On behalf of the Data Curation Network (DCN), we appreciate the opportunity to provide input on the Best Practices for Sharing NIH Supported Research Software, Notice NOT-OD-24-005. The DCN has been an advocate for open and ethically shared data since our inception, and we are excited to see and support NIH’s continued efforts to provide access to invaluable data and research outputs. In addition to our comments submitted directly to the request for input, we offer the following feedback.
In general, we recommend NIH share a formal definition of research software. We recommend: “Research Software includes source code files, algorithms, scripts, computational workflows and executables that were created during the research process or for a research purpose” from FAIR4RS Principles. We also recommend that NIH emphasize the need to share data and well-documented, and ideally self-documenting, code for reproducible research.
Researchers and data stewards would also benefit from knowing, explicitly, the connections between DMSPs and other plans, such as the Resource Sharing Plan and Data Security and Monitoring Plan. These seem interrelated components, so more clarity from NIH on how these do, or do not, interact would be beneficial.
We would also like to emphasize that reusing open source software largely depends on two factors: documentation and licensing. With any software that is shared, it must be clear on how exactly to put the code together to be able to reuse it, and this depends on clear documentation. There are some other factors that can play into reuse, such as the programming frameworks or language chosen, but if there’s a codebase with no license and no documentation, it is actually impossible to use.
It is worth noting that current toolkits and informative materials may need to be refined and redeveloped for a number of reasons:
- Current tools may not address a specific need of a community of practitioners at all or in an insufficient way;
- Current tools may only be proprietary and expensive and people need a free alternative (e.g. qualitative analysis software is dominated by proprietary tools with three free and open source alternatives)
- Current tools may not be deployable in a way that makes sense for a group (e.g. desktop vs. server-based software)
- Current or previous tools that may have done everything may go unsupported and will eventually become unusable on modern machines.
There are a number of reasons to create new software, but what’s paramount should be creating interoperable and open source software. Tools that work with each other can have a larger base of support and bode better for maintenance. Tools that are open promote interoperability (and verifiability for research).
Lastly, while we recognize that this guidance is aimed at a broad spectrum of researchers, and is therefore high-level guidance, we also note that scholars would benefit from more in-depth support in sharing their software. Tools like a checklist or assessment criteria would be exceedingly useful for ensuring that a standard or recommended practice has been sufficiently met.
Thank you again for the opportunity to provide feedback on the Best Practices for Sharing Research Software. We would be happy to provide additional information or clarification, and would welcome a meeting to discuss further.
Mikala Narlock,
Director, Data Curation Network
The full response has been archived and is accessible at: https://hdl.handle.net/11299/260307
With special thanks to Madina Grace, Laura Hjerpe, Greg Janée, Sherry Lake, Vicky Rampin, Nicholas Wolf, and Rachel Woodbrook for their comments and suggestions that are the foundation of this blog post and feedback.
2 Comments
Comments are closed.