This is the story of policy development. As federal data sharing mandates change and grow, we thought our experience of adding a new policy may be timely and act as a roadmap for others who find themselves in similar situations. We tell this story from our roles as data management educators in the health and social sciences and as curators for our institutional repository at the University of Minnesota.
The Data Repository for the University of Minnesota (DRUM), launched in 2014, is a collection of digital research data generated by University of Minnesota researchers, students, and staff. Datasets published in DRUM are openly available and can be downloaded by anyone without restriction. DRUM accepts all types of data, including human participant data, and was launched with a Data Collection Policy that governed expectations about what would be accepted. This policy included language that prohibited the acceptance of private, confidential, or legally restricted data.
As those who work with human participant data know, what counts as “private, confidential or legally restricted” can be ambiguous. Other features of the data, such as the sensitivity of the content and appropriateness of the consent, can also affect the potential risk involved in making data open. Because of these nuances, we have always assessed submissions that contain human participant data before acceptance. The nature and focus of these assessments, however, often depended on the specifics of the datasets received and were not necessarily consistent across time, as human participant data submissions were still relatively infrequent. While DRUM’s policies addressed privacy and de-identification issues, supporting the decisions we made to reject dataset submissions based on the possibility of re-identification, it did not cover another issue we encountered – (in)appropriateness of consent. After several years of curating human participant data, a growing concern became whether participants were appropriately consented to having their data shared in an open access repository without restrictions. To better document and formalize how we were handling (and ultimately how we wanted to handle) human participant data in DRUM, the authors and DRUM manager at that time reviewed DRUM’s past and current practices regarding these submissions, ethical considerations weighed, and ultimate actions taken. This enabled us to develop a set of clear internal guidelines, a first step in establishing guideposts for our assessment of human participant data submissions.
Policy formulation and research
In 2018, we rejected our first human participant datasets based on ethical considerations regarding language in the consent form. This decision, and many that followed, resulted in challenging conversations with researchers who pushed back on our recommendations. Researchers brought concerns to the Institutional Review Board (IRB), who also did not have defined policies or guidance to limit data sharing practices post collection, especially for exempt and low-risk studies. Our own Libraries’ administration also questioned whether DRUM had the authority to turn away researchers, as the mission of the repository was to serve the research needs on campus. We attempted to formalize our internal assessment and criteria for human participant data in a new DRUM policy, but were unable to get traction and support in making these criteria policy.
These challenges led us to carry out an analysis of repository policies in 2019. What were other repositories doing to ensure appropriate consent for data sharing? What policies were in place to guide collection and rejection of submissions? To do this, we scanned 105 websites from repositories from the DCN, the Big Ten Academic Alliance, the National Institutes of Health (NIH), generalist repositories (e.g., Zenodo, figshare), and repositories recommended by PLOS One. We searched and evaluated language related to participant consent 1) on the website and 2) in the deposit agreement (see the full results of the study) The final analysis included 19 repositories whose website and/or deposit agreement contained information about participant informed consent. Only 5 website pages labeled the information as policy – the others used language such as guidelines, recommendations, FAQs, etc. A total of 15 of the 19 repositories had deposit agreements – only 12 stated that data sharing should be consistent with informed consent.
We also examined the consistency between informed consent language found in the deposit agreement vs the language found elsewhere. In 10 cases, the language found on the website was generally consistent with the language found in the deposit agreement. However, in a third of cases (5 of 15), the language found on the website or through other means did not align with the informed consent language in the data agreement. In sum, most repositories that provided guidance around the consent process advised explicit data sharing language in the consent form; however, there was enormous inconsistency in how the consent process guidance was labeled, where the information was located, and the level of detail provided. Also, the guidance was largely just that – guidance, rather than a formalized policy, resulting in little or unclear enforcement. We also did not find any rejection mechanisms in place for circumstances where a depositor submits data collected with a consent form that says the data will not be shared. Finally, repositories that made it to our final round of review tended to be NIH repositories that collected human subjects data within certain fields of medical research (although not all such repositories had guidance). Institutional or generalist repositories that collect data broadly tended to reflect DRUM’s own practice of addressing only personal identifiable information (PII) and confidential information in data submissions.
Equipped with our study results and recommendations recently made by FORCE 11 Research Data Publishing Ethics working group, of which Hunt was an active participant, curators drafted a new DRUM informed consent policy proposal. Our policy proposal and supporting documentation was sent to a new group of library administrators in 2021, as administrative staff experienced significant turnover in 2020, and became an official policy in October 2021.
We frequently turn researchers toward the policy before, during, and even after submissions. We also have worked closely with our IRB to help integrate data sharing language into consent form templates, which has been especially important with the release of NIH’s 2023 Data Management and Sharing Policy. This collaboration, along with our policy and paired guidelines, has provided a formal framework with which to make data sharing decisions and has helped align guidance across campus stakeholders. Since its establishment, we have received less pushback from researchers and have incorporated the policy into an official process for review before acceptance to ensure submitters are aware of the policy. This is done through an automated email sent upon upload: “If you submitted data that were collected from or about human participants, please share a blank copy of the informed consent form, information sheet, or other participant agreement used. We will review it to make sure it meets our Human Participant Data Policy.” We also support researchers in making adjustments to their consent forms for future studies or if re-consenting participants and depositing in more appropriate restricted repositories when submissions are rejected from DRUM.
Documenting this policy and guidelines has also become a training tool for new curators, interns, and other curators without human participant experience. Heightening awareness of the policy is still a challenge, and we continue to have submitters who don’t understand why we might reject their data even with the policy in place. Overall, adoption of the human participant data policy and paired guidelines have been a net benefit to DRUM. Overcoming the hurdle of researching, drafting, and obtaining approval for the policy established our autonomy and status as experts on ethical data sharing, which is perhaps the most critical takeaway.
Reflecting on the process of developing a new policy for DRUM has led to several recommendations we would make to others:
- Develop a framework for decision making and parameters established. Essentially, document your practices before writing and proposing a policy, even if initially this is for internal use only. This allowed us to be explicit with decision makers when justifying our policy proposal and to be consistent in how we made decisions, especially when situations were ambiguous.
- Find out who has authority and ownership over the policy making process so that you can write a proposal tailored to your audience. Acknowledge that there could be other campus offices that have a say in the policy – being connected to those entities prior to your proposal helps establish cohesive and consensual messaging. For example, concurrent with this work, the authors also joined the campus wide Education Advisory Group under the Office for the Vice President of Research. Our successful seating on that group may have reassured Libraries’ decision makers that we were not going rogue, but rather engaging with other offices.
- Perform a scan of other repositories for related policies and/or guidelines. We found it fruitful to be able to point to established precedents. The scan does not necessarily have to be a formalized study – we believe highlighting a handful could be just as impactful.
- Name the expertise of folks involved in the policy proposal. In our case, the two human participant curators (authors of this post) had research backgrounds and the DRUM manager was an archivist – we leaned on this pooled expertise in drafting our proposal. Consider your team expertise and how that may add leverage to a policy proposal. If there is expertise missing on your team, who are other experts you can turn to?
We briefly mentioned above that one of our challenges with the new policy has been how to communicate the changes to campus investors. It has been a slow process. We worked with Libraries’ Communication to get a short message into the University wide newsletter, and leveraged our activity in other campus groups to share the message in their newsletters as well. But our primary communication about the new policy happens at the individual level as researchers approach us about sharing their human participant data (i.e., at the time of need). While we did work with our IRB to get the consent form template language updated to reflect data sharing options, it was an extremely slow process. We drafted recommendations in 2019, and they were implemented in late 2022, prioritized by the NIH mandate at the beginning of 2023. Data sharing and repository policy changes that reflect new mandates requires a shift in culture, and that takes time, patience, and persistent messaging. We continue to build more relationships with campus groups with similar interests to strengthen our collective efforts to share data responsibly and ethically.