We wanted to share some results about how data depositors feel about the curation services they receive from our repository staff.
In spring 2021, members of the Data Curation Network surveyed 568 researchers who had recently deposited data into one of 6 academic data repositories (see table 1). Our survey in Qualtrics asked respondents to consider their most recent data curation experience and prompted them with the name and DOI of their recent data publication. Running for about two weeks at each institution, our 11-question survey received a 42% response rate with 239 valid responses.
Repository Affiliation | Date Sent | Distribution Count | Response Count | Response Rate |
Cornell eCommons | 4-26-21 | 34 | 18 | 53% |
DRUM (U Minnesota) | 4-26-21 | 197 | 82 | 42% |
U Michigan Deep Blue Data | 5-11-21 | 130 | 63 | 48% |
Duke Research Data Repository | 5-05-21 | 54 | 19 | 35% |
Illinois Data Bank | 5-18-21 | 121 | 45 | 37% |
JHU Data Archive | 6-17-21 | 32 | 12 | 38% |
Total | 568 | 239 | 42% |
Results Preview
The participants of our survey, though not a representative sample of all researchers, revealed some pretty interesting things about their satisfaction with the curation process.
We defined data curation as: the various actions taken to ensure that data are fit for purpose and available for discovery and reuse. During the curatorial review process, a data curator may check files, review metadata, and/or make suggestions that would help others find and reuse your data. (Data Curation Network Satisfaction Survey, Spring 2021)
Did you expect repository staff to curate your data? Were you satisfied?
A majority of depositors said yes (65%) they expected curation with 34% answering No. And nearly all Strongly Agreed that they were satisfied with the result (see figure 1).
Were any changes made to your data submission due to the curatorial review?
Most (75%) said yes! To get at this question, we asked depositors to consider changes made to the last dataset they published in the repository (Table 2).
Results | Follow-Up Question |
---|---|
Yes (75%) | “If yes, what changes were made?” 8% Essential changes were made (e.g., an error was corrected) 27% Some major changes were made (e.g., files updated/added) 63% A few minimal changes were made (e.g., small edits/additions) 2% Unsure 100% Total |
No (14%) | “If no changes were made, why not?” 94% No changes were needed 0%. I did not agree with the recommended changes 3%. I did not have time to work on this 0%. Unsure 3%. Other 100% Total |
Unsure (11%) | Skip to next question |
Total (100%) |
Due to the curation process, did you feel more confident sharing your data?
We asked if they felt more confident sharing their data due to the curation process.
- 67% Strongly agree
- 23% Somewhat agree
- 9% Neither agree nor disagree
- 0% Somewhat disagree
- 1% Strongly disagree (However, the response was contradicted with the positive free-text response, so it might be an error)
Would you recommend your colleagues submit data to this repository? May we contact you in the future to discuss data curation?
A whopping 98% said yes they would recommend! While just 2% were unsure and none said no. And when asked “May we contact you in the future to discuss data curation?” 67% said yes, or 161 researchers.
Data curation by this repository adds value to the data sharing process?
Figure 2 shows that 81% Strongly agree, 13% Somewhat agree, 4% Neither agree nor disagree, 0% Disagree, 0% Strongly Disagree, and 2% left blank.
What is the most “value-add” curation action taken by this repository?
Here we received 182 free text comments (83% of the sample!). For example:
“The data curation was helpful in that it forced me to consider the viewpoint of the end user of the data. The curators nudged me to add pieces that would help third parties navigate the data (e.g. readme file) months or years in the future. In retrospect this is a good thing; at the time I was focused on getting my thesis chapter out and published, and didn’t pay attention to this aspect as much as I needed to.”
“Meticulous review of our datasets and code enabled us to catch several errors that would have hindered replication efforts by other scholars. Gaps in documentation/codebooks, which are hard for us to detect because the data is so familiar to our team, were also detected and corrected. Feedback from someone who comes to the data/documents with fresh eyes is simply invaluable….The <repository name redacted> team has done this for us on several occasions, and it gives us greater confidence that if we make a mistake in our own quality control process, it will be caught before the data is disseminated.
“It ensures the data from all my studies is shared in a uniform, easily identifiable, and accessible manner. This makes navigation for the end user easy and thus makes my work more impactful for the community as a whole.”
Preliminary Conclusions
Overall, we were very pleased with the response of our survey. We ran this survey alongside another DCN survey on the “Value of Curation” that was aimed at repository staff. It will be interesting to compare the results between our two groups (repository staff and depositors).
We think this survey might be useful for others in the repository community wishing to obtain depositor feedback on their local curation process.
Full dataset and publication coming soon. Meanwhile, feel free to download our survey instrument and invitation letter.
- Sarah Wright and Wendy Kozlowski, Cornell University eCommons
- Lisa Johnston and Wanda Marsolek, University of Minnesota Data Repository for U of M (DRUM)
- Hoa Luong and Susan Braxton, University of Illinois Data Bank
- Joel Herndon and Sophia Lafferty-Hess, Duke University Research Data Repository
- Jake Carlson, University of Michigan Deep Blue Data
- Mara Blake and Marley Kalt, Johns Hopkins University Data Archive