by Liza Coburn

Cite as: Coburn, Liza. (October 22, 2020). Assessing the Satisfaction and Engagement of Curators. Data Curation Network. Retrieved from the University of Minnesota Digital Conservancy, http://hdl.handle.net/11299/216691.


The Data Curation Network1 collectively curates research data deposited to academic and non-profit data repositories. Curators participating in the DCN, who bring a wide variety of subject and file type expertise, are matched with datasets from across the network of (currently 12) partner repositories. After the first year of piloting the DCN’s shared curation service we determined that a key consideration for a cross-institutional collaboration is the satisfaction of those participating: in our case, the DCN curators (Coburn & Johnston, 2020, p 17).2 

To address this need, we launched a brief, anonymous survey in May 2020 – about halfway through our three-year grant cycle. The intent of the survey was to gauge the satisfaction of DCN curators, their engagement with the DCN community, and assess their interest in continuing their participation beyond the current grant phase (May 2018 – June 2021). Overall, the survey sought to highlight any barriers to curator participation and satisfaction. The timing of the survey was important as we wanted to have a chance to analyze the responses to present at our annual third All Hands Meeting (held in July 2020) and to brainstorm ways to address any barriers going forward.

We received 19 responses from the 24 DCN curators invited to take the survey (79% response rate). This summary report presents our findings and discusses the work done over the subsequent months to address curator feedback and better engage our community members in the DCN going forward.

A copy of the survey questions is available here.

Survey Results & Discussion

Part 1. Participation in the DCN

We received a wide range of comments from respondents in this section of the survey. Most of the comments were positive with respondents indicating that they enjoyed participating in and being a part of the DCN.

Fig 1. Enjoy the Most. We found that a majority of DCN curators most enjoy belonging to the DCN community (42.1%). This was something we noticed over the first year of our pilot as well, and mentioned in the conclusions of our recent paper (p. 18). Curating datasets assigned to them by the DCN Coordinator (26.3%) and participating in special interest groups (15.8%) also received high marks.

Fig 2. Enjoy the least? While many curators find participating in special interest groups the most enjoyable part of the DCN, others consider it the least enjoyable (21.1%), along with the process of submitting datasets (10.5%) to the DCN for curation and opportunities to present DCN-related research at conferences and meetings (10.5%). 

Special Interest Groups 

The idea for launching special interest and working groups within the DCN community took form at the 2019 All Hands Meeting. At the meeting we identified numerous topics people were interested in working on, including:

  • Human subjects and restricted data
  • Curation services on campus and community development
  • Large datasets / big data
  • Zip files / compressed formats
  • Code curation
  • File assessment: tips, tricks & tools

From this list we decided to proceed immediately with three topics: human subjects, curation services on campus, and large datasets. Curators interested in each of these three topics met in-person during that All Hands Meeting and made plans for a second meeting to take place afterwards. These interest groups, with exception of the big data group, continued meeting over the course of the next year and gave brief presentations on their progress at our 2020 All Hands Meeting. After this latest All Hands Meeting we followed up with curators about their interest in the existing groups or in forming additional ones. As a result, we launched (or revived as was the case with the large datasets group) a number of new groups in September 2020:

Table 1. Special Interest Groups (re)launched in Fall 2020

TopicInterested Curators
Big Data (large datasets)13
Campus Outreach6
Education10
End User Satisfaction (data authors)4
Expanding DCN Submission Workflows4
Investigating New Partnerships7
Racial Justice14
Refining the Standard Curation Protocol8
Value of Curation6
Curators that expressed interest in participating in 2020 special interest groups

More details about these groups and their activities are available on our website.

Curation Assignments

Fig 3. Enjoy the most? When it comes to curation assignments, curators find assignments most enjoyable when they match their technical expertise (36.8%) even more so than matching their domain expertise (26.3%). Early DCN testing in the planning phase indicated that this would be the case (Johnston et al., 2017)3, and it’s likewise been the foundation of our curation assignment process. 

The DCN matchmaking process

When a new dataset is submitted to the DCN for curation it is classified by data/file type and discipline and matched to a curator with appropriate expertise – first according to the technical expertise required and then domain expertise. We assess curator expertise periodically as well to ensure we understand our curators interests and competencies. 

We also assess curators’ satisfaction and enjoyment of an assignment after each dataset they curate by asking a series of questions about their comfort (with data type and discipline), confidence, and enjoyment while curating as part of a dataset “ticket” in our project management tool, Jira. When a curator transitions a ticket in the workflow to indicate they’ve finished curation they’re prompted to answer these questions, which the DCN Coordinator then reviews and follows up on if necessary. Generally we’ve had very positive feedback – curators are generally comfortable with assignments, confident in their curation work and find assignments enjoyable. 

Finally, curators are able to update their domain and data type expertise and preferences at any time by editing our “Curator Expertise Matrix”, a shared Google Sheet, that the DCN Coordinator references when assigning datasets. 

Fig 4. Enjoy the least? Curators consider DCN dataset curation to be least enjoyable when they’re not clear what’s expected of them in curating datasets for other partners (36.8%), when a dataset is missing key parts like documentation or contextual information (26.3%), or if they don’t curate for the DCN very often and feel unclear about the workflow (15.8%).

Barriers to the DCN matchmaking process

“I’m not always clear about what’s expected of me in curating datasets for another repository.” We’re constantly striving to improve our workflows and procedures, including how to manage assignments using our project management tool, Jira. We’ve found that it’s very important to strike a careful balance between asking for enough information about a dataset so that a DCN curator can curate it well and not overwhelming the submitter with too many fields in the ticket they create for each new dataset they submit to the Network (see list of ticket fields below). 

Table 2. List of fields in new dataset tickets submitted to the DCN’s Jira instance.

Field nameField type
Summary [title]Text box
Submitting institution Select from dropdown
Due dateDate picker
Access linkURL
I’d like to partner with a DCN curator on this one…Select from dropdown (yes/no)
Notes for the DCN CuratorText box
Primary Format Type ExtensionsLabel field (create new or choose from list)
Data Type(s)Checkboxes (check all that apply)
DisciplineSelect from dropdown
Discipline – other Label field (create new or choose from list)
Data author departmentText box
DescriptionText box
Attachment Upload
Time SpentSystem widget (enter units of work completed, ex. 1.5h, 30m)

In response to the feedback in the survey, we repurposed an existing field (“Notes for the DCN Curator”) in the dataset ticket instructing the submitter, when possible and appropriate, to provide explicit instructions about their expectations for curation besides applying the standard CURATED step protocol (e.g. if there is code, should the curator try to run the code or not). We also encourage conversation between the submitter and the DCN curator, mediated via the Jira ticket, whenever necessary.

“When a dataset is missing key parts like documentation or contextual information (e.g. an associated manuscript, thorough metadata).” We’ve had many discussions within the DCN about whether or not we should accept datasets that do not come with any sort of documentation or contextual information (a README, or an associated manuscript). However, we don’t feel that during this pilot implementation phase we should discount or exclude any datasets submitted. And, since we’ve found that most often DCN partners are submitting datasets to the DCN because they lack the technical or disciplinary expertise to effectively curate the dataset, a DCN curator with expertise in the necessary data type or domain might be able to offer valuable insights into what should be included in the documentation.

“I don’t curate datasets very often so when I am assigned a dataset I can’t remember what I’m supposed to do.” Finally, while each new curator does receive onboarding training including an introduction to DCN workflows, procedures and the Jira system, curation assignments for some curators may be few and far between and we recognize that it’s hard to remember these workflows. We try to provide extensive documentation of these workflows and instructions for using the Jira system, but curators may forget that these exist and/or how to find them. It also takes extra time to review them during a curation assignment. In response to this feedback we held multiple Jira refresher training sessions during this year’s All Hands Meeting (July 2020), one in October 2020, and we plan to offer additional sessions periodically going forward. We’ve also developed a simple wiki page on our website with quick links to all pertinent curator resources.

Time constraints involved in the DCN

Fig 5. Enough time? Most DCN curators commit 5% of their time to the project, or only about 8 hours per month, and some commit even less – as low as 1%. While most respondents (73.7%) consider this 5% commitment to be the right amount of time necessary to participate effectively, several others considered it too much (26.3%). 

Fig 6. Data curation as a percentage of your job? In considering some of these barriers to participation, it’s also important to note that most, if not all, DCN curators aren’t devoted to data curation full-time. This can be a barrier to entry – the recurring need to become familiar with a process that one does not do often.

Part 2: Engagement

In the second part of the survey we asked DCN curators how engaged they felt with the DCN community. Curators are generally responsive to assignments or requests for information, and the vast majority regularly attend DCN events, but these activities may not fully map to their level of engagement.

Fig 7. Level of Engagement? The vast majority (78.9%) of respondents reported that they felt engaged with the community and are happy to be participating in the DCN. One respondent reported feeling neutral, and a combined 15.8% (3 respondents) reported feeling unengaged or completely unengaged. Unfortunately, due to the anonymous nature of the survey we’re not able to identify those curators who feel unengaged, but we did strongly encourage DCN partner representatives to follow up with curators on their teams, and vice versa. 

Fig 8. Communication methods? We employ many communication methods in the DCN community (e.g. email, Slack, Jira) and encourage DCN representatives to communicate regularly with the DCN curators at their institutions about DCN business or concerns. Jira, the primary method for communicating about datasets and curation assignments, and the DCN-Curator’s Google Group are considered the two most effective communication methods (each at 31.6%). Though Slack and communications with their DCN representative are also seen as highly effective by some (each with 15.8%). We continue to make use of all of these methods to date.

Stepping Up Our Curator Engagement Efforts

Since the survey, we’ve continued to use all of the communication methods mentioned above, but we’ve also stepped up our efforts in a few new areas:

  1. The DCN partner representatives team meets on a biweekly basis to report out on activities, work on initiatives and generally touch base. The DCN Coordinator has begun sending out recap emails to the curator email list after every meeting to keep curators better informed, and our hope is that discussion is also happening between partner representatives and the curators on their teams.
  1. We’ve held brief (15 minute) weekly standup meetings about the latest DCN datasets, which are open to the entire DCN community, since shortly after launching the pilot curation service. These are attended by 5-10 DCN curators on average. Since the survey the DCN Coordinator has begun issuing email reminders about these stand-ups to the DCN community on the day of the event.

Additional Respondent comments

In the comments section for this part of the survey a few people mentioned that they found our routine (semi-annual) partner check-in meetings very engaging and productive – especially during the last round when we met with two partner teams at a time. These meetings began as a way for DCN staff to check in with partner teams to assess how the curation service was working for everyone (submitting to the DCN, curating for the DCN, etc.). 

However, as time passed and it became clear that things were running smoothly, we pivoted from discussing DCN workflows to facilitating discussions between different DCN partner teams about their local workflows. And at our virtual All Hands Meeting in July 2020 we planned a session to expand on some of the ideas and conversations that took place in these partner meetings, and held breakout sessions within the session as well as facilitated meetings throughout the week between partner teams and community members. 

A number of curators expressed interest in continuing to work on these initiatives after the All Hands Meeting, however, we felt that there was limited bandwidth to do so. Especially considering the re(launching) of various interest groups and time needed for the Network’s curation efforts.

Part 3: The Future of the DCN Community

The timing of this survey coincided with the beginning of our third year of the implementation phase, funded by Sloan. The DCN data curation service was up and running – smoothly – and we began preparation for transitioning the project off grant-funding.

Still, we asked: Was there anything else DCN curators would like to see us focus on in year 3? A common sentiment from respondents was that we should continue to focus on what’s working rather than try to do a lot of new things.

Fig. 9 Participation post-implementation phase? We also asked curators to indicate if they would like to participate in the DCN past the implementation phase. The vast majority of respondents (84.2%) indicated that they would like to continue their participation, while a few (15.8%) were unsure.

While it seemed like most curators feel very positive about being a part of the DCN and our community, and they consider their participation as meaningful both personally and to the larger professional community, others reported specific aspects of the DCN that they’d like to continue participating in (i.e. primers, curation, education workshops), and a few weren’t so sure that their participation.

Offering curation services beyond DCN membership

Finally, offering curation (or other) services beyond the DCN partners has been a consideration since the inception of the project. As we’ve been planning for our future sustainability, governance and membership after the end of the implementation phase, the question of whether or not we should offer curation services to individuals and/or non-partner organizations has been a major topic of discussion among DCN partner representatives, and we wanted to get the curators’ opinions.

Fig 10. Curation service for non-members? The majority of respondents were unsure (47.4%), 36.8% felt that the DCN should offer services to non-members, and 15.8% felt that the DCN should not offer services to non-members. Like the previous question, respondents were required to explain their response and responses varied widely (Table 6). 

Generally, it seemed a few respondents felt that offering our expertise and services to institutions lacking the same level of curation resources as DCN partner institutions was a positive thing and provided a way to expand our reach, further our mission, and bring attention to data curation in general. Others felt that accepting money for curation services from non-members may impact the cooperative, community feel of the DCN. Almost everyone, though, in their comments indicated that more information was needed. Curators are unsure of how it would work, logistically (i.e. coordination, cost recovery, compensation, etc.).  

Fig 11. Curation services for individual researchers? As a follow-up question we asked how curators would feel being asked to actually curate a dataset submitted by a non-DCN partner. About half of respondents (52.6%) remained unsure pending more details, and about a third (31.6%) indicated they would be willing to curate these datasets. One respondent said they would not be willing, one said they would only if the dataset was the product of federally-funded research, and one said they’d still be willing to curate despite their concerns. 

This question about how offering services beyond DCN membership would work in practice, remains unresolved. We will be transparent and diligent in our efforts to include curators in the process as we explore this option via additional market research on this topic. We’re planning to identify prospective organizations and institutions in various sectors and hold focus groups to understand what their needs are so that we might better understand how the DCN could fulfill them. We’ve been keeping the DCN curators informed of our planning and are hoping curators will be interested in leading these focus groups. 

Limitation

In addition to the aggregated survey data presented in this report, we received 75 rich qualitative comments from respondents in this survey. However, we did not obtain participant consent to share these comments publicly.

Conclusions 

We’ve learned a lot about community-building during the implementation phase of this project – namely that it’s very challenging! And especially as the Data Curation Network has expanded from 6 to 12 partner organizations and from approximately 20 to now close to 50 individuals. As with any cross-institutional collaborative project, it is hard to accommodate every community member’s wants and needs, but we do our best, and we’re happy to see that generally people seem satisfied.

We’ve always tried to adapt and increase efficiencies, but in response to some of the feedback in this most recent survey, we’ve made a special effort to increase our communication and outreach efforts, adapt our workflows, and provide expanded opportunities for engagement (e.g. special interest groups) and keep curators better informed and “in the loop” about discussions regarding the future sustainability of the DCN after the implementation phase ends in 2021.

We’ve also taken to heart survey feedback indicating that things are running smoothly and that generally, people seem satisfied with their participation and engagement in the community — that we should focus on what’s working and continue to do those things well rather than try to do more in the limited time remaining. We also acknowledge that curators may be at capacity for how much time and effort they can devote to the DCN, considering that DCN participation and data curation itself may only be small parts of their larger job responsibilities.

During the remainder of the DCN’s implementation phase we hope to continue strengthening our community by building on key activities like the special interest groups and involving curators in the planning and decision-making processes related to sustainability of the Network and its community post-implementation phase. 

Footnotes

  1. The Data Curation Network currently involves 12 partner institutions and is funded by the Alfred P. Sloan Foundation, https://datacurationnetwork.org/
  2. Coburn E, Johnston L. Testing Our Assumptions: Preliminary Results from the Data Curation Network. Journal of eScience Librarianship 2020;9(1): e1186. https://doi.org/10.7191/jeslib.2020.1186.
  3. Johnston, Lisa R; Carlson, Jake; Hudson-Vitale, Cynthia; Imker, Heidi; Kozlowski, Wendy; Olendorf, Robert; Stewart, Claire. (2017). Curate a data set to capture and compare curation activities across network partners. Retrieved from the University of Minnesota Digital Conservancy, http://hdl.handle.net/11299/188649.