This post was authored by Heidi Imker, University of Illinois at Urbana Champaign, and Joel Herndon, Duke University.
As data curators, we spend much of our time and energy preparing research data for publication. While there are many reasons for curating research data, one is the potential for reuse. Data deposits of growing complexity and size led us to wonder which aspects of curation are most impactful on the reusability of a dataset? That is, given finite time, how is time curating data best spent?
To answer this question, a few of us started by looking to the data reuse literature. This was prompted by many factors, including a discussion at All Hands 2023 that surfaced general interest in data reuse as a topic. As we explored the literature, we came across the work of Kathleen Gregory and Laura Koesten; namely their book Human-Centered Data Discovery, which discusses, among other things, an essential precursor to reuse–how researchers find data. Knowing this was of interest to the DCN (and us!), we reached out to Kathleen and Laura, who kindly agreed to host a half-day workshop building on their work.
All Hands 2024 at Duke University was identified as the best option for this workshop since we knew many members of the DCN would be in attendance, and we loved the idea of being able to offer the opportunity to a large portion of our community. Additionally, we often host a CURATE(D) workshop at All Hands to ensure our curators’ skills are up-to-date – but since this was our seventh All Hands, we thought adding a different flavor of training opportunity would complement other activities. Here are a few concepts that especially resonated with DCN members:
There is no typical data user, but there are typical data needs and making sense of data is done with others in a dialogue and collaboratively.
The ideas of data as object and data as process. Data as process meaning it is dynamic, situational, and shaped by context – I just love that, really spoke to my curator heart.
I really appreciated their presentation of simple frameworks for considering data use (e.g. Discovery -> Interpretation -> Use)…I’ve used this several times since May in considering data re-use.
Something that stuck with me was the reminder that people don’t necessarily reuse entire datasets. People often piece together parts of multiple datasets, which requires us to think creatively when we’re curating and describing data for reuse.
While the conversation included far too much to summarize here, Kathleen and Laura published a blog post with their takeaways. One particularly resonant point:
“Data curation is a translation activity. Data curators and researchers often view the value of data and data management differently. While researchers may view data management as a box-ticking exercise, data curators are motivated to preserve and archive data in ways that serve a community. The translation work involved between these two perspectives is vital, but not something that can be taken for granted or which is easily achieved.”
We are so grateful to Kathleen and Laura from traveling many thousands of miles to share their expertise and spark conversations among the DCN.