Data Curation – The encompassing work and actions taken by curators of a data repository or archive in order to provide meaningful and enduring access to research.
The Data Curation Network will enable data repositories to better support researchers that are faced with a growing number of requirements to ethically share their research data in ways that make it findable, accessible, interoperable and reusable (FAIR).
Data curation enables data discovery and retrieval, maintains data quality, adds value, and provides for re-use over time through activities including authentication, archiving, metadata creation, digital preservation, and transformation.
Data curation skills span a wide variety of data types and discipline-specific data formats such as spatial data, code, databases, chemical spectra, 3D images, and genomic sequencing data. Each repository alone cannot reasonably account for all the curation expertise needed. Sharing our staff enables data repositories to collectively, and more effectively, curate research data in ways that are measurably of greater value than non-curated data.
The Data Curation Network (DCN) serves as the “human layer” in the data repository stack and seamlessly connects local data sets to expert data curators via a cross-institutional shared staffing model. Our vision for a fully operational DCN is to:
- provide expert data curation services for Network partners and (forthcoming) end users,
- create and openly share data curation procedures and best practices,
- support training and development opportunities for an emerging data curator professional community.
The Data Curation Network is supported by grants from the Alfred P. Sloan Foundation (Primary Award: G-2018-10072; Planning Award: G-2016-7044) and the Institute of Museum and Library Services (Workshop Series: RE-85-18-0040-18).
The Data Curation Network project team includes representatives from each DCN partner institution: University of Minnesota (lead), Cornell University, Dryad Digital Repository, Duke University, Johns Hopkins University, Penn State University, University of Illinois, and the University of Michigan.
DCN representatives are managers and directors of their local curation services and often have supervisory responsibilities for the DCN Curators who contribute staff time to the project.
Lisa R. Johnston, the Research Data Management/Curation Lead and director of the Data Repository for the University of Minnesota (DRUM) (http://z.umn.edu/drum) at the University of Minnesota Twin Cities Libraries. Johnston coordinates the library’s efforts around research data management and leads a team of five data curation experts for archiving research data in DRUM. Since 2012, Johnston has also served as the co-director of the University of Minnesota’s institutional repository for research and publications, the University Digital Conservancy (http://conservancy.umn.edu). Johnston is the co-editor (with Jacob Carlson) of Data Information Literacy: Librarians, Data, and the Education of a New Generation of Researchers (2015, Purdue University Press) and editor of Curating Research Data: Practical Strategies for your Digital Repository (2017, ACRL Press).
Mara Blake is the Data Services Manager at the Sheridan Libraries and Museums at Johns Hopkins University. In this position, Blake leads the Data Management Services and Geographic Information Systems (GIS) services for the library, including the JHU Data Archive (https://archive.data.jhu.edu). Previously, Blake worked on the Big Ten Academic Alliance Geospatial Data Project, a multi-institutional collaborative project to create a shared collection of metadata and discovery interface for geospatial data resources.
Jake Carlson is the Research Data Services Manager at the University of Michigan Library. Carlson oversees the creation, implementation and operation of Research Data Services (RDS) at the Library, which includes Deep Blue Data, launched in 2016. RDS at Michigan is centered on harnessing the diverse knowledge and expertise of all librarians across the library in managing, organizing, describing, sharing and preserving information and applying it towards understanding and addressing researcher needs with their data. Carlson is a primary architect of the Data Curation Profile Toolkit (http://datacurationprofiles.org) and the PI of the Data Information Literacy project (http://datainfolit.org). He is the co-editor (with Lisa Johnston) of Data Information Literacy: Librarians, Data, and the Education of a New Generation of Researchers (2015, Purdue University Press) and the author of numerous articles on roles for librarians in managing and curating research data.
Elizabeth Coburn, Project Coordinator for the Data Curation Network at the University of Minnesota. Elizabeth oversees daily operations and coordinates the activities of the DCN. She completed her BA at Grinnell College, and her MLIS specializing in Data Curation at the University of Illinois. Most recently, Elizabeth consulted in the area of web and software development specializing in metadata, Islandora, data curation, large-scale data migrations, and project management. Previously, she completed internships at the National Library of Medicine and the National Snow and Ice Data Center before serving as the Data Librarian at the Woods Hole Oceanographic Institution.
Hannah Hadley, Project Manager for the IMLS DCN data curation training grant at Pennsylvania State University Libraries, Digital Scholarship and Data Services. The two year grant funds three data curation education workshops between 2018 and 2019. Hannah has a B.S. in Anthropology from Washington University in St. Louis, and is currently a graduate student at Drexel University within the MSLIS program. Her research interests include data curation and scientific research using archival sources. She is pleased to serve the data librarian community in this effort to education and share curation experience.
Joel Herndon, Head of Data and Visualization Services in Duke University Libraries. In this position, Herndon leads the libraries’ programs providing consulting and training for data management, data visualization, digital mapping, and data analysis. Herndon’s recent research considers the evolution of data management requirements in social science journals and how research libraries can provide consultation and training to meet these publishing requirements. Recent projects include the launch of Duke’s new Edge research space, the ICPSR-led Learning to Curate project, and the TLRN Research Data Management and Use Task Force.
Cynthia Hudson-Vitale, Head of Digital Scholarship and Data Services at Pennsylvania State University. In this position, Hudson-Vitale leads digital scholarship services for the University, which include services in support of digital humanities, research data management, maps and GIS, data analysis, and open publishing. Hudson-Vitale has worked on several funded faculty projects to facilitate data sharing and interoperability, while also providing scaleable curation services for the entire University population. Hudson-Vitale currently serves as the Visiting Program Officer for SHARE with the Association of Research Libraries.
Elizabeth Hull is Associate Director for Dryad (http://datadryad.org), an independent, non-profit, general-purpose repository for research data underlying scholarly publications. In this role, Elizabeth keeps Dryad projects and processes on track and moving as smoothly and efficiently as possible. She oversees curation and journal integration, manages the user help desk, and participates in strategic planning and priority-setting for the organization. Elizabeth holds a varied background in libraries/archives, web content management, public history and archaeology, all of which converge into an overarching commitment to open access and usability of knowledge.
Heidi Imker is Associate Dean and Associate University Librarian for Research, Director for the Research Data Service, and Associate Professor of the University Library at the University of Illinois at Urbana-Champaign. Imker came to the University Library in 2014 to become the Director of Illinois’ Research Data Service (RDS). The Illinois RDS is a campus-wide initiative that provides the Illinois research community with the expertise, tools, and infrastructure necessary to manage and steward research data. Prior to this position, Imker was the Executive Director of a large scale collaborative grant funded by NIH, called the Enzyme Function Initiative. There Imker was the co-director of the Data Core which aimed to manage, disseminate, and integrate research data produced by 15 different research groups across the disciplines of microbiology, metabolomics, molecular biology, structural biology, enzymology, and computational biology.
Wendy Kozlowski, Data Curation Specialist at Cornell University. Kozlowski is coordinator of the Cornell Research Data Management Services Group (RDMSG), a cross-campus, collaborative organization that provides data management services to faculty, staff and students throughout the entire research process. Operating within Cornell University Library’s Metadata Services group and as part of the library’s institutional repository (eCommons) administrative team, Kozlowski is the point person for both repository-wide and scientific metadata, and works with subject liaisons to curate data sets deposited into eCommons. Kozlowski also serves as chair of the library’s Repository Executive Group.
Hoa Luong is the Research Data Specialist at the Research Data Service at the University of Illinois at Urbana-Champaign (UIUC). Previously, Hoa was a graduate assistant at Chemistry and Grainger Engineering libraries where she worked on Grainger’s massive DMP review project and other engineering-related projects. She also interned with the Illinois Advancement in Data Analytics. Currently, Hoa leads the data curation efforts for Illinois Data Bank. Hoa earned her Bachelor’s degree in Food Science & Human Nutrition at UIUC and earned her MLIS degree from the iSchool.
Tim McGeary is currently the Associate University Librarian for Digital Strategies and Technology. In his role, Tim provides leadership for information technology services and operations within the Duke University Libraries, including management of Data and Visualization Services, Digital Curation Services, Digital Collections, Software Development & Integration Services, Core Services, and Discovery Services. Tim has presented nationally and regionally on library technology development, managing electronic resources, open source solutions, and the Open Library Environment (OLE). He is also published on topics of library technology and managing electronic resources as an investment portfolio. Tim came to Duke University from the University of North Carolina at Chapel Hill, where he served as the Director of Library & Information Technology. Prior to that, he served as the Team Leader of Library Technology for Lehigh University. Tim received a B.A. in Music and an M.S. in Information and Systems Engineering from Lehigh University, with his thesis on a multi-period capital budgeting model of subscriptions to electronic resources.
Claire Stewart is the Associate University Librarian for Research and Learning at the University of Minnesota. Prior to arriving at Minnesota in January 2015, Stewart held several positions at Northwestern University over a 21-year period, including directing the Center for Scholarly Communication and Digital Curation and serving as Head of Digital Collections. At Northwestern, Stewart served as campus lead for repository services and e-science, directing the creation of an E-Science Working Group and data management services as a collaboration between the office for research, information technology, and the library. At the University of Minnesota, Stewart is a member of the Libraries senior leadership team and co-sponsor of the Data Management and Curation Initiative. She directs the Libraries’ education and research support programs, leading staff who provide general and specialized support, including GIS, digital humanities, and data management and curation services.
The DCN Partner Institutions may differ in how our data curation services are offered – we have unique audiences, different policies or scope, and we manage unique repository technologies – however, the values of open access to research data and common standards for data curation bring us together.
|Institution||Repository Name||Repository Type||Technology||Metadata||Open Access|
|Cornell University||Cornell eCommons||Institutional||DSpace 6.2||Dublin Core||Yes|
|Dryad Digital Repository||Dryad||General||DSpace||Dublin Core||Yes|
|Duke University||Duke Digital Repository||Institutional||Hyrax||Dublin Core||Yes|
|Johns Hopkins University||Johns Hopkins Data Archive||Institutional||Dataverse||Dublin Core||Yes|
|University of Illinois||Illinois Data Bank||Institutional||Custom Ruby on Rails||DataCite||Yes|
|University of Michigan||Michigan Deep Blue Data||Institutional||Hyrax 2||Dublin Core||Yes|
|University of Minnesota||Minnesota (DRUM)||Institutional||DSpace 5.5||Dublin Core||Yes|
|Pennsylvania State University||Penn State ScholarSphere||Institutional||Sufia 7||Dublin Core||Yes|
The Data Curation Network is interested in growing our project during our pilot implementation phase (July 1, 2018-June 30, 2021) which is supported by a grant from the Alfred P. Sloan foundation to the University of Minnesota and the in-kind staff effort from across the DCN partner institutions
The Data Curation Network will benefit researchers, their disciplines, and the end users of data world-wide by providing coordinated data curation services via a distributed network of expert curation staff. In order to incrementally grow the DCN during the implementation phase we will sponsor four new partner institutions that bring a unique set of domain and disciplinary expertise as well as established data repository services; two institutions will be invited to join in Year 2 of the project (July 1, 2019 – June 30, 2021) and an additional two institutions will be invited to join in Year 3 of the project (July 1, 2020 – June 30, 2021). We seek partners that:
- Have an existing data repository services
- Have data curation staff that are able to commit the following levels: (1) DCN Representative for the institutional at 10% FTE and (2) 1-2 DCN Curators at 1-5% FTE.
- Agree to adhere to our team code of conduct.
- (preferred) Institutions that help to expand our curation expertise, while also bringing new diversity to the project, are particularly welcome during the implementation pilot phase, such as
- Curation expertise in 3D images, chemistry, chemical engineering, civil engineering, crop sciences, electrical engineering, genomics, and coding expertise in Java and Python languages.
- Geographic distribution outside of the Mid-Atlantic and Midwestern United States.
- Institutional types including small-to-midsized academic institutions, independent nonprofit organizations, disciplinary data repositories, etc.
- Repository platforms not currently represented.
- Individuals with diverse backgrounds and identities to better reflect the broader digital archives community.
New partner institutions joining during the DCN implementation phase will receive:
- Access to a network of expert data curators, who are trained in specialized data curation procedures and bring a broad variety of format and domain expertise, to curate data sets intended for your local repository service.
- Travel support for each institution to send a ‘DCN Representative’ to attend and participate in the annual All Hands Meeting that will be held in Minneapolis, MN in July 2019 and June 2020.
- Travel support for each institution to send (two) ‘DCN Curators’ to attend two-day in-person training in specialized data curation procedures (workshop be held in conjunction with the All Hands Meeting).
- Opportunities to present at conferences and co-author articles and research papers on topics related to the DCN.
- Opportunity to help shape the future direction of the DCN during the pilot implementation phase.
- Option to continue as a sustaining member post-implementation phase (or not).
No monetary fees are contributed during this fully-funded implementation phase and no expectations of long-term involvement are required (project phase ends June 30, 2021). Partner contributions to the project during this pilot are in the form of staff effort only. The institution may choose to formally join the DCN once the sustainability model (under development) is implemented in July 2021.
We invite expressions of interest from institutions with existing staff and expertise that would help expand our project. A survey form (preview PDF of questions) collecting expressions of interest will be open in Year 2 and Year 3.
- Now accepting expressions of interest for potential new partner institutions interested in joining the DCN in Year 2.
- Deadline: May 31, 2019.
- Fill out the survey or contact the PI directly.
Please contact Lisa Johnston, Principal Investigator of the Data Curation Network, with any questions.
The Data Curation Network (DCN) is a collaborative project that includes institutional representatives and curators from a range of institutions and organizations. These individuals represent a broad range of identities and backgrounds. This diversity is an asset to the project and to many collaborative projects. Consequently, the Data Curation Network adheres to this Code of Conduct based on the Contributor Covenant Code of Conduct and Geoblacklight Code of Conduct as guiding principles for our work together.
In the interest of fostering an open and welcoming environment, we as project members, curators, and institutional representatives pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
Examples of behavior that contribute to creating a positive environment include:
- Using welcoming and inclusive language
- Being respectful of differing viewpoints and experiences
- Gracefully accepting constructive criticism
- Focusing on what is best for the community
- Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
- The use of sexualized language or imagery and unwelcome sexual attention or advances
- Trolling, insulting/derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing or sharing others’ private information, such as a physical or electronic address, without explicit permission
- Other conduct which could reasonably be considered inappropriate in a professional setting
The group of ‘DCN Representatives’ are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
DCN Representatives have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers. This code of conduct will not supersede any terms and conditions in place by a project member’s parent institution.
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project lead, Lisa Johnston at firstname.lastname@example.org, or the project coordinator, Liza Coburn at email@example.com. Complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. We cannot guarantee confidentiality with regard to the reporter of an incident, but will proceed with discretion and treat every reporter with respect. No member of the project may retaliate against an individual because of the individual’s good faith reporting. Further details of specific enforcement procedures may be posted separately.
Project members who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by the Data Curation Network Representatives.
This Code of Conduct is adapted from the Contributor Covenant, version 1.4, available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html.