This post is part of our Curators’ Corner series. Every so often we’ll feature a different DCN Curator. The series grew out of a community-building activity wherein curators at our partner organizations interview each other “chain-letter style” in order to get to know each other and their work outside of the DCN better. We hope you enjoy these posts!
How did you come to your current position?
I started at the Princeton Research Data Service (PRDS) in November 2019. Before this, I was a researcher. I completed my PhD at Notre Dame in a joint program in sociology and peace studies, focusing on multi-method research, mostly qualitative. I did some field research in the Middle East because my topical focus was mostly contentious politics and social movements. That led me into network science, social network analysis and computational methods.
Then, I was a postdoc at Notre Dame for about a year and a half working on a NIH-funded project investigating social networks and health behaviors that produced lots of data. I ended up doing data management most of the time even though it was something I wasn’t trained in. The data management skills I learned then—and that we now share in workshops—aren’t earth-shattering, but they take a lot of discipline. A lot of things people end up learning on their own through painful experience. I learned a lot of things the hard way and realized there are so many skill gaps in contemporary research training, especially for folks in the social sciences who are collecting high-volume micro data like we were. PRDS was a brand-new service when I applied, and I came into it feeling the pain of not having a service like this. It was great to come in and to be a part of building something new.
Tell me what you do as a Research Data Management Specialist at Princeton University.
I work for our Research Data Service (RDS) in the libraries. This department consists of a small core team We do a lot of outreach and education. For example, I’m doing a series on humanities data for people who are already working with data or are data curious. We also have a Research Data Stewardship Program, an eight-unit program over the academic year where we cover data management and open research across the research lifecycle, which we may eventually develop into a for-credit course. One of the silver linings of the pandemic is that for the last few semesters, all of our workshops were virtual, which meant that we got to record them and build up this virtual library researchers can watch on their own time, which is what researchers tell us they want. They want resources they can access at 3 am if they need to.
How much of your job involves Data Curation?
My colleague Neggin Keshavarzian and I both do some data curation for our institutional data repository, DataSpace. That’s been an evolving infrastructure. It used to be supported by OIT without any curation, but when we spun up PRDS, we started a curation service. The types of things we get coming in are mostly large datasets—over 100 GB—because they’re having trouble finding another repository. We also get people who are new to data publication and would like feedback. We tell them we’re happy to consult with them wherever they publish, but in consultations most people opt to go with DataSpace because it’s easy and available. People choose us because we’re here to help them, and we can handle big data. Because we’re pretty well staffed, we’re pretty quick also. That works in our favor.
Why is curation important to you?
A typical researcher may come to us because the journal they’re publishing in has a data sharing policy. We may say that the description is pretty thin, and they should expand it. At first, they might be annoyed, just because it takes additional time, but then what happens is they suddenly learn a lot about description or licensing. They thought publishing their data was just a quick chore they needed to get done so they can get on with the real work of publishing the paper. After working with us, they realize, OK, somebody could actually use this data. It’s well documented, the permissions are clear, it’s got a DOI, and I can give it a clear citation. Now it feels like valuable work and not a mere chore. Once they’re done with it, they’ll be glad they did it.
Why is the Data Curation Network important to you?
We’re still new to the DCN. We have this conundrum that a lot of services have, which is that if we do a good job with our outreach and encourage open scholarship and data publishing, then there’s no way we’ll be able to handle that volume. We need to be thinking in advance on how we could scale up, and the DCN is a part of that. I don’t know if anyone has a tally of how many papers are submitted by Princeton researchers each day, but there’s no way we could handle data for all of the papers being sent out. As we do more outreach, we need to be thinking about ways to be efficient and scale up, and as we start convincing more people to share their data, we will definitely be sending more things around to the DCN.
If you weren’t doing data curation, what would you be doing?
Before I started this role, I was at a real crossroads. I do enjoy doing academic research. I thought maybe I’d go back to working in nonprofits, in peace work or human rights work, and put my research skills to work there. I also see what I’m doing here as not that different. I’m trying to put some of my background and skills to use in mission-driven research that is ultimately about the democratization of knowledge and the advancement of science and growth of society.
What do you like to do outside of work?
It’s right there in the question. Outside of work, I like to be outside—hiking, traveling, fishing. I grew up in Oregon where outdoor activities are just a part of life. I’m always interested in traveling to a new, beautiful location.
To learn more about Matt, and the datasets he has curated for the DCN, see his curator page!