This post is part of our Curators’ Corner series. Every so often we’ll feature a different DCN Curator. The series grew out of a community-building activity wherein curators at our partner organizations interview each other “chain-letter style” in order to get to know each other and their work outside of the DCN better. We hope you enjoy these posts!
Renata Curty is a Social Science Research Facilitator at the University of California – Santa Barbara. Renata was interviewed by Matt Chandler in April 2022.
How did you come to your current position?
I’ll have to go back to 2015, when I was completing my PhD as a Fulbright Scholar at Syracuse University–which I was drawn to because it had the first i-school in the US and many great mentors to work with. When I applied for the Fulbright, my interests were around human-computer interaction and leveraging the adoption of repositories, especially institutional repositories, but then I developed an interest in the data community. I was a strong believer in disciplinary approaches for institutional repositories, because after talking to researchers I could see that there would be a shift of attention to institutional repositories, in the sense that they can expand knowledge with re-usability and reproducibility. I was at first really focused on the platforms for institutional repositories, and how to advance the systems for usability, accessibility, and user friendliness–because that was already something I was teaching about in Brazil. But then I realized it’s not a platform issue; it’s a social issue. It’s more related to factors that are stimulating or motivating or hindering people from sharing data. So I transitioned from this focus on the platform and system design to behavior and to what extent people would really be interested in re-using shared data. Not many people were discussing data re-use at the time; we were mostly focused on data sharing. I connected to some people who were also interested in data re-use and the behavioral side.
[Data sharing] is not a platform issue; it’s a social issue.
My dissertation study was especially focused on the social sciences, including platforms and facilitating conditions, as well as social factors, like how others in a researcher’s field devalue secondary data use and instead push them to produce research based on original data. I could then identify some trends that promote or hinder data re-use, and curation was a part of that, because most of the struggles were related to poor data documentation. The results of my dissertation laid out some recommendations that could be implemented by repository managers and data curators.
But I had to go back to Brazil to complete my degree and then stay with my home institution for four years. When I left Syracuse, I was really interested in pursuing my career in the direction of data curation, but it happened that back in Brazil, things were still developing. We were still in the early discussions of repositories, and things were not at the same place as here in the US, so I went back to teaching. I was an assistant professor at a state university in southern Brazil, and I went back to teaching on scholarly communication and metadata. I had an opportunity to teach a few classes related to data re-use and curation, and I realized that it takes some time for change at library science and information schools. So just after I completed my fourth year in Brazil, I was looking for opportunities in the US, and I found this opportunity to join UCSB library and to be engaged in curation, teaching, and developing resources to support data management.
So it took me a good number of years to mature my interests, but also I was always in communication with some of the schools here in the US–going to IASSIST, RDAP, and other conferences to stay connected with what was going on in the data curation field. That involvement captured the momentum and my interest to come back to the US and pursue my career here. It was not an easy decision to leave a secure position in Brazil, but I’m really happy with my role at UCSB and all of the connections I’ve made here and in the DCN and other associations in the US.
Tell me about your position, and how much of your job involves curation.
My current job title is “Social Science Research Facilitator,” as of about a year ago, and it was changed from my original title to better represent what we do. When I applied for this position, the title was “Data Curator,” but I knew that curation was only one fraction of the tasks involved in the position, and it was clear to me that I would have opportunities to do research, to be engaged in consultations, and to be involved in teaching. But then since I started, in February 2020, I knew that this position would involve outreach also. I am the liaison for some of the tools here, like DMPTool, and I advocate for the use of Dryad, which is our current institutional repository. We would be giving these talks and demos across campus, but for my position, mostly to the social sciences. I would say that now curation accounts for 15-20% of my role, but that’s because we are still in the production stage for Dataverse. We are hoping to launch in the summer or fall of 2022.
Right now, we are part of the curation workflow with Dryad–and that was by request, because when I first joined, when curation was in my job title, I had access to the deposits, but I didn’t have a role in evaluating anything from UCSB. Most of the time, it was too late for us to request any changes after deposits were already published. So we had a meeting with the Dryad team, and we expressed interest in being curators for all of the deposits associated with UCSB, and we were granted that role. Since March 2021, we are part of the workflow, performing curation for the deposits associated with UCSB in Dryad. We pitched this as a great opportunity for us to connect with researchers on campus, and it has been working great. I think it’s been a learning experience, especially because we are planning to have a Dataverse in addition to Dryad, and we won’t have a massive team of curators to help us. It has been helping our instructional efforts as well, because, based on what we are seeing in terms of best practice and opportunities to improve deposits, we are reshaping our instruction. When we give classes for data management or responsible conduct of research, we are able to bring in examples. Our “Data Literacy Series” and guidelines and handouts–and everything we do in terms of outreach and our data management services–is informed by our curation. Before having access to that part of the process in Dryad, we would get examples of things that we learned about, but not necessarily things that are happening on our campus. It’s also an opportunity for us to advocate for our researchers and encourage progress in their praxis. Students are also learning about Dryad from faculty, and more people have been prompted to come to our services and learn our recommendations.
Why is curation important to you?
The whole principle of data sharing is to make the resources reusable. When we talk about data accessibility, we are talking about time and effort, along with infrastructure. Even though reusability cannot be guaranteed, we have to make sure that data is not just piling up to sit idle for generations. The data need to be of good quality and have good documentation so that it can be reused. If we are giving the opportunity for the public to access data, we want to make sure we have checks to ensure quality. Some research fields still need to learn more about the data ecosystem and how data can have value in addition to their original research.
The value of curation might mean different things depending on who you talk to–like depositors versus curators–but we can see that all of the time invested, and all of the resources and infrastructure required, need to be taken into account and expressed in the robustness of the data, code, and documentation shared. If you look at other repositories that have very minimal metadata requirements and no inspection, like Figshare or Zenodo, and you talk to people who have worked with what has been shared, you can see how hard it is to explore these types of repositories and find something that will be valuable. I see curation as a really important stage; otherwise, we miss the link between sharing and reuse. Curation should be in the middle, and it should be done in a way that we are really seeking to improve the deposits of data and code.
Going back to my dissertation research, I understood not only the facilitating conditions, like having an infrastructure for data sharing, but also the policy implications. It’s really important to not only have a top-down approach, with mandates, but also incentives for sharing. I learned from depositors that they often doubted that anyone was checking their submissions and questioned whether it was anything other than a requirement to check off. What really resonated with me while I was doing my study was we have to work on incentives. At that time, we were talking so much about the sticks, but not the carrots. When the mandates were new, researchers were just angry; they didn’t have direction, and they didn’t know where they should deposit their data. I was involved in RDAP at that time, starting about 2011, and I remember the community included researchers from every field, and they were really angry about not having direction. Learning from that experience, as well as my dissertation, informs the things we are doing at UCSB and the UC system at-large. We don’t treat every field as the same, because they have different traditions. Even in the social sciences, it’s a huge umbrella, so if I talk with people from psychology who are invested in experiments and have more of a quantitative tradition for most of their studies, it will be completely different from talking to folks from other disciplines. So we can’t have a one-size-fits-all approach. So even though I didn’t have the experience on the data services front, because I was mostly in academia then, I think it helped me to envision and be more considerate of these factors when developing our services.
Why is the Data Curation Network important to you?
When I joined UCSB, I didn’t know before coming into my position about the existence of the DCN. But I remember in one meeting, I was discussing with a colleague about the possibility of joining the DCN, so I investigated it. I was really impressed. I thought it was great, because this was a community where people were exchanging and defining best practice. It’s not only about following what is out there, but this is a group that has been leading discussions on data curation and coming up with solutions and primers. So I told my colleague it would be great to join. We had a long way to go to learn more about curation, and that was what stimulated us to join Dryad and engage more in the community of experts with curation at different institutions. For me, it was really inspiring to join this community and to learn from the members–to connect with this very knowledgeable group of people, who have a shared goal toward better, transparent, reproducible data.
We are a team of three at UCSB, focusing on the social sciences, humanities, and environmental studies, and we share some of the curation tests we have. We also count on Dryad and DCN when we lack domain expertise or familiarity with the format of a dataset, so they can assign a different curator with more expertise. When I curate, I try to apply the CURATE(D) steps to my curation tasks every time, and I get some positive comments from Dryad about our more-involved curation process using what we learned from the DCN.
When we implement our Dataverse, it will be in addition to Dryad, because Dryad has an integrated workflow for several journals already, with features for private sharing with peer reviewers. Peer reviewers can check the data and request some changes, but we still spot issues with the data that were not caught during peer review. We hope that peer review of the data will be tied to peer review of the paper for some journals in the future. Dryad works for several fields we know, but, especially for me and the social sciences, we often have concerns with human subjects and need to restrict some of the files. And Dryad is not a good home for this type of data, because everything in Dryad is under a CC0 license and has to be public. Also, we would like to offer a platform that is more than an archiving platform, integrating with different platforms like Docker or TwoRavens. Dataverse is a very robust platform; it has several integrations that we believe would be very appealing to researchers, so it can be a data management tool as well.
If you weren’t doing data curation, what would you be doing?
Certainly, teaching would be related to my role. That was one of the important points in the job description here that caught my attention, because I really enjoy being involved in teaching and instruction. Here I have an opportunity to teach some of the Carpentry workshops. We also have what’s called the “RDS Workshop Series,” which is more on the intro level. We launched this and it has been nice because we are connecting and pairing them with some of the Carpentry workshops–like we would have one on data storytelling, and then there would be a data visualization Carpentry workshop–so we are developing this aligned with some of the interdisciplinary collaboratory instructional efforts. And I really enjoy doing these sessions on data management, and I see that this exchange with students is really important, and something I really value.
And also research, because here I have the opportunity to conduct my own research. I have an incentive for that here, as well, because my position is an academic position. It’s not only a requirement, but something that I really enjoy doing.
So I think I would be teaching or doing research in a library school or in a similar position to what I’m doing now.
What’s your favorite cuisine?
I’m a foodie. I would say my favorite cuisine is Thai food. I love all cuisines, but I’m a vegetarian, and Thai food is very inclusive of my type of diet, so I’m a huge fan. Once I had an opportunity to travel to Thailand and I did a cooking class at the Blue Elephant, which is also the restaurant that cooks for the king. For me, that was really a highlight of the trip, and it was by accident. I happened to stay next to the main restaurant, so I decided just to ask for the cooking class, and they said they were sold out because of cruises coming up. But then I was able to join a family cooking class, and it was amazing.
What do you like to do outside of work?
I enjoy cooking a lot; it’s one of my hobbies. If there is a social gathering, I really enjoy cooking and seeing others’ reactions to my food. Another one is camping and hiking. I’m a very outdoorsy person, so I feel like I’m exactly where I should be, because Santa Barbara is great for that. Another thing is I love to spend time reading.
Is there anything else you’d like to share
As a closing statement, I would just say how appreciative I am of my time with the DCN and everything that I learned, and the opportunity to engage with very inspiring professionals. I think the community is really great. It’s a way for us to keep exchanging knowledge and new developments–and, as I said, to form these guidelines and recommendations for data curation.
To learn more about Renata, and the datasets she has curated for the DCN, see her curator page!