This post is part of our Curators’ Corner series. Every so often we’ll feature a different DCN Curator. The series grew out of a community-building activity wherein curators at our partner organizations interview each other “chain-letter style” in order to get to know each other and their work outside of the DCN better. We hope you enjoy these posts!

Peter Cerda,
University of Michigan

Peter Cerda is the Data Curation Specialist for Workflows and Big Data at the University of Michigan’s Deep Blue Repositories. Peter was interviewed by Leslie Delserone in October 2023.

How did you come to your current position?

When I was starting to wrap up my graduate studies [in ecology and evolutionary biology], I realized that I had to do something next. I knew I didn’t want to stay in my research field. While I enjoyed the questions I was working on, I didn’t see myself pursuing them further. But I still really enjoyed being in the academic world. I knew that I wanted to stay in this area in Michigan and saw the job posting for this position.

I started reading more about this position and the different job duties, and it was managing big data, which is what I was generating during my Ph.D. It was organizing research data, which is what I was doing. Even the services that I was using for my dissertation were services that I would potentially use in this position. When I started to look into this position, and just the field in general, I was really intrigued because data curation is really, really important. But it’s something you don’t think about as a researcher until the very end when you have to scramble and put it all together. Speaking from experience.

The outreach aspect of this position is something that is also really interesting. I’ve been that graduate student that has no idea how to manage data. When I open a file or folder that I haven’t opened in 4 or 5 years – I’m like, oh, I can’t find anything here. Realizing this is actually a skill. Helping [students] to realize that and to start thinking about it earlier in the process is something that really drew me to the position as well.

It’s exactly, “don’t do what I did. Do what I’m telling you to do now”. I find it really encouraging to see a diversity of backgrounds, but sharing this interest in good data management and data curation and preservation. Bringing that disciplinary background to data management, I think I do a better job having been a scientist for a while. Even if it’s just my own screw-ups in that arena when I was active. One of the advantages of the DCN is having that diversity of experiences, diversity of thought. Having no library background, I don’t feel a disadvantage in this position because I have my own expertise, but I can draw on those that do have the librarian expertise to help me when I don’t know what I’m doing. I think that’s what makes us such a strong organization, that we have this diversity of experience and thought.

What do you do?

So I essentially manage the data deposits that we receive that are what we term big data. For us, that means it’s over 5 GB or has over a hundred files. it involves a little bit more technical curation just in terms of getting the data sets onto our website. The 5 GB limit is because of our web browser. The web browser isn’t able to actually upload the file, so we had to go through the back end to get those available. That’s a majority of what I do, but I also help with our outreach events that we’re presenting and with data management plans with researchers.

How much of your job involves data curation?

I’ll say about 50 to 70%. Depending on the week, of course, you can’t predict when data deposits are going to come in, so some weeks we’re heavy,  some weeks we’re low in terms of deposits.  We have curators that specialize in different areas, and they take the first crack at curation. But then if it’s a big data deposit or something that requires a little more technical expertise, then it gets triaged to me. I also assist with our other repository, Deep Blue Documents, our institutional repository. For example, I manage the batch upload process, which is essentially uploading multiple documents for multiple different works which requires working through the back end as well.

Do you ever get data collections for deposit that are in the terabytes or petabytes?

We are getting into the terabytes, for sure. For example, I think we have one deposit that I just helped with that’s 1.7 TB. But those are generally the largest ones that we’ll manage. We’re definitely not set up to handle anything to petabytes. If we did get a deposit that size, we would have to point them towards somewhere else.

Why is data curation important to you?

Yeah, this is a great question. For me, it actually stems back to my graduate career actually. So I completed a Ph.D. in ecology and evolutionary biology here at the University of Michigan just this past year. It involved going into other researchers’ supplements or material or downloading sequences from Genbank – essentially trying to utilize the knowledge that’s already there.

But that comes with the challenge of working with data sets that are not curated, that don’t have a ReadMe. It’s a lot of reformatting, working with files that are not in the optimal formats or things like that. So I started learning the importance of data curation by having to deal with uncurated data essentially. Which then made me look back at myself and some things I had done in the past. I’m like, oh no, don’t do that.

Having to work with it on the other end made me really appreciate the work and detail that really go into it. The documentation – it adds so much more value to the data set, even though it is a single text file. As I was off boarding from my lab, I went through the process of writing documentation just so people would have something, at least a jumping board, a diving board to get into some of the work that I had done previously.

Why is the Data Curation Network important?

To me, it’s really important because I am brand new to the field. Having just started this position and not coming from a traditional library background, it’s been really useful for me to learn more about the field in general. And that’s why I’m really excited to be on the Governance Board as well, to hear the situations at other universities,at other other institutions. But having that common ground that this is a group that is dedicated to this type of work and this upcoming or growing issue of data curation and public and open access sharing of data sets.

There could be a scenario – we might be thinking about doing something a certain way and to be able to reach out to someone at the other institutions, the other people, the curators and ask them: “Have you guys tried this? Well, did it work? Did it not? Do you have any solutions or ideas?” You know, it’s that common ground, to have that conversation about how different people are going about the same issue is instrumental. Working in a silo, when many others are doing the same thing is, in my opinion, not a great way to go about it. I guess the diversity of ideas and diversity of solutions, that’s really great when you have a network like the DCN to lean on. I’ve definitely been using the primers as I come across data types that I have no idea what’s going on there. It’s been really instrumental.

If you weren’t doing data curation, what would you be doing? 

That is a great question. So while I was going through the trials and tribulations of being a graduate student and thinking about my future, I knew early on that I didn’t want to be a full-on research professor, like at an R1 institution. I discovered that I really enjoy teaching and working with students. And so before I made the decision to start applying to jobs in the library/data curation field, I was heavily considering looking for positions at undergraduate institutions. You know, the small liberal arts colleges, things like that, because I really do enjoy teaching and trying to encourage younger students to pursue fields that aren’t traditionally thought of. So for example, I came from the ecology and evolutionary biology background. A lot of the undergrads that I taught were in the medical field. Every so often, I taught an animal diversity class. Some students were taking it because they needed a lab course, but then they were able to take the course and see the diversity of life that we have on this planet and just realize that this is a profession and this is really awesome. That was fantastic.One of the best comments I got – a student told me, “If I had taken this class one year earlier, I would be an EEB major instead of a pre-med.” And I was like, yay. Yes, so I think that’s something that I would love to pursue.

Something that I’m excited about with this position is still having that ability to teach workshops or work with students, maybe more through a data management plan or graduate seminars. Yeah, I really do enjoy the teaching side. If I wasn’t doing this work, that’s something I would have pursued.

What is your favorite cuisine?

Yeah, so I am from Texas, from a border town. I feel like I would be betraying everyone if I didn’t say Mexican food and barbecue. A Texas barbecue specifically. No disrespect to the other barbecues, but it’s what I grew up with. I am actively dreaming of the day that I can buy a smoker and just maintain a brisk for 12 hours.

Mexican food is definitely a big thing because that’s what I grew up with. Food-wise, I’ve had a good life. Every time I go back home, one of the first things I do is find a taqueria, and just get basic tacos. Chopped up meat, cheese and cilantro and onions on it. One of the hot sauces on top. We also have a regional dish that’s called a botana. And the best way to describe it is basically tortilla chips layered with refried beans and cheese. And then on top of that, you have sliced skirt steak and chicken breast, cooked basically as if it was going to be in a taco. And then you also have quesadillas on the side. Yeah, it’s amazing. When you buy an order for one, it is really for 2 people. You can order for 2, which is really for 4 people. You can order for 4 people, that’s really for 8 people. It’s the best. Apparently it’s a thing only in the Rio Grande Valley, which is where I’m from. I’ve tried finding it anywhere else, even in other places in Texas, you can’t find it.

What do you like to do outside of work?

I really enjoy the outdoors. I think having a biology background definitely increases my enjoyment of the outdoors. Going camping, going to the local state parks. I try to see a national park whenever I can. Trail running is something that I’ve really started to enjoy recently. I have a history of running, but I’ve gotten back into it a little more now that I have time again.

So being outdoors, sure, I mean, barbecuing. Cooking too, cooking is something I really got into during the COVID years. When there wasn’t much else. Let me be sure to eat well.

What is your favorite city?

My favorite city is actually Austin, Texas. And the reason for that is, it’s the first city that I got to go to as a young adult on my own. Having been from South Texas, I had some friends that went to University of Texas. So I was able to go out there and visit them. Learn a little bit more about what life was like outside of the area that I’m from. And to see a diversity of people, diversity of thought. The food is fantastic. The city is beautiful. So I’ve always had a soft spot for Austin.

Now that they have an MLS (Major League Soccer) team, I avidly support them, even though there’s very little chance of me going to the game anytime soon. Yeah, I really enjoy that town. Any time I get the chance to go back there, I try to spend at least 2 or 3 days just rolling around the city. It has a really nice charm to it.

Where would you most like to travel to next?

I mentioned in my previous answer that I’m a soccer fan. My first big club that I supported was Manchester United. So the place I would love to go to would be Manchester, England, to go watch a game. At Old Trafford, that would be phenomenal.

I would love to go to England. I think anywhere in Europe really. I’ve been fortunate enough to travel to Central and South America, so I always have a soft spot for those countries. But I haven’t made it to Europe yet, so I’d like to check that one off.

To learn more about Peter, and the datasets he has curated for the DCN, see his curator page!

Similar Posts