Xuying is a Data Analyst for the Penn State University Libraries. She was interviewed by Melinda Kernik in July, 2021.

How did you come to your current position?

Before joining the Penn State University Libraries at University Park, I was a research associate in a biomedical lab at another campus of Penn State. While at the lab, I did micro CT imaging, image analysis, and visualization.  I developed a database based on the lab data and created some visualizations for the team, which I found very interesting! When my current position (Research Support Data Analyst) became available, I immediately applied for it. This position allows me to continue to work with data – both for data visualization and data curation – giving me the opportunity to leverage my research background to help researchers in different stages of the research lifecycle. It is great!

What do you do?

I curate data for our institutional repository, ScholarSphere, and I also work with the DCN. In addition, I provide workshops and consultation about data analytics and visualization tools such as Tableau, Power BI, and Big Data Analytics on Clouds (Azure, AWS, GCP) for the research community here. Basically, I work with faculty, staff, and students to support their research projects with the above-mentioned tools. I am also part of the ITHAKA Big Data Research Support project at Penn State.  We have just submitted our report and are moving forward to the next step.

How much of your job involves data curation?

About 25% – split between ScholarSphere and the DCN.

Why is data curation important to you?

Data often comes in lacking metadata, including information that is key for discoverability and reusability. It is very important to curate the data and identify the key information to add for better quality. I believe that a lot of researchers are focused on generating research data and are less focused on the metadata, which is equally important for the research product to be reused by others. Data curation is an essential step in the research lifecycle!

Why is the Data Curation Network important?

The DCN is a very important way for curators to get to know each other and to help each other by sharing expertise and resources including curation tools, practices, and primers! Without the group, I might feel isolated, but instead there is a sense of community. I am so glad to be part of it.

If you weren’t doing data curation, what would you be doing?

In my current position, I would be doing more workshops and consultations to support faculty, students, and staff around data analytics and visualization for their research projects. It is a good experience to work with researchers from various backgrounds.

Where would you most like to travel next? 

I would like to travel with my family, mostly a road trip to the Four Corners (Arizona, Colorado, Utah, and New Mexico) in the United States!

