This post was authored by Mikala Narlock, DCN Director.
When the DCN was spinning up 9 years ago, our principal investigator created the Google Drive Folder that would grow to be the nervous system of the DCN, tracking grants, curation, workshops, primers, research–you name it, it was likely in our Google Drive Folder.1 The benefits of this were many, including that we could collaboratively work across institutions synchronously and know where to find our materials. Even as the number of directories grew, as nesting changed, as new personnel joined and colleagues changed positions, our Google Drive Folder was there for us.
In Spring 2024, we received the news we and Google Drive users everywhere had been dreading: files in our Google Drive created by individuals no longer employed at the University of Minnesota would be deleted in early December 2024. As two of our amazing former colleagues created a great number of the documents using Google Drive, we could no longer delay: we had to move our files (documents, spreadsheets, presentations, images, forms) to a Shared Drive administered by the University of Minnesota. We did not discuss alternative technical solutions to this problem– instead, since we knew this approach was working, and was the technical solution supported by our fiscal home, we wanted to continue using Google Drive.
Below, we share a little more about the context of our situation, how we approached this challenge, and some of our lessons learned–in the hopes it will be useful for others starting their own Google Drive migrations.
Context
For those who may not be as versed in the nuances of Google Drive:
- Drive Folders are owned by an individual–but the files within the folder can be owned by different individuals, meaning those files are not necessarily saved if individual accounts are deactivated for any reason.
- Shared Drives are owned by an organization–so when individuals leave or their accounts are deactivated, their files are saved because they are automatically owned by, in our case, the University of Minnesota.
One important caveat (feature or bug, you decide) is that you cannot move items into a Shared Drive that you do not own–even if you own the top level folder. You have to make copies of the files, and transfer them–which comes with, and creates, a new set of problems.
Our Process
This led us to investigate and try different mechanisms for creating copies of files to transfer into our new Shared Drive.2 Ultimately, because our Drive Folder was (all things considered) relatively small at around 20GB, we used a manual process of downloading folders and reuploading them. In practice, this looked like:
- Identify an appropriate folder size
- For purposes of the migration, “appropriate folder size” was one that could be zipped into one file. If a folder is too large, Google Drive will break the folder into multiple zip files that then need to be manually rejoined–so we wanted to avoid that as much as possible.
- Download the folder (Google Drive zips it for you)3
- Unzip on desktop
- Upload to new structure in Shared Drive
- We manually created new directories in the Shared Drive, and uploaded folders to create sub-directories.
- Compare file-by-file that each file was present in the Shared Drive
- If needed, make copies or different versions of files that were not transferred correctly
- This was just a confirmation of the file names (i.e., not using fixity checks)
- Delete folder from original Drive Folder
Unfortunately, this process changes the URLs associated with the file and breaks previous links. Of course, this makes sense: Google Drive is meant for active materials, not necessarily long-term preservation, so links can change, and we’ll move on. This does pose a challenge moving forward, which is discussed below under Lessons Learned.
One also cannot delete files you don’t own, even when the highest level folders are deleted. As a result, we now have two of just about every document in the Google Drive sphere: the original version and the current version in our Shared Drive. Our community is diligently updating file names to indicate when something is deprecated, but it will take some time for us to adjust. The good news is that the files we were worried we’d lose? Well, they’ll be deleted in December and we won’t have duplicates of them!
Lessons Learned
Along the way, there were some lessons learned the hard way. We share these here for consideration as you prepare your own migration. This was our experience using these tools, your results may differ.
- Create test folders, and test your process multiple times.
This is an absolute must. When working with complicated files, folders, and ownership, be sure to create different tests:
- Have others outside of your organization create test files
- Create directories – then let someone else make a sub-directory.
Make a few versions of each of these, and test them out using different methods to see what will work. While it is unfortunate that we lost a few files in our migration, it could have been far worse if we hadn’t started with practice files.
- Google Drive for Desktop did not work for our needs because it quietly deleted files or created unusable files.
First, when trying to use Google Drive for Desktop (on a Mac), the process of moving from the Drive Folder to the Shared Drive did not work. Occasionally, the application would give me an error and halt the process. Regularly, what happened with files that I was not the owner of was:
- Tried to move the files
- Received an error that some files could not be moved and the files would be deleted
- Selected undo
- The files were then deleted anyway
Unfortunately there was a small folder of files that were lost this way– just poof, silently gone, never to be seen again. After the initial loss, I created test files to experiment with. Ultimately, I found this process unsuccessful and pivoted.
I also tried using Google Drive for Desktop to copy and paste files using my Finder (Mac equivalent to File Explorer on Windows). However, the files would be created as Google versions (with the extension, for example, gdoc). Interestingly, or perhaps ironically, when uploaded to Drive via Google Drive for Desktop, the files were unusable as they would not open in Google Docs when opened on a browser (thus defeating the purpose). So, we scrapped Google Drive for Desktop altogether in favor of the download and upload process. Which brings us to another lesson learned!
- All files are now in the Microsoft Office equivalent (e.g., Slides -> pptx, Google Docs -> docx, Sheets -> xlsx).
This makes sense to me, even if it isn’t my favorite. All of the Google Drive versions of items are downloaded, and by default then uploaded, in Microsoft Office equivalents. This is not a terrible change: Google Drive allows you to interact with these versions as if they were Slides, Sheets, or Docs. The only slightly frustrating thing is that sometimes our shared links now get a bit fussy – and some of our formatting of presentations has been drastically changed.
- We lost some documentation, in particular Forms and Drawings, during the migration
One of the biggest losses, though, was the loss of files that do not have Microsoft Office equivalents. In particular? Google Forms. We use forms for so much in the DCN (like most orgs!) and, well. Poof. They’re gone. We made a few manual copies of forms we use a lot (e.g., our workshop application and registration forms) that we could manually move to the new Shared Drive. The rest? We accepted the fact that we lost them.
In conclusion
This was not an easy or quick task. It took quite a bit of time to prepare for, and a full week to execute. With this migration, we also moved some of our content to GitHub (have you seen the new CURATE(D) Steps?). We also seized this opportunity to clean some things up, which as data professionals we knew was overdue but as said professionals, we are also keenly aware of how hard it is to make time for such labor-intensive tasks. Now, we’ve slightly restructured our Shared Drive and are creating a readme to guide future document creation (including file naming!). If anyone is interested in learning more about this process, I’m happy to discuss further via email.
I know we’re all hopeful that this migration was a once-and-done process–but we know that all storage is ephemeral and needs to be periodically updated. Just maybe give us a few years, ok?
Thanks to Leslie Delserone, Heidi Imker, Kate Sheridan, and Matthew Murray🦇 for their thoughtful comments on earlier drafts of this post.
- The exception here is for HR materials related to hiring. We retained communication templates and timeline, but removed all cover letters, resumes, etc. ↩︎
- The DCN Community was asked to refrain from creating, editing, or removing files from our shared folders during this period. We began communicating this process, with encouragement to download any files they felt strongly about, in July with a migration date in September. ↩︎
- As a precautionary measure, we also downloaded a zip of the entire DCN Drive, which has been stored in a separate Shared Drive, still administered by the University of Minnesota. It has ~26 zip files of data. ↩︎