Last spring, I was looking for the perfect summer internship to bring together all that I had learned so far in my studies for the Masters in Library and Information Science at the School of Information Studies at Syracuse University (SU). My internship would also be the final piece to complete my Certificate of Advanced Study in Digital Libraries at SU. As I followed several leads through my network, all roads pointed me toward the Center for Digital Research and Scholarship at Columbia University. A place to continue my work with web-based Digital Humanities projects? CDRS. A place to continue learning more about Research Data Management (RDM), following up on my work at SU on the Capability Maturity Model for RDM? CDRS. A place to pursue hands-on experience with data curation, building on my coursework? CDRS. I was very grateful to be chosen as an intern, and I jumped right in.
From the very beginning of my time at CDRS, I was struck by how everyone really seems to care so much about the work they’re doing, and how they are a part of a bigger picture of Open Access (and consciously see themselves that way). It was also impressive to be a part of a library system the size of Columbia’s, and to see both the extreme specificity of certain job descriptions, and the collaborative nature of work across departments.
Over the course of my summer internship, working 2-3 days a week for 8 weeks, I worked on two data rescue projects. One was to create a detailed survey of Columbia researchers who have published in journals by the Ecological Society of America, to embed the funding sources for their research, find out if their raw data are still available, and collect information to facilitate the process of adding their datasets to Columbia’s digital repository, Academic Commons. That project will be ongoing, and I look forward to finding out about the results of the survey. The second project was a case study of a recent student’s Digital Humanities project published as a website, to collect and preserve the data components independent of the website, and to experiment with ways of preserving the website itself, all with detailed process documentation, especially concerning metadata remediation. Later this fall I will work more on this case study, in hopes of publishing my findings so that others may be able to apply some parts of my process to other projects.
I was surprised and pleased with the level of independence I was given for my projects. I did not expect the opportunities I had to meet with diverse staff members across multiple different departments, and I am very grateful to my site supervisor, Amy Nurnberger, for connecting me with all the many different people who could help with my projects. I was able to learn from a diverse group, each with very specialized knowledge, and these are valuable contacts to have in the future. Columbia library staff in the digital divisions are an incredibly knowledgeable and passionate group! The opportunity to attend many different meetings and presentations, both in and out of CDRS, learning about GIS and MODS-RDF for example, was another pleasant surprise that was great for my professional development.
The data curation projects I worked on during my internship gave me wonderful opportunities to apply many different concepts I had studied in school to a practical situation. I was able to directly apply work from my studies in these classes at SU: Digital Libraries, Metadata, Creating, Managing, and Preserving Digital Assets, Tech in Web Content Management, Information Architecture, and Information Policy. This experience really was the whole package, allowing me to tie together all that I had learned so far in my degree program. I had just taken Creating, Managing, and Preserving Digital Assets in the spring, which had a strong focus on digital preservation, and it was very helpful for me to see how each practical situation is usually far from an ideal situation. The work requires prioritization to determine the first small steps to take toward preservation with available resources, improving conditions with each continued step. The necessity of collaboration also was driven home to me through my work on these projects at CDRS. Everyone who touches data in even the smallest way during the data lifecycle has an impact on the future preservation of those data, so it’s incredibly important that we raise awareness with everyone involved, from the very beginning, about the choices that can negatively impact preservation.
My career has definitely taken a turn toward digital curation and preservation as a result of my work during the internship. When I began my library degree, it was with the goal of helping to create diverse digital library collections with value for education. Now I am much more aware of how building a web-based collection is just one part in the middle of the data lifecycle, in between the content specialists whose research data become material for a collection and the end users who may use these data in unexpected ways. Maintenance for prolonged access to such collections (and the raw data in them) still remains a fragile prospect, and I hope to personally play a part in reducing that fragility in the future.
There was much that I learned in my internship that I will apply to my future work, regarding the importance of collaboration, documenting processes for future replication, and developing policies, guides, workshops, etc. to help stakeholders follow best practices. As I begin my post-graduate job search, I am grateful for the experience I had in my internship with writing a data management plan, working with OpenRefine to clean up data, developing XSL transformations, documenting processes for future replication, developing complex surveys using Qualtrics, and (perhaps most importantly) collaborating with diverse stakeholders.
All in all this was an incredible experience for me, for which I am very grateful! Basically, I just wish I could have stayed longer and done more – the summer flew by far too fast. Thank you to everyone at CDRS for this wonderful opportunity!