Data Management Education: A Report from CURATEcamp

This is a guest post I wrote for the e-Science Community blog and was originally posted on May 16, 2012 at

On May 7-8, 2012, I attended CURATEcamp at Georgia Tech in Atlanta. CURATEcamp is an un-conference dedicated to all things digital curation. Since it’s an un-conference, the schedule and topics of discussion are not predetermined. They are developed on the fly on the first morning of the conference based on the interests of the attendees. Everyone is expected to participate in some way. I wanted to have a discussion about data management (DM) education for faculty and students. Others had similar interests, so one session was devoted to DM planning and education. I believe DM education is essential to long-term curation of research data. Unfortunately, library personnel can only do so much regarding the curation of research data. Therefore, the researchers themselves must share responsibility for curatorial activities on their data prior to handing it over to the library. For this to happen, we need to teach them how to do it well, and I believe the library is exceptionally positioned to offer that training.

I began the CURATEcamp discussion by introducing the Data Information Literacy (DIL) Project, a joint venture among the libraries at Purdue University, University of Minnesota, University of Oregon, and Cornell University. The investigators surveyed faculty and students and found that faculty are leaving DM responsibilities to their graduate students, but what the graduate students are doing varies widely (Carlson, Fosmire, Miller, & Nelson, 2011). This combination is troubling and underscores the need for DM training.

Discussion participants raised questions about how to get researchers to attend DM training, admitting that DM won’t exactly keep them on the edge of their seats. How can we encourage class attendance in a class like this? One suggestion was to offer a data management certificate. Others suggested the need for faculty and institutional support. Carlson, et al, discovered that faculty believe there is real benefit to grad students learning DM practices, but they don’t have the time nor the knowledge to teach them (2011). If the faculty supports this endeavor, they can encourage, if not require, their grad students to complete this training. The consensus at CURATEcamp was that reaching young researchers early in their careers, both grad students and faculty, is the best way to inject sound DM practices into the research process.

Getting students to attend the class may be a hurdle, but when they come, what will we teach them? Earlier this year, the Frameworks for Data Management Curriculum was announced on this blog. This document was developed by the Lamar Soutter Library at University of Massachusetts Medical School and the George C. Gordon Library at Worcester Polytechnic Institute. It contains seven modules with topics including metadata, backups, data sharing, and ethics (Frameworks, 2012). Likewise, as part of the DIL project, the investigators proposed twelve core competencies that should encompass data information literacy. A comparison of the Frameworks to the DIL’s twelve core competencies shows the Frameworks includes nine of the twelve competencies.

The Frameworks does not cover databases, data analysis, and data visualization. I believe that not including these three topics is wise for two reasons. First, I don’t believe these skills can be adequately taught in a course devoted to DM. To be competent enough to make effective use of these tools, one would need more advanced training. Second, while they are important, if not essential, skills for researchers to possess, they are not data management skills in the sense that metadata, file naming, file formats, and data conversion are. They are complementary skills. However, they can be considered part of data information literacy in a broader sense, of which data management skills are a subset.

Overall, the discussion at CURATEcamp was fruitful, though perhaps it raised more questions than it answered. Additional conversations are needed to fully work out details such as the ideal instructional format, the length of the course, and how to encourage attendance. But one thing is certain: these skills are important. The better we train the next generation of researchers to management their research data competently, the easier the long term curation of that data will be.


Carlson, J., Fosmire, M., Miller, C., & Nelson, M. (2011). Determining data information literacy needs – A study of students and research faculty. Portal: Libraries and the Academy, 11(2), 629-657.

Lamar Soutter Library University of Massachusetts Medical School, & George C. Gordon Library Worcester Polytechnic Institute. (2012). Frameworks for a Data Management Curriculum.