Someone from the Office of Research and Engagement contacted Ellie Read in the Libraries last week with a question about which repository was better for archiving data: ICPSR or Dataverse. Ellie wasn’t familiar with Dataverse, so she forwarded the email to me, and I did a little research to find out the differences. I found that ICPSR is a subscription-based service (which I already knew) and the UTK Libraries pay $15,750 per year for the service. That gives people at UT the ability to archive data there and get all the added benefits ICPSR provides, which leads to the main difference I found in the two repositories: data curation. ICPSR curates the data when archives, meaning they add value to that data by converting it to additional formats, adding metadata, and updating the file format so access is maintained throughout the years. Dataverse, on the other hand, does not do any of this. They simply archive the file and metadata and guarantee that they will preserve the bitstream. Another difference is that Dataverse is a freely available, open-source software that anyone can download and install. Thus, anyone can create their own Dataverse. For example, a researcher can begin their own Dataverse for their own research and data, or a department can begin its own Dataverse for the researchers within that department. There is, however, a Dataverse that is already implemented and open to all: the Harvard Dataverse. Anyone can deposit data (mainly social science data) to the Harvard Dataverse, but again, it will not be curated.
My conclusion was that ICPSR would probably be the better option for researchers here at UT, especially since we pay nearly $16,000 for the privilege. The additional curation of data sets is a valuable service that will help maintain access and usefulness of data for years.