To apply the concepts I have been learning in my Foundations of Data Curation course, I created a research guide on the UT Libraries website. Jeanine and I met with the Associate Dean for Scholarly Communication, Holly Mercer, to run everything by her and make sure everything on there was OK from her standpoint. She was extremely excited that I had created it and said we needed something like that badly. She thought the Office of Research would be thrilled with it as well. It went live a few days ago. The link is: libguides.utk.edu/datamanagement.
We received our grades for Assignment 2 for Foundations of Data Curation. I got a 17/20. I was expecting a higher grade than that, but apparently, I missed some of the assignment’s requirements. We were supposed to talk about our chosen data standard, its history, strengths, weaknesses, and stuff like that. I didn’t talk about mine as much as I should have. I should have found alot more information on TEI and spoken at length about it. I understand what I did wrong.
Another thing the instructor mentioned was that I wasn’t quite right on my explanation of the Semantic Level. I said at the semantic level, the document was a book by an author. He said I should have described the semantics of TEI, not the book. So at the semantic level, it is the meaning of the TEI elements. I’ve asked for more clarification, because I’m still not clear on this concept.
On March 3, our data curation class had lecture at University of Illinois. We talked about how FRBR can be refactored for data sets. FRBR is designed to handle “document-like” objects, but we need a new system to handle “dataset-like” objects. Allen Renear and Simone Sacchi explained this diagram and called it “Refactored FRBR for Datasets.”
Whereas FRBR has 4 levels as follows,
This new FRBR refactored for datasets, which they call the “Simple Stack,” has four levels as follows:
These levels are shown in the diagram below. I will attempt to explain the four levels and what mean in a separate post or posts.
I was trying to explain the symantic level to a classmate today over email, and as I was explaining it to her, it became much clearer to me. This is what I wrote:
The document-like vs dataset-like objects comes from the comparison between FRBR and the new FRBR redone for datasets they [the instructors] introduced last Saturday. FRBR is designed to handle “document-like” objects, such as books, CDs, etc. But the new stack levels (semantic, syntax, serialization, encoding, etc) are meant to handle “dataset-like” objects. The statement about data making assertions is how they differentiate datasets from document-like objects. Data is different from document-like objects because it asserts a fact about something. Remember them saying this – Data involve assertions intended to be used as evidence? So if you have a data set full of temperature measurements at certain times and pressures, then each data entry would assert “The temperature at time X and pressure Y is Z.” That’s the assertion the data are making. It’s not expressed literally in the data, but you can infer it. This assertion is the symantic level of the document — it’s what fact it’s asserting
It’s nice to see these concepts becoming clearer to me. I’m beginning to see the clouds lift, much like they did when I was trying to come to terms with conceptual frameworks in my research methods class. This semester has been a pattern of clouds forming, my feeling totally lost, and then the clouds starting to lift as things become clearer to me. At least I recognize this pattern so the next time it happens I will know it’s part of the process.
A few weeks ago in my data curation class we began talking about how to tell if two data sets were the same data sets. You would think that would be so easy, right? WRONG! It’s completely an utterly complicated and way over my head. My head is swimming trying to figure this stuff out. I have a homework assignment due on it next week and I’m completely unprepared to do any meaningful work on this assignment. I’m reading back through the slides from the previous few weeks so I hopefully can begin to understand it, but I’m not getting very far. This is completely over my head.
This is a blog post I wrote over at Hack Library School Blog.
Author’s note: My interests within the LIS field are data curation and e-science librarianship. This is a hot topic that is growing every day, and skilled e-science librarians are needed to fill the gap. If you’re interested in learning more about data curation librarianship as a future career, leave a comment here, and I’ll follow up with more information.
Back in the Fall, Micah wrote a post about Open Access Week. In it he discussed open journals, open data, and the ALA Code of Ethics. Open data is what today’s post is about. An important ongoing question in the world of data curation today is how to get scientists to share their data by placing it in a data repository. There are many scientists who are unaware of the fact that their data has value to anyone but them and their research team. On the other hand, there are…
View original post 333 more words