Last Friday I attended a seminar at UC Berkeley’s iSchool given by MacKenzie Smith, a terrific presenter and colleague who is affiliated with Creative Commons (among other prestigious organizations). MacKenzie was talking about data governance, an issue I covered a few months back for the DCXL blog. However on Friday MacKenzie brought up a few things that I think warrant another post.
First, let’s define data governance for those that aren’t familiar with the concept. Based on Wikipedia’s entry, it’s the policies surrounding data, including data risk management, assignment of roles and responsibilities for data, and more generally formally managing data assets throughout the research cycle. Now on to the new things:
Thing 1: Facts cannot be copyrighted. It makes sense for things like, say, simple math. I can’t say “2+2=4” © 2011 Carly Strasser. Known facts can’t be copyrighted. So what about data? One might argue that data are facts (assuming you are doing science correctly). That means you don’t own the copyright to your data. Eeek! Scary thought, I know. You might be saved by the fact that a unique arrangement or collection of facts can be copyrighted. Huh. Data in a database? Can’t be copyrighted. The database itself? Can be copyrighted. This obviously makes things related to data quite messy when it comes to intellectual property.
Thing 2: Did you know that “attribution” can be legally imposed? The remedy for a lack of attribution where warranted is a lawsuit. Creative Commons licenses are built on this fact. This is not true, however of citation. Citation is a “scholarly norm” that has no underlying legality.
Thing 3: Creative Commons is now working on a CC 4.0 license. Some of goals of this new version are enabling internationalization and interoperability, and improving support of data, Science, and Education. They want input from scientists, librarians, administrators, and anyone else who might have an opinion about intellectual property, open science, and governance in general.
Thing 4: Open Knowledge Foundation is working on concepts related to governance with a global perspective. They have a range of projects in the works for improving the sharing of knowledge, data, and content.
Thing 5: While waiting for a consensus on how to properly govern digital data and other digital content, many data providers are dealing with governance by constructing data usage agreements. These are contracts created by lawyers for a specific data provider (e.g., an online database). The problem with data usage agreements is that they are all different. This means that if you want to use data from a source that requires you agree to their terms, you have three options:
- Carefully read the terms before agreeing (and who does that?)
- Click that you agree without reading and hope you don’t accidentally break any rules
- Find the data that you need from another source that doesn’t have terms and conditions for data usage.
Thing 6: What about international collaborations? As you might imagine, this offers yet another layer of complication. As a scientist, you are supposed to be ensuring that you look into any data policies that may apply to your collaborators. From NSF DMP FAQ (hello, alphabet soup!):
16. If I participate in a collaborative international research project, do I need to be concerned with data management policies established by institutions outside the United States?
Yes. There may be cases where data management plans are affected by formal data protocols established by large international research consortia or set forth in formal science and technology agreements signed by the United States Government and foreign counterparts. Be sure to discuss this issue with your sponsored projects office (or equivalent) and your international research partner when first planning your collaboration.
Hmm. It looks like the waters are very muddy right now, and until they clear, researchers should watch their step.