Tag Archives: open access

Embargoing the Term “Embargoes” Indefinitely

I’m two months into a position that lends part of its time to overseeing Dash, a Data Publication platform for the University of California. On my first day I was told that a big priority for Dash was to build out an embargo feature. Coming to the California Digital Library (CDL) from PLOS, an OA publisher with an OA Data Policy, I couldn’t understand why I would be leading endeavors to embargo data and not open it up- so I met this embargo directive with apprehension.

I began to acquaint myself with the campuses and a couple of weeks ago while at UCSF I presented the prototype for what this “embargo” feature would look like and I questioned why researchers wanted to close data on an open data platform. This is where it gets fun.

“Our researchers really just want a feature to keep their data private while their associated paper is under peer review. We see this frequently when people submit to PLOS”.

Yes, I had contributed to my own conflict.

While I laughed about how I was previously the person at PLOS convincing UC researchers to make their data public- I recognized that this would be an easy issue to clarify. And here we are.

Embargoes imply a negative connotation in the open community and I ask that moving forward we do not use this phrase to talk about keeping data private until an associated manuscript has been accepted. Let us call this “Private for Peer Review” or “Timed Release”, with a “Peer Review URL” that is available for sharing data during the peer review process as Dryad does.

  • Embargoes imply that data are being held private for reasons other than the peer review process.
  • Embargoes are not appropriate if you have a funder, publisher, or other mandate to open up your data.
  • Embargoes are not appropriate for sensitive data, as these data should not be held in a public repository (embargoed) unless this were through a data access committee and the repository had proper security.
  • Embargoes are not appropriate for open Data Publications.

To embargo your data for longer than the peer review process (or for other reasons) is to shield your data from being used, built off of, or validated. This is contrary to “Open” as a strategy to further scientific findings and scholarly communications.

Dash is implementing features that will allow researchers to choose, in line with what we believe is reasonable for peer review and revisions, a publication date up to six months after submission. If researchers choose to use this feature, they will be given a Peer Review URL that can be shared to download the data until the data are public. It is important to note though that while the data may be private during this time, the DOI for the data and associated metadata will be public and should be used for citation. These features will be for the use of Peer Review; we do not believe that data should be held private for a period of time on an open data publication platform for other reasons.

Opening up data, publishing data, and giving credit to data are all important in emphasizing that data are a credible and necessary piece of scholarly work. Dash and other repositories will allow for data to be private through peer review (with the intent to have data be public and accessible in the close future). However, my hope is that as the data revolution evolves, incentives to open up data sooner will become apparent. The first step is to check our vocab and limit the use of the term “embargo” to cases where data are being held private without an open data intention.

Tagged , , ,

UC Open Access: How to Comply

Free access to UC research is almost as good as free hugs! From Flickr by mhauri

Free access to UC research is almost as good as free hugs! From Flickr by mhauri

My last two blog posts have been about the new open access policy that applies to the entire University of California system. For big open science nerds like myself, this is exciting progress and deserves much ado. For the on-the-ground researcher at a UC, knee-deep in grants and lecture preparation, the ado could probably be skipped in lieu of a straightforward explanation of how to comply with the procedure. So here goes.

Who & When:

  • 1 November 2013: Faculty at UC Irvine, UCLA, and UCSF
  • 1 November 2014: Faculty at UC Berkeley, UC Merced, UC Santa Cruz, UC Santa Barbara, UC Davis, UC San Diego, UC Riverside

Note: The policy applies only to ladder-rank faculty members. Of course, graduate students and postdocs should strongly consider participating as well.

To comply, faculty members have two options:

Option 1: Out-of-the-box open access

. There are two ways to do this:

  1. Publishing in an open access-only journal (see examples here). Some have fees and others do not.
  2. Publishing with a more traditional publisher, but paying a fee to ensure the manuscript is publicly available. These are article-processing charges (APCs) and vary widely depending on the journal. For example, Elsevier’s Ecological Informatics charges $2,500, while Nature charges $5,200.

Learn more about different journals’ fees and policies: Directory of Open Access Journals: www.doaj.org

Option 2: Deposit your final manuscript in an open access repository.

In this scenario, you can publish in whatever journal you prefer – regardless of its openness. Once the manuscript is published, you take action to make a version of the article freely and openly available.

As UC faculty (or any UC researcher, including grad students and postdocs), you can comply via Option 2 above by depositing your publications in UC’s eScholarship open access repository. The CDL Access & Publishing Group is currently perfecting a user-friendly, efficient workflow for managing article deposits into eScholarship. The new workflow will be available as of November 1stLearn more.

Does this still sound like too much work? Good news! The Publishing Group is also working on a harvesting tool that will automate deposit into eScholarship. Stay tuned – the estimated release of this tool is June 2014.

An Addendum: Are you not a UC affiliate? Don’t fret! You can find your own version of eScholarship (i.e., an open access repository) by going to OpenDOAR. Also see my full blog post about making your publications open access.


Academic libraries must pay exorbitant fees to provide their patrons (researchers) with access to scholarly publications.  The very patrons who need these publications are the ones who provide the content in the form of research articles.  Essentially, the researchers are paying for their own work, by proxy via their institution’s library.

What if you don’t have access? Individuals without institutional affiliations (e.g., between jobs), or who are affiliated with institutions that have no/a poorly funded library (e.g., in 2nd or 3rd world countries), depend on open access articles for keeping up with the scholarly literature. The need for OA isn’t limited to jobless or international folks, though. For proof, one only has to notice that the Twitter community has developed a hash tag around this, #Icanhazpdf (Hat tip to the Lolcats phenomenon). Basically, you tweet the name of the article you can’t access and add the hashtag in hopes that someone out in the Twittersphere can help you out and send it to you.

Special thanks to Catherine Mitchell from the CDL Publishing & Access Group for help on this post.

Tagged , , , , ,

A Closer Look at the New UC Open Access Policy

The UC is opening up their research locker.  From Flickr by sam.d

The UC is opening up their research locker. From Flickr by sam.d

Last week, the University of California announced a new Open Access Policy. Here I will explore the policy in a bit more detail.  The gist of the policy is this: research articles authored by UC faculty will be made available to the public at no charge.

I’m sure most of this blog’s readers are familiar with paywalls and the nuances of scholarly publishing, but for those that aren’t – if you don’t have a license to get content from particular journals (via your institution’s library, for example) then you may pay upwards of $100 per article. For example, if I publish an amazing article in Nature (and don’t pay the $5,200 fee to make my article open access), my mom can’t get a copy of the article to hang on her fridge without either (1) getting a copy from someone with access, or (2) paying a big fee. Considering that my mom pays taxes that fund the NSF which funded my work, this is rather strange.

The UC policy is trying to change that. The idea is that faculty at the UC will grant a license to the UC prior to any contractual arrangement with publishers. The faculty member then has the right to make their research will be widely and publicly available, re-use it for various purposes, or modify it for future research publications – regardless of the publisher’s wishes for locking down the work.

Faculty will continue to publish their work in the most appropriate journal (open access or not). The big change is that now they can also place a copy of the publication in UC’s open access repository, eScholarship, which is freely accessible to anyone. To re-emphasize: This policy does NOT require that faculty publish in particular journals or pay “Article Processing Charges” to ensure their article is open access.

From the policy’s FAQ  page:

Faculty are strongly encouraged to continue to publish as normal, in the most appropriate and prestigious journals. Faculty are not required to pay to publish articles or pay to deposit them in an open-access repository under this policy, unless they choose to do so.

How faculty can comply (from the FAQ page):

By passing the policy on July 24, 2013, UC faculty members have committed themselves to making their scholarly articles available to the public by granting a license to UC and depositing a copy of their publications in eScholarship, UC’s open access repository. The policy automatically grants UC a license to make any scholarly articles available in an open access repository. UC will not do so, however, until an author takes the action of depositing an article in UC’s eScholarship repository or confirms the availability of the article in another open access venue – i.e., a repository (such as PubMed Central, ArXiv or SSRN) or an open access journal.

The California Digital Library and the campus libraries will assist faculty by providing a streamlined deposit system into eScholarship and an automated ‘harvesting’ tool in order to ease the process of depositing articles, is expected to be in place by June 2014.

And now, the downside. Michael Eisen, co-founder of the open access journal PLOS, points out the potential downside of the new policy in his blog post:

This policy has a major, major hole – an optional faculty opt-out. This is there because enough faculty wanted the right to publish their works in ways that were incompatible with the policy that the policy would not have passed without the provision.  Unfortunately, this means that the policy is completely toothless.

Eisen goes on to say

…because of the opt out, this is a largely symbolic gesture – a minor event in the history of open access, not the watershed event that some people are making it out to be.

Although I agree with Eisen that the opt-out clause significantly weakens the strength of this policy, I still believe this move on the UC’s part represents a major step forward in the battle to reclaim our scholarly work from some publishers. Perhaps it isn’t “watershed” but it’s certainly exciting, and it’s stimulating conversations about open science and accessibility to research.

Read more on the new policy and related topics:

Tagged , , ,

UC Faculty Senate Passes #OA Policy

Big news! I just got this email regarding the new Open Access Policy for the University of California System. I’ll write a full blog post next week but wanted to share this as soon as possible. (emphasis is mine)

The Academic Senate of the University of California has passed an Open Access Policy, ensuring that future research articles authored by faculty at all 10 campuses of UC will be made available to the public at no charge. “The Academic Council’s adoption of this policy on July 24, 2013, came after a six-year process culminating in two years of formal review and revision,” said Robert Powell, chair of the Academic Council. “Council’s intent is to make these articles widely—and freely— available in order to advance research everywhere.”  Articles will be available to the public without charge via eScholarship (UC’s open access repository) in tandem with their publication in scholarly journals.  Open access benefits researchers, educational institutions, businesses, research funders and the public by accelerating the pace of research, discovery and innovation and contributing to the mission of advancing knowledge and encouraging new ideas and services.

Chris Kelty, Associate Professor of Information Studies, UCLA, and chair of the UC University Committee on Library and Scholarly Communication (UCOLASC), explains, “This policy will cover more faculty and more research than ever before, and it sends a powerful message that faculty want open access and they want it on terms that benefit the public and the future of research.”

The policy covers more than 8,000 UC faculty at all 10 campuses of the University of California, and as many as 40,000 publications a year. 

It follows more than 175 other universities who have adopted similar so-called “green” open access policies.  By granting a license to the University of California prior to any contractual arrangement with publishers, faculty members can now make their research widely and publicly available, re-use it for various purposes, or modify it for future research publications.  Previously, publishers had sole control of the distribution of these articles.  All research publications covered by the policy will continue to be subjected to rigorous peer review; they will still appear in the most prestigious journals across all fields; and they will continue to meet UC’s standards of high quality.  Learn more about the policy and its implementation here: http://osc.universityofcalifornia.edu/openaccesspolicy/

UC is the largest public research university in the world and its faculty members receive roughly 8% of all research funding in the U.S.

With this policy UC Faculty make a commitment to the public accessibility of research, especially, but not only, research paid for with public funding by the people of California and the United States.  This initiative is in line with the recently announced White House Office of Science and Technology Policy (OSTP) directive requiring “each Federal Agency with over $100 million in annual conduct of research and development expenditures to develop a plan to support increased public access to results of the research funded by the Federal Government.” The new UC Policy also follows a similar policy passed in 2012 by the Academic Senate at the University of California, San Francisco, which is a health sciences campus.

“The UC Systemwide adoption of an Open Access (OA) Policy represents a major leap forward for the global OA movement and a well-deserved return to taxpayers who will now finally be able to see first-hand the published byproducts of their deeply appreciated investments in research” said Richard A. Schneider, Professor, Department of Orthopaedic Surgery and chair of the Committee on Library and Scholarly Communication at UCSF.   “The ten UC campuses generate around 2-3% of all the peer-reviewed articles published in the world every year, and this policy will make many of those articles freely  available to anyone who is interested anywhere, whether they are colleagues, students, or members of the general public”

The adoption of this policy across the UC system also signals to scholarly publishers that open access, in terms defined by faculty and not by publishers, must be part of any future scholarly publishing system.  The faculty remains committed to working with publishers to transform the publishing landscape in ways that are sustainable and beneficial to both the University and the public.

More information: http://osc.universityofcalifornia.edu/openaccesspolicy/


University of California, Berkeley campus, 1901. Contributed to Calisphere by the Berkeley Public Library.

University of California, Berkeley campus, 1901. Contributed to Calisphere by the Berkeley Public Library.

Tagged , ,

The Who’s Who of Publishing Research

This week’s blog post is a bit more of a Sociology of science topic… Perhaps only marginally related to the usual content surrounding data, but still worth consideration. I recently heard a talk by Laura Czerniewicz, from University of Cape Town’s Centre for Educational Technology. She was among the speakers  during the Context session at Beyond the PDF2, and she asked the following questions about research and science:

Whose interests are being served? Who participates? Who is enabled? Who is constrained?

She brought up points I had never really considered, related to the distribution of wealth and how that affects scientific outputs. First, she examined who actually produces the bulk of knowledge. Based on an editorial in Science in 2008, she reported that US academics produce about 30% of the articles published in international peer-reviewed journals, while developing countries (China, India, Brazil) produce another 20%. Sub-saharan Africa? A mere 1%.

She then explored what factors are shaping knowledge production and dissemination. She cited infrastructure (i.e., high speed internet, electricity, water, etc.), funding, culture, and reward systems. For example, South Africa produces more articles than other countries on the continent, perhaps because the government gives universities $13,000 for every article published in a “reputable journal”, and 21 of 23 universities surveyed give a cut of that directly to the authors.

Next, she asked “Who’s doing the publishing? What research are they publishing?” She put up some convincing graphics showing the number of articles published by authors from various countries, of which the US and Western Europe were leading the pack by six fold. I couldn’t hunt down the original publication, so take this rough statistic with a grain of salt. What about book publishing? The Atlantic Wire published a great chart back in October (based on an original article in Digital Book World) that scaled a country’s size based on the value of their domestic publishing markets:

Scaled map of the world based on book publishing. From Digital Book World via Atlantic Wire.

Scaled map of the world based on book publishing. From Digital Book World via Atlantic Wire.

When asking whose interests are served by international journals, she focused on a commentary by R. Horton, titled “Medical journals: Evidence of bias against the diseases of poverty” (The Lancet 361, 1 March 2003 – behind paywall). Granted, it’s a bit out of date, but it still has interesting points to consider. Horton reported that of the five top medical journals there is little or no representation on their editorial boards from countries with low Human Development Indices. Horton then postulates that this might be the cause for the so-called 10/90 gap – where 90% of research funding is allocated to diseases that affect only 10% of the world’s population. Although Horton does not go so far as to blame the commercial nature of publishing, he points out that editorial boards for journals must consider their readership and cater to those who can afford subscription fees.

I wonder how this commentary holds up, 10 years later. I would like to think that we’ve made a lot of progress towards better representation of research affecting humans that live in poverty. I’m not sure, however, we’ve done better with access to published research. I’ll leave you with something Laura said during her talk (paraphrased): “If half of the world is left out of knowledge exchange and dissemination, science will suffer.”

Check out Laura Czerniewicz’s Blog for more on this. She’s also got a Twitter feed.

Tagged , , , , ,

Open Up

Open Access Week came and went last week, and I marked the event on the blog with a post on Open Access.  But the Open movement goes far beyond just Open Access: there are lots of different flavors of open, with a select few explored in this post.

open range

Watch out for loose data and stray knowledge, folks. From Flickr by osiatynska

First let’s start with Open Notebook Science. This concept throws out the idea that you should be a hoarder, not telling others of your results until the Big Reveal in the form of a publication.  Instead, you keep your lab notebook (you do have one, right?) out in a public place, for anyone to peruse.  Most often ONS takes the form of a blog or a wiki.  The researcher updates their notebook daily, weekly, or whatever is most appropriate. There are links to data, code, relevant publications, or other content that helps readers, and the researcher themselves, understand the research workflow.

The most obvious reason for doing Open Notebook Science is that you can get feedback while you are still working on your research. If you are having problems or are stuck, the community might be able to help you. Another potential benefit is more opportunity for collaboration with others working on similar or related projects. Of course, the altruistic reason for keeping an open notebook is to contribute to the reproducibility and credibility of your research. For more information, check out Carl Boettiger’ great site that tells you more about ONS and contains his own notebook.

Open Science is basically the same concept as open notebook science: you make sure anyone who wants information on your work, your data, or your process can find it easily. You may or may not keep a lab notebook online, however.

Open Source refers to software (it actually refers to lots of stuff, but I’m only going to talk about software here). From Wikipedia:

Open-source software is software whose source code is published and made available to the public, enabling anyone to copy, modify and redistribute the source code without paying royalties or fees.

An important component of the open source software model is the community.  Developers and individuals can rally around the code, making it better and working as a group to improve the software.

Open-source code can evolve through community cooperation. These communities are composed of individual programmers as well as very large companies.

The statistical program R is a great example of open source software with an active, strong community.

Open Data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control. Data that is truly open should be released into the public domain (e.g., with a CC-0 license). For those that use the ONEShare repository via DataUp, your data will be open data.

And finally, Open Knowledge encompasses all of these concepts. It’s described as a set of principles and methodologies related to the production and distribution of “knowledge works” in an open manner. In this definition, knowledge can include data, content and general information. To learn more about the OK movement, check out the materials and resources on the Open Knowledge Foundation website.

Tagged , , , ,


A few months back I received an invite to visit the University of Florida in sunny Gainesville.  The invite was from organizers of an annual symposium for the Quantitative Spatial Ecology, Evolution and Environment (QSE3) Integrative Graduate Education and Research Traineeship (IGERT) program.  Phew! That was a lot of typing for the first two acronyms in my blog post’s title.  The third acronym  (OA) stands for Open Access, and the fourth acronym should be familiar.

I presented a session on data management and sharing for scientists, and afterward we had a round table discussion focused on OA.  There were about 25 graduate students affiliated the QSE3 IGERT program, a few of their faculty advisors, and some guests (including myself) involved in the discussion.  In 90 minutes we covered the gamut of current publishing models, incentive structures for scientists, LaTeX advantages and disadvantages, and data sharing.  The discussion was both interesting and energetic in a way that I don’t encounter from scientists that are “more established”.  Some of the themes that emerged from our discussion warrant a blog post.

First, we discussed that data sharing is an obvious scientific obligation in theory, but when it comes to your data, most scientists get a bit more cagey.  This might be with good reason – many of the students in the discussion were still writing up their results in thesis form, never mind in journal-ready form.  Throwing your data out into the ether without restrictions might result in some speedy scientist scooping you while you are dotting i’s and crossing t’s in your thesis draft.  In the case of grad students and scientists in general, embargo periods seem to be a good response to most of this apprehension. We agreed as a group, however, that such embargos should be temporary and should be phased out over time as cultural norms shift.

The current publishing model needs to change, but there was disagreement about how this change should manifest. For instance, one (very computer-savvy) student who uses R, LaTeX and Sweave asked “Why do we need publishers? Why can’t we just put the formatted text and code online?”  This is an obvious solution for someone well-versed in the world of document preparation in the vein of LaTeX.  You get fully formated, high-quality publications by simply compiling documents. But this was argued against by many in attendance because LaTeX use is not widespread, and most articles need heavy amounts of formatting before publication.  Of course, this is work that would need to be done by the overburdened scientist if they published their own work, which is not likely to become the norm any time soon.

empty library

No journals means empty library shelves. Perhaps the newly freed up space could be used to store curmudgeonly professors resistant to change.

Let’s pretend that we have overhauled both scientists and the publishing system as it is.  In this scenario, scientists use free open-source tools like LaTeX and Sweave to generate beautiful documents.  They document their workflows and create python scripts that run in the command line for reproducible results.  Given this scenario, one of the students in the discussion asked “How do you decide what to read?” His argument was that the current journal system provides some structure for scientists to hone in on interesting publications and determine their quality based (at least partly) on the journal in which the article appears.

One of the other grad students had an interesting response to this: use tags and keywords, create better search engines for academia, and provide capabilities for real-time peer review of articles, data, and publication quality.  In essence, he used the argument that there’s no such thing as too much information. You just need a better filter.

One of the final questions of the discussion came from the notable scientist Craig Osenberg. It was in reference to the shift in science towards “big data”, including remote sensing, text mining, and observatory datasets. To paraphrase: Is anyone worrying about the small datasets? They are the most unique, the hardest to document, and arguably the most important.

My answer was a resounding YES! Enter the DCXL project.  We are focusing on providing support for the scientists that don’t have data managers, IT staff, and existing data repository accounts that facilitate data management and sharing.  One of the main goals of the DCXL project is to help “the little guy”.  These are often scientists working on relatively small datasets that can be contained in Excel files.

In summary, the very smart group of students at UF came to the same conclusions that many of us in the data world have: there needs to be a fundamental shift in the way that science is incentivized, and this is likely to take a while.  Of course, given that these students are early in their careers, and their high levels of interest and intelligence, they are likely to be a part of that change.

Special thanks goes to Emilio Bruna (@brunalab) who not only scored me the invite to UF, but also hosted me for a lovely dinner during my visit (albeit NOT the Tasty Budda…)

Tagged , , , , ,