Category Archives: Altmetrics

Make Data Count: Building a System to Support Recognition of Data as a First Class Research Output

The Alfred P. Sloan Foundation has made a 2-year, $747K award to the California Digital Library, DataCite and DataONE to support collection of usage and citation metrics for data objects. Building on pilot work, this award will result in the launch of a new service that will collate and expose data level metrics.

The impact of research has traditionally been measured by citations to journal publications: journal articles are the currency of scholarly research.  However, scholarly research is made up of a much larger and richer set of outputs beyond traditional publications, including research data. In order to track and report the reach of research data, methods for collecting metrics on complex research data are needed.  In this way, data can receive the same credit and recognition that is assigned to journal articles.

Recognition of data as valuable output from the research process is increasing and this project will greatly enhance awareness around the value of data and enable researchers to gain credit for the creation and publication of data” – Ed Pentz, Crossref.

This project will work with the community to create a clear set of guidelines on how to define data usage. In addition, the project will develop a central hub for the collection of data level metrics. These metrics will include data views, downloads, citations, saves, social media mentions, and will be exposed through customized user interfaces deployed at partner organizations. Working in an open source environment, and including extensive user experience testing and community engagement, the products of this project will be available to data repositories, libraries and other organizations to deploy within their own environment, serving their communities of data authors.

Are you working in the data metrics space? Let’s collaborate.

Find out more and follow us at:, @makedatacount

About the Partners

California Digital Library was founded by the University of California in 1997 to take advantage of emerging technologies that were transforming the way digital information was being published and accessed. University of California Curation Center (UC3), one of four main programs within the CDL, helps researchers and the UC libraries manage, preserve, and provide access to their important digital assets as well as developing tools and services that serve the community throughout the research and data life cycles.

DataCite is a leading global non-profit organization that provides persistent identifiers (DOIs) for research data. Our goal is to help the research community locate, identify, and cite research data with confidence. Through collaboration, DataCite supports researchers by helping them to find, identify, and cite research data; data centres by providing persistent identifiers, workflows and standards; and journal publishers by enabling research articles to be linked to the underlying data/objects.

DataONE (Data Observation Network for Earth) is an NSF DataNet project which is developing a distributed framework and sustainable cyber infrastructure that meets the needs of science and society for open, persistent, robust, and secure access to well-described and easily discovered Earth observational data.

California Digital Library Supports the Initiative for Open Citations

California Digital Library (CDL) is proud to announce our formal endorsement for the Initiative for Open Citations (I4OC). CDL has long supported free and reusable scholarly work, as well as organizations and initiatives supporting citations in publication. With a growing database of literature and research data citations, there is a need for an open global network of citation data.

The Initiative for Open Citations will work with Crossref and their Cited-by service to open up all references indexed in Crossref. Many publishers and stakeholders have opted in to participate in opening up their citation data, and we hope that each year this list will grow to encompass all fields of publication. Furthermore, we are looking forward to seeing how research data citations will be a part of this discussion.

CDL is a firm believer in and advocate for data citations and persistent identifiers in scholarly work. However, if research publications are cited and those citations are not freely accessible and searchable- our goal is not accomplished. We are proud to support the Initiative for Open Citations and invite you to get in touch with any questions you may have about the need for open citations or ways to be an advocate for this necessary change.

Below are some Frequently Asked Questions about the need, ways to get involved, and misconceptions regarding citations. The answers are provided by the Board and founders of the I4OC Initiative:

I am a scholarly publisher not enrolled in the Cited-by service. How do I enable it?

If not already a participant in Cited-by, a Crossref member can register for this service free-of-charge. Having done so, there is nothing further the publisher needs to do to ‘open’ its reference data, other than to give its consent to Crossref, since participation in Cited-by alone does not automatically make these references available via Crossref’s standard APIs.

I am a scholarly publisher already depositing references to Crossref. How do I publicly release them?

We encourage all publishers to make their reference metadata publicly available. If you are already submitting article metadata to Crossref as a participant in their Cited-by service, opening them can be achieved in a matter of days. Publishers can easily and freely achieve this:

  • either by contacting Crossref support directly by e-mail, asking them to turn on reference distribution for all of the relevant DOI prefixes;
  • or by themselves setting the < reference_distribution_opt > metadata element to “ any ” for each DOI deposit for which they want to make references openly available.

How do I access open citation data?

Once made open, the references for individual scholarly publications may be accessed immediately through the Crossref REST API.

Open citations are also available from the OpenCitations Corpus , a database created to house scholarly citations, that is progressively and systematically harvested citation data from Crossref and other sources. An advantage of accessing citation data from the OpenCitations Corpus is that they are available in standards-compliant machine-readable RDF format , and include information about both incoming and outgoing citations of bibliographic resources (published articles and books).

Does this initiative cover future citations only or also historical data?

Both. All DOIs under a prefix set for open reference distribution will have open references through Crossref, for past, present, and future publications.

Past and present publications that lack DOIs are not dealt with by Crossref, and gaining access to their citation data will require separate initiatives by their publishers or others to extract and openly publish those references.

Under what licensing terms is citation data being made available?

Crossref exposes article and reference metadata without a license, since it regards these as raw facts that cannot be licensed.

The structured citation metadata within the OpenCitations Corpus are published under a Creative Commons CC0 public domain dedication, to make it explicitly clear that these data are open.

My journal is open access. Aren’t its articles’ citations automatically available?

No. Although Open Access articles may be open and freely available to read on the publisher’s website, their references are not separate, and are not necessarily structured or accessible programmatically. Additionally, although their reference metadata may be submitted to Crossref, Crossref historically set the default for references to “closed,” with a manual opt-in being required for public references. Many publisher members have not been aware that they could simply instruct Crossref to make references open, and, as a neutral party, Crossref has not promoted the public reference option. All publishers therefore have to opt in to open distribution of references via Crossref.

Is there a programmatic way to check whether a publisher’s or journal’s citation data is free to reuse?

For Crossref metadata , their REST API reveals how many and which publishers have opened references. Any system or tool (or a JSON viewer) can be pointed to this query: to show the count and the list of publishers with public-references “: true .

To query a specific publisher’s status, use, for example: ery=springer then find the tag for public-references. In some cases it will be set to false.


You can contact the founding group by e-mail at: .

Data: Do You Care? The DLM Survey

We all know that data is important for research. So how can we quantify that? How can you get credit for the data you produce? What do you want to know about how your data is used?

If you are a researcher or data manager, we want to hear from you. Take this 5-10 minute survey and help us craft data-level metrics:

Please share widely! The survey will be open until December 1st.

Read more about the project at or check out our previous post. Thanks to John Kratz for creating the survey and jumping through IRB hoops!

What do you think of data metrics? We're listening.  From Click for more pics of dogs + radios.

What do you think of data metrics? We’re listening.
From Click for more pics of dogs + radios.

Tagged , , , ,

UC3, PLOS, and DataONE join forces to build incentives for data sharing

We are excited to announce that UC3, in partnership with PLOS and DataONE, are launching a new project to develop data-level metrics (DLMs). This 12-month project is funded by an Early Concept Grants for Exploratory Research (EAGER) grant from the National Science Foundation, and will result in a suite of metrics that track and measure data use. The proposal is available via CDL’s eScholarship repository: More information is also available on the NSF Website.

Why DLMs? Sharing data is time consuming and researchers need incentives for undertaking the extra work. Metrics for data will provide feedback on data usage, views, and impact that will help encourage researchers to share their data. This project will explore and test the metrics needed to capture activity surrounding research data.

The DLM pilot will build from the successful open source Article-Level Metrics community project, Lagotto, originally started by PLOS in 2009. ALM provide a view into the activity surrounding an article after publication, across a broad spectrum of ways in which research is disseminated and used (e.g., viewed, shared, discussed, cited, and recommended, etc.)

About the project partners

PLOS (Public Library of Science) is a nonprofit publisher and advocacy organization founded to accelerate progress in science and medicine by leading a transformation in research communication.

Data Observation Network for Earth (DataONE) is an NSF DataNet project which is developing a distributed framework and sustainable cyberinfrastructure that meets the needs of science and society for open, persistent, robust, and secure access to well-described and easily discovered Earth observational data.

The University of California Curation Center (UC3) at the California Digital Library is a creative partnership bringing together the expertise and resources of the University of California. Together with the UC libraries, we provide high quality and cost-effective solutions that enable campus constituencies – museums, libraries, archives, academic departments, research units and individual researchers – to have direct control over the management, curation and preservation of the information resources underpinning their scholarly activities.

The official mascot for our new project: Count von Count. From

The official mascot for our new project: Count von Count. From

Tagged , , ,

Researchers – get your ORCID

Yesterday I remotely joined a lab meeting at my old stomping grounds, Woods Hole Oceanographic Institution. My former advisor, Mike Neubert, asked me to join his math ecology lab meeting to “convince them to get ORCID Identifiers. (Or try anyway!)”. As a result, I’ve spent a little bit of time thinking about ORCIDs in the last few days. I figured I might put the preverbal pen to paper and write a blog post about it for the benefit of other researchers.

What is ORCID?

An acronym, of course! ORCID stands for “Open Researcher & Contributor ID”. The ORCID Organization is an open, non-profit group working to provide a registry of unique researcher identifiers and a transparent method of linking research activities and outputs to these identifiers (from their website). The endgame is to support the creation of a permanent, clear and unambiguous record of scholarly communication by enabling reliable attribution of authors and contributors.

Wait – let’s back up.

What is a “Researcher Identifier”?

Wikipedia’s entry on ORCIDs might summarize researcher identifiers best:

An ORCID [i.e., researcher identifier] is nonproprietary alphanumeric code to uniquely identify scientific and other academic authors. This addresses the problem that a particular author’s contributions to the scientific literature can be hard to electronically recognize as most personal names are not unique, they can change (such as with marriage), have cultural differences in name order, contain inconsistent use of first-name abbreviations and employ different writing systems. It would provide for humans a persistent identity — an “author DOI” — similar to that created for content-related entities on digital networks by digital object identifiers (DOIs).

Basically, researcher identifiers are like social security numbers for scientists. They unambiguously identify you throughout your research life. It’s important to note that, unlike SSNs, there isn’t just one researcher ID system. Existing researcher identifier systems include ORCID, ResearcherIDScopus Author IdentifierarXiv Author ID, and eRA Commons Username. So why ORCID?

ORCID is an open system – that means web application developers, publishers, grants administrators, and institutions can hook into ORCID and use those identifiers for all kinds stuff. It’s like having one identifier to rule them all – imagine logging into all kinds of websites, entering your ORCID ID, and having them know who you are, what you’ve published, and what impacts you have had on scientific research. A bonus of the ORCID organization is that they are committed to “transcending discipline, geographic, national and institutional boundaries” and ensuring that ORCID services will be based on transparent and non-discriminatory terms posted on the ORCID website.

How does this differ from Google Scholar, Research Gate and the like?

This is one of the first question most researchers ask. In fact, CV creation sites like Google Scholar profiles, Academia.eduResearch Gate and the like are a completely different thing. ORCID is an identifier system, so comparing ORCIDs to Research Gate is like comparing your social security number to your Facebook profile. Note, however, that ORCID could work with these CV creation sites in the future – which would make identifying your research outputs even easier. The confusion probably stems from the fact that you can create an ORCID Profile on their website. Note that this is not required, however it helps ensure that past research products are connected to your ORCID ID.

Metrics + ORCID

One of the most exciting things about ORCID is its potential to influence the way we think about credit and metrics for researchers. If researchers have unique identifiers, it makes it easier to round up all of their products (data, blog posts, technical documents, theses) and determine how much they have influenced the field. In other words, ORCID plays nice with altmetrics. Read more about altmetrics in these previous Data Pub blog posts on the subject. A 2009 Nature Editorial sums up this topic about altmetrics and identifiers nicely:

…But perhaps the largest challenge will be cultural. Whether ORCID or some other author ID system becomes the accepted standard, the new metrics made possible will need to be taken seriously by everyone involved in the academic-reward system — funding agencies, university administrations, and promotion and tenure committees. Every role in science should be recognized and rewarded, not just those that produce high-profile publications.

What should you do?

  1. Go to
  2. Follow the Register Now Link and fill out the necessary fields (name, email, password)

You can stop here- you’ve claimed your ORCID ID! It will be a numeric string that looks something like this: 0000-0001-9592-2339 (that’s my ORCID ID!).

…OR you can go ahead build out your ORCID profile. To do add previous work:

  1. On your profile page (which opens after you’ve registered), select the “Import Works” button.
  2. A window will pop up with organizations who have partnered with ORCID. When in doubt, start with “CrossRef Metadata Search”. CrossRef provides DOIs for publishers, which means if you’ve published articles in journals, they will probably show up in this metadata search.
  3. Grant approval for ORCID to access your CrossRef information. Then peruse the list and identify which works are yours.
  4. By default, the list of works on your ORCID profile will be private. You can change your viewing permission to allow others to see your profile.
  5. Consider adding a link to your ORCID profile on your CV and/or website. I’ve done it on mine.

ORCID is still quite new – that means it won’t find all of your work, and you might need to manually add some of your products. But given their recently-awarded funding from the Alfred P. Sloan Foundation, and interest from many web application developers and companies, you can be sure that the system will only get better from here.


Orchis morio (Green-winged Orchid) Specimen in Derby Museum herbarium. From Flickr by Derby Museum.

Orchis morio (Green-winged Orchid) Specimen in Derby Museum herbarium. From Flickr by Derby Museum.

Tagged , , , , ,

Two Altmetrics Workshops in San Francisco

Last week, a group forward-thinking individuals interested in measuring scholarly impact gathered at Fort Mason in San Francisco to talk about altmetrics. The Alfred P. Sloan Foundation funded the events at Fort Mason, which included (1) an altmetrics-focused workshop run by the open-access publisher (and leader in ALM) PLOS, and (2) a NISO Alternative Assessment Initiative Project Workshop to discuss standards and best practices for altmetrics.

In lieu of a blog post for Data Pub, I wrote up something for the folks over at the London School of Economics Impact of Social Sciences Blog. Here’s a snippet that explains altmetrics:

Altmetrics focuses broadening the things we are measuring, as well as how we measure them. For instance, article-level metrics (ALMs) report on aspects of the article itself, rather than the journal in which it can be found. ALM reports might include the number of article views, the number of downloads, and the number of references to the article in social media such as Twitter. In addition to measuring the impact of articles in new ways, the altmetrics movement is also striving to expand what scholarly outputs are assessed – rather than focusing on journal articles, we could also be giving credit for other scholarly outputs such as datasets, software, and blog posts.

So head on over and read up on the role of higher education institutions in altmetrics: “Universities can improve academic services through wider recognition of altmetrics and alt-products.”

Related Data Pub posts:

Tagged , , ,

It’s Time for Better Project Metrics

I’m involved in lots of projects, based at many institutions, with multiple funders and oodles of people involved. Each of these projects has requirements for reporting metrics that are used to prove the project is successful. Here, I want to argue that many of these metrics are arbitrary, and in some cases misleading. I’m not sure what the solution is – but I am anxious for a discussion to start about reporting requirements for funders and institutions, metrics for success, and how we measure a project’s impact.

What are the current requirements for projects to assess success? The most common request is for text-based reports – which are reminiscent of junior high book reports. My colleague here at the CDL, John Kunze, has been working for the UC in some capacity for a long time. If anyone is familiar with the bureaucratic frustrations of metrics, it’s John. Recently he brought me a sticky-note with an acronym he’s hoping will catch on:

SNωωRF: Stuff nobody wants to write, read, or fund

The two lower-case omegas, which translate to “w” for the acronym, represent the letter “O” to facilitate pronunciation –i.e.,  “snorf”. He was prompted to invent this catchy acronym after writing up a report for a collaborative project we work on, based in Europe. After writing the report, he was told it “needed to be longer by two or three pages”. The necessary content was there in the short version – but it wasn’t long enough to look thorough. Clearly brevity is not something that’s rewarded in project reporting.

Which orange dot is bigger? Overall impressions differ from what the measurements say. Measuring and comparing projects doesn't always reflect success. From

Which orange dot is bigger? Overall impressions differ from what the measurements say. Project metrics doesn’t always reflect success. From

Outside of text-based reports, there are other reports and metrics that higher-ups like: number of website hits, number of collaborations, number of conferences attended, number of partners/institutions involved, et cetera. A really successful project can look weak in all these ways. Similarly, a crap project can look quite successful based on the metrics listed. So if there is not a clear correlation between metrics used for project success, and actual project success, why do we measure them?

So what’s the alternative? The simplest alternative – not measuring/reporting metrics – is probably not going to fly with funders, institutions, or organizations. In fact, metrics play an important role. They allow for comparisons among projects, provide targets to strive for, and allow project members to assess progress. Perhaps rather than defaulting to the standard reporting requirements, funders and institutions could instead take some time to consider what success means for a particular project, and customize the metrics based on that.

In the space I operate (data sharing, data management, open science, scholarly publishing etc.) project success is best assessed by whether the project has (1) resulted in new conversations, debates and dialogue, and/or (2) changed the way science is done. Examples of successful projects based on this definition: figshare, ImpactStory, PeerJ, IPython Notebook, and basically anything funded by the Alfred P. Sloan Foundation. Many of these would also pass the success test based on more traditional metrics, but not necessarily. I will avoid making enemies by listing projects that I deem unsuccessful, despite their passing the test based on traditional metrics.

The altmetrics movement is focused on reviewing researcher and research impact in new, interesting ways (see my blog posts on the topic here and here). What would this altmetrics movement look like in terms of projects? I’m not sure, but I know that its time has come.

Tagged , , ,

Impact Factors: A Broken System

From Flickr by The Official CTBTO Photostream

How big is your impact? Sedan Plowshare Crater, 1962. From Flickr by The Official CTBTO Photostream

If you are a researcher, you are very familiar with the concept of a journal’s Impact Factor (IF). Basically, it’s a way to grade journal quality. From Wikipedia:

The impact factor (IF) of an academic journal is a measure reflecting the average number of citations to recent articles published in the journal. It is frequently used as a proxy for the relative importance of a journal within its field, with journals with higher impact factors deemed to be more important than those with lower ones.

The IF was devised in the 1970s as a tool for research libraries to judge the relative merits of journals when allocating their subscription budgets. However it is now being used as a way to evaluate the merits of individual scientists– something for which it was never intended to be used.  As Björn Brembs puts it, “…scientific careers are made and broken by the editors at high-ranking journals.”

In his great post, “Sick of Impact Factors“, Stephen Curry says that the real problem started when impact factors began to be applied to papers and people.

I can’t trace the precise origin of the growth but it has become a cancer that can no longer be ignored. The malady seems to particularly afflict researchers in science, technology and medicine who, astonishingly for a group that prizes its intelligence, have acquired a dependency on a valuation system that is grounded in falsity. We spend our lives fretting about how high an impact factor we can attach to our published research because it has become such an important determinant in the award of the grants and promotions needed to advance a career. We submit to time-wasting and demoralising rounds of manuscript rejection, retarding the progress of science in the chase for a false measure of prestige.

Curry isn’t alone. Just last week Bruce Alberts, Editor-in-Chief of Science, wrote  a compelling editorial about Impact Factor distortions. Alberts’ editorial was inspired by the recently released San Francisco Declaration on Research Assessment (DORA). I think this is one of the more important declarations/manifestoes peppering the internet right now, and has the potential to really change the way scholarly publishing is approached by researchers.

DORA was created by a group of editors and publishers who met up at the Annual Meeting of the American Society for Cell Biology (ASCB) in 2012. Basically, it lays out all the problems with impact factors and provides a set of general recommendations for different stakeholders (funders, institutions, publishers, researchers, etc.). The goal of DORA is to improve “the way in which the quality of research output is evaluated”.  Read more on the DORA website and sign the declaration (I did!).

An alternative to IF?

If most of us can agree that impact factors are not a great way to assess researchers or their work, then what’s the alternative? Curry thinks the solution lies in Web 2.0 (quoted from this post):

…we need to find ways to attach to each piece of work the value that the scientific community places on it though use and citation. The rate of accrual of citations remains rather sluggish, even in today’s wired world, so attempts are being made to capture the internet buzz that greets each new publication…

That’s right, skeptical scientists: he’s talking about buzz on the internet as a way to assess impact. Read more about “alternative metrics” in my blog post on the subject: The Future of Metrics in Science.  Also check out the list of altmetrics-related tools at The great thing about altmetrics is that they don’t rely solely on citation counts, plus they are capable of taking other research products into account (like blog posts and datasets).

Other good reads on this subject:

Tagged , , ,

The Future of Metrics in Science

Ask any researcher what they need for tenure, and the answer is virtually the same across institutions and disciplines: publications.  The “publish or perish” model has reigned supreme for generations of scientists, despite its rather annoying ignorance of having quality over quantity publications, how many collaborations have been established, or even the novelty or difficulty of a particular research project.  This archaic measure of impact tends to rely measures like a scientist’s number of citations and the impact factor of the journals in which they publish.

With the upswing in blogs, Twitter feeds, and academic social sites like MendeleyZotero, and (my favorite) CiteULike, some folks are working on developing a new model for measuring one’s impact on science.  Jason Priem, a graduate student at UNC’s School of Information and Library Science, coined the term “altmetrics” rather recently, and the idea has taken off like wildfire.

altmetrics is the creation and study of new metrics based on the Social Web for analyzing, and informing scholarship.

The concept is simple: instead of using traditional metrics for measuring impact (citation counts, journal impact factors), Priem and his colleagues want to take into account more modern measures of impact like number of bookmarks, shares, or re-tweets.  In addition, altmetrics seeks to consider not only publications, but associated data or code downloads.

sex pistols

The original alternatives: The Sex Pistols. From Arroz Do Ceu ( Read more about the beginnings of alternative rock in Dave Thompson’s book “Alternative Rock”.

Old-school scientists and Luddites might balk at the idea of measuring a scientist’s impact on the community by the number of re-tweets their article received, or by the number of downloads of their dataset.  This reaction can be attributed to several causes, one of which may be an irrational fear of change.  But the reality is that the landscape of science is changing dramatically, and the trend towards social media as a scientific tool is only likely to continue.  See my blog post on why scientists should tweet for more information on the benefits of embracing one of the aspects of this trend.

Need another reason to get onboard? Funders see the value in altmetrics.  Priem, along with his co-PI (and my DataONE colleague) Heather Piwowar, just received $125K from the Sloan Foundation to expand their Total Impact project.  Check out the Total Impact website for more information, or read the UNC SILS news story about the grant.

The DCXL project feeds right into the concept of altmetrics.  By providing citations for datasets that are housed in data centers, the impact of a scientist’s data can be easily incorporated into their impact factor.

Tagged , , ,