Category Archives: Data Management Planning

An RDM Model for Researchers: What we’ve learned

Thanks to everyone who gave feedback on our previous blog post describing our data management tool for researchers. We received a great deal of input related to our guide’s use of the term “data sharing” and our guide’s position in relation to other RDM tools as well as quite a few questions about what our guide will include as we develop it further.

As stated in our initial post, we’re building a tool to enable individual researchers to assess the maturity of their data management practices within an institutional or organizational context. To do this, we’ve taken the concept of RDM maturity from in existing tools like the Five Organizational Stages of Digital Preservation, the Scientific Data Management Capability Model, and the Capability Maturity Guide and placed it within a framework familiar to researchers, the research data lifecycle.

researchercmm_090916

A visualization of our guide as presented in our last blog post. An updated version, including changed made in response to reader feedback, is presented later in this post.

Data Sharing

The most immediate feedback we received was about the term “Data Sharing”. Several commenters pointed out the ambiguity of this term in the context of the research data life cycle. In the last iteration of our guide, we intended “Data Sharing” as a shorthand to describe activities related to the communication of data. Such activities may range from describing data in a traditional scholarly publication to depositing a dataset in a public repository or publishing a data paper. Because existing data sharing policies (e.g. PLOS, The Gates Foundation, and The Moore Foundation) refer specifically to the latter over the former, the term is clearly too imprecise for our guide.

Like “Data Sharing”, “Data Publication” is a popular term for describing activities surrounding the communication of data. Even more than “Sharing”, “Publication” relays our desire to advance practices that treat data as a first class research product. Unfortunately the term is simultaneously too precise and too ambiguous it to be useful in our guide. On one hand, the term “Data Publication” can refer specifically to a peer reviewed document that presents a dataset without offering any analysis or conclusion. While data papers may be a straightforward way of inserting datasets into the existing scholarly communication ecosystem, they represent a single point on the continuum of data management maturity. On the other hand, there is currently no clear consensus between researchers about what it means to “publish” data.

For now, we’ve given that portion of our guide the preliminary label of “Data Output”. As the development process proceeds, this row will include a full range of activities- from description of data in traditional scholarly publications (that may or may not include a data availability statement) to depositing data into public repositories and the publication of data papers.

Other Models and Guides

While we correctly identified that there are are range of rubrics, tools, and capability models with similar aims as our guide, we overstated that ours uniquely allows researchers to assess where they are and where they want to be in regards to data management. Several of the tools we cited in our initial post can be applied by researchers to measure the maturity of data management practices within a project or institutional context.

Below we’ve profiled four such tools and indicated how we believe our guide differs from each. In differentiating our guide, we do not mean to position it strictly as an alternative. Rather, we believe that our guide could be used in concert with these other tools.

Collaborative Assessment of Research Data Infrastructure and Objectives (CARDIO)

CARDIO is a benchmarking tool designed to be used by researchers, service providers, and coordinators for collaborative data management strategy development. Designed to be applied at a variety of levels, from entire institutions down to individual research projects, CARDIO enables its users to collaboratively assess data management requirements, activities, and capacities using an online interface. Users of CARDIO rate their data management infrastructure relative to a series of statements concerning their organization, technology, and resources. After completing CARDIO, users are given a comprehensive set of quantitative capability ratings as well as a series of practical recommendations for improvement.

Unlike CARDIO, our guide does not necessarily assume its users are in contact with data-related service providers at their institution. As we stated in our initial blog post, we intend to guide researchers to specialist knowledge without necessarily turning them into specialists. Therefore, we would consider a researcher making contact with their local data management, research IT, or library service providers for the first time as a positive application of our guide.

Community Capability Model Framework (CCMF)

The Community Capability Model Framework is designed to evaluate a community’s readiness to perform data intensive research. Intended to be used by researchers, institutions, and funders to assess current capabilities, identify areas requiring investment, and develop roadmaps for achieving a target state of readiness, the CCMF encompasses eight “capability factors” including openness, skills and training, research culture, and technical infrastructure. When used alongside the Capability Profile Template, the CCMF provides its users with a scorecard containing multiple quantitative scores related to each capability factor.   

Unlike the CCMF, our guide does not necessarily assume that its users should all be striving towards the same level of data management maturity. We recognize that data management practices may vary significantly between institutions or research areas and that what works for one researcher may not necessarily work for another. Therefore, we would consider researchers understanding the maturity of their data management practices within their local contexts to be a positive application of our guide.

Data Curation Profiles (DCP) and DMVitals

The Data Curation Profile toolkit is intended to address the needs of an individual researcher or research group with regards to the “primary” data used for a particular project. Taking the form of a structured interview between an information professional and a researcher, a DCP can allow an individual research group to consider their long-term data needs, enable an institution to coordinate their data management services, or facilitate research into broader topics in digital curation and preservation.

DMVitals is a tool designed to take information from a source like a Data Curation Profile and use it to systematically assess a researcher’s data management practices in direct comparison to institutional and domain standards. Using the DMVitals, a consultant matches a list of evaluated data management practices with responses from an interview and ranks the researcher’s current practices by their level of data management “sustainability.” The tool then generates customized and actionable recommendations, which a consultant then provides to the researcher as guidance to improve his or her data management practices.  

Unlike DMVitals, our guide does not calculate a quantitative rating to describe the maturity of data management practices. From a measurement perspective, the range of practice maturity may differ between the four stages of our guide (e.g. the “Project Planning” stage could have greater or fewer steps than the “Data Collection” stage), which would significantly complicate the interpretation of any quantitative ratings derived from our guide. We also recognize that data management practices are constantly evolving and likely dependent on disciplinary and institutional context. On the other hand, we also recognize the utility of quantitative ratings for benchmarking. Therefore, if, after assessing the maturity of their data management practices with our guide, a researcher chooses to apply a tool like DMVitals, we would consider that a positive application of our guide.

Our Model (Redux)

Perhaps the biggest takeaway from the response to our  last blog post is that it is very difficult to give detailed feedback on a guide that is mostly whitespace. Below is an updated mock-up, which describes a set of RDM practices along the continuum of data management maturity. At present, we are not aiming to illustrate a full range of data management practices. More simply, this mock-up is intended to show the types of practices that could be described by our guide once it is complete.

screen-shot-2016-11-08-at-11-37-35-am

An updated visualization of our guide based on reader feedback. At this stage, the example RDM practices are intended to be representative not comprehensive.

Project Planning

The “Project Planning” stage describes practices that occur prior to the start of data collection. Our examples are all centered around data management plans (DMPs), but other considerations at this stage could include training in data literacy, engagement with local RDM services, inclusion of “sharing” in project documentation (e.g. consent forms), and project pre-registration.

Data Collection

The “Data Collection” stage describes practices related to the acquisition, accumulation, measurement, or simulation of data. Our examples relate mostly to standards around file naming and structuring, but other considerations at this stage could include the protection of sensitive or restricted data, validation of data integrity, and specification of linked data.

Data Analysis

The “Data Analysis” stage describes practices that involve the inspection, modeling, cleaning, or transformation of data. Our examples mostly relate to documenting the analysis workflow, but other considerations at this stage could include the generation and annotation of code and the packaging of data within sharable files or formats.

Data Output

The “Data Output” stage describes practices that involve the communication of either the data itself of conclusions drawn from the data. Our examples are mostly related to the communication of data linked to scholarly publications, but other considerations at this stage could include journal and funder mandates around data sharing, the publication of data papers, and the long term preservation of data.

Next Steps

Now that we’ve solicited a round of feedback from the community that works on issues around research support, data management, and digital curation, our next step is to broaden our scope to include researchers.

Specifically we are looking for help with the following:

  • Do you find the divisions within our model useful? We’ve used the research data lifecycle as a framework because we believe it makes our tool user-friendly for researchers. At the same time, we also acknowledge that the lines separating planning, collection, analysis, and output can be quite blurry. We would be grateful to know if researchers or data management service providers find these divisions useful or overly constrained.
  • Should there be more discrete “steps” within our framework? Because we view data management maturity as a continuum, we have shied away from creating discrete steps within each division. We would be grateful to know how researchers or data management service providers view this approach, especially when compared to the more quantitative approach employed by CARDIO, the Capability Profile Template, and DMVitals.
  • What else should we put into our model? Researchers are faced with changing expectations and obligations in regards to data management. We want our model to reflect that. We also want our model to reflect the relationship between research data management and broader issues like openness and reproducibility. With that in mind, what other practices and considerations should or model include?
Tagged , , , , , ,

We are Hiring a DMPTool Manager!

Do you love all things data management as much as we do? Then join our team! We are hiring a person to help manage the DMPTool, including development prioritization, promotion, outreach, and education. The position is funded for two years with the potential for an extension pending funding and budgets. You would be based in the amazing city of Oakland CA, home of the California Digital Library. Read more at jobs.ucop.edu or download the PDF description: Data Management Product Manager (4116).

Job Duties

Product Management (30%): Ensure the DMPTool remains a viable and relevant application. Update funder requirements, maintain the integrity of publicly available DMPs, contact partner institutions to report issues, and review DMPTool guidance and content for currency. Evaluates and presents new technologies and industry trends. Recommends those that are applicable to current products or services and the organization’s long-range, strategic plans. Identifies, organizes, and participates in technical discussions with key advisory groups and other customers/clients. Identifies additional opportunities for value added product/service delivery based on customer/client interaction and feedback.

Marketing and Ourtreach (20%): Develop and implement strategies for promoting the DMPTool. Create marketing materials, update website content, contacting institutions, and present at workshops and/or conferences. Develops and participates in marketing and professional outreach activities and informational campaigns to raise awareness of product or service including communicating developments and updates to the community via social media. This includes maintaining the DMPTool blog, Twitter and Facebook accounts, GitHub Issues, and listservs.

Project Management (30%): Develops project plans including goals, deliverables, resources, budget and timelines for enhancements of the DMPTool. Acting as product/service liaison across the organization, external agencies and customers to ensure effective production, delivery and operation of the DMPTool.

Strategic Planning (10%): Assist in strategic planning, prioritizing and guiding future development of the DMPTool. Pursue outside collaborations and funding opportunities for future DMPTool development including developing an engaged community of DMPTool users (researchers) and software developers to contribute to the codebase. Foster and engage open source community for future maintenance and enhancement.

Reporting (10%): Provides periodic content progress reports outlining key activities and progress toward achieving overall goals. Develops and reports on metrics/key performance indicators and provides corresponding analysis.

To apply, visit jobs.ucop.edu (Requisition No. 20140735)

From Flickr by Brenda Gottsabend

From Flickr by Brenda Gottsabend

The Data Lineup for #ESA2013

Why am I excited about Minneapolis? Potential Prince sightings, of course!

Why am I excited about Minneapolis? Potential Prince sightings, of course! From http://www.emusic.com

In less than  week, the Ecological Society of America’s 2013 Meeting will commence in Minneapolis, MN. There will be zillions of talks and posters on topics ranging from microbes to biomes, along with special sessions on education, outreach, and citizen science. So why am I going?

For starters, I’m a marine ecologist by training, and this is an excuse to meet up with old friends. But of course the bigger draw is to educate my ecological colleagues about all things data: data management planning, open data, data stewardship, archiving and sharing data, et cetera et cetera. Here I provide a rundown of must-see talks, sessions, and workshops related to data. Many of these are tied to the DataONE group and the rOpenSci folks; see DataONE’s activities and rOpenSci’s activities. Follow the full ESA meeting on Twitter at #ESA2013. See you in Minneapolis!

Sunday August 4th

0800-1130 / WK8: Managing Ecological Data for Effective Use and Re-use: A Workshop for Early Career Scientists

For this 3.5 hour workshop, I’ll be part of a DataONE team that includes Amber Budden (DataONE Community Engagement Director), Bill Michener (DataONE PI), Viv Hutchison (USGS), and Tammy Beaty (ORNL). This will be a hands-on workshop for researchers interested in learning about how to better plan for, collect, describe, and preserve their datasets.

1200-1700 / WK15: Conducting Open Science Using R and DataONE: A Hands-on Primer (Open Format)

Matt Jones from NCEAS/DataONE will be assisted by Karthik Ram (UC Berkeley & rOpenSci), Carl Boettiger (UC Davis & rOpenSci), and Mark Schildhauer (NCEAS) to highlight the use of open software tools for conducting open science in ecology, focusing on the interplay between R and DataONE.

Monday August 5th

1015-1130 / SS2: Creating Effective Data Management Plans for Ecological Research

Amber, Bill and I join forces again to talk about how to create data management plans (like those now required by the NSF) using the free online DMPTool. This session is only 1.25 hours long, but we will allow ample time for questions and testing out the tool.

1130-1315 / WK27: Tools for Creating Ecological Metadata: Introduction to Morpho and DataUp

Matt Jones and I will be introducing two free, open-source software tools that can help ecologists describe their datasets with standard metadata. The Morpho tool can be used to locally manage data and upload it to data repositories. The DataUp tool helps researchers not only create metadata, but check for potential problems in their dataset that might inhibit reuse, and upload data to the ONEShare repository.

Tuesday August 6th

0800-1000 / IGN2: Sharing Makes Science Better

This two-hour session organized by Sandra Chung of NEON is composed of 5-minute long “ignite” talks, which guarantees you won’t nod off. The topics look pretty great, and the crackerjack list of presenters includes Ethan White, Ben Morris, Amber Budden, Matt Jones,  Ed Hart, Scott Chamberlain, and Chris Lortie.

1330-1700 / COS41: Education: Research And Assessment

In my presentation at 1410, “The fractured lab notebook: Undergraduates are not learning ecological data management at top US institutions”, I’ll give a brief talk on results from my recent open-access publication with Stephanie Hampton on data management education.

2000-2200 / SS19: Open Science and Ecology

Karthik Ram and I are getting together with Scott Chamberlain (Simon Fraser University & rOpenSci), Carl Boettiger, and Russell Neches (UC Davis) to lead a discussion about open science. Topics will include open data, open workflows and notebooks, open source software, and open hardware.

2000-2200 / SS15: DataNet: Demonstrations of Data Discovery, Access, and Sharing Tools

Amber Budden will demo and discuss DataONE alongside folks from other DataNet projects like the Data Conservancy, SEAD, and Terra Populus.

Tagged , , , , , ,

Webinar Series on Data Management & DMPTool

Operators will be standing by to connect you to our awesome webinars. From Flickr by MarkGregory007

Operators will be standing by to connect you to our awesome webinars. From Flickr by MarkGregory007

One of the services we run at the California Digital Library is the DMPTool – this is an online tool that helps researchers create data management plans by guiding them through a series of prompts based on funder requirements. The tool provides resources and help in the form of links, help text, and suggested answers. It was developed by the CDL and many partners a couple of years ago, and it’s been wildly successful.

As a result of this success, we received two generous one-year grants: one from the Alfred P. Sloan Foundation to develop out and improve upon the existing DMPTool (read more in this post); and one from theInstitute of Museum and Library Services, focused on creating resources for librarians interested in promoting the DMPTool at their institutions.

Based on input from a group of librarians back in February, we determined that a webinar series would be useful for introducing the tool, communicating how to use it effectively, and describing how it can be customized for institutional needs. We plan to present webinars on Tuesdays, with current plans for ~15 webinars. The series will go into Fall 2013.

A few things to note:

  • All webinars will be recorded and made available for viewing afterward.
  • The webinar schedule might change a bit depending on presenters’ availability.
  • We are always interested in new webinar ideas; please send them to carly.strasser@ucop.edu or leave them as a comment below.
  • We plan to collect these webinars and make them available as a set. We then hope to create a short course in Data Management with the DMPTool that will offer certification for librarians as “DMPTool Experts” (we are still working on the title!).

Webinar Schedule

Note: for the most up-to-date schedule & links, visit theDMPTool Webinar Series Page or View Google Calendar

Date Topic
28 May Introduction to the DMPTool (details & registration)
4 Jun Learning about data management: Resources, tools, materials (details & registration)
18 Jun Customizing the DMPTool for your institution (details & registration)
25 Jun Environmental Scan: Who’s important at your campus (details & registration)
9 Jul Promoting institutional services with the DMPTool; EZID as example (details & registration)
16 Jul Health Sciences & DMPTool – Lisa Federer, UCLA (details & registration)
23 Jul Digital humanities and the DMPTool – Miriam Posner, UCLA (details & registration)
13 Aug Data curation profiles and the DMPTool – Jake Carlson, Purdue (details soon)
TBD How to give the data management sales pitch to various audiences
TBD Other tools and resources that work with/complement the DMPTool
TBD Beyond funder requirements: more extensive DMPs
TBD Case studies 1 – How librarians have successfully used the tool
TBD Case studies 2 – How librarians have successfully used the tool
TBD Outreach Kit Introduction
TBD Certification program introduction
Tagged , , ,

Good DMP Examples + Going Beyond Two Pages

Did you know that data management plans existed before the NSF started requiring them?? I know, it’s shocking. But they have inherent value despite their being relatively unknown to researchers until now. Proper, thorough data management plans (DMPs) are potentially a major time saver and a huge asset for the project.  Funders tend to have minimal requirements for DMPs (e.g., a mere two pages allowed for an NSF proposal), and as a result researchers tend to underestimate the importance of the document.  I’ve spoken to many researchers who wait until the last minute to start creating their DMP, and as a result their plans reflect their lack of knowledge about data stewardship and are not properly prepared for when data starts being generated by their project.

Here are a few ways to ensure you create a high-quality, thorough DMP:

You take advantage of experts. Librarians should be partners with the researcher in creating their data management plans. Librarians are information professionals, and their business is essentially figuring out how to manage and preserve information (i.e., data). Consult them regularly when creating a plan: even if they don’t fully understand your data, they know how to find good standards, appropriate repositories, and who to talk to on campus.

You take advantage of institutional resources, such as departmental servers, backup services, and IT professionals. Often researchers are unaware of the hardware and software available from their institutions; often the institutional services and resources are available at no or low cost.

You think carefully about your data, including considering file formats, common vocabularies, codes and metadata needed, and standards that will be used for metadata. This should be done as thoroughly as possible before any data are collected to prevent the need to go back and edit your datasets (i.e., the dreaded “find/replace” tasks).

You think carefully about your workflow and sketch out the plan for data processing and analysis.  Workflows can be very informal, consisting of a simple flow chart (read my blog post about this). By considering the iterations of the data before you start collecting, you are more likely to arrange your files, datasets, and collection procedures in a logical way.

You know exactly where your data will be stored, both during the project and after the project is completed.

michael scott, the office

Be a good manager of your data. Need a good example manager? Michael Scott of The Office. From NBC, M. Haaseth

Perhaps most importantly, consider this:  a data management plan should be created early on and should be revisited throughout the project.  Add a reminder to your calendar – every six months, re-read your plan. Make sure new members of your lab group have read the plan and understand it. Make changes based on new developments in the project, and ensure that the work of archiving the data is not pushed entirely to the end of the project.

What about examples? There are lots of examples out there for two-page NSF DMPs:

But be sure to check out more extensive examples and resources too:

Note that future development of the DMPTool will include a “DMP Library”, full of example DMPs where researchers can access others’ plans and share their DMPs. Now go forth and plan!

Tagged , ,

Sustaining Data

Last week, folks from DataONE gathered in Berkeley to discuss sustainability (new to DataONE? Read my post about it). Of course, lots of people are talking about sustainability in Berkeley, but this discussion focused on sustaining scientific data and its support systems.  The truth is, no one wants to pay for data sustainability. Purchasing servers and software, and paying for IT personnel is not cheap. Given our current grim financial times, room in the budget is not likely to be made.  So who should pay?  Let’s first think about the different groups that might pay.

  1. Private foundations
  2. Public agencies (e.g., NSF, NIH)
  3. Institutions
  4. Professional societies and organizations
  5. Researchers

Although the NSF provides funds for organizations like DataONE to develop, they are not interested in funding “sustainability”. They are in the business of funding research, which means that come 2019 when NSF funding ends for DataONE, someone else is going to have to pick up the tab.

Any researcher (including myself) will tell you that the thought of paying for data archiving and personnel is not appealing.  Budgets are already tight in proposals (which have record low acceptance rates); combine that with the lack of clarity about data management and archiving costs, and researchers are not eager to take on sustainability.

Many researchers see data sustainability as the domain of their institutions: providing data management and archiving services in bulk to their faculty would allow institutions to both regulate how their researchers handle their data, and remove the guesswork and confusion for the researchers themselves.  However with budget crises plaguing higher education due to rising costs and decreasing revenue, this is not a cost that institutions are likely to take on in the near future.

Obviously I was going to reference Pink Floyd for this post on money… From Wikipedia.

Lack of funds for critical data infrastructure is a systematic problem, and DataUp is no exception. Although we have funds to promote DataUp and publish our findings in the course of the project, we do not have funds to continue development. There is also the question of storage for datasets. Storage is not free, and we have not yet solved the problem of who will pay in the long term for storing data ingested into the ONEShare repository via DataUp.

Now that I’ve completed this post, it seems rather bleak. I am confident, however, that we have the right people working on the problem of data sustainability. It is certainly a critical piece in the current landscape of digital data.

Love Pink Floyd AND The Flaming Lips? Check out the FL cover album for Dark Side of the Moon, including a spectacular version of “Money”.

Tagged , , , ,

Data Questions: Who Can Help?

Discussions about data seem to be everywhere.  For evidence of this, look at recent discussions of big data, calls for increasing cyber-infrastructure for data, data management requirements by funders, and data sharing requirements by journals.  Given all of this discussion, researchers are (or should be) considering how to handle their own data for both the long term and the short term.

Admit it: You can’t help yourself. You need the expertise of others! The Four Tops knew it. Image from http://www.last.fm (click for more). Check out this live performance of “I can’t help myself”: http://www.youtube.com/watch?v=qXavZYeXEc0

The popularity of discussions about data is good and bad for the average researcher.  

Let’s start with the bad first: it means researchers are now, more than ever, responsible for being good data stewards (before commenting that this “isn’t a bad thing!!” read on).  Gone are the days when you could manage your data in-house, with no worries that others might notice your terrible file naming schemes or scoff at the color coding system in your spreadsheets.  With increasing requirements for managing and sharing data, researchers should be careful to construct their datasets and perform their analyses knowing that they will have to share those files eventually.  This means that researchers need to learn a bit about best practices for data management and invest some time in creating data management plans that go beyond simply funder requirements (which are NOT adequate for actually properly managing your data – see next week’s blog post for more).

Arguably, the “bad” I mention above is not actually bad at all.  Speaking from the point of view of a researcher, however, anything that requires more demands on your time can be taxing.  Moving on to the good: all of this attention being given to data stewardship means that there are lots of places to go for help and guidance.  You aren’t in this alone, researchers.  In previous posts I’ve written about the stubbornness of scientists and our inherent inability to believe that someone might be able us.  In the case of data management and related topics, it will pay off in the long run to put aside your ego and ask for help.  Who? Here are a few ideas:

  1. Librarians.  I’ve blogged about how great and under-used academic libraries and librarians tend to be, but it is worth mentioning again.  Librarians are very knowledgeable about information.  Yes, your information is special. No, no one can possibly understand how great/complex/important/nuanced your data set is.  But I promise you will learn something if you go hang out with a librarian.  Since my entry into the libraries community, I have found that librarians are great listeners.  They will actively listen while you to babble on endlessly about your awesome data and project, and then provide you with insight that only someone from the outside might provide.  Bonus: many librarians are active in the digital data landscape, and therefore are likely to be able to guide you towards helpful resources for scientific data management.
  2. Data Centers/repositories.  If you have never submitted data to a data center for archiving, you will soon.  Calls for sharing data publicly will only get louder in the next few years, from funders, journals, and institutions interested in maximizing their investment and increasing credibility.  Although you might be just hearing of data centers’ existence, they have been around for a long time and have been thinking about how to organize and manage data.  How to pick a data center? A wonderful searchable database of repositories is available at www.databib.org. Once you zero in on a data center that’s appropriate for your particular data set, contact them.  They will have advice on all kinds of useful stuff, including metadata, file formats, and getting persistent identifiers for your data.
  3. Publishers and Funders.  Although they wouldn’t be my first resource for topics related to data, many publishers and funders are increasingly providing guidance, help text, and links to resources that might help you in your quest for improved data stewardship.

My final takeaway is this: researchers, you aren’t in this alone. There is lots of support available for those humble enough to accept it.

Tagged , , , , ,

NSF Panel Review of Data Management Plans

With the clarity of the New Year, I realized I broke a promise to you DCXL readers… in my post on data policies, I stated that my next post would be about the current state of data management plan evaluation on NSF panels.  Although it is a bit late, here’s that post.

My information is from a couple of different sources: a program officer or two at NSF, a few scientists who have served on panels for several different directorates, and some miscellaneous experts in data management plans.  In general, they all said about the same thing: we are in early days for data management plans as an NSF requirement, and the process is still evolving.  With that in mind, here are a few more specific pieces of information I gathered (note, these should be taken with a grain of salt since this is not the official position of NSF):

zach morris cell phone

Just like Zach Morris' cell phone, data management plans are sure to evolve into something much fancier in a few years. From zackmorriscellphone.wordpress.com

  1. The NSF program officer that leads the panel set the tone for DMP evaluation.  Scientists that serve on the proposal review panels generally are not experts in data management or archiving, and therefore are unsure what to look for in DMPs.
  2. The contents of a data management plan will not tank a proposal unless it is completely absent. Since no one is quite sure what should be in these DMPs, it’s tough to eliminate a good proposal on the basis of its DMP. Overall, DMPs are not currently a part of the merit review process.  One person said it very succinctly:

    PIs received a slap on the wrist if they had a good proposal with a bad DMP. If it was a bad proposal, the bad DMP was just another nail in the coffin.

  3. The panelists are merely trying to determine whether at DMP is “adequate”.  What does this mean? It generally boils down to two criteria: (1) Is the DMP present? and (2) Does the PI discuss how they will archive the data?  Even (2) is up for debate since proposals have made it to the top despite no clear plans for archival, e.g. no mention of where they will archive the data.
  4. Finally, there is buzz about some knowledgeable PIs using DMPs as a strategic tool.  Rather than considering this two-page requirement a burden, they use the DMP as part of their proposal’s narrative.  Food for thought.
Tagged , , ,