9 thoughts on “Wanted: Better Tools and Websites for Data Management Help

  1. Carl Boettiger (@cboettig) says:

    You mean you don’t find it helpful to be linked to a standard that has the intuitive name like ISO 19115? After all, if you want to know more about that standard you can simply buy the pdf for a mere CHF 224,00. Google tells me that’s about 23,731.23 US dollars. surely a bargain for something described as “Geographic information — Metadata”. (http://www.iso.org/iso/catalogue_detail.htm?csnumber=26020)

  2. Gail Steinhart says:

    I agree with what you say, and I think the problem is even more complicated. Or maybe… it isn’t? My point is that you don’t usually have the luxury of picking the standard that seems to fit best (right discipline! fabulous tools!). Aside from creating your own personal metadata library, you pretty much have to use whatever is required by the repository where you’re putting your data. If that’s KNB, you use EML. If it’s a GIS data repository, it’s ISO-19115 (or FGDC-CSDGM). You could (and I’ve done this) create a metadata record according to any standard you like and deposit as a supplementary file if the repository you choose supports that, but you still need to create metadata according to whatever the repository’s requirements are. So, perhaps the *choices* are not so overwhelming after all? The tools issue I’ll concede is harder, for many standards, if the repositories themselves don’t have reasonable tools or interfaces.

    • Carly Strasser says:

      I totally agree that the easiest way to figure out what metadata to use is to ask your favorite repo. However, often those that are new to data management don’t know about that step… Ideally they would be in touch with an expert librarian like yourself. So perhaps it’s back in the outreach camp?

      • Great conversation starter, Carly. I agree with Gail’s point, as you do, but I’ve noticed that in many (too many) cases, the “sharing” strategy of investigators is either posting data to their personal web page or stating that they will make data available by request. Sadly, both of these are perfectly acceptable options to proposal reviewers reading over a data management plan. In these cases, there are no guiding principles for the investigator in terms of selecting a metadata schema, and they probably aren’t likely to share their data anyway (regardless of the “requirement” from NSF).

        I think the choices are still somewhat overwhelming to investigators, but the actual process of creating metadata (in XML, god forbid!) is a far greater hurdle. We need more consensus within disciplinary communities on a standard schema within that group, and we certainly need better tools to help them create metadata that is useful. The reality of the situation is that creating metadata is a huge amount of work, and it’s not a priority for many PIs. We need to create and market tools that make the process easier to have any chance of a widespread commitment within the scientific community to share their data in a productive way.

    • Matt Jones says:

      Great point Gail, and I can see where you are coming from. But for accuracy’s sake, let me say that many repositories support multiple metadata standards, and some support arbitrary metadata standards. The KNB, for example, supports arbitrary metadata standards (anything expressed in XML), and can easily be used to house EML, FGDC, ISO19115, Dublin Core, etc. That’s how we house such diverse metadata as EML and Kepler’s MoML workflow specifications in the same repository.

      • Gail Steinhart says:

        Two thumbs up for KNB’s standards-agnostic infrastructure. If think, though, that Carly’s point (or at least mine) is that there is still a great gulf between what is *possible* and what is actually *easy* for researchers to do themselves.

      • Carly Strasser says:

        Gail – you hit the nail on the head. This was the disconnect I wrote about in a previous blog post (referenced in this one). Although tools exist, and there is flexibility and help available, there is a complete lack of effective communication among the different groups. Researchers need easy, which requires good communication.

  3. Sarah Jones says:

    Great post, Carly.

    I completely agree that more mediation is needed. Collating resources and doing some basic filtering is only one initial step. Hopefully researchers will be able to draw on local support and experts from data repositories to help them to understand and navigate the dark and stormy waters of metadata further.

    There’s definitely a need for more outreach so people know where they can turn for support

    Sarah, DCC

  4. Limor Peer says:

    Great topic. I just wanted to second Sarah’s point here about local support. A model that allows for an “embedded” data person in the research environment would be tremendously helpful in many cases. Researchers can’t (really) be expected to do all this on their own, and the library or repository experts who can help them are not always on their radar.
    ISPS, Yale University

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: