Just in time for the holidays (and the AGU Fall 2011 Meeting)- a list of the overarching requirements we are working on for the Excel add-in project! After talking with about 150 scientists about their Excel use and their data management/archiving practices, we think we have narrowed down the giant list of potential Excel improvements to a manageable list for the project.
- Generate metadata. Over and over, scientists said they didn’t have good data documentation. Wouldn’t it be great it Excel helped you create it? We think so. Data centers require a certain amount of metadata for you to archive your data with them; the process would be much easier if you had a way to generate the correct metadata format and structure for the archive you designate.
- Generate a data citation for the data file. Data citation is the fastest way to encourage good data stewardship since it gives scientists incentives to publish and share their data. For more information on data citation, read my post on the subject.
- Check the spreadsheet for export compatibility. Most scientists don’t use Excel alone. Instead, Excel is a stepping stone for other programs- it is used to organize the data and perform basic quality control, but the data are then promptly copied and pasted or exported to another program like R, MATLAB, ArcGIS, or SAS. The spreadsheet format needed to import your data into other programs is similar to what’s needed to submit your data to an archive. That means if you eliminate problems that would cause statistical programs to choke (see my post on problematic features in Excel), your data are one step closer to being archived quickly and easily.
- Link to archive services. We want you to be able to archive your data with the click of a button. You will need to have a relationship with the archive you plan to submit to, but hopefully that’s already established. Don’t worry- you can specify your usage restrictions and access policies depending on the archive you choose. This requirement is where DataONE comes in- we hope to make the connection between Excel and DataONE as seamless as possible.
What do you think? Ideas? Concerns? More details will come but suggestions are always welcomed. Just shoot me an email.