Category Archives: DataUp add-in & web application

DataUp is Merging with Dash!

Exciting news! We are merging the DataUp tool with our new data sharing platform, Dash.

About Dash

Dash is a University of California project to create a platform that allows researchers to easily describe, deposit and share their research data publicly. Currently the Dash platform is connected to the UC3 Merritt Digital Repository; however, we have plans to make the platform compatible with other repositories using protocols such as SWORD and OAI-PMH. The Dash project is open-source and we encourage community discussion and contribution to our GitHub site.

About the Merge

There is significant overlap in functionality for Dash and DataUp (see below), so we will merge these two projects to enable better support for our users. This merge is funded by an NSF grant (available on eScholarship) supplemental to the DataONE project.

The new service will be an instance of our Dash platform (to be available in late September), connected to the DataONE repository ONEShare. Previously the only way to deposit datasets into ONEShare was via the DataUp interface, thereby limiting deposits to spreadsheets. With the Dash platform, this restriction is removed and any dataset type can be deposited. Users will be able to log in with their Google ID (other options being explored). There are no restrictions on who can use the service, and therefore no restrictions on who can deposit datasets into ONEShare, and the service will remain free. The ONEShare repository will continue to be supported by the University of New Mexico in partnership with CDL/UC3. 

The NSF grant will continue to fund a developer to work with the UC3 team on implementing the DataONE-Dash service, including enabling login via Google and other identity providers, ensuring that metadata produced by Dash will meet the conditions of harvest by DataONE, and exploring the potential for implementing spreadsheet-specific functionality that existed in DataUp (e.g., the best practices check). 

Benefits of the Merge

  • We will be leveraging work that UC3 has already completed on Dash, which has fully-implemented functionality similar to DataUp (upload, describe, get identifier, and share data).
  • ONEShare will continue to exist and be a repository for long tail/orphan datasets.
  • Because Dash is an existing UC3 service, the project will move much more quickly than if we were to start from “scratch” on a new version of DataUp in a language that we can support.
  • Datasets will get DataCite digital object identifiers (DOIs) via EZID.
  • All data deposited via Dash into ONEShare will be discoverable via DataONE.

FAQ about the change

What will happen to DataUp as it currently exists?

The current version of DataUp will continue to exist until November 1, 2014, at which point we will discontinue the service and the dataup.org website will be redirected to the new service. The DataUp codebase will still be available via the project’s GitHub repository.

Why are you no longer supporting the current DataUp tool?

We have limited resources and can’t properly support DataUp as a service due to a lack of local experience with the C#/.NET framework and the Windows Azure platform.  Although DataUp and Dash were originally started as independent projects, over time their functionality converged significantly.  It is more efficient to continue forward with a single platform and we chose to use Dash as a more sustainable basis for this consolidated service.  Dash is implemented in the  Ruby on Rails framework that is used extensively by other CDL/UC3 service offerings.

What happens to data already submitted to ONEShare via DataUp?

All datasets now in ONEShare will be automatically available in the new Dash discovery environment alongside all newly contributed data.  All datasets also continue to be accessible directly via the Merritt interface at https://merritt.cdlib.org/m/oneshare_dataup.

Will the same functionality exist in Dash as in DataUp?

Users will be able to describe their datasets, get an identifier and citation for them, and share them publicly using the Dash tool. The initial implementation of DataONE-Dash will not have capabilities for parsing spreadsheets and reporting on best practices compliance. Also the user will not be able to describe column-level (i.e., attribute) metadata via the web interface. Our intention, however, is develop out these functions and other enhancements in the future. Stay tuned!

Still want help specifically with spreadsheets?

  • We have pulled together some best practices resources: Spreadsheet Help 
  • Check out the Morpho Tool from the KNB – free, open-source data management software you can download to create/edit/share spreadsheet metadata (both file- and column-level). Bonus – The KNB is part of the DataONE Network.

 

It's the dawn of a new day for DataUp! From Flickr by David Yu.

It’s the dawn of a new day for DataUp! From Flickr by David Yu.

Tagged , , , , , ,

DataUp-Date

It’s been over a year since the DataUp tool went live, and we figure it’s time for an update. I’m co-writing this blog post with Susan Borda from UC Merced, who joined the UC3 DataUp project a few months ago.

DataUp Version 1

We went live with the DataUp tool in November 2012. Since then, more than 600 people have downloaded the add-in for Excel, and countless others have accessed the web application. We have had more than 50 submissions of datasets to the ONEShare Repository via DataUp, and many more inquiries about using the free repository. Although the DataUp tool was considered a success by many measures, we recognized that it had even more potential for improvement and expanded features (see our list of suggested improvements and fixes on BitBucket).

"Going Up". From Flickr by vsai

“Going Up”. From Flickr by vsai

Unfortunately, development on DataUp stopped once we went live. The typical reasons apply here – lack of staff and resources to devote to the project. We therefore partnered with DataONE and requested funds from the National Science Foundation to continue work on the tool (full text of the grant available on eScholarship). Shortly after receiving notice that we received the requested grant, the UC3 team met with Microsoft Research, our original partners on DataUp. We discovered that our interests were still aligned, and that Microsoft had been using in-house resources to continue work on DataUp as an internal project titled “Sequim”. Rather than work in parallel, we decided to join forces and work on DataUp Version 2 (see more below).

In the interim, we published our work on DataUp Version 1 at F1000Research, an open access journal that focuses on rapid dissemination of results and open peer review. In this publication, we describe the project background, requirements gathering including researcher surveys, and a description of the tool’s implementation.

DataUp Version 2

The NSF grant allowed us to hire Susan Borda, a librarian at UC Merced with a background in IT and knowledge of the DataUp project. She has been serving as the project manager for DataUp Version 2, and has liaised with Microsoft Research on the project. Susan will take over from here to describe what’s on the horizon for DataUp.

The new version of DataUp will be available after February 24th, 2014. This version will have a new, clean web interface with functionality for both users and administrators. A DataUp administrator (i.e., repository manager), will be able to define the file-level metadata that will be captured from the user upon data deposit. In addition, an administrator will be able to activate the  “Data Quality Check”, which allows the DataUp tool to verify whether user’s uploaded file meets certain requirements for their repository. The “Best Practices” and file “Citation” features from DataUp version 1 are still available in version 2.

Note that we will be phasing out DataUp version 1 over the next few weeks, which means the add-in for Excel will no longer be operational.

Dying to see the new tool?

Microsoft Research will be at the International Digital Curation Conference (#IDCC14) in San Francisco at the end of February, demoing and discussing their suite of research tools, including DataUp. Susan will also be at IDCC, demoing DataUp version 2 more informally during the poster session with the goal of getting feedback from delegates.

Tagged , , ,

All Things Data in Amsterdam

The International Digital Curation Conference is wrapping up today, and I feel like I just finished a big, tasty Thanksgiving dinner: full and slightly uncomfortable, but in the brain rather than the gut.  IDCC is a meeting that draws about 300 individuals from all over the world. Participants include librarians, repository administrators, publishers, funders, information technology folks, and people working at all manner of data and archiving organizations. Get these people in the same room, and the result is interesting talks, an amazing twitter backchannel, and novel ideas for collaboration. This was my first IDCC conference, and I was not disappointed.

Pre-workshops started on Monday, and I participated in a data management tools update (Data Management Planning: what’s happened, what’s happening and what’s coming next?), organized primarily by Martin Donnelly of the Digital Curation Centre in the UK. It was interesting to hear about the future of the DMPTool and DMPOnline, as well as an overview of current data policies in the UK, Europe, Australia, and the US. Martin and I are arranging a similar workshop for the iConference, held next month in Fort Worth TX.

On Tuesday, I was inundated with really great talks and conversations. The keynote speaker was Ewan Birney from the European Bioinformatics Institute on Bioinformatics infrastructure in Europe was chock full of great examples about how data sharing can benefit research. There was also a talk by  Kaitlin Thaney from Digital Science, who discussed the many projects they are funding, including Figshare and Altmetric. These two talks highlighted the many approaches people are taking to tackle digital data: we need both infrastructure and tools, as well as incentives and changes in the culture of research data.

Fun fact: Eddie and Alex Van Halen are Dutch! Photo from bumslogic.wordpress.com

Fun fact: Eddie and Alex Van Halen are Dutch! Photo from bumslogic.wordpress.com

Tuesday afternoon was devoted to a poster session where I schmoozed with folks over the DataUp poster. The DataUp team (Trisha Cruse, John Kunze, and myself) won 2nd place for best poster; first place went to the Right Field project (Right FIeld: Spreadsheet Annotation by Stealth), which was especially interesting given how closely aligned this project is with DataUp. Wednesday was more talks, meetings, and discussions. I’m excited about the post-conference workshop today on data publication. I’m guessing I will be inspired by this workshop and my next blog post will be about all things data publication.

Hungry for some Dutch music trivia? Wikipedia has a great list of songs about Amsterdam… including one by Van Halen.

Tagged , , , ,

DataUp is Live!

party girls

We are celebrating. From Boston Public Library via Flickr.

That’s right: DataUp is LIVE! I’m so excited I needed to type it twice.  So what does “DataUp is Live!” mean? Several things:

  • The DataUp website (dataup.cdlib.org) is up and running, and is chock full of information about the project, how to participate, and how to get the tool (in either web app or add-in form).
  • The DataUp web application is up and running (www.dataup.org). Anyone with internet access can start creating high-quality, archive-ready data! Would you rather use the tool within Excel? Download the add-in instead (available via the main site).
  • The DataUp code is available. DataUp is an open source project, and we strongly encourage community members to participate in the tool’s continued improvement. Check out the code on BitBucket.
  • The special repository for housing DataUp data, ONEShare, is up and running. This new repository is a special instance of the CDL’s Merritt Repository, and is connected to the DataONE project. ONEShare is the result of collaborations between CDL, University of New Mexico, and DataONE.  Read more in my blog post about ONEShare.
  • Please note that the current version of DataUp is Beta: this means it’s a work in progress. We apologize for any hiccups you may encounter; in particular, there is a known issue that currently prevents spreadsheets archived via DataUp from appearing in DataONE searches.

Today also marks the integration of the old DCXL/DataUp blog with the Data Pub Blog. You probably noticed that they are combined since the banner at the top says “Data Pub”. I will be posting here from now on, rather than at dataup.cdlib.org. The DataUp URL now hold the DataUp main website. Read more about these changes in my blog post about it.  The Data Pub Blog is intended to hold “Conversations About Data”. That means we will run the gamut of potential topics, including (but not limited to) data publication, data sharing, open data, metadata, digital archiving, etc. etc..  There are likely to be posts from others at CDL from time to time, which means you will have access to more than just my myopic views on all things data.

The DataUp project’s core team included yours truly, Patricia Cruse (UC3 Director), John Kunze (UC3 Associate Director), and Stephen Abrams (UC3 Associate Director). Of course, no project at CDL is an island. We had SO MUCH help from the great folks here:

  • DataUp Website: Eric Satzman, Abhishek Salve, Robin Davis-White, Rob Valentine, Felicia Poe
  • DataUp Communications: Ellen Meltzer (DataUp Press Release PDF)
  • DataUp development: Mark Reyes, David Loy, Scott Fisher, Marisa Strong
  • Machine configuration: Joseph Somontan
  • Administrative support: Beaumont Yung, Rondy Epting-Day, Stephanie Lew

Thanks to all of you!

Tagged , , , ,

Counting Down Plus DataUp Webinar

celebration

Next week: The CDL DataUp team will be performing “Celebration” at a karaoke bar (undisclosed location).

We are nearing the (revised) launch date for DataUp: on Tuesday 2 October, one week from today, we plan on officially release the tool. This includes the DataUp website, the code, and the ability to download the add-in.  Of course, you never know what the next week will bring.  We aren’t promising these will be delivered on Tuesday, but we will do our very best!

Last week at the annual DataONE All Hands Meeting, I presented a demonstration of DataUp and showcased its capabilities for assisting in good data stewardship practices.  DataUp was met with much excitement, especially from the Citizen Science Working Group (technically called the PPSR group, which stands for Public Participation in Scientific Research). The PPSR folks were very excited about shaping DataUp to be something that will help their data contributors to submit high quality, well-documented data. This is one of the many extensions for which DataUp is ripe; others include its integration with repositories other than ONEShare.

If you would like a guided introduction and walk-through of the tool, mark your calendar for the DataUp webinar, scheduled for Wednesday 3 October.  You need to pre-register for the webinar to receive the connection information.  If you can’t make the webinar, don’t fret: we will record it and make it available afterward on the UC3 webinar page.

Tagged , ,

Have Patience

work in progress

From Flickr by London Permaculture

Like all good projects, DataUp hit a few snags near the finish line. As a result, the DataUp launch will not take place today, as described in last week’s post. We have rescheduled the launch for two weeks from today. Stay tuned!

Did you notice? We tidied up.

If you didn’t notice, check out the URL above for this post: unbeknownst to you, you have been rerouted from DataUp to Data Pub. If you are still reeling from our first change (DCXL to DataUp), we apologize. Keep in mind, however, that change is good. Turn and face the strain.

The newest move is a harbinger of many changes that are coming up in the next eight days: on September 18, we will be releasing the DataUp tool! In preparation for this release, a little housekeeping needed to be done:

It’s time for DataUp housekeeping! From Flickr by clotho98

First, we created a lovely new website for DataUp (hat tip to the crackerjack team of user experience design folks here at the California Digital Library).  The new website will have all of the bells and whistles needed to fully enjoy DataUp: links to the add-in, the web application, users guides and documentation, and the code to name a few. Where should this website live? At dataup.cdlib.org, of course! But this requires a bit of musical chairs. So…

We are moving the DataUp blog (formerly the DCXL blog) to the Data Pub URL (datapub.cdlib.org). The CDL already has a blog residing at this URL, however it is in dire need of sustenance.  And let’s face it: although they are all data-related, many of the blog posts you’ve read here are not specific to the DataUp project. So as of now, Data Pub will be the official blog for all things data-related at CDL, but not exclusively related to DataUp. It will be written by yours truly (with the occasional guest post), so if you are hungry for more blog content with tenuous links to music and pop culture, then re-bookmark now.

On Tuesday next week, check out the new dataup.cdlib.org website. Stay tuned for the announcement blog post, found here on Data Pub! This URL/website will be re-branded Data Pub on Tuesday next week.

Tagged ,

Progress & Plans for DataUp Release

I can't get no satisfaction album cover

Unlike Mick Jagger, our beta testers are satisfied. From wikipedia.org

It was one year ago today that I moved up to the Bay Area to work on DataUp (then DCXL) in earnest.  It seems fitting that this milestone be marked by some significant progress on the project.  No, we haven’t released DataUp to the public yet, but we have a release date slated for this September.  This is very exciting news, especially since the project got off to a bit of a slow  start.  We have been cooking with gas since March, however, and the DataUp tool promises to do much of what I had envisioned on my drive from Santa Barbara last year.

If you are wondering what DataUp looks like, you will need to be patient.  You can, however, see some preliminary responses from our very gracious beta testers.  The good news is this: most folks seem pretty happy with the tool as-is, and many offered some really great feedback that will improve the tool as we move into the community involvement phase of the development effort.

We asked 21 beta testers what they thought of DataUp features, and here are the results:

We expect that the DataUp tool will only improve from here on out, so stay tuned for our big debut in less than two months!

Tagged , , ,

DataUp Demo at #DUG2012

ballons

Going up? From Flickr by tarotastic

This past Sunday and Monday, DataONE had their annual Users Group Meeting (DUG) in Madison, WI. The meeting is a chance for librarians, information specialists, data center managers, developers, and other interested folks to get an update on DataONE and provide feedback on how the project is proceeding.  One of the many reports given covered the Investigator Toolkit that DataONE is developing alongside the cyber-infrastructure. The DataUp tools will be part of this toolkit, and I attended DUG to demonstrate the add-in.

I’m happy to report that DataUp was a resounding success.  More importantly, there is much interest in extending the current functionality and capabilities of the tool, which is possible because the DataUp project’s code will be open source.  There was talk of hackathons, hiring summer student interns to improve on the code, and whether it was worthwhile to extend the add-in to Mac versions of Excel.  Discussions were lively and interesting, only to be concluded on the announcement that cookies were available at the break.

I’ve posted the slides for my DataUp presentation, but you will have to wait for the software a bit longer.  The beta version for testing is due out this week, and the final version of the software will be completed later this month.  Stay tuned!

Tagged , ,