Step-by-step guide for curating and sharing data
The typical process for curating and sharing a dataset according to the SPARC guidelines consists of:
- Organizing your data according to the SPARC Data Structure (SDS)
- Adding metadata files
- Uploading everything on the Pennsieve data platform where more metadata needs to be added,
- Finally sharing the dataset with the SPARC Curation Team who will review it for compliance and help with subsequent steps until your dataset becomes accessible publicly through the SPARC Data Portal.
We describe below the suggested steps for implementing this process with SODA. It differs slightly depending on if you are sharing a dataset associated with a SPARC funded study (i.e., a SPARC dataset) or not (i.e., a non-SPARC dataset).
Sharing SPARC datasets
If you have data files issued from a SPARC-funded study, it is mandatory to share them according to the SPARC data curation and sharing guidelines so that they eventually become openly accessible through the SPARC Data Portal. Follow the steps below to do so. See the next section for non-SPARC data submission.
A. Preliminary steps
These steps only need to be completed once.
- Download and install SODA
- All SPARC datasets must be uploaded on the Pennsieve data platform. Get access to Pennsieve as well as the SPARC organization on Pennsieve by filling out this form. We also suggest to request access to the SPARC Airtable sheet through the same form as it will come in handy when your prepare your SPARC metadata files.
- Download and install the Pennsieve agent required to upload files through SODA (ignore the deprecation warning)
- Watch our quick video to familiarize yourself with the user interface of SODA (note: optional but recommended)
B. Curate and share data with SODA
Use the Guided Mode of SODA, accessible through the sidebar of the app, for preparing and sharing your dataset according to the SPARC guidelines. The Guided Mode is intended to guide users step-by-step through all the requirements for curating and sharing datasets according to the SPARC data standards. The user interfaces of the Guided Mode are designed to logically guide users through the curation steps and include all necessary information such that no prior knowledge of the SPARC data standards is required.
Sharing non-SPARC datasets
As of August 2022, SPARC is accepting datasets from investigators that are not funded through the NIH SPARC program. This is an excellent opportunity to make your research data FAIR through the SPARC Data Portal and get credit whenever someone reuse your data!
A. Initial inquiry
The process for sharing non-SPARC data on the SPARC data portal starts with reaching out via email to the SPARC Curation team (curation@sparc.science). Include briefly information about the data you want to share and the SPARC Curation team will follow up with you regarding the suitability of your data for the SPARC data portal.
B. Preliminary steps
If your dataset is deemed suitable for the SPARC data portal, the SPARC Curation team will typically instruct you to download SODA and request access to Pennsieve.
- Download and install SODA
- All SPARC datasets must be uploaded on the Pennsieve data platform. Get access to Pennsieve as well as the SPARC organization on Pennsieve by filling out this form
- Download and install the Pennsieve agent required to upload files through SODA (ignore the deprecation warning)
- Watch our quick video to familiarize yourself with the user interface of SODA (note: optional but recommended)
C. Curate and share data with SODA
Use the Guided Mode of SODA, accessible through the sidebar of the app, for preparing and sharing your dataset according to the SPARC guidelines. The Guided Mode is intended to guide users step-by-step through all the requirements for curating and sharing datasets according to the SPARC data standards. The user interfaces of the Guided Mode are designed to logically guide users through the curation steps and include all necessary information such that no prior knowledge of the SPARC data standards is required.
A specific workflow is currently being implemented for easily processing non-SPARC datasets through the Guided Mode. In the meantime, the Guided Mode can still be used for non-SPARC datasets with the following instructions:
- On the "Before Getting Started Page" ignore reference to the
Data Deliverables document
and theAirtable account
. - On the "SPARC Award number" page, enter your own
award number
associated with your dataset manually. If you have multiple awards, specify the main one here and you will be able to specify other award numbers at a later step. If you don't have any award number associated with your dataset, write "None". - Throughout the user interface, ignore references to
"SPARC"
such as "SPARC datasets" or "SPARC investigators".