-
Data must be generated by a Rice faculty member, postdoc, staff member, or graduate student for research purposes; it should be original or unique data. Data created by Rice undergraduates can be published with the support of a faculty or staff sponsor.
-
Due to storage and technical limitations, the maximum size of a file is 1 GB, while the maximum size of the total dataset is 10 GB. (If you use the online submission form, the maximum file size is 100 MB due to technical issues, but we can use other methods to transfer the data file.) For cases that do not fit these parameters, please email cds@rice.edu.
-
Datasets should be finalized, and researchers should ensure that they are ready to be published. If it is essential to provide access to a new version, the original will be retained.
-
Making the datasets publicly available must not violate privacy, intellectual property, or other laws or guidelines. You should have the rights to share the data or have obtained permissions from others with rights over the data to share it. Research that contains human subjects data must be properly anonymized, and sharing this data should generally have been cleared (or envisioned) through an approved Institutional Review Board (IRB) protocol.
-
Datasets must be accompanied with sufficient documentation and metadata so that others can find and understand them. You should provide basic metadata for the dataset as a whole (such as title and creator) along with a readme file describing the contents of the data package.
-
Datasets must be made publicly available. To facilitate data reuse, we recommend that researchers use a public domain license such as the CC-0 license or Open Data Commons Public Domain Dedication and License (PDDL). If you need to wait before making your data publicly available, we can embargo the data for a defined period of time (such as 6 months or 1 year).
-
Datasets must not contain corrupted files or malware.
-
We strongly suggest that you convert your data to an open format (such as csv or txt) to facilitate usage and preservation. We can’t accept encrypted or password-protected data.
-
If you want to share datasets from multiple projects, please create a separate submission for each one.
-
Review your data to ensure that it can be shared publicly. You should aim to share all files required to replicate the analysis in your publication. Are there any privacy or intellectual property issues to consider?
-
Make sure that your data is well-organized. Is the dataset finalized and well-documented? Are the files well-named? Are there missing data, unlabeled variables, corrupted files, or other issues?
-
Determine whether the Rice Digital Scholarship Archive is the best repository for your data. In some cases, a repository associated with your discipline may be a better choice, since it may offer more appropriate metadata templates or display features and be better recognized in your discipline. To identify an appropriate repository, please see:
-
Nature’s Recommended Data Repositories
-
PLOS Recommended Repositories
-
Assessing General Data Repositories
-
Re3data
-
If there are more than a few files, consider packaging your dataset, such as by creating a zip file.
-
Make a readme file to document your data; this file may contain information about methods, files, and variables.
-
Review the terms of use for sharing your data through RDSA and determine which license to use for your dataset. Choices include