Data Sharing Requirements for Biomedical Research
Data sharing in biomedicine went from optional to expected to mandatory over the last decade. The NIH Data Management and Sharing Policy, which took effect January 25, 2023, now requires every NIH-funded researcher to submit a data management plan, and actually share their data.
This guide covers what the NIH policy requires, what major journals ask for in data availability statements, where to deposit different types of biomedical data, and what FAIR principles mean in practice.
The NIH Data Management and Sharing Policy (2023)
The NIH DMS Policy applies to all NIH-funded research that generates scientific data: including grants, contracts, and intramural research. It applies to all applications and proposals submitted on or after January 25, 2023.
The policy requires:
- 1.Submit a Data Management and Sharing (DMS) Plan with your grant application
- 2.Data must be deposited in an established repository as soon as possible: no later than publication or end of award
- 3.Data Management costs are allowable as direct costs (up to ~$30,000/year without special justification)
- 4.Cite the shared dataset in any resulting publications
Other Major Funder Data Sharing Mandates
Wellcome Trust
Since 2017 (updated 2021)Strong data sharing mandate. Requires a data management plan, data sharing in an appropriate repository, and a data availability statement in all publications. All publications must be OA (CC BY), and the underlying data must be available.
UKRI (all councils)
Since 2015 (RCUK policy, now UKRI)Research outputs including data must be made available as openly as possible. Data underlying publications must be deposited in an appropriate repository. Applies to BBSRC, MRC, ESRC, EPSRC, and other UKRI councils.
Gates Foundation
Since 2017One of the strictest. It requires CC BY for all publications and immediate open access to underlying data. Data sharing plan required with grant application.
European Research Council (ERC)
Since 2017 (expanded Horizon Europe)Open Research Data pilot is now the default for all funded projects. Requires a Data Management Plan and deposition in a trusted repository where possible.
Journal Data Availability Policies
Almost every major journal now requires a data availability statement: a brief paragraph describing where the data are, how to access them, and any restrictions. "Data available on request" was acceptable five years ago; many journals now explicitly prohibit it, requiring actual deposition in a repository.
Nature family
Mandatory data availability statement. Raw data for figures required. Specific repositories required for genomics, structures, sequences. Code must be deposited.
Science / AAAS
Mandatory data availability statement. All data must be available to reviewers and readers. Structured data deposited in appropriate repositories.
Cell Press
Data availability statement required. Original data for all figures must be deposited or available on request. Specific NCBI/PDB repositories for applicable data types.
NEJM
Data availability statement required. For clinical trials, data sharing plan must be registered. Patient-level trial data sharing increasingly expected.
JAMA Network
Data and statistical code availability statement required. Deidentified participant data must be available for clinical trials with planned sharing.
Lancet family
Data sharing statement required. Original data must be available on reasonable request. Clinical trial data sharing plan required.
BMJ
Data availability statement required. Supports open data; where possible, data should be deposited. Structured data sharing encouraged via OSF or similar.
PLOS ONE / PLOS Medicine
All data underlying figures and results must be fully available. Data deposited in appropriate repository or included as supplementary material. No 'available on request': must be actually available.
eLife
All data must be available. No 'available on request.' Code must be available. Extremely transparent data sharing expectations.
PNAS
Data availability statement required. Data must be deposited in an appropriate repository where one exists.
Nature Communications
Same as Nature family: data availability statement, repository deposition for applicable data types.
Where to Deposit Your Data
Use domain-specific repositories where they exist: GEO for gene expression, SRA for sequencing, PDB for structures. For everything else, general-purpose repositories like Zenodo, Figshare, or Dryad work well. All generate DOIs for citation.
Gene expression, genomics arrays
Required by most molecular biology journals for microarray and RNA-seq data. Generates GSE accession number.
Raw sequencing data (NGS, WGS, RNA-seq)
Required for raw sequencing data. Submit before manuscript submission to include accession in paper.
Human genomic + phenotype data
NIH-approved repository for controlled-access human genomic data with identifiability concerns.
Protein and nucleic acid structures
Required by Nature, Cell, and structural biology journals for all new structures. Generates PDB ID.
Any type (general purpose)
CERN-hosted, FAIR-compliant, generates DOI. Good for computational code, datasets, and materials not accepted elsewhere.
Any type (general purpose)
Good for figures, datasets, posters, and code. Many institutions have sponsored accounts. Generates DOI.
Any type (general purpose)
Good for pre-registration, protocols, and data. 5GB private storage per project; public projects have higher limits. Strong in psychology/social science; growing in biomedicine.
Any type (general purpose)
Partner with many journals: some journals automatically transfer data to Dryad at acceptance.
Analysis code and software
GitHub for active development; Zenodo integration for a frozen, citable version. Required for computational methods papers.
Physiological and clinical signals
For ECG, EEG, ICU monitoring data. Strong in clinical and biomedical signals.
Immunology data
NIH-supported repository for immunological data including flow cytometry and clinical trial immunology data.
Medical imaging (CT, MRI, PET)
For de-identified cancer imaging datasets. Required/recommended by imaging and oncology journals.
Any type (general purpose)
Well-regarded general repository. Many universities have institutional Dataverse installations.
FAIR Principles in Practice
FAIR data principles (Findable, Accessible, Interoperable, Reusable) are now the standard framework for data sharing. NIH, Wellcome, UKRI, and most major journals reference FAIR compliance as the goal. Here's what each means operationally:
Findable
Deposit in a repository that assigns a persistent identifier (DOI or accession number). Include rich metadata so search engines can index it. Don't just put data on a lab website that disappears.
Accessible
Data should be retrievable via open protocols. For controlled-access data (e.g., dbGaP), the access procedure itself must be publicly documented. The process for requesting access counts as 'accessible' even if the data aren't open.
Interoperable
Use standard file formats (CSV over Excel, FASTQ over proprietary formats, TSV over custom delimiters). Include a data dictionary. Make it so someone without your specific software can use the data.
Reusable
Attach a license (CC BY for open data). Include enough provenance (how the data were collected, processed, and what quality checks were applied) so that someone else can reproduce your decisions.
References
- National Institutes of Health. (2023). NIH Data Management and Sharing Policy. U.S. Department of Health and Human Services. [sharing.nih.gov ↗]
- Wilkinson MD, Dumontier M, Aalbersberg IJ, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016;3:160018. [doi.org/10.1038/sdata.2016.18 ↗]
- UK Research and Innovation (UKRI). Research data management and sharing policy. Retrieved February 2026. [ukri.org ↗]
- Zenodo. About Zenodo: open repository for research. CERN. Retrieved February 2026. [zenodo.org ↗]
- Figshare. About figshare: open data repository. Digital Science. Retrieved February 2026. [figshare.com ↗]
- Dryad. About Dryad: open data repository. Dryad Digital Repository. Retrieved February 2026. [datadryad.org ↗]
Suggested Citation
APA
Manusights. (2026). Data sharing requirements for biomedical research: NIH policy, journals, and repositories. Retrieved from https://manusights.com/resources/data-sharing-requirements
MLA
Manusights. "Data Sharing Requirements for Biomedical Research: NIH Policy, Journals, and Repositories." Manusights, 2026, manusights.com/resources/data-sharing-requirements.
VANCOUVER
Manusights. Data sharing requirements for biomedical research: NIH policy, journals, and repositories [Internet]. 2026. Available from: https://manusights.com/resources/data-sharing-requirements
CC BY 4.0 - share and adapt freely with attribution to Manusights (manusights.com/resources).
Frequently Asked Questions
Are data sharing requirements mandatory for all research types?
Requirements vary by journal, publisher, and research type. Nature Portfolio journals, Cell Press, PLOS journals, and BMJ require data sharing for all original research by default, with exceptions for privacy-restricted data. Most journals with data sharing policies require a Data Availability Statement in the manuscript regardless of whether data can be fully shared. If data cannot be shared, you must explain why - privacy, legal restrictions, or third-party ownership are all accepted reasons at most journals.
Where should I deposit research data before submitting a manuscript?
The right repository depends on your data type. Genomics data goes to NCBI (GEO for gene expression, SRA for raw sequencing, dbGaP for controlled human genomics data). Protein structures go to the RCSB Protein Data Bank. Clinical trial data goes to ClinicalTrials.gov. For general research data without a domain-specific repository, Zenodo, Figshare, or Dryad are widely accepted. Most journals provide a list of recommended repositories in their data sharing policy.
What should a Data Availability Statement include?
A Data Availability Statement should specify: (1) where the data can be accessed (repository name and URL or DOI), (2) any access restrictions and the reason for them, and (3) the accession number or identifier for the deposited dataset. If all data are in the manuscript itself (in figures and tables), state that explicitly. If data cannot be shared due to privacy or ethical restrictions, state the restriction and whether data can be requested from the corresponding author. Many journals provide a template - use it, since reviewers and editors check the statement during review.
Related guides in this collection