Reference notes

Coverage

NIH DMS Policy · 11 funder mandates · 14 repositories

Sources

NIH DMS Policy + publisher guidelines

Last reviewed

February 2026

Prepared by the Manusights editorial team.

Compliance-and-repository guide

Data Sharing Requirements for Biomedical Research

Data sharing in biomedicine went from optional to expected to mandatory over the last decade. The NIH Data Management and Sharing Policy, which took effect January 25, 2023, now requires every NIH-funded researcher to submit a data management plan, and actually share their data.

This guide covers what the NIH policy requires, what major journals ask for in data availability statements, where to deposit different types of biomedical data, and what FAIR principles mean in practice.

Quick orientation

Use this page when the manuscript is getting close to submission and the data-sharing plan needs to be concrete, not aspirational.

This guide helps translate policy language into operational choices: what the NIH expects, what journals usually require in the statement itself, which repository fits the data type, and what “FAIR” means at the moment of deposit.

NIH policy summary11 journal-policy records14 repository optionsFAIR in practice

The NIH Data Management and Sharing Policy (2023)

The NIH DMS Policy applies to all NIH-funded research that generates scientific data: including grants, contracts, and intramural research. It applies to all applications and proposals submitted on or after January 25, 2023.

The policy requires:

  1. 1.Submit a Data Management and Sharing (DMS) Plan with your grant application
  2. 2.Data must be deposited in an established repository as soon as possible: no later than publication or end of award
  3. 3.Data Management costs are allowable as direct costs (up to ~$30,000/year without special justification)
  4. 4.Cite the shared dataset in any resulting publications
What "scientific data" means under the policy: Recorded factual material commonly accepted in the scientific community as necessary to validate research findings, not lab notebooks, preliminary analyses, or materials that would typically be in a methods section.
Exceptions: Data may be exempt from sharing if sharing is restricted by law (HIPAA, tribal law), contract, or if the data contains information that can't be de-identified. Document the reason in your DMS Plan.
NIH Institute-specific policies: Some NIH institutes have stricter requirements on top of the base policy: NHGRI for genomics, NCI for cancer data. Check your funding institute's data sharing policy in addition to the base DMS Policy.

Other Major Funder Data Sharing Mandates

Wellcome Trust

Since 2017 (updated 2021)

Strong data sharing mandate. Requires a data management plan, data sharing in an appropriate repository, and a data availability statement in all publications. All publications must be OA (CC BY), and the underlying data must be available.

UKRI (all councils)

Since 2015 (RCUK policy, now UKRI)

Research outputs including data must be made available as openly as possible. Data underlying publications must be deposited in an appropriate repository. Applies to BBSRC, MRC, ESRC, EPSRC, and other UKRI councils.

Gates Foundation

Since 2017

One of the strictest. It requires CC BY for all publications and immediate open access to underlying data. Data sharing plan required with grant application.

European Research Council (ERC)

Since 2017 (expanded Horizon Europe)

Open Research Data pilot is now the default for all funded projects. Requires a Data Management Plan and deposition in a trusted repository where possible.

Journal data-availability policies

Compare how strict major journal families are about data availability statements and repository deposition. Export the current view or copy rows into a lab or library data-sharing guide.

11 of 11 rows

Visible policies

11

Strict groups

6

Quick filters
Policy strictness
Export

Moderate

BMJ

Strictness

Moderate

Policy summary

Data availability statement required. Supports open data; where possible, data should be deposited. Structured data sharing encouraged via OSF or similar.

Strict

Cell Press

Strictness

Strict

Policy summary

Data availability statement required. Original data for all figures must be deposited or available on request. Specific NCBI/PDB repositories for applicable data types.

Strict

eLife

Strictness

Strict

Policy summary

All data must be available. No 'available on request.' Code must be available. Extremely transparent data sharing expectations.

Moderate

JAMA Network

Strictness

Moderate

Policy summary

Data and statistical code availability statement required. Deidentified participant data must be available for clinical trials with planned sharing.

Moderate

Lancet family

Strictness

Moderate

Policy summary

Data sharing statement required. Original data must be available on reasonable request. Clinical trial data sharing plan required.

Strict

Nature Communications

Strictness

Strict

Policy summary

Same as Nature family: data availability statement, repository deposition for applicable data types.

Strict

Nature family

Strictness

Strict

Policy summary

Mandatory data availability statement. Raw data for figures required. Specific repositories required for genomics, structures, sequences. Code must be deposited.

Moderate

NEJM

Strictness

Moderate

Policy summary

Data availability statement required. For clinical trials, data sharing plan must be registered. Patient-level trial data sharing increasingly expected.

Strict

PLOS ONE / PLOS Medicine

Strictness

Strict

Policy summary

All data underlying figures and results must be fully available. Data deposited in appropriate repository or included as supplementary material. No 'available on request': must be actually available.

Moderate

PNAS

Strictness

Moderate

Policy summary

Data availability statement required. Data must be deposited in an appropriate repository where one exists.

Strict

Science / AAAS

Strictness

Strict

Policy summary

Mandatory data availability statement. All data must be available to reviewers and readers. Structured data deposited in appropriate repositories.

Where to deposit your data

Search repositories by data type, cost, or notes. Export the current view or copy rows directly into a DMS plan, lab SOP, or author-support page.

14 of 14 rows

Visible repositories

14

Cost tiers

6

Free options

13

Quick filters
Cost
Export

$120 DPC (often journal-covered)

Dryad

Data type

Any type (general purpose)

Cost

$120 DPC (often journal-covered)

URL

datadryad.org

Notes

Partner with many journals: some journals automatically transfer data to Dryad at acceptance.

Free up to 20GB

Figshare

Data type

Any type (general purpose)

Cost

Free up to 20GB

URL

figshare.com

Notes

Good for figures, datasets, posters, and code. Many institutions have sponsored accounts. Generates DOI.

Free (code); Zenodo for archival DOI

GitHub / Zenodo (code)

Data type

Analysis code and software

Cost

Free (code); Zenodo for archival DOI

URL

github.com

Notes

GitHub for active development; Zenodo integration for a frozen, citable version. Required for computational methods papers.

Free

Harvard Dataverse

Data type

Any type (general purpose)

Cost

Free

URL

dataverse.harvard.edu

Notes

Well-regarded general repository. Many universities have institutional Dataverse installations.

Free

ImmPort

Data type

Immunology data

Cost

Free

URL

immport.org

Notes

NIH-supported repository for immunological data including flow cytometry and clinical trial immunology data.

Free

NCBI dbGaP

Data type

Human genomic + phenotype data

Cost

Free

URL

www.ncbi.nlm.nih.gov/gap/

Notes

NIH-approved repository for controlled-access human genomic data with identifiability concerns.

Free

NCBI dbSNP / ClinVar

Data type

Genetic variants

Cost

Free

Notes

For novel SNPs and variant-disease associations.

URL

www.ncbi.nlm.nih.gov/snp/

Free

NCBI GEO

Data type

Gene expression, genomics arrays

Cost

Free

URL

www.ncbi.nlm.nih.gov/geo/

Notes

Required by most molecular biology journals for microarray and RNA-seq data. Generates GSE accession number.

Free

NCBI SRA (Sequence Read Archive)

Data type

Raw sequencing data (NGS, WGS, RNA-seq)

Cost

Free

URL

www.ncbi.nlm.nih.gov/sra

Notes

Required for raw sequencing data. Submit before manuscript submission to include accession in paper.

Free (5GB private/project)

OSF (Open Science Framework)

Data type

Any type (general purpose)

Cost

Free (5GB private/project)

URL

osf.io

Notes

Good for pre-registration, protocols, and data. 5GB private storage per project; public projects have higher limits. Strong in psychology/social science; growing in biomedicine.

Free

PhysioNet

Data type

Physiological and clinical signals

Cost

Free

URL

physionet.org

Notes

For ECG, EEG, ICU monitoring data. Strong in clinical and biomedical signals.

Free

Protein Data Bank (PDB)

Data type

Protein and nucleic acid structures

Cost

Free

URL

www.rcsb.org

Notes

Required by Nature, Cell, and structural biology journals for all new structures. Generates PDB ID.

Free

The Cancer Imaging Archive (TCIA)

Data type

Medical imaging (CT, MRI, PET)

Cost

Free

URL

www.cancerimagingarchive.net

Notes

For de-identified cancer imaging datasets. Required/recommended by imaging and oncology journals.

Free up to 50GB

Zenodo

Data type

Any type (general purpose)

Cost

Free up to 50GB

URL

zenodo.org

Notes

CERN-hosted, FAIR-compliant, generates DOI. Good for computational code, datasets, and materials not accepted elsewhere.

FAIR Principles in Practice

FAIR data principles (Findable, Accessible, Interoperable, Reusable) are now the standard framework for data sharing. NIH, Wellcome, UKRI, and most major journals reference FAIR compliance as the goal. Here's what each means operationally:

F

Findable

Deposit in a repository that assigns a persistent identifier (DOI or accession number). Include rich metadata so search engines can index it. Don't just put data on a lab website that disappears.

A

Accessible

Data should be retrievable via open protocols. For controlled-access data (e.g., dbGaP), the access procedure itself must be publicly documented. The process for requesting access counts as 'accessible' even if the data aren't open.

I

Interoperable

Use standard file formats (CSV over Excel, FASTQ over proprietary formats, TSV over custom delimiters). Include a data dictionary. Make it so someone without your specific software can use the data.

R

Reusable

Attach a license (CC BY for open data). Include enough provenance (how the data were collected, processed, and what quality checks were applied) so that someone else can reproduce your decisions.

Practical note

Three data-sharing mistakes that create late-stage submission problems

Waiting until the manuscript is drafted to identify the repository, accession timing, or data-availability wording the journal expects.
Assuming a general-purpose repository is always acceptable when a domain-specific archive is the norm for that data type.
Treating FAIR as a slogan rather than a deposit checklist covering identifier, metadata, format, and reuse information.

References

  1. National Institutes of Health. (2023). NIH Data Management and Sharing Policy. U.S. Department of Health and Human Services. [sharing.nih.gov ↗]
  2. Wilkinson MD, Dumontier M, Aalbersberg IJ, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016;3:160018. [doi.org/10.1038/sdata.2016.18 ↗]
  3. UK Research and Innovation (UKRI). Research data management and sharing policy. Retrieved February 2026. [ukri.org ↗]
  4. Zenodo. About Zenodo: open repository for research. CERN. Retrieved February 2026. [zenodo.org ↗]
  5. Figshare. About figshare: open data repository. Digital Science. Retrieved February 2026. [figshare.com ↗]
  6. Dryad. About Dryad: open data repository. Dryad Digital Repository. Retrieved February 2026. [datadryad.org ↗]
Data note: NIH policy information sourced from sharing.nih.gov. Journal policies sourced from individual journal author instructions as of February 2026. Funder and journal data sharing policies evolve: always check your funder's current policy and the journal's author guidelines before submission. These pages are permanently maintained. For accuracy corrections or updates, contact hello@manusights.com.

Ready to apply this to a real draft?

Move from reference guidance to a manuscript-specific check

Use the public submission-readiness path when you already have a manuscript and need a draft-specific signal, not just a general guide.

Best for researchers who want a fast readiness read before deciding whether to revise, retarget, or submit.

Related guides in this collection

Frequently Asked Questions

Are data sharing requirements mandatory for all research types?

Requirements vary by journal, publisher, and research type. Nature Portfolio journals, Cell Press, PLOS journals, and BMJ require data sharing for all original research by default, with exceptions for privacy-restricted data. Most journals with data sharing policies require a Data Availability Statement in the manuscript regardless of whether data can be fully shared. If data cannot be shared, you must explain why - privacy, legal restrictions, or third-party ownership are all accepted reasons at most journals.

Where should I deposit research data before submitting a manuscript?

The right repository depends on your data type. Genomics data goes to NCBI (GEO for gene expression, SRA for raw sequencing, dbGaP for controlled human genomics data). Protein structures go to the RCSB Protein Data Bank. Clinical trial data goes to ClinicalTrials.gov. For general research data without a domain-specific repository, Zenodo, Figshare, or Dryad are widely accepted. Most journals provide a list of recommended repositories in their data sharing policy.

What should a Data Availability Statement include?

A Data Availability Statement should specify: (1) where the data can be accessed (repository name and URL or DOI), (2) any access restrictions and the reason for them, and (3) the accession number or identifier for the deposited dataset. If all data are in the manuscript itself (in figures and tables), state that explicitly. If data cannot be shared due to privacy or ethical restrictions, state the restriction and whether data can be requested from the corresponding author. Many journals provide a template - use it, since reviewers and editors check the statement during review.