Pre-Submission Review for Computational Biology Papers: Reproducibility, Code, and What Reviewers Check
Computational biology manuscripts face unique reproducibility scrutiny. About half of published computational models are not reproducible. Here is what to verify before submission to avoid being part of that statistic.
Senior Researcher, Oncology & Cell Biology
Author context
Specializes in manuscript preparation and peer review strategy for oncology and cell biology, with deep experience evaluating submissions to Nature Medicine, JCO, Cancer Cell, and Cell-family journals.
Readiness scan
Find out if this manuscript is ready to submit.
Run the Free Readiness Scan before you submit. Catch the issues editors reject on first read.
How to use this page well
These pages work best when they behave like tools, not essays. Use the quick structure first, then apply it to the exact journal and manuscript situation.
Question | What to do |
|---|---|
Use this page for | Building a point-by-point response that is easy for reviewers and editors to trust. |
Start with | State the reviewer concern clearly, then pair each response with the exact evidence or revision. |
Common mistake | Sounding defensive or abstract instead of specific about what changed. |
Best next step | Turn the response into a visible checklist or matrix before you finalize the letter. |
Decision cue: Computational biology has a reproducibility problem that editors and reviewers are now actively screening for. A study analyzing published computational models found that about half were not reproducible due to incorrect or missing information in the manuscript. Journals like Genome Biology, PLOS Computational Biology, and Bioinformatics have responded with increasingly strict requirements for code sharing, data availability, and methodology documentation. If your computational pipeline is not fully documented and your code is not publicly available, reviewers will flag it before they evaluate the science.
Check your computational biology manuscript readiness in 60 seconds with the free scan.
Why computational biology manuscripts face unique scrutiny
The reproducibility crisis is well-documented
The problem is not theoretical. Multiple studies have demonstrated that computational biology results frequently cannot be reproduced:
- about half of published computational models were not reproducible due to incorrect or missing information
- key details in bioinformatics data processing are often omitted, including software versions, parameter settings, and configuration files
- changes in reference data, software versions, and missing code make replication impossible even when the original analysis was correct
This means reviewers at top computational biology journals are specifically looking for reproducibility gaps. Not as a secondary concern, but as a primary evaluation criterion.
The five pillars of reproducible computational research
A 2023 framework published in Briefings in Bioinformatics identified five pillars that reviewers increasingly expect:
Pillar | What it requires | Common failure |
|---|---|---|
Literate programming | Analysis documented in notebooks (Jupyter, R Markdown) that interleave code, results, and interpretation | Code exists but is not documented or explained |
Code version control | Code in a version-controlled repository (GitHub, GitLab) with tagged releases | Code shared as a zip file or "available upon request" |
Compute environment control | Containerized environments (Docker, Singularity) or explicit dependency specifications | Software versions not recorded, conda/pip environments not exported |
Persistent data sharing | Data in FAIR-compliant repositories with persistent identifiers (DOI, accession numbers) | Data "available upon request" or on a lab website that may disappear |
Documentation | README files, parameter descriptions, example inputs and outputs | Code exists without instructions on how to run it |
Not every journal requires all five, but the direction is clear. Reviewers increasingly check for these and flag their absence.
What computational biology reviewers check first
Code availability
Is the code in a public repository? Not "available upon request" but actually accessible right now. GitHub with a Zenodo DOI is the standard. The repository should include:
- all custom scripts and pipelines used in the analysis
- a README explaining how to run the code
- version tags matching the submitted manuscript
- example input data or test cases
- dependency specifications (requirements.txt, environment.yml, or Dockerfile)
Software versions
Every piece of software used in the analysis must be specified with its version number. "We used STAR for alignment" is not reproducible. "We used STAR v2.7.10b with default parameters except --outFilterMismatchNmax 5" is reproducible.
This applies to every step: alignment, variant calling, differential expression, pathway analysis, visualization. If you used R, the R version and every package version matter. If you used Python, the Python version and library versions matter.
Data availability
Raw data should be deposited in appropriate repositories:
- sequencing data: GEO, SRA, ENA
- proteomics: PRIDE, ProteomeXchange
- metabolomics: MetaboLights
- structural data: PDB, EMDB
- general: Figshare, Dryad, Zenodo
Processed data (count matrices, normalized expression values, variant calls) should also be available, either in the repository or as supplementary material. Reviewers need to be able to start from the raw data and arrive at the same processed data using your documented pipeline.
Statistical methods
Computational biology papers often involve multiple testing across thousands of genes, proteins, or genomic regions. Reviewers check:
- multiple testing correction method (Bonferroni, Benjamini-Hochberg, or permutation-based)
- significance thresholds justified, not arbitrary
- effect size reported alongside statistical significance
- batch effects addressed in multi-sample analyses
- validation approach (cross-validation, independent cohort, or orthogonal method)
Benchmarking against existing methods
If the paper introduces a new method or pipeline, reviewers expect comparison against established alternatives using standard benchmark datasets. A new tool that has only been tested on the authors' own data is not convincing.
The computational biology pre-submission checklist
Code and reproducibility
- all code is in a public repository with a DOI (GitHub + Zenodo)
- the repository has a README with instructions for running the analysis
- software versions are specified for every tool in the pipeline
- dependency specifications are included (requirements.txt, environment.yml, Dockerfile)
- example data or test cases are provided
- the analysis can be run from start to finish by someone outside your lab
Data
- raw data deposited in appropriate domain-specific repositories with accession numbers
- processed data available as supplementary material or in a general repository
- data availability statement includes specific repository names and accession numbers
- any access restrictions are explained and justified
Methodology
- every step of the computational pipeline is described in enough detail for reproduction
- parameter choices are stated and justified (not just "default parameters")
- statistical methods are appropriate for the data type and multiple testing burden
- batch effects are addressed where applicable
- validation is performed using an independent approach or dataset
For new methods papers
- benchmarking against existing alternatives using standard datasets
- runtime and memory requirements documented
- scalability discussed (does it work on larger datasets?)
- limitations acknowledged (what the method cannot do)
Where pre-submission review helps most in computational biology
Computational biology manuscripts are uniquely well-suited for automated review because many of the reproducibility requirements are systematic and checkable:
- Citation verification catches references to tools that have been superseded or papers that have been retracted. The field moves fast, and citing an outdated version of a widely-used tool signals that the pipeline may not be current.
- Methodology evaluation checks whether the computational approach is described in enough detail and whether the statistical methods are appropriate.
- Journal-specific calibration evaluates whether the paper meets the specific requirements of your target journal (Genome Biology has different standards than Bioinformatics).
The Manusights free readiness scan evaluates these in about 60 seconds. The $29 AI Diagnostic provides a full report with 15+ verified citations from 500M+ live papers, figure-level feedback, and a prioritized revision checklist calibrated to your target journal.
For manuscripts targeting Genome Biology, Nature Methods, or Cell Systems, Manusights Expert Review ($1,000 to $1,800) connects you with a reviewer experienced in computational biology methodology at your target journal.
Sources
On this page
Reference library
Use the core publishing datasets alongside this guide
This article answers one part of the publishing decision. The reference library covers the recurring questions that usually come next: how selective journals are, how long review takes, and what the submission requirements look like across journals.
Dataset / reference guide
Peer Review Timelines by Journal
Reference-grade journal timeline data that authors, labs, and writing centers can cite when discussing realistic review timing.
Dataset / benchmark
Biomedical Journal Acceptance Rates
A field-organized acceptance-rate guide that works as a neutral benchmark when authors are deciding how selective to target.
Reference table
Journal Submission Specs
A high-utility submission table covering word limits, figure caps, reference limits, and formatting expectations.
Final step
Find out if this manuscript is ready to submit.
Run the Free Readiness Scan. See score, top issues, and journal-fit signals before you submit.
Anthropic Privacy Partner. Zero-retention manuscript processing.
Need deeper scientific feedback? See Expert Review Options
Where to go next
Supporting reads
Conversion step
Find out if this manuscript is ready to submit.
Anthropic Privacy Partner. Zero-retention manuscript processing.