Audit
Stats Sanity Checker
Paste your Results section. We recompute every reported p-value and run GRIM, GRIMMER, and DEBIT in one go to flag inconsistent statistics, impossible means, and decision-flipping rounding errors before reviewers do.
We don’t train AI on your data. Pasted text is deleted within 24 hours.
Paste your Results
APA-style stats prose, descriptives table, or LaTeX. 200-character minimum.
We extract every claim
NHST tests (t / F / χ² / r / z), descriptive triples (mean ± SD ± N), and binary proportions.
Math-checked flags
p-recompute (statcheck), GRIM, GRIMMER, and DEBIT. All in pure code, not LLM guesses.
Why stats integrity matters
Nuijten et al. (2016) scanned 30,000+ psychology papers and found that ~50% contained at least one inconsistent reported p-value, with ~13% containing a decision-flipping inconsistency. A 2024 follow-up by Nuijten & Wicherts found that integrating statcheck-equivalent checks into peer review correlated with a 4.5× reduction in reporting errors at submission.
Reviewers and meta-analysts run these checks routinely. The point of this tool is to put the same checks in front of you before submission, with plain-English flag explanations, so a fixable arithmetic error never becomes a public correction notice.
Limitations
- Reported stats only. We can recompute p only when test statistic + df + reported p are all present in the paste. We can’t check raw data we don’t see.
- Corrections suspend recompute. If you tick “Bonferroni / FDR / Holm,” we suppress p-recompute (the math is family-specific) and run only descriptive checks (GRIM / GRIMMER / DEBIT).
- GRIM range. GRIM is informative for N ≤ 200 with integer-bounded scales (Likert, counts). Above that, the rounding band swallows the scale and we surface “skipped” rather than false-positive.
- GRIMMER upper-bound only. V1 flags SDs that exceed the theoretical maximum given the integer scale and N. The full Anaya 2017 algorithm additionally checks integer-partition consistency; that fuller enumeration ships in a future update. Flagged cases are correct; some genuinely-impossible sub-maximal SDs may currently pass.
- Not a substitute for statcheck-on-PDFs. For batch checks of full manuscripts (PDF / DOCX), use the official statcheck R package or the upcoming statcheck Word add-in. Audit is built for the “paste my Results paragraph and tell me what’s wrong” loop.
Manuscript-level read
Want stats integrity checked in manuscript context?
The audit recomputes math from a paste. The full readiness scan reads your entire manuscript and flags missing power analyses, multiple-comparison gaps, methodology issues, and reviewer-flag patterns alongside arithmetic checks. Free preview, $29 only if you want the full report.
Built with reviewers who have published in Cell, Nature, The Lancet, NEJM, and Science. Used by researchers at every institution below.





Want the math behind every flag? Read the full methodology · closed-form CDF formulas, severity classification, and citations to Nuijten 2016/2020, Brown & Heathers 2017, Anaya 2017, and Heathers 2018. Or read About + credits for the original tool authors we built on.
Picking a journal next? Run Manusights Compass · paste your title and abstract, get the top 5 best-fit venues with scope reasoning and a fit score.
Sanity-checking your bibliography too? Run Manusights Sentry · paste your reference list, get per-reference flags for retractions, expressions of concern, hijacked journals, and DOAJ-withdrawn venues before reviewers screen them.