Back to SentryMethodology

How Manusights Sentry works

Full data-source and algorithm disclosure for Sentry, the Reference Integrity Checker. Every per-reference verdict traces back to a public, citable dataset (Crossref, Retraction Watch, DOAJ, the Retraction Watch Hijacked Journal Checker). No black boxes; no hand-wavy “AI says.”

Last reviewed: April 2026 · Sentry v1.0

Architecture: parse → resolve → enrich → classify

  1. Parse: auto-detect input format (DOIs / BibTeX / RIS / plain text) using citation-js for structured formats and doi-regex for plaintext sweeps. Output: normalized ParsedReference[].
  2. Resolve: single Redis round-trip against the nightly Retraction Watch mirror to fast-path retractions; concurrency-3 batch fetch against Crossref REST API for paper metadata + update-to[] arrays.
  3. Enrich: for non-retracted references, check ISSN against Retraction Watch’s Hijacked Journal Checker and DOAJ’s withdrawn-journals log.
  4. Classify: severity precedence: retracted > expression-of-concern > hijacked-journal > doaj-withdrawn > correction > clean.

Crossref REST API

Primary source for paper metadata and update-to[] retraction notices

Sentry queries https://api.crossref.org/works/{DOI} with a mailto-identified User-Agent (polite pool, 50 req/s rate limit; effective Dec 1 2025). We run concurrency 3 to stay well under the limit with safety margin. The response includes:

  • title, author, container-title, ISSN, publisher, issued.date-parts — for the per-reference card display
  • update-to[] — the critical field. Array of update notices linked TO this work. Non-empty for retracted / EoC / corrected papers. Each entry carries a type (“retraction” / “expression-of-concern” / “correction” / etc.), the notice DOI, the notice label, and the registration timestamp.

Per-DOI responses are cached in Upstash Redis for 24 hours. Crossref data updates daily; retraction notices typically propagate within 24-72 hours of publication.

Reference: Crossref REST API documentation.

Retraction Watch dataset

ISSN: 2692-4579 · Fast-path retraction lookup with full reason metadata

Crossref Labs hosts the Retraction Watch dataset as a downloadable CSV with per-DOI retraction metadata. We mirror this nightly via a cron job into a Redis hash keyed by DOI, so retraction lookups are O(1) and don’t require an extra Crossref call. Fields used:

  • OriginalPaperDOI — primary key, lowercase normalized
  • RetractionNature — controlled vocabulary: “Retraction”, “Expression of Concern”, “Correction”, “Reinstatement”
  • RetractionDate — ISO YYYY-MM-DD
  • Reason — semicolon-separated controlled-vocabulary categories from the Retraction Watch user guide (e.g., “Falsification/Fabrication of Data”, “Investigation by Company/Institution”, “Plagiarism of Article”)
  • RetractionDOI — linkable DOI of the retraction notice itself

When a DOI appears in both Crossref’s update-to[] and the RW CSV, the RW entry takes precedence because it carries richer reason metadata. Source attribution (“via Retraction Watch” or “via Crossref”) is shown per-flag in the result UI.

Reference: The Retraction Watch Database. Retraction Watch is operated by The Center for Scientific Integrity, a 501(c)(3) nonprofit, edited by Ivan Oransky and Adam Marcus.

DOAJ withdrawn-journals log

Directory of Open Access Journals · Misconduct-removal flag

DOAJ maintains a public log of journals removed from the directory. Removal reasons include editorial misconduct, predatory practices, and ceased publication. We mirror this list nightly and check the journal ISSN of each Sentry result against it. Sentry surfaces these as “DOAJ-withdrawn” with the DOAJ-provided reason category when available.

Reference: Directory of Open Access Journals (DOAJ). The withdrawn-journals public sheet is updated by DOAJ’s editorial team.

Retraction Watch Hijacked Journal Checker

Tracker of titles being impersonated by predatory websites

Retraction Watch maintains a Hijacked Journal Checker (public Google Sheet) tracking journal titles being impersonated by predatory websites that mimic established venues. We mirror this list nightly and match by ISSN (preferred) or by exact journal-name match (case-insensitive). We do not use title-similarity heuristics — too many false positives — only exact matches.

When flagged, Sentry surfaces the venue name and notes that “the original paper may still be legitimate, but the citation as it stands could be ambiguous to reviewers; verify whether this is the original journal’s content.” The flag is conservative on purpose — a false positive on a hijacked-journal flag damages the tool’s credibility more than a missed catch.

Reference: Retraction Watch Hijacked Journal Checker.

Severity precedence

When multiple checks return positives for the same reference, Sentry collapses to a single highest-severity verdict for the card UI:

  1. Retracted — trumps everything; remove or replace the citation
  2. Expression of Concern — publisher-level concern, paper not retracted
  3. Hijacked Journal — venue-level integrity flag; verify the citation source
  4. DOAJ-Withdrawn — venue removed from DOAJ for misconduct
  5. Correction — corrected/erratum/addendum; verify any specific numerical claims
  6. Clean — DOI resolves, no integrity flags
  7. Unverifiable — no DOI in the entry, or DOI did not resolve via Crossref

Hijacked / DOAJ-withdrawn outranks correction because they’re venue-level integrity categories: a correction in a normal journal is a footnote, but a paper in a hijacked journal is a citation reviewers may flag regardless of the paper’s own merit.

Data handling

  • Pasted bibliographies are not stored. Submitted text is sent to the parser and discarded. Per-DOI Crossref responses are cached for 24 hours; per-bibliography results are cached server-side for 24 hours keyed by content hash, with no original text stored alongside the cache key.
  • No account, no email gate. The tool works without sign-up. Rate limits are per-IP, not per-account: 20 checks per hour, 80 per day.
  • Share URLs are not indexed. Per-result share URLs (/share/{token}) carry noindex so they don’t dilute the canonical tool page in search results.
  • Crossref polite-pool identification. All Crossref API calls include the mailto:erik@manusights.com identifier in the User-Agent, per Crossref’s requested practice.

How to cite Manusights Sentry

If you reference Sentry in a manuscript methods section or supplementary materials, please cite as:

Manusights. (2026). Sentry v1.0: Reference Integrity Checker
  (Crossref + Retraction Watch + DOAJ + Hijacked Journal Checker)
  [Free academic tool]. https://manusights.com/tools/reference-integrity
  Methodology: https://manusights.com/tools/reference-integrity/methodology
  (Accessed: YYYY-MM-DD)

When citing the underlying data sources directly (which we recommend for methods-section disclosures), please credit Crossref, Retraction Watch (ISSN 2692-4579), DOAJ, and the Retraction Watch Hijacked Journal Checker. See the About page for full attribution.

Want every citation claim verified, not just every reference? The full Manusights Readiness Scan reads your entire manuscript: it runs the same Crossref + Retraction Watch + DOAJ + hijacked-journal checks AND verifies whether each citation actually supports the claim it’s attached to, alongside methods, statistics, and journal fit. Free preview, $29 only if you want the full report.

Run the full readiness scan