Forensic Reference-Integrity Auditor

A prompt-engineered deep-scan verification system for academic reference lists. Built for managing editors in nursing and health sciences publishing who need to catch fabricated, manipulated, and suspicious citations before they reach print.

The Problem

Academic reference lists are a trust surface. Paper mills, AI-generated citations, and increasingly sophisticated metadata manipulation mean that a reference can look perfectly formatted while being completely fabricated — or worse, a composite of real elements assembled to resist casual verification.

Existing tools address slices of this problem:

Tool	What It Does	What It Misses
Edifix	Formatting correction, DOI lookup	No adversarial verification
Scite.ai	Citation context analysis	Doesn't detect fabricated metadata
iThenticate	Text similarity / plagiarism	Ignores reference list integrity
Papermill Alarm	Paper mill pattern detection	Narrow heuristic scope
RefChecker	Basic DOI/metadata validation	No forensic depth

None of them perform adversarial forensic verification across multiple heuristic dimensions simultaneously. That's what this tool does.

How It Works

The auditor runs as a structured prompt on Anthropic's Claude (Opus), using live web search to verify every citation against authoritative sources:

Crossref — DOI resolution, metadata matching, retraction status
PubMed / PMC — Biomedical citation verification
Retraction Watch — Known retraction and expression-of-concern database
Publisher sites — Direct verification against journal archives

Forensic Heuristics (v3)

Each reference is evaluated against seven forensic heuristics designed to catch progressively more sophisticated fabrication:

#	Heuristic	What It Catches
1	DOI Resolution	Dead DOIs, DOIs pointing to wrong papers, fabricated DOI patterns
2	Homoglyph Detection	Cyrillic or other Unicode substitutions in titles, author names, or journal names designed to defeat string matching
3	Digit-Swap Analysis	Transposed volume/issue/page numbers that make a real citation unfindable
4	Author-Shifting	Subtly rearranged, added, or removed authors compared to the actual publication record
5	Double-Real Trap	Real DOI + real-sounding metadata from a different paper, creating a composite that passes surface checks
6	Journal Mutation	Slightly altered journal titles (word substitution, abbreviation manipulation) that point to nonexistent or different journals
7	Shadow-Paper Signatures	Citations with plausible metadata that match no known publication — fully fabricated but constructed to look legitimate

Risk Classification

Every reference receives one of four risk tiers:

Tier	Label	Meaning
H	High	Strong evidence of fabrication or manipulation. Recommend rejection or author query.
E	Elevated	Multiple anomalies detected. Requires manual verification before acceptance.
M	Moderate	Minor anomalies or incomplete verification. Flag for editorial awareness.
D	Defensible	Verified or consistent with known publication records. No action required.

Scoring Formula

Reference List Score = 100 − (H × 12) − (E × 5) − (M × 2) − (D × 3)

The weights punish fabrication heavily while avoiding over-penalization of grey literature (government reports, organizational white papers, URLs) that legitimately lacks DOIs.

Output

The auditor produces a self-contained HTML report with six sections, designed for editorial decision-making:

Executive Dashboard — Confidence gauge (0–100), risk-tier heatmap, summary stat cards. A managing editor can glance at this and know whether to worry.
Forensic Audit Table — Per-reference findings with heuristic flags, verification sources consulted, and risk tier assignments.
Ranked Suspicion Index — References ordered by risk severity. Highest-risk citations surface first.
Cleaned APA Reference List — Corrected formatting for all verified references (APA 7th edition).
PRISMA-Style Flow Diagram — Visual representation of how references moved through the verification pipeline (verified, flagged, unresolvable, grey literature).
Forensic Appendix — Methodology documentation, heuristic definitions, and scoring explanation. Supports editorial audit trails and COPE-aligned documentation.

Usage

Requirements

Anthropic Claude (Opus recommended for forensic interpretation quality)
Web search enabled (the auditor performs live verification against external sources)

Running an Audit

Provide the prompt (see prompts/v3-auditor.md) to Claude with web search enabled.
Paste or upload the reference list to be audited.
The auditor will systematically verify each reference and produce the HTML report.

Note: A single audit of 25–40 references typically requires 5–15 minutes of processing time and significant tool-call volume. This is by design — thorough forensic verification is not a quick-check operation.

Input Formats

The auditor accepts reference lists in:

Raw text (pasted APA-formatted references)
Extracted from manuscript PDFs or Word documents
Mixed formats (the auditor will normalize during processing)

Testing

The system has been validated against:

Adversarial Test Set

A deliberately constructed 30-reference list containing layered traps:

Homoglyph substitutions (Cyrillic characters in journal titles)
Author-shifted citations (real papers with manipulated author lists)
Shadow papers (fully fabricated but plausible-sounding)
Double-Real composites (real DOI + metadata from a different paper)
Pop-culture junk citations (including a fabricated Obi-Wan Kenobi publication)
Clean references seeded throughout to test false-positive rates

Real Published Articles

Multiple real articles from JOGNN, MCN, and related nursing journals verified to confirm that the auditor correctly classifies legitimate references as Defensible without over-flagging.

Roadmap

v4 Heuristics (Planned)

Batch-pattern detection — Statistical analysis across multiple submissions to identify coordinated fabrication campaigns
Crossref Retraction API integration — Direct programmatic retraction checking
Predatory journal flagging — Cabells-style methodology for identifying predatory or questionable venues
Temporal impossibility checks — Citations with dates that predate the journal's existence or postdate the submission
Sneaked-reference detection — References that appear in the list but are never cited in the manuscript body
COPE flowchart alignment — Structured recommendation output aligned with Committee on Publication Ethics investigation procedures

Architecture (Planned)

Pipeline decomposition across model tiers for cost optimization at editorial scale:

Stage	Model	Role
Forensic interpretation	Opus	Judgment calls, ambiguous cases, adversarial reasoning
Procedural verification	Sonnet	DOI resolution, metadata matching, systematic checks
Formatting and output	Haiku	APA correction, HTML report generation, structured output

Project Context

This project originated from a real editorial workflow need. I spoke with a managing editor at a few leading nursing journals. They were clear: these journals face the same reference-integrity threats as all academic publishing, amplified by the rapid growth of AI-generated content and paper mill sophistication.

The tool is designed to fit into a managing editor's actual workflow: receive a manuscript, run the reference list through the auditor, get a report that supports an editorial decision. Not a research tool — an editorial operations tool.

Development Methodology

This project uses imperative-to-declarative promotion as its core development methodology:

Exploratory run — Execute the prompt, observe what Claude produces, optimize for good raw output.
Identify what works — Name the specific behaviors, heuristics, and output patterns that succeeded.
Codify into spec — Write the successful behavior into the prompt as declarative instructions that any Claude instance can reproduce cold.

This is the same pattern as writing configuration management (Puppet, Ansible) from a hand-tuned known-good state: get the system working by hand, then capture that state as code.

Nothing gets added to the spec until it's been tested. The prompt is the artifact.

Repository Structure

├── README.md
├── prompts/
│   └── v3-auditor.md          # Current production prompt
├── test-sets/
│   ├── adversarial-30.md      # Adversarial reference list with layered traps
│   └── real-articles/         # Real article reference lists used for validation
├── reports/                   # Sample output reports
├── docs/
│   ├── heuristics.md          # Detailed heuristic documentation
│   ├── competitive-landscape.md
│   └── architecture.md        # Pipeline decomposition design
└── roadmap/
    └── v4-features.md         # Planned enhancements

License

MIT License — see LICENSE.

Credits

Built by Chris Pitzi — infrastructure professional turned prompt engineer. 30 years of production operations applied to making AI do useful, verifiable, adversarial work. Developed with Claude (Anthropic).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Forensic Reference-Integrity Auditor

The Problem

How It Works

Forensic Heuristics (v3)

Risk Classification

Scoring Formula

Output

Usage

Requirements

Running an Audit

Input Formats

Testing

Adversarial Test Set

Real Published Articles

Roadmap

v4 Heuristics (Planned)

Architecture (Planned)

Project Context

Development Methodology

Repository Structure

Related

License

Credits

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
docs		docs
prompts		prompts
reports		reports
roadmap		roadmap
test-sets		test-sets
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Forensic Reference-Integrity Auditor

The Problem

How It Works

Forensic Heuristics (v3)

Risk Classification

Scoring Formula

Output

Usage

Requirements

Running an Audit

Input Formats

Testing

Adversarial Test Set

Real Published Articles

Roadmap

v4 Heuristics (Planned)

Architecture (Planned)

Project Context

Development Methodology

Repository Structure

Related

License

Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages