AI Source Hygiene

Your AI research tools are pulling from AI-generated garbage. Here's how to fix it.

The Problem

AI assistants with web search are surfacing AI-generated "encyclopedias" that have documented manipulation, conspiracy promotion, and ideological bias. These sources poison your research without you knowing it. The owner of the content often also controls the AI surfacing it—a closed loop of self-citation.

Quick Fix (30 seconds)

Claude

Paste into User Preferences → Settings → User Preferences:

Never cite these sources (documented reliability issues):
infowars.com, grokipedia.com, vdare.com, thegatewaypundit.com,
revolver.news, infogalactic.com, conservapedia.com, rt.com,
sputniknews.com, oann.com, zerohedge.com, theepochtimes.com,
breitbart.com, telesurtv.net, presstv.ir, thegrayzone.com,
mintpressnews.com, globalresearch.ca, newsmax.com, dailymail.co.uk,
dailycaller.com, dailywire.com, theblaze.com, occupydemocrats.com,
palmerreport.com, bipartisanreport.com, dailykos.com

Note when excluding sources. Prefer Wikipedia, academic sources, primary sources.

ChatGPT

Paste into Custom Instructions → "How would you like ChatGPT to respond?":

Never cite these sources: infowars.com, grokipedia.com, vdare.com,
thegatewaypundit.com, revolver.news, infogalactic.com, conservapedia.com,
rt.com, sputniknews.com, oann.com, zerohedge.com, theepochtimes.com,
breitbart.com, telesurtv.net, presstv.ir, thegrayzone.com,
mintpressnews.com, globalresearch.ca, newsmax.com, dailymail.co.uk,
dailycaller.com, dailywire.com, theblaze.com, occupydemocrats.com,
palmerreport.com, bipartisanreport.com, dailykos.com

Skip in search results. Note when excluding unreliable sources.

Perplexity

Prefix your queries with:

Exclude: infowars.com, grokipedia.com, vdare.com, thegatewaypundit.com, rt.com, sputniknews.com, oann.com, breitbart.com, newsmax.com, dailymail.co.uk (documented reliability issues).

[Your question here]

That's it. For complete configuration, see platform-specific guides.

What We Block (and Why)

27 sources across 3 tiers, based on legal judgments, platform bans, FARA registrations, and Wikipedia deprecation. Sources blocked regardless of political leaning—same criteria applied to all.

Tier	Sources	Criteria
1: Demonstrably Harmful	InfoWars, VDare, Gateway Pundit, Revolver News, Grokipedia, Infogalactic, Conservapedia	$1.5B+ legal judgments, hate group designations, platform bans
2: Propaganda/State Media	RT, Sputnik, TeleSUR, Press TV, OAN, Zero Hedge, Epoch Times, Breitbart, The Grayzone, MintPress News, Global Research	FARA registrations, EU/US sanctions, state ownership, NATO/State Dept flagged
3: Highly Partisan	Newsmax, Daily Mail, Daily Caller, Daily Wire, The Blaze, Occupy Democrats, Palmer Report, Bipartisan Report, Daily Kos	Wikipedia deprecated, defamation settlements, low credibility ratings

Full blocklist with evidence →

Machine-readable version: blocklist/sources.yaml

Platform Guides

Platform	Instructions
Claude	platforms/claude.md
ChatGPT	platforms/chatgpt.md
Perplexity	platforms/perplexity.md
Microsoft Copilot	platforms/copilot.md
Google Gemini	platforms/gemini.md
Grok	platforms/grok.md ⚠️ see conflict of interest note
Any AI	platforms/generic.md

Contributing

Found a bad source? Have a fix for another platform?

See CONTRIBUTING.md for standards.

FAQ

Isn't this censorship?

No. It's source verification—standard practice in journalism, academia, and professional research. You're free to use whatever sources you want. This helps you avoid sources with documented reliability issues.

Why not block all AI-generated content?

Because not all AI content is problematic. The issue is AI content without editorial oversight, with documented bias, or with conflicts of interest (like the owner controlling both content creation AND the AI that surfaces it).

This seems political.

Source quality isn't political. A source that promotes conspiracy theories is unreliable regardless of which conspiracy theories. A source controlled by a single individual with editorial intervention is risky regardless of their politics. We apply the same standards to everyone—see our blocking criteria.

Why these specific sources?

These were the first documented cases of AI-generated or ideologically-manipulated encyclopedias being surfaced by AI research tools. The blocklist grows based on evidence—report sources that meet our criteria.

License

CC0 / Public Domain

Copy it, adapt it, share it. Information hygiene over attribution.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github		.github
blocklist		blocklist
docs		docs
platforms		platforms
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Source Hygiene

The Problem

Quick Fix (30 seconds)

Claude

ChatGPT

Perplexity

What We Block (and Why)

Platform Guides

Contributing

FAQ

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

AI Source Hygiene

The Problem

Quick Fix (30 seconds)

Claude

ChatGPT

Perplexity

What We Block (and Why)

Platform Guides

Contributing

FAQ

License

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages