Mission: Building a comprehensive AI information verification system to combat the growing issue of AI-generated misinformation polluting the web and creating feedback loops of false information.
AI is slowly poisoning its own well. With 1200+ AI-generated content sites and a dangerous feedback loop where:
- AI/LLM gives wrong info โ
- Someone publishes that info on reputed sites โ
- Content gets views and engagement โ
- Next AI models see high-engagement content as "must be true" โ
- Misinformation becomes a trusted source โ cycle repeats
A verification system that doesn't just fetch information but actively verifies and proves it through trusted sources and cross-referencing.
User prompt โ Router (classifier) โ Crawler โ Knowledge Base โ LLM (summarizer)
- Define what we consider "correct" (research papers, authoritative sources)
- Domain-specific parameter selection
- Dynamic trust scoring for sources
- Option A: Use Firecrawler API โ scrape research papers โ LLM parsing
- Option B: Custom search โ LLM-generated search queries โ web scraping โ relevance filtering
Dynamic Trust Score (0-1 scale)
| Factor | Description | Scoring Rule |
|---|---|---|
| Domain Reputation | Authority level | 0.9 for .edu/.gov; 0.8 for .org |
| Source Verification | Cross-citations | +0.05 per citation |
| Fact-Check Record | Misinformation history | -0.2 for false reports |
| Recency | Data freshness | -0.1 for >2 years old |
| Consistency | Cross-source matching | +0.2 for high similarity |
Final Score Calculation:
FinalScore = 0.4D + 0.3C + 0.2R + 0.1F
Where: D=Domain trust, C=Cross-source consistency, R=Recency, F=Fact-check verification
Trust Levels:
- โฅ0.75: "Trusted"
- 0.5-0.75: "Partially Trusted"
- <0.5: "Untrusted"
- Search 3-5 trusted sources for same claim
- Compare entities, dates, numbers
- Use NLP techniques (BERT embeddings, RoBERTa-large-mnli)
- Textual entailment analysis
User Query: "NASA confirms alien life on Mars."
- Router: Classifies as "Science News"
- Crawler: Searches NASA, Reuters, BBC Science
- Knowledge Base: No official NASA report found
- Cross-check: Reuters/BBC report "No official confirmation"
- Fact-check API: Snopes lists as false
- Result: Trust score = 0.18 โ FAKE
- When false alarms cause serious harm
- Examples: Reputation damage, legal action, wrongful content takedown
- When missing real issues is dangerous
- Examples: Health misinformation, safety-critical information
- Verified trustworthy users across the internet
- Training data based on their contributions
- Continuous knowledge base refinement
- ScienceDirect - Domain General Cognitive Ability
- ScienceDirect - Research Article
- PhD Discussion Thread
- https://www.mdpi.com/2079-9292/12/24/5041
- 1100+ Professor Signatures - Stop Uncritical AI Adoption in Academia
- Educators Against GenAI in Education
- Udit: Research Lead - Core problem analysis and solution design
- Divya: Research - Trust scoring and verification algorithms
- Rishik: Research - Cross-verification and NLP implementation
/
โโโ README.md # This file
โโโ docs/ # Documentation
โ โโโ api-reference.md # API documentation
โ โโโ architecture.md # System architecture details
โ โโโ contributing.md # Contribution guidelines
โ โโโ use-cases.md # Detailed use cases
โโโ research/ # Research materials
โ โโโ papers/ # Academic papers and references
โ โโโ experiments/ # Research experiments
โ โโโ findings/ # Research findings and notes
โโโ src/ # Source code
โ โโโ crawler/ # Web crawling modules
โ โโโ classifier/ # Content classification
โ โโโ verifier/ # Verification engine
โ โโโ api/ # API endpoints
โโโ daily-progress/ # Daily progress tracking
โ โโโ templates/ # Log templates
โ โโโ logs/ # Individual daily logs
โโโ tests/ # Test suites
โโโ unit/ # Unit tests
โโโ integration/ # Integration tests
- Python 3.8+
- Node.js 16+
- Docker (optional)
git clone https://github.com/uditjainstjis/Project-Clandestine.git
cd Project-Clandestine
pip install -r requirements.txt# Run the verification system
python src/main.py --query "your query here"
# Start development server
npm run dev- Problem identification and analysis
- Core architecture design
- Trust scoring algorithm design
- Initial prototype development
- Core verification engine
- API development
- Web interface
- Testing suite
- Advanced NLP integration
- Real-time verification
- Browser extension
- Mobile app
See CONTRIBUTING.md for detailed contribution guidelines.
All team members should log daily progress in daily-progress/logs/YYYY-MM-DD-[name].md
This project is licensed under the MIT License - see the LICENSE file for details.
Building the next generation of information verification - a system that doesn't just compete with Perplexity but sets the standard for verified, trustworthy AI-powered information retrieval.
Note: This is an active research project. The system is currently in the research and development phase. Contributions and feedback are welcome!