A comprehensive operational toolkit for conducting AI/LLM red team assessments on Large Language Models, AI agents, RAG pipelines, and AI-enabled applications. This repository provides both tactical field guidance and strategic consulting frameworks.
📖 GitBook Navigation: See SUMMARY.md for the complete chapter structure.
This repository contains three core resources:
A comprehensive consultancy guide with all chapters now featuring standardized metadata, abstracts, and consistent structure:
- Part I: Professional Foundations (Chapters 1-4) - Ethics, legal framework, mindset, and engagement setup
- Part II: Project Preparation (Chapters 5-8) - SOW, threat modeling, coping, and lab setup
- Part III: Technical Fundamentals (Chapters 9-11) - LLM architectures and components
- Part IV: Pipeline Security (Chapters 12-13) - RAG and supply chain
- Part V: Attacks & Techniques (Chapters 14-24) - Comprehensive coverage of all major LLM attack vectors
- Part VI: Defense & Mitigation (Chapters 25-30) - Adversarial ML and advanced defense
- Part VII: Advanced Operations (Chapters 31-39) - Reporting, remediation, and automation
- Part VIII: Advanced Topics (Chapters 40-46) - Future threats, compliance, and program building
Fully Complete Chapters:
- Introduction to AI Red Teaming (Beginner, 15 min)
- Ethics, Legal, and Stakeholder Communication (Beginner, 18 min)
- The Red Teamer's Mindset (Beginner, 12 min)
- SOW, Rules of Engagement, and Client Onboarding (Intermediate, 20 min)
- Threat Modeling and Risk Analysis (Intermediate, 16 min)
- Scoping an Engagement (Intermediate, 14 min)
- Lab Setup and Environmental Safety (Intermediate, 25 min, Hands-on)
- Evidence, Documentation, and Chain of Custody (Intermediate, 18 min, Hands-on)
- LLM Architectures and System Components (Intermediate, 22 min, Hands-on)
- Tokenization, Context, and Generation (Intermediate, 20 min, Hands-on)
- Plugins, Extensions, and External APIs (Intermediate, 16 min)
- Retrieval-Augmented Generation (RAG) Pipelines (Advanced, 24 min, Hands-on)
- Data Provenance and Supply Chain Security (Intermediate, 18 min)
- Prompt Injection (Intermediate, ~30 min, Hands-on)
- Data Leakage and Extraction (Intermediate, ~30 min, Hands-on)
- Jailbreaks and Bypass Techniques (Intermediate, ~20 min, Hands-on)
- Plugin and API Exploitation (Advanced, ~25 min, Hands-on)
- Evasion, Obfuscation, and Adversarial Inputs (Advanced, ~20 min, Hands-on)
- Training Data Poisoning (Advanced, ~18 min, Hands-on)
- Model Theft and Membership Inference (Advanced, ~20 min, Hands-on)
- Model DoS and Resource Exhaustion (Advanced, ~18 min, Hands-on)
- Cross-Modal and Multimodal Attacks (Advanced, ~20 min, Hands-on)
- Advanced Persistence and Chaining (Advanced, ~18 min, Hands-on)
- Social Engineering with LLMs (Intermediate, ~20 min, Hands-on)
- Advanced Adversarial ML (Advanced, ~25 min)
Additional Content:
- Chapter 36: Reporting and Communication
- Chapter 37: Remediation Strategies
- Chapter 38: Continuous Red Teaming
- Chapter 45: Building an AI Red Team Program
Remaining chapters are currently in development as stubs.
Chapter Features:
- ✅ Standardized Metadata: Category, difficulty, time estimates, prerequisites
- ✅ Compelling Abstracts: 2-3 sentence chapter summaries
- ✅ Theoretical Foundations: Attack mechanisms and research citations (Ch 14-24)
- ✅ Research Landscapes: Evolution of attacks and current gaps (Ch 14-24)
- ✅ Quick References: Attack vectors, detection, mitigation (Ch 14-24)
- ✅ Checklists: Pre/post-engagement validation
📖 GitBook Navigation: See SUMMARY.md for the complete chapter structure.
Compact operational reference for field use:
- Quick-reference attack prompts and payloads
- Testing checklists and methodology
- Tool commands and configurations
- OWASP Top 10 for LLMs mapping
- MITRE ATLAS framework alignment
Automated testing suite including:
- Prompt injection attacks
- Safety bypass and jailbreak tests
- Data leakage and PII extraction
- Tool/plugin misuse testing
- Adversarial fuzzing
- Model integrity validation
# Clone the repository
git clone https://github.com/shiva108/ai-llm-red-team-handbook.git
cd ai-llm-red-team-handbook
# Manual testing: Start with the Field Manual
open docs/AI_LLM\ Red\ Team\ Field\ Manual.md
# Automated testing:
cd scripts
pip install -r requirements.txt
python runner.py --config config.py📖 Detailed setup: See Configuration Guide
ai-llm-red-team-handbook/
├── docs/
│ ├── SUMMARY.md # GitBook navigation
│ ├── Chapter_01_Introduction_to_AI_Red_Teaming.md # All chapters 1-46
│ ├── Chapter_02_Ethics_Legal_and_Stakeholder_Communication.md
│ ├── ... # (Chapters 3-45)
│ ├── Chapter_46_Conclusion_and_Next_Steps.md
│ ├── AI_LLM Red Team Field Manual.md # Operational reference
│ ├── Configuration.md # Setup guide
│ ├── templates/ # Report templates
│ ├── field_manuals/ # Modular field guides
│ ├── assets/ # Images and graphics
│ └── archive/ # Historical versions
├── scripts/
│ ├── runner.py # Test orchestration
│ ├── test_prompt_injection.py # Prompt injection tests
│ ├── test_safety_bypass.py # Jailbreak tests
│ ├── test_data_exposure.py # Data leakage tests
│ ├── test_tool_misuse.py # Plugin/tool abuse tests
│ ├── test_fuzzing.py # Adversarial fuzzing
│ └── requirements.txt # Python dependencies
├── assets/ # Images and resources
└── README.md # This file
| Use Case | Resources | Description |
|---|---|---|
| Red Team Assessments | Field Manual + Python Framework | Conduct comprehensive LLM security assessments |
| Consultant Engagements | Handbook + Report Template | Full methodology for client projects |
| Team Training | Handbook Foundations (Ch 1-13) | Onboard and develop security teams |
| Research & Development | Attack Chapters (Ch 14-24) | Deep dives into specific attack surfaces |
| Compliance & Audit | Threat Modeling (Ch 5) + Tools | Risk assessments and control validation |
Manual Testing:
- Any text editor + target LLM access
Automated Testing:
- Python 3.8+
- Dependencies:
requests,pytest,pydantic,python-dotenv - API credentials for target LLM
test_prompt_injection.py- Automated prompt injection attackstest_safety_bypass.py- Jailbreak and guardrail bypass teststest_data_exposure.py- Data leakage and PII extractiontest_tool_misuse.py- Function-calling and plugin abusetest_fuzzing.py- Adversarial input fuzzingtest_integrity.py- Model integrity and consistency
Create scripts/.env:
API_ENDPOINT=https://api.example.com/v1/chat/completions
API_KEY=your-secret-api-key
MODEL_NAME=gpt-4Run tests:
python runner.py # All tests
python runner.py --test prompt_injection # Specific test
python runner.py --verbose # Verbose output📖 Full configuration options: Configuration Guide
Completed (December 2024):
- ✅ 24 comprehensive chapters (1-13 foundations, 14-24 attack techniques)
- ✅ Standardized metadata across all chapters
- ✅ Theoretical foundations and research landscapes (Ch 14-24)
- ✅ Quick reference guides for attack chapters
- ✅ Pre/post-engagement checklists
- ✅ Modular field manual structure
In Development:
- 🔄 Advanced attack chapters (25-35): Adversarial ML, model inversion, backdoors
- 🔄 Professional practice chapters (36-46): Some completed, others in progress
- 🔄 Comprehensive linting and code block improvements
- 🔄 Cross-chapter reference validation
Future Enhancements:
- Sample RAG and LLM test environments
- Interactive attack case studies with recordings
- Video tutorials and walkthroughs
- Auto-generated learning paths from metadata
- Chapter completion tracking tools
Contributions welcome via issues and PRs.
Licensed under CC BY-SA 4.0 (Creative Commons Attribution-ShareAlike 4.0 International).
See LICENSE for details.
For authorized security testing only.
Ensure:
- Written authorization (SOW/RoE) is in place
- Compliance with applicable laws and regulations (CFAA, GDPR, etc.)
- Testing conducted in isolated environments when appropriate
- No unauthorized testing on production systems
The authors accept no liability for misuse or unauthorized use of this material.
We welcome contributions! Please:
- Review existing issues and PRs
- Follow the established format and style
- Test any code additions
- Submit clear, well-documented PRs
For major changes, please open an issue first to discuss.
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Last Updated: December 2025 | Chapters: 46 total (25 complete, standardized with metadata) | Handbook Status: Production-Ready