Skip to content

Shiva108/ai-llm-red-team-handbook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI / LLM Red Team Field Manual & Consultant's Handbook

Repository Banner

License Python Contributions Welcome Last Updated

A comprehensive operational toolkit for conducting AI/LLM red team assessments on Large Language Models, AI agents, RAG pipelines, and AI-enabled applications. This repository provides both tactical field guidance and strategic consulting frameworks.

📖 GitBook Navigation: See SUMMARY.md for the complete chapter structure.


📚 What's Inside

This repository contains three core resources:

1. AI LLM Red Team Handbook (46 Chapters, Fully Standardized, GitBook-ready)

A comprehensive consultancy guide with all chapters now featuring standardized metadata, abstracts, and consistent structure:

  • Part I: Professional Foundations (Chapters 1-4) - Ethics, legal framework, mindset, and engagement setup
  • Part II: Project Preparation (Chapters 5-8) - SOW, threat modeling, coping, and lab setup
  • Part III: Technical Fundamentals (Chapters 9-11) - LLM architectures and components
  • Part IV: Pipeline Security (Chapters 12-13) - RAG and supply chain
  • Part V: Attacks & Techniques (Chapters 14-24) - Comprehensive coverage of all major LLM attack vectors
  • Part VI: Defense & Mitigation (Chapters 25-30) - Adversarial ML and advanced defense
  • Part VII: Advanced Operations (Chapters 31-39) - Reporting, remediation, and automation
  • Part VIII: Advanced Topics (Chapters 40-46) - Future threats, compliance, and program building

Fully Complete Chapters:

  1. Introduction to AI Red Teaming (Beginner, 15 min)
  2. Ethics, Legal, and Stakeholder Communication (Beginner, 18 min)
  3. The Red Teamer's Mindset (Beginner, 12 min)
  4. SOW, Rules of Engagement, and Client Onboarding (Intermediate, 20 min)
  5. Threat Modeling and Risk Analysis (Intermediate, 16 min)
  6. Scoping an Engagement (Intermediate, 14 min)
  7. Lab Setup and Environmental Safety (Intermediate, 25 min, Hands-on)
  8. Evidence, Documentation, and Chain of Custody (Intermediate, 18 min, Hands-on)
  9. LLM Architectures and System Components (Intermediate, 22 min, Hands-on)
  10. Tokenization, Context, and Generation (Intermediate, 20 min, Hands-on)
  11. Plugins, Extensions, and External APIs (Intermediate, 16 min)
  12. Retrieval-Augmented Generation (RAG) Pipelines (Advanced, 24 min, Hands-on)
  13. Data Provenance and Supply Chain Security (Intermediate, 18 min)
  14. Prompt Injection (Intermediate, ~30 min, Hands-on)
  15. Data Leakage and Extraction (Intermediate, ~30 min, Hands-on)
  16. Jailbreaks and Bypass Techniques (Intermediate, ~20 min, Hands-on)
  17. Plugin and API Exploitation (Advanced, ~25 min, Hands-on)
  18. Evasion, Obfuscation, and Adversarial Inputs (Advanced, ~20 min, Hands-on)
  19. Training Data Poisoning (Advanced, ~18 min, Hands-on)
  20. Model Theft and Membership Inference (Advanced, ~20 min, Hands-on)
  21. Model DoS and Resource Exhaustion (Advanced, ~18 min, Hands-on)
  22. Cross-Modal and Multimodal Attacks (Advanced, ~20 min, Hands-on)
  23. Advanced Persistence and Chaining (Advanced, ~18 min, Hands-on)
  24. Social Engineering with LLMs (Intermediate, ~20 min, Hands-on)
  25. Advanced Adversarial ML (Advanced, ~25 min)

Additional Content:

  • Chapter 36: Reporting and Communication
  • Chapter 37: Remediation Strategies
  • Chapter 38: Continuous Red Teaming
  • Chapter 45: Building an AI Red Team Program

Remaining chapters are currently in development as stubs.

Chapter Features:

  • Standardized Metadata: Category, difficulty, time estimates, prerequisites
  • Compelling Abstracts: 2-3 sentence chapter summaries
  • Theoretical Foundations: Attack mechanisms and research citations (Ch 14-24)
  • Research Landscapes: Evolution of attacks and current gaps (Ch 14-24)
  • Quick References: Attack vectors, detection, mitigation (Ch 14-24)
  • Checklists: Pre/post-engagement validation

📖 GitBook Navigation: See SUMMARY.md for the complete chapter structure.

2. AI LLM Red Team Field Manual (64KB)

Compact operational reference for field use:

  • Quick-reference attack prompts and payloads
  • Testing checklists and methodology
  • Tool commands and configurations
  • OWASP Top 10 for LLMs mapping
  • MITRE ATLAS framework alignment

3. Python Testing Framework (scripts/)

Automated testing suite including:

  • Prompt injection attacks
  • Safety bypass and jailbreak tests
  • Data leakage and PII extraction
  • Tool/plugin misuse testing
  • Adversarial fuzzing
  • Model integrity validation

🚀 Quick Start

# Clone the repository
git clone https://github.com/shiva108/ai-llm-red-team-handbook.git
cd ai-llm-red-team-handbook

# Manual testing: Start with the Field Manual
open docs/AI_LLM\ Red\ Team\ Field\ Manual.md

# Automated testing:
cd scripts
pip install -r requirements.txt
python runner.py --config config.py

📖 Detailed setup: See Configuration Guide


📁 Repository Structure

ai-llm-red-team-handbook/
├── docs/
│   ├── SUMMARY.md                                       # GitBook navigation
│   ├── Chapter_01_Introduction_to_AI_Red_Teaming.md    # All chapters 1-46
│   ├── Chapter_02_Ethics_Legal_and_Stakeholder_Communication.md
│   ├── ...                                              # (Chapters 3-45)
│   ├── Chapter_46_Conclusion_and_Next_Steps.md
│   ├── AI_LLM Red Team Field Manual.md                  # Operational reference
│   ├── Configuration.md                                 # Setup guide
│   ├── templates/                                       # Report templates
│   ├── field_manuals/                                   # Modular field guides
│   ├── assets/                                          # Images and graphics
│   └── archive/                                         # Historical versions
├── scripts/
│   ├── runner.py                                        # Test orchestration
│   ├── test_prompt_injection.py                         # Prompt injection tests
│   ├── test_safety_bypass.py                            # Jailbreak tests
│   ├── test_data_exposure.py                            # Data leakage tests
│   ├── test_tool_misuse.py                              # Plugin/tool abuse tests
│   ├── test_fuzzing.py                                  # Adversarial fuzzing
│   └── requirements.txt                                 # Python dependencies
├── assets/                                              # Images and resources
└── README.md                                            # This file

🎯 Use Cases

Use Case Resources Description
Red Team Assessments Field Manual + Python Framework Conduct comprehensive LLM security assessments
Consultant Engagements Handbook + Report Template Full methodology for client projects
Team Training Handbook Foundations (Ch 1-13) Onboard and develop security teams
Research & Development Attack Chapters (Ch 14-24) Deep dives into specific attack surfaces
Compliance & Audit Threat Modeling (Ch 5) + Tools Risk assessments and control validation

⚙️ Prerequisites

Manual Testing:

  • Any text editor + target LLM access

Automated Testing:

  • Python 3.8+
  • Dependencies: requests, pytest, pydantic, python-dotenv
  • API credentials for target LLM

🧪 Python Testing Framework

Test Suites

  • test_prompt_injection.py - Automated prompt injection attacks
  • test_safety_bypass.py - Jailbreak and guardrail bypass tests
  • test_data_exposure.py - Data leakage and PII extraction
  • test_tool_misuse.py - Function-calling and plugin abuse
  • test_fuzzing.py - Adversarial input fuzzing
  • test_integrity.py - Model integrity and consistency

Configuration

Create scripts/.env:

API_ENDPOINT=https://api.example.com/v1/chat/completions
API_KEY=your-secret-api-key
MODEL_NAME=gpt-4

Run tests:

python runner.py                           # All tests
python runner.py --test prompt_injection   # Specific test
python runner.py --verbose                 # Verbose output

📖 Full configuration options: Configuration Guide


🗺️ Roadmap

Completed (December 2024):

  • ✅ 24 comprehensive chapters (1-13 foundations, 14-24 attack techniques)
  • ✅ Standardized metadata across all chapters
  • ✅ Theoretical foundations and research landscapes (Ch 14-24)
  • ✅ Quick reference guides for attack chapters
  • ✅ Pre/post-engagement checklists
  • ✅ Modular field manual structure

In Development:

  • 🔄 Advanced attack chapters (25-35): Adversarial ML, model inversion, backdoors
  • 🔄 Professional practice chapters (36-46): Some completed, others in progress
  • 🔄 Comprehensive linting and code block improvements
  • 🔄 Cross-chapter reference validation

Future Enhancements:

  • Sample RAG and LLM test environments
  • Interactive attack case studies with recordings
  • Video tutorials and walkthroughs
  • Auto-generated learning paths from metadata
  • Chapter completion tracking tools

Contributions welcome via issues and PRs.


📄 License

Licensed under CC BY-SA 4.0 (Creative Commons Attribution-ShareAlike 4.0 International).

See LICENSE for details.


⚠️ Disclaimer

For authorized security testing only.

Ensure:

  • Written authorization (SOW/RoE) is in place
  • Compliance with applicable laws and regulations (CFAA, GDPR, etc.)
  • Testing conducted in isolated environments when appropriate
  • No unauthorized testing on production systems

The authors accept no liability for misuse or unauthorized use of this material.


🤝 Contributing

We welcome contributions! Please:

  1. Review existing issues and PRs
  2. Follow the established format and style
  3. Test any code additions
  4. Submit clear, well-documented PRs

For major changes, please open an issue first to discuss.


📬 Contact & Support


Last Updated: December 2025 | Chapters: 46 total (25 complete, standardized with metadata) | Handbook Status: Production-Ready

About

AI / LLM Red Team Field Manual & Consultant’s Handbook

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages