Skip to content

feat: cache poisoning + cross-agent injection detection#321

Merged
andres-linero merged 1 commit intomainfrom
feat/cache-poisoning-cross-agent-injection
Mar 8, 2026
Merged

feat: cache poisoning + cross-agent injection detection#321
andres-linero merged 1 commit intomainfrom
feat/cache-poisoning-cross-agent-injection

Conversation

@msaad00
Copy link
Owner

@msaad00 msaad00 commented Mar 8, 2026

Summary

  • VectorDBInjectionDetector: new runtime detector for cache poisoning via RAG/vector DB retrieval tools — detects when retrieved context contains prompt injection payloads and upgrades severity to CRITICAL for confirmed vector tools
  • RESPONSE_INJECTION_PATTERNS: 7 patterns covering role overrides, jailbreak triggers, system prompt injection, instruction injection, task hijacking, exfil instructions, prompt delimiter attacks
  • ResponseInspector: extended to detect prompt injection in any tool response
  • ToxicPattern.CACHE_POISON: CVE in package backing a RAG/vector server + retrieval tool = poisoned LLM context on every query
  • ToxicPattern.CROSS_AGENT_POISON: shared MCP server with write+read tool pair across 2+ agents = one agent can poison another's retrieved context

Attack coverage added

Attack Detection
Cache poisoning (RAG/vector DB) VectorDBInjectionDetector + ToxicPattern.CACHE_POISON
Cross-agent context injection ToxicPattern.CROSS_AGENT_POISON (structural, no CVE needed)
Prompt injection in tool responses ResponseInspector injection patterns

Test plan

  • 17 new tests — all passing (test_runtime_detectors.py, test_toxic_combos.py)
  • Existing 83 tests — all passing

Adds two new attack pattern detections targeting RAG/vector DB pipelines
and multi-agent shared context surfaces:

Runtime (detectors.py / patterns.py):
- RESPONSE_INJECTION_PATTERNS: 7 regex patterns covering role overrides,
  jailbreak triggers, system prompt injection, instruction injection,
  task hijacking, exfil instructions, and prompt delimiter attacks
- ResponseInspector: now detects prompt injection in any tool response
  (CRITICAL severity)
- VectorDBInjectionDetector: new detector for cache poisoning via vector
  DB / RAG retrieval tools; identifies vector tool names, upgrades severity
  of all findings to CRITICAL for confirmed vector tools, tags alerts as
  cache_poison vs content_injection

Toxic combos (toxic_combos.py):
- ToxicPattern.CACHE_POISON: CVE in package backing a RAG/vector server
  + retrieval tool exposure = poisoned LLM context on every query
- ToxicPattern.CROSS_AGENT_POISON: shared MCP server with write+read tool
  pair across 2+ agents = one agent can poison another's context
- Context-graph-based detectors now run even when blast_radii is empty
  (structural risk detection without CVE data)

Tests: 17 new test cases across TestResponseInspectorInjection,
TestVectorDBInjectionDetector, TestCachePoison, TestCrossAgentPoison
@msaad00 msaad00 requested a review from andres-linero as a code owner March 8, 2026 00:13
@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2026

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@andres-linero andres-linero merged commit 22b6ff6 into main Mar 8, 2026
16 checks passed
@andres-linero andres-linero deleted the feat/cache-poisoning-cross-agent-injection branch March 8, 2026 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants