Add ATR detection rules as community detection resource by eeee2345 · Pull Request #187 · safe-agentic-framework/safe-mcp

eeee2345 · 2026-03-27T11:12:48Z

April 2026 Update — ATR has grown from 71 to 108 rules (v1.1.1). Cisco AI Defense merged 34 ATR rules into their open-source skill-scanner (PR #79). Coverage of SAFE-MCP techniques remains at 78/85 (91.8%).

TL;DR

SAFE-MCP defines the threats. ATR detects them.

Your framework tells people what to watch for. ATR tells their scanners how to detect it. Every SAFE-MCP user who installs ATR gets automated detection coverage of 91.8% of your threat taxonomy — one command:

npm install agent-threat-rules && npx atr scan .

What is ATR?

ATR (Agent Threat Rules) is an open-source, MIT-licensed detection ruleset — Sigma/YARA-style YAML signatures for AI agent threats.

	SAFE-MCP	ATR
Role	Threat knowledge base (like MITRE ATT&CK)	Detection ruleset (like Sigma/YARA)
Output	"These attacks exist"	"Here's how to detect them"
Format	Markdown technique descriptions	Machine-readable YAML with regex patterns
Usage	Reference for security teams	Executable by scanners and CI pipelines

Key stats (April 2026)

108 YAML detection rules across 9 threat categories
99.7% precision / 62.7% recall on PINT adversarial benchmark
36,394 MCP skills scanned from the ecosystem (182 CRITICAL, 1,124 HIGH)
Cisco AI Defense merged 34 ATR rules as upstream detection source
Engine-agnostic — reference engines in TypeScript and Python
MIT licensed — no commercial component

SAFE-MCP Coverage: 78/85 techniques (91.8%)

SAFE-MCP Tactic	Techniques	ATR Covered	Coverage
Initial Access (TA0001)	9	9	FULL
Execution (TA0002)	9	8	STRONG
Persistence (TA0003)	8	8	FULL
Privilege Escalation (TA0004)	9	8	STRONG
Defense Evasion (TA0005)	8	7	STRONG
Credential Access (TA0006)	7	7	FULL
Discovery (TA0007)	6	5	STRONG
Lateral Movement (TA0008)	7	7	FULL
Collection (TA0009)	5	5	FULL
Command and Control (TA0011)	4	4	FULL
Exfiltration (TA0010)	6	5	STRONG
Impact (TA0040)	6	6	FULL
Resource Development (TA0042)	1	1	FULL
Total	85	78	91.8%

Detailed Mapping by Tactic

Initial Access — 9/9 FULL

SAFE-MCP Technique	ATR Rules
SAFE-T1001 Tool Poisoning Attack	ATR-010, ATR-011, ATR-100, ATR-101, ATR-103, ATR-105
SAFE-T1002 Supply Chain Compromise	ATR-060, ATR-095, ATR-096
SAFE-T1003 Malicious MCP-Server Distribution	ATR-095, ATR-096
SAFE-T1004 Server Impersonation / Name-Collision	ATR-060, ATR-117
SAFE-T1005 Exposed Endpoint Exploit	ATR-012, ATR-013
SAFE-T1006 User-Social-Engineering Install	ATR-119
SAFE-T1007 OAuth Authorization Phishing	ATR-114
SAFE-T1008 Tool Shadowing Attack	ATR-089, ATR-106
SAFE-T1009 Authorization Server Mix-up	ATR-114

Execution — 8/9 STRONG

SAFE-MCP Technique	ATR Rules
SAFE-T1101 Command Injection	ATR-066, ATR-110, ATR-111
SAFE-T1102 Prompt Injection (Multiple Vectors)	ATR-001, ATR-002, ATR-003, ATR-004, ATR-005, ATR-080, ATR-081, ATR-083, ATR-084, ATR-091, ATR-097, ATR-104
SAFE-T1103 Fake Tool Invocation	ATR-012
SAFE-T1104 Over-Privileged Tool Abuse	ATR-040, ATR-064
SAFE-T1105 Path Traversal via File Tool	ATR-113
SAFE-T1106 Autonomous Loop Exploit	ATR-050, ATR-051
SAFE-T1109 Debugging Tool Exploitation	— (gap: CVE-specific)
SAFE-T1110 Multimodal Prompt Injection	— (gap: requires image/audio detection)
SAFE-T1111 AI Agent CLI Weaponization	ATR-110, ATR-111

Persistence — 8/8 FULL

SAFE-MCP Technique	ATR Rules
SAFE-T1201 MCP Rug Pull Attack	ATR-065, ATR-089
SAFE-T1202 OAuth Token Persistence	ATR-114
SAFE-T1203 Backdoored Server Binary	ATR-095
SAFE-T1204 Context Memory Implant	ATR-075
SAFE-T1205 Persistent Tool Redefinition	ATR-065
SAFE-T1206 Credential Implant in Config	ATR-113
SAFE-T1207 Hijack Update Mechanism	ATR-095, ATR-096
SAFE-T2106 Vector Store Contamination	ATR-070, ATR-075

Privilege Escalation — 8/9 STRONG

SAFE-MCP Technique	ATR Rules
SAFE-T1301 Cross-Server Tool Shadowing	ATR-074, ATR-089
SAFE-T1302 High-Privilege Tool Abuse	ATR-040, ATR-012
SAFE-T1303 Sandbox Escape via Server Exec	— (gap: infrastructure-level)
SAFE-T1304 Credential Relay Chain	ATR-074, ATR-114
SAFE-T1305 Host OS Priv-Esc (RCE)	ATR-040, ATR-110
SAFE-T1306 Rogue Authorization Server	ATR-114
SAFE-T1307 Confused Deputy Attack	ATR-074, ATR-117
SAFE-T1308 Token Scope Substitution	ATR-114
SAFE-T1309 Privileged Tool Invocation via Prompt	ATR-001, ATR-004, ATR-040

Defense Evasion — 7/8 STRONG

SAFE-MCP Technique	ATR Rules
SAFE-T1401 Line Jumping	ATR-094
SAFE-T1402 Instruction Steganography	ATR-002, ATR-080, ATR-086
SAFE-T1403 Consent-Fatigue Exploit	ATR-118
SAFE-T1404 Response Tampering	ATR-088, ATR-105
SAFE-T1405 Tool Obfuscation/Renaming	ATR-061
SAFE-T1406 Metadata Manipulation	ATR-082
SAFE-T1407 Server Proxy Masquerade	— (gap: network-level)
SAFE-T1408 OAuth Protocol Downgrade	ATR-114

Credential Access — 7/7 FULL

SAFE-MCP Technique	ATR Rules
SAFE-T1501 Full-Schema Poisoning	ATR-100, ATR-103
SAFE-T1502 File-Based Credential Harvest	ATR-113
SAFE-T1503 Env-Var Scraping	ATR-115
SAFE-T1504 Token Theft via API Response	ATR-021, ATR-114
SAFE-T1505 In-Memory Secret Extraction	ATR-021
SAFE-T1506 Infrastructure Token Theft	ATR-114
SAFE-T1507 Authorization Code Interception	ATR-114

Discovery — 5/6 STRONG

SAFE-MCP Technique	ATR Rules
SAFE-T1601 MCP Server Enumeration	ATR-087, ATR-090
SAFE-T1602 Tool Enumeration	ATR-087
SAFE-T1603 System Prompt Disclosure	ATR-020
SAFE-T1604 Server Version Enumeration	— (gap: infrastructure fingerprinting)
SAFE-T1605 Capability Mapping	ATR-087, ATR-090
SAFE-T1606 Directory Listing via File Tool	ATR-113

Lateral Movement — 7/7 FULL

SAFE-MCP Technique	ATR Rules
SAFE-T1701 Cross-Tool Contamination	ATR-063, ATR-074
SAFE-T1702 Shared-Memory Poisoning	ATR-070, ATR-092
SAFE-T1703 Tool-Chaining Pivot	ATR-063
SAFE-T1704 Compromised-Server Pivot	ATR-074
SAFE-T1705 Cross-Agent Instruction Injection	ATR-030, ATR-116
SAFE-T1706 OAuth Token Pivot Replay	ATR-114
SAFE-T1707 CSRF Token Relay	ATR-114

Collection — 5/5 FULL

SAFE-MCP Technique	ATR Rules
SAFE-T1801 Automated Data Harvesting	ATR-102
SAFE-T1802 File Collection	ATR-113
SAFE-T1803 Database Dump	ATR-013
SAFE-T1804 API Data Harvest	ATR-102
SAFE-T1805 Context Snapshot Capture	ATR-075, ATR-090

Command and Control — 4/4 FULL

SAFE-MCP Technique	ATR Rules
SAFE-T1901 Outbound Webhook C2	ATR-010, ATR-013
SAFE-T1902 Covert Channel in Responses	ATR-080, ATR-086
SAFE-T1903 Malicious Server Control Channel	ATR-095
SAFE-T1904 Chat-Based Backchannel	ATR-080

Exfiltration — 5/6 STRONG

SAFE-MCP Technique	ATR Rules
SAFE-T1910 Covert Channel Exfiltration	ATR-080, ATR-102
SAFE-T1911 Parameter Exfiltration	ATR-084
SAFE-T1912 Stego Response Exfil	ATR-086
SAFE-T1913 HTTP POST Exfil	ATR-010, ATR-013
SAFE-T1914 Tool-to-Tool Exfil	ATR-063
SAFE-T1915 Cross-Chain Laundering	— (gap: blockchain-specific)

Impact — 6/6 FULL

SAFE-MCP Technique	ATR Rules
SAFE-T2101 Data Destruction	ATR-012, ATR-098
SAFE-T2102 Service Disruption	ATR-051, ATR-052
SAFE-T2103 Code Sabotage	ATR-062
SAFE-T2104 Fraudulent Transactions	ATR-098
SAFE-T2105 Disinformation Output	ATR-032, ATR-119
SAFE-T3001 RAG Backdoor Attack	ATR-070

Resource Development — 1/1 FULL

SAFE-MCP Technique	ATR Rules
SAFE-T2107 AI Model Poisoning via Training Data	ATR-073

7 Gaps — Why They Exist

SAFE-MCP Technique	Reason	Priority
SAFE-T1109 Debugging Tool Exploitation	CVE-specific (MCP Inspector)	MEDIUM
SAFE-T1110 Multimodal Prompt Injection	Requires image/audio detection, ATR is text-based	HIGH
SAFE-T1303 Sandbox Escape via Server Exec	Infrastructure-level, outside agent interaction layer	LOW
SAFE-T1407 Server Proxy Masquerade	Network-level, outside agent interaction layer	LOW
SAFE-T1604 Server Version Enumeration	Infrastructure fingerprinting	LOW
SAFE-T1915 Cross-Chain Laundering	Blockchain/DeFi-specific	LOW

3 of 7 gaps are infrastructure-level threats outside ATR's agent interaction focus. The 2 actionable gaps (multimodal injection, debugging tool exploitation) are on the roadmap.

Changes in This PR

README.md: Added "Community Detection Tools" section (extensible — other projects can add themselves via PR)
SAFE-T1001 (Tool Poisoning Attack): Added 6 ATR detection rules to Security Tool Integration section, alongside existing MCP-Scan reference
SAFE-T1102 (Prompt Injection): Added 12 ATR detection rules to Detection Methods section

Paper

Methodology and design rationale: https://doi.org/10.5281/zenodo.19178002

Full cross-reference mapping: ATR SAFE-MCP Mapping

Happy to adjust format, add references to additional techniques, or discuss coverage gaps. The full mapping covers all 14 SAFE-MCP tactics.

eeee2345 · 2026-03-28T00:47:01Z

ATR x SAFE-MCP Integration Details (PR #187 Follow-up)

Thanks for reviewing this mapping. Wanted to add technical details that may be useful for the review.

Testing Methodology

ATR rules were validated against the PINT (Prompt Injection Needle Test) benchmark:

62.7% recall — ATR detects 62.7% of known malicious patterns
99.7% precision — 3 false positives per 1,000 scans
257 unit tests covering all 71 rules
Real-world validation: Full scan of 36,394 ClawHub skills (9,676 with content). Found 182 CRITICAL, 1,124 HIGH, 1,016 MEDIUM, 7,354 LOW findings.

Rule Format

Each ATR rule is a standalone YAML file:

id: ATR-010
title: Malicious Content in MCP Tool Response
severity: critical
patterns:
  - regex: '<pattern>'
    location: tool_description | tool_response | full_content
tags: [SAFE-T1001, OWASP-A01]

Rules map directly to SAFE-MCP technique IDs via tags, making cross-referencing straightforward.

Known Limitations (Transparency)

62.7% recall means 37.3% of threats are missed. ATR is regex/pattern-based static analysis. Semantic attacks that don't match known patterns will evade detection.
64 known evasion techniques are documented in the ATR repo. These include encoding variations, semantic paraphrasing, and context-dependent attacks.
7 SAFE-MCP techniques have no ATR coverage (detailed in the mapping doc):
- 3 are infrastructure-level (sandbox escape, server proxy masquerade, version enumeration)
- 1 is multimodal (image/audio injection)
- 1 is CVE-specific (debugging tool exploitation)
- 1 is blockchain-specific (cross-chain laundering)
- All require runtime or infrastructure-layer detection beyond static analysis
Static analysis cannot detect runtime-only attacks. ATR complements but does not replace runtime monitoring.

Compatibility

ATR rules are engine-agnostic YAML. They can be consumed by:

Any regex engine (PCRE, RE2, JavaScript RegExp)
YARA-style rule runners
Custom MCP scanners
CI/CD pipelines (pre-commit hooks, GitHub Actions)

Question for Maintainers

Is there a specific rule format or testing framework you'd like me to adapt these to? If SAFE-MCP plans to include detection signatures alongside technique descriptions, I'm happy to contribute ATR rules in whatever format works best for the project.

ATR Repository: https://github.com/Agent-Threat-Rule/agent-threat-rules
Paper: https://doi.org/10.5281/zenodo.19178002

… section Add references to ATR (Agent Threat Rules), an open-source MIT-licensed detection ruleset that provides machine-readable YAML rules for 78 of 85 SAFE-MCP techniques (91.8% coverage). Changes: - README.md: Add Community Detection Tools section with ATR coverage table - SAFE-T1001: Add ATR detection rules (6 rules) to Security Tool Integration - SAFE-T1102: Add ATR detection rules (12 rules) to Detection Methods ATR complements SAFE-MCP by providing the detection layer (like Sigma/YARA) on top of the threat knowledge base (like MITRE ATT&CK). Full cross-reference mapping available in the ATR repository. Signed-off-by: Panguard AI <support@panguard.ai>

eeee2345 · 2026-04-03T15:05:33Z

Hi maintainers - friendly follow-up on this PR and the technical comment above.

Quick update: ATR detection rules have been integrated into Cisco AI Defense (merged as PR #79 in cisco-ai-defense/skill-scanner). This adds enterprise-level validation for the detection approach, and further strengthens the case for ATR as a community detection resource alongside SAFE-MCP technique taxonomy.

ATR has also grown to 76 rules since this PR was opened. Happy to update the mapping if there are any changes on the SAFE-MCP side. Let me know if there is anything else needed for review. Thanks!

eeee2345 mentioned this pull request Mar 27, 2026

ATR: Open-source detection ruleset covering 91.8% of SAFE-MCP techniques #188

Open

eeee2345 mentioned this pull request Mar 28, 2026

Analysis: 6 SAFE-MCP techniques that require runtime detection (beyond static analysis) #189

Open

eeee2345 force-pushed the add-atr-detection-references branch from 8ad6b58 to 0ec42c4 Compare April 2, 2026 04:26

update: ATR now at 108 rules

e557b35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ATR detection rules as community detection resource#187

Add ATR detection rules as community detection resource#187
eeee2345 wants to merge 2 commits intosafe-agentic-framework:mainfrom
eeee2345:add-atr-detection-references

eeee2345 commented Mar 27, 2026 •

edited

Loading

Uh oh!

eeee2345 commented Mar 28, 2026

Uh oh!

eeee2345 commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

eeee2345 commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TL;DR

What is ATR?

Key stats (April 2026)

SAFE-MCP Coverage: 78/85 techniques (91.8%)

Detailed Mapping by Tactic

Initial Access — 9/9 FULL

Execution — 8/9 STRONG

Persistence — 8/8 FULL

Privilege Escalation — 8/9 STRONG

Defense Evasion — 7/8 STRONG

Credential Access — 7/7 FULL

Discovery — 5/6 STRONG

Lateral Movement — 7/7 FULL

Collection — 5/5 FULL

Command and Control — 4/4 FULL

Exfiltration — 5/6 STRONG

Impact — 6/6 FULL

Resource Development — 1/1 FULL

7 Gaps — Why They Exist

Changes in This PR

Paper

Uh oh!

eeee2345 commented Mar 28, 2026

Uh oh!

eeee2345 commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

eeee2345 commented Mar 27, 2026 •

edited

Loading