Skip to content

Analysis: 6 SAFE-MCP techniques that require runtime detection (beyond static analysis) #189

@eeee2345

Description

@eeee2345

SAFE-MCP Gap Analysis: 7 Uncovered Techniques

Source: ATR v0.4.0 x SAFE-MCP mapping (PR #187)
Date: 2026-03-28


Overview

ATR covers 78 of 85 SAFE-MCP techniques (91.8%). The 7 uncovered techniques fall into three categories: infrastructure-level threats (3), detection-boundary limitations (2), and out-of-scope domains (2).


Category 1: Infrastructure-Level Threats (3)

These operate below the agent interaction layer at the container, network, or server infrastructure level. Static analysis of MCP tool descriptions and responses cannot observe these.

SAFE-T1303 — Sandbox Escape via Server Exec
What it is: Exploiting a vulnerability in the container/sandbox running the MCP server to break out and access the host OS.
Why static analysis can't detect it: The escape happens at runtime via memory corruption, kernel exploits, or misconfigured container boundaries. There is no pattern in the MCP protocol or tool definitions to scan for.
What's needed: Container security tooling (Falco, gVisor, seccomp profiles), runtime syscall monitoring, and hardened sandbox configurations.

SAFE-T1407 — Server Proxy Masquerade
What it is: A malicious proxy that sits between the client and the real MCP server, intercepting and modifying traffic while appearing legitimate.
Why static analysis can't detect it: The masquerade happens at the network layer. The MCP server's source code looks clean. The attacker is a separate process on the network path.
What's needed: mTLS enforcement, certificate pinning, network-level anomaly detection.

SAFE-T1604 — Server Version Enumeration
What it is: Fingerprinting MCP server versions to identify known vulnerabilities, similar to port scanning + banner grabbing.
Why static analysis can't detect it: Enumeration is a runtime behavior (the attacker sends probing requests). The server's source code does not contain evidence of being enumerated.
What's needed: Rate limiting on MCP endpoints, version header stripping, runtime request anomaly detection.


Category 2: Detection-Boundary Limitations (2)

These require input modalities or runtime behaviors that ATR's regex-based text analysis cannot observe.

SAFE-T1109 — Debugging Tool Exploitation (CVE-2025-49596)
What it is: Exploiting the MCP Inspector debugging tool to execute arbitrary code on the server.
Why static analysis can't detect it: The vulnerability is in the debugging tool's runtime behavior (how it processes debug commands), not in MCP tool descriptions.
What's needed: A targeted ATR rule checking for MCP Inspector usage patterns, or runtime monitoring of debug port access. This is the most actionable gap -- a future ATR rule could flag MCP servers that expose debug endpoints.

SAFE-T1110 — Multimodal Prompt Injection via Images/Audio
What it is: Embedding malicious instructions in images (steganography, OCR-readable text) or audio (speech-to-text injection) that the AI agent processes.
Why static analysis can't detect it: ATR analyzes text content. An image containing "ignore previous instructions" rendered as pixels is invisible to regex.
What's needed: Multimodal content analysis -- OCR scanning of images for instruction-like text, audio transcription analysis, image metadata inspection. Fundamentally different detection modality.


Category 3: Out-of-Scope Domains (1)

SAFE-T1915 — Cross-Chain Laundering via Bridges/DEXs
What it is: Using AI agents to move stolen funds across blockchain bridges and decentralized exchanges.
Why static analysis can't detect it: This is a financial crime pattern, not an agent security pattern. Detection requires blockchain transaction analysis, not tool description scanning.
What's needed: On-chain analytics (Chainalysis, TRM Labs), transaction pattern matching, wallet clustering.


Summary Table

Gap Addressable by ATR? What's Needed Instead
Sandbox Escape (T1303) No Container security (Falco, gVisor)
Server Proxy Masquerade (T1407) No mTLS, cert pinning, network monitoring
Version Enumeration (T1604) No Rate limiting, header stripping
Debug Tool Exploit (T1109) Partially Targeted rule + runtime debug port monitoring
Multimodal Injection (T1110) No OCR/audio analysis pipeline
Cross-Chain Laundering (T1915) No On-chain analytics

The Key Insight

Static analysis (ATR) and runtime monitoring (gateways, proxies) are complementary layers. ATR catches threats embedded in MCP server source code -- 78/85 SAFE-MCP techniques before the server ever runs. The remaining gaps require runtime infrastructure that sits outside the agent interaction layer.

The security stack for MCP needs at minimum:

  1. Pre-deployment static analysis (ATR) -- catches embedded threats
  2. Runtime gateway/proxy (Runlayer, Helmet) -- catches in-transit attacks
  3. Infrastructure hardening (containers, network, mTLS) -- catches escape/masquerade
  4. Multimodal analysis (future) -- catches image/audio injection

This came out of the mapping work in PR #187 -- happy to discuss if useful.

ATR Repository: https://github.com/Agent-Threat-Rule/agent-threat-rules
Paper: https://doi.org/10.5281/zenodo.19178002

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions