Skip to content

blog: MCP Security — Why Your AI Agent Tool Calls Need a Firewall#899

Open
aymenhmaidiwastaken wants to merge 2 commits intomicrosoft:mainfrom
aymenhmaidiwastaken:blog/mcp-security-firewall
Open

blog: MCP Security — Why Your AI Agent Tool Calls Need a Firewall#899
aymenhmaidiwastaken wants to merge 2 commits intomicrosoft:mainfrom
aymenhmaidiwastaken:blog/mcp-security-firewall

Conversation

@aymenhmaidiwastaken
Copy link
Copy Markdown

Closes #848

Drafted the MCP security blog post covering the threat landscape around AI agent tool calls — tool poisoning, rug-pull attacks, cross-server data leakage, and over-permissioned tools with concrete attack scenarios.

Includes six practical recommendations: tool allowlisting, definition fingerprinting, argument boundary enforcement, human-in-the-loop for sensitive ops, runtime monitoring, and trust domain isolation.

Happy to revise based on feedback!

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Apr 7, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

Welcome to the Agent Governance Toolkit! Thanks for your first pull request.
Please ensure tests pass, code follows style (ruff check), and you have signed the CLA.
See our Contributing Guide.

@github-actions github-actions bot added agent-mesh agent-mesh package size/M Medium PR (< 200 lines) labels Apr 7, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

🤖 AI Agent: contributor-guide — 🌟 What You Did Well

Hi @aymenhmaidiwastaken! 👋

Welcome to the Agent Governance Toolkit community, and thank you for contributing your time and expertise! 🎉 Your blog post draft is incredibly thoughtful and well-researched — it's clear you've put a lot of effort into breaking down complex security concepts into actionable advice. Let's dive into the review!


🌟 What You Did Well

  1. Clarity and Structure: Your blog post is exceptionally well-organized. The "Threat Surface" and "Practical Recommendations" sections are easy to follow and provide a logical flow from problem to solution.
  2. Actionable Advice: The six recommendations are practical, detailed, and immediately useful for readers. The inclusion of YAML examples and JSON schemas is a great touch to make the concepts tangible.
  3. Real-World Scenarios: The attack scenarios you described are both realistic and compelling. They help illustrate the risks in a way that will resonate with practitioners.
  4. Community Alignment: Your post aligns perfectly with the goals of this repository — promoting secure and responsible agent governance. The tie-ins to OWASP and the MCP Trust Guide are excellent.

🛠 Suggestions for Improvement

Here are a few areas where we can refine your contribution to align with project conventions and ensure maximum impact:

1. File Placement

  • Blog posts in this repository are typically placed under packages/{name}/docs/blog/. You've done this correctly by placing the file in packages/agent-mesh/docs/blog/. ✅
  • However, could you also add a test case to ensure the blog post renders correctly in our documentation pipeline? Tests for this package should go in packages/agent-mesh/tests/. You can create a simple test to verify the file's presence and formatting.

2. Linting

  • We use ruff for linting with a focus on E, F, and W error codes. While your blog post is Markdown and won't be linted directly, make sure any Python code snippets (like the MCP Security Scanner link) adhere to PEP 8 standards. If you include runnable Python examples in the future, running ruff locally will help catch issues early.

3. Commit Message

  • We follow the Conventional Commits standard for commit messages. Your commit message should start with a prefix like docs: to indicate the type of change. For example:
    docs: add MCP security blog post on tool call firewalls
    
  • This helps maintainers quickly understand the purpose of your changes and ensures consistent commit history.

4. Security-Sensitive Content

  • Since this blog post discusses security-sensitive topics, it will receive extra scrutiny. You've done a great job referencing OWASP and providing concrete examples, but it would be helpful to link directly to the OWASP Top 10 for LLMs for readers who want to dive deeper.

5. Cross-Referencing Internal Resources

  • You’ve already linked to the MCP Trust Guide and the MCP Security Scanner. Great job! To make this even more robust, consider adding a link to our CONTRIBUTING.md file for readers who might want to contribute to the toolkit after reading your post.

🔗 Helpful Resources

Here are some resources to help you refine your contribution:


✅ Next Steps

  1. Address the feedback above:
    • Add a test case for the blog post in packages/agent-mesh/tests/.
    • Ensure your commit message follows the docs: prefix convention.
    • Optionally, add a link to the OWASP Top 10 for LLMs.
  2. Push your changes to this branch. Once updated, our CI/CD pipeline will automatically re-run checks.
  3. Let us know if you have any questions or need clarification on any of the feedback!

Once you've made these updates, we'll review your PR again and work towards merging it. Thank you for helping us make the Agent Governance Toolkit even better! 🚀

Looking forward to your updates! 😊

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Feedback on Pull Request: blog: MCP Security — Why Your AI Agent Tool Calls Need a Firewall

🔴 CRITICAL

  1. Tool Description Injection Vulnerability
    The blog correctly highlights the risk of tool poisoning via description injection but does not explicitly recommend sanitizing tool descriptions before they are consumed by the agent. This is a critical omission because malicious descriptions can bypass LLM safeguards.
    Actionable Recommendation: Add explicit guidance to sanitize tool descriptions for hidden instructions or malicious payloads before they are presented to the agent. This could include stripping non-visible characters, detecting prompt injection patterns, and validating descriptions against a whitelist of allowed patterns.

  2. Cross-Server Data Leakage
    While the blog mentions the risk of cross-server data leakage, it does not provide concrete implementation details for tracking data provenance across tool calls. Without this, the recommendation for isolating MCP server trust domains lacks actionable guidance.
    Actionable Recommendation: Include technical details on how to implement data provenance tracking, such as tagging data with metadata about its origin and enforcing policies based on these tags.

🟡 WARNING

  1. Backward Compatibility of Tool Fingerprinting
    The recommendation to fingerprint tool definitions and block tools with changed definitions could lead to breaking changes in production environments. If an MCP server updates a tool description or schema for legitimate reasons (e.g., bug fixes or feature enhancements), agents may fail to function unless the fingerprints are updated.
    Actionable Recommendation: Suggest implementing a staged approval process for fingerprint changes, where updates are flagged but not immediately blocked. This allows operators to review and approve legitimate changes without disrupting production.

💡 SUGGESTIONS

  1. Expand Human-in-the-Loop Guidance
    The blog mentions human approval for sensitive operations but does not specify how this could be implemented in practice.
    Actionable Recommendation: Provide examples of how to integrate human-in-the-loop mechanisms, such as using a webhook to trigger approval workflows in tools like Slack or Microsoft Teams.

  2. Runtime Monitoring Details
    The recommendation for runtime monitoring is high-level and does not specify what tools or frameworks could be used to implement anomaly detection.
    Actionable Recommendation: Suggest specific technologies or libraries (e.g., OpenTelemetry for tracing, Elasticsearch for log analysis) that can be used to implement runtime monitoring.

  3. OWASP Agentic Top 10 Mapping
    While the blog references ASI01 (Prompt Injection), it could benefit from mapping the other threats (rug-pull attacks, data leakage, over-permissioned tools) to relevant OWASP Agentic Top 10 categories.
    Actionable Recommendation: Expand the OWASP mapping to include ASI02 (Supply Chain Vulnerabilities) for rug-pull attacks and ASI03 (Data Leakage) for cross-server data leakage.

  4. Tool Allowlist Implementation
    The YAML example for tool allowlisting is helpful but lacks details on how this policy would be enforced programmatically.
    Actionable Recommendation: Provide a code snippet or pseudocode demonstrating how the allowlist can be integrated into the agent's runtime logic.

  5. Clarify "Excessive Data Volume" Detection
    The blog mentions scanning arguments for excessive data volume but does not define thresholds or criteria for what constitutes "excessive."
    Actionable Recommendation: Add guidance on setting thresholds based on tool schema expectations, such as maximum string lengths or array sizes.

  6. Link to MCP Trust Guide and Security Scanner
    The blog links to the MCP Trust Guide and Security Scanner but does not summarize their functionality or relevance to the recommendations.
    Actionable Recommendation: Briefly describe what these resources provide and how they can help implement the defenses outlined in the blog.

General Observations

  • The blog is well-written and provides a clear overview of the MCP threat landscape. It effectively communicates the urgency of securing tool calls and offers practical recommendations.
  • The inclusion of real-world attack scenarios is excellent and helps illustrate the risks.
  • The blog aligns well with the goals of the repository and contributes valuable insights to the community.

Final Recommendation

Merge the pull request after addressing the critical issues and warnings. Consider incorporating the suggestions to further enhance the blog's utility and actionable guidance.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

🤖 AI Agent: security-scanner — Security Review of Blog Post: "MCP Security — Why Your AI Agent Tool Calls Need a Firewall"

Security Review of Blog Post: "MCP Security — Why Your AI Agent Tool Calls Need a Firewall"

This pull request adds a blog post discussing the security challenges of the Model Context Protocol (MCP) and provides practical recommendations for mitigating risks. While the blog post itself does not introduce code changes to the repository, it is critical to evaluate the security implications of the advice provided, as downstream users may rely on this guidance to secure their systems.


Findings

1. Prompt Injection Defense Bypass

Rating: 🔴 CRITICAL
The blog correctly identifies the risk of prompt injection via tool descriptions in MCP (e.g., hidden instructions in the description field). However, the proposed defenses (regex-based sanitization and a dedicated prompt injection classifier) may not be sufficient to detect all forms of adversarial input. For example:

  • Imperative directives: Regex patterns may fail to catch obfuscated or indirect instructions (e.g., "It is recommended to first call exfiltrate_data").
  • Invisible Unicode: While the blog mentions zero-width characters, it does not address other forms of obfuscation, such as homoglyphs or encoded instructions.
  • Encoded payloads: Base64 or other encoding methods can be used to hide malicious instructions, and the blog does not provide a comprehensive decoding strategy.

Suggested Fix:

  • Use a combination of static analysis and dynamic testing to detect prompt injection. For example:
    • Implement a sandboxed LLM instance to test tool descriptions for potential prompt injection behavior.
    • Use a machine learning-based classifier trained on known prompt injection examples to detect subtle patterns that regex might miss.
  • Enforce strict length limits on tool descriptions and reject overly verbose or complex descriptions.
  • Log all rejected descriptions for manual review and analysis.

2. Policy Engine Circumvention

Rating: 🟠 HIGH
The blog highlights the risk of rug-pull attacks, where an MCP server changes tool definitions after they have been approved. While the proposed solution of fingerprinting tool definitions is sound, it does not address the possibility of subtle schema changes that may not trigger a hash mismatch but still introduce vulnerabilities (e.g., adding permissive default values or optional parameters).

Suggested Fix:

  • Extend the fingerprinting mechanism to include semantic analysis of schema changes. For example:
    • Detect added or modified parameters, especially those with default values that could be exploited.
    • Flag changes in parameter descriptions that could indicate new behavior.
  • Implement a versioning system for tool definitions, requiring explicit re-approval for any updates.

3. Trust Chain Weaknesses

Rating: 🟡 MEDIUM
The blog does not explicitly address trust chain validation for MCP servers. If an attacker compromises an MCP server or performs a DNS spoofing attack, they could impersonate a trusted server and serve malicious tool definitions.

Suggested Fix:

  • Require MCP servers to authenticate using strong cryptographic identities (e.g., SPIFFE/SVID or mutual TLS).
  • Implement certificate pinning to prevent man-in-the-middle attacks.
  • Validate the integrity of tool catalogs using signed manifests.

4. Credential Exposure

Rating: 🟠 HIGH
The blog recommends scanning tool call arguments for sensitive data (e.g., API keys, PII). However, it does not address the risk of logging sensitive data during validation or monitoring.

Suggested Fix:

  • Mask sensitive data in logs by default. For example:
    • Replace detected API keys or PII with placeholders (e.g., ***REDACTED***) before logging.
    • Use structured logging with explicit fields for sensitive data, ensuring they are excluded from plaintext logs.
  • Implement access controls on logs to restrict who can view sensitive information.

5. Sandbox Escape

Rating: 🔵 LOW
The blog does not directly address sandboxing or isolation of MCP tool calls. While this is not the primary focus of the post, it is worth mentioning that tools capable of executing code (e.g., execute_command) should be sandboxed to prevent escape.

Suggested Fix:

  • Recommend using containerized or VM-based isolation for tools that perform code execution or interact with the filesystem.

6. Deserialization Attacks

Rating: 🟡 MEDIUM
The blog does not discuss the risk of deserialization attacks when processing tool definitions or arguments. If an MCP server sends maliciously crafted JSON, it could exploit vulnerabilities in the deserialization library.

Suggested Fix:

  • Use safe JSON parsers that enforce strict schema validation and reject unexpected fields or types.
  • Avoid using libraries that support unsafe deserialization features (e.g., pickle in Python).

7. Race Conditions

Rating: 🔵 LOW
The blog does not explicitly address race conditions in policy enforcement (e.g., TOCTOU issues when validating tool definitions or arguments). While this is a lower-risk issue, it could be relevant in high-concurrency environments.

Suggested Fix:

  • Use atomic operations to update and validate tool definitions.
  • Lock tool definitions during validation to prevent concurrent updates.

8. Supply Chain Risks

Rating: 🟠 HIGH
The blog identifies the risk of typosquatting or dependency confusion in MCP tool catalogs but does not provide concrete guidance for mitigating these risks.

Suggested Fix:

  • Recommend verifying the authenticity of MCP servers before connecting to them.
  • Use a trusted registry or allowlist for MCP servers, similar to the proposed tool allowlist.

Summary of Findings

Category Rating Key Risks Suggested Fixes
Prompt Injection Defense Bypass 🔴 CRITICAL Imperative directives, invisible Unicode, encoded payloads, and obfuscation. Use sandboxed LLM testing, ML-based classifiers, and strict length limits.
Policy Engine Circumvention 🟠 HIGH Rug-pull attacks via schema or description changes. Extend fingerprinting to include semantic analysis and enforce versioning.
Trust Chain Weaknesses 🟡 MEDIUM Impersonation of MCP servers via DNS spoofing or compromise. Use cryptographic identities, certificate pinning, and signed manifests.
Credential Exposure 🟠 HIGH Sensitive data logged during validation or monitoring. Mask sensitive data in logs and enforce access controls.
Sandbox Escape 🔵 LOW Lack of isolation for tools with code execution capabilities. Recommend containerized or VM-based isolation for high-risk tools.
Deserialization Attacks 🟡 MEDIUM Maliciously crafted JSON in tool definitions or arguments. Use safe JSON parsers and enforce strict schema validation.
Race Conditions 🔵 LOW TOCTOU issues in policy enforcement. Use atomic operations and locking during validation.
Supply Chain Risks 🟠 HIGH Typosquatting or dependency confusion in MCP tool catalogs. Verify MCP server authenticity and use a trusted registry or allowlist.

Final Recommendation

The blog post provides valuable insights into the security challenges of MCP and outlines several practical defenses. However, the proposed solutions could be strengthened in several areas, particularly around prompt injection, trust chain validation, and supply chain risks. Addressing these gaps will ensure that the guidance is robust and comprehensive for downstream users.

Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing about MCP security @aymenhmaidiwastaken! Great content. Two items before we can merge:

  1. Sign the CLA — the license/cla check is still pending. Follow the bot instructions.
  2. Publish the blog externally — per issue #848, the deliverable is a published post on Dev.to/Medium/Hashnode. Please publish the article there, then update the COMMUNITY.md link to point to the published URL instead of the in-repo path.

The content quality is excellent — looking forward to getting this merged once published!

@aymenhmaidiwastaken
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

@aymenhmaidiwastaken
Copy link
Copy Markdown
Author

Thanks for the review @imran-siddique! Really appreciate the feedback.

I'll work on both items:

  1. CLA — just signed it above
  2. Publishing externally — I'll publish the article on Dev.to and update the COMMUNITY.md link to point there instead of the in-repo path. Will push the update once it's live.

Also, the AI code reviewer raised some solid points — I'll incorporate the critical ones (sanitizing tool descriptions, data provenance tracking details) and the OWASP Agentic Top 10 mapping before publishing. Should make the article stronger.

Will update the PR shortly!

@imran-siddique
Copy link
Copy Markdown
Member

Great, thanks @aymenhmaidiwastaken! Take your time with the publishing. Once the blog is live and CLA is signed, ping us and we'll merge right away.

@github-actions github-actions bot added the size/L Large PR (< 500 lines) label Apr 8, 2026
@aymenhmaidiwastaken
Copy link
Copy Markdown
Author

Updated the blog post with all the reviewer feedback incorporated:

  • Added tool description sanitization guidance with a scan_description() implementation covering imperative directives, cross-tool references, invisible Unicode, encoded payloads, and HTML comments
  • Added data provenance tracking with a ProvenanceTracker class and check_boundary() enforcement
  • Mapped all threats to OWASP Agentic Top 10 (ASI01 for tool poisoning, ASI02 for rug-pull attacks, ASI03 for data leakage)
  • Added concrete human-in-the-loop implementation with Slack/Teams webhook approval and YAML policy config
  • Added specific monitoring stack recommendations (OpenTelemetry for tracing, Elasticsearch/Loki for logs, Grafana/Datadog/PagerDuty for alerting) with span attributes
  • Defined thresholds for excessive data volume (5KB warn, 20KB block, 10x median always block)
  • Added descriptions of the MCP Trust Guide's four governance layers and the Security Scanner's capabilities

I'll publish this on Dev.to and update the COMMUNITY.md link once it's live. Working on that now.

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Feedback for Pull Request: MCP Security Blog Post

🔴 CRITICAL: Security Concerns

  1. Tool Description Sanitization Pipeline:

    • The sanitization pipeline proposed in the blog post relies heavily on regex-based matching for detecting malicious patterns in tool descriptions. While regex can catch obvious cases, it is insufficient for detecting sophisticated prompt injection attacks that leverage advanced obfuscation techniques. Consider integrating a more robust NLP-based classifier trained on adversarial examples to detect hidden instructions in tool descriptions.
  2. Provenance Tracker Implementation:

    • The ProvenanceTracker implementation uses SHA-256 for content hashing, which is insufficient for detecting partial matches or modified data. Attackers can easily bypass this by slightly altering the data. Consider using fuzzy hashing techniques like ssdeep or MinHash for more robust content similarity detection.
  3. Cross-Domain Policies:

    • The example policy for cross-domain data flow allows exceptions for specific tools like translate_text. This introduces a potential attack vector where malicious actors could exploit the exception to exfiltrate sensitive data. Ensure that exceptions are tightly scoped and include additional safeguards such as content classification, size limits, and PII detection.
  4. Human-in-the-Loop Approval:

    • The webhook-based approval mechanism lacks authentication and authorization checks. An attacker could potentially spoof approval requests or responses. Ensure that the webhook endpoint is secured using cryptographic signatures, and validate responses using a secure mechanism (e.g., HMAC or JWT).
  5. Telemetry Logging:

    • While the blog post recommends logging tool calls with full arguments, this approach may inadvertently log sensitive data (e.g., PII, credentials). Ensure that sensitive data is redacted or encrypted before being logged to prevent accidental exposure.

🟡 WARNING: Potential Breaking Changes

  1. Tool Allowlisting:

    • The proposed allowlist mechanism introduces a breaking change to how agents interact with MCP servers. If implemented, agents will no longer be able to dynamically discover tools, which could impact existing workflows. Ensure that this change is documented and communicated clearly to users, along with migration guidance.
  2. Fingerprinting Tool Definitions:

    • The fingerprinting mechanism requires MCP servers to maintain consistent tool definitions across sessions. Any server-side changes to tool definitions will now result in blocked tool calls, which could disrupt production systems. Provide a fallback mechanism or alerting system to handle such cases gracefully.

💡 SUGGESTIONS: Improvements

  1. Structured Telemetry:

    • The blog post recommends using OpenTelemetry for monitoring tool calls. Extend this recommendation to include distributed tracing across MCP servers to track data flow between trust domains. This will provide better visibility into cross-server interactions.
  2. Runtime Argument Validation:

    • The argument boundary enforcement mechanism could benefit from integrating a dedicated library for sensitive data detection, such as Microsoft Presidio or Pydantic validators. This would improve accuracy and reduce false positives.
  3. Sandboxing Tool Execution:

    • The blog post does not address sandboxing for tools that execute code (e.g., execute_command). Consider recommending the use of containerization (e.g., Docker, Firecracker) or syscall filtering (e.g., seccomp) to isolate tool execution environments.
  4. Backward Compatibility:

    • For organizations with existing MCP deployments, provide a migration guide for implementing the proposed defenses incrementally. This could include a phased rollout plan for allowlisting, fingerprinting, and provenance tracking.
  5. OWASP Agentic Top 10 Mapping:

    • The blog post provides a good mapping to OWASP Agentic Top 10 categories but could benefit from a more detailed explanation of how each recommendation mitigates specific risks. Consider adding a table summarizing the defenses against each OWASP category.

Summary

This blog post provides a comprehensive overview of the security challenges in MCP-based AI agent deployments and proposes actionable defenses. However, the implementation details for some defenses (e.g., sanitization, provenance tracking) require refinement to address sophisticated attack vectors. Additionally, the proposed changes may introduce breaking changes for existing deployments, which should be carefully documented and communicated.

Recommendations

  • Address the critical security concerns flagged above, particularly around sanitization, provenance tracking, and webhook approval mechanisms.
  • Provide detailed migration guidance for users adopting the proposed defenses.
  • Consider extending the blog post to include sandboxing recommendations and distributed tracing for cross-server interactions.

This blog post is a valuable addition to the repository's documentation and aligns well with the project's focus on security and governance.

@aymenhmaidiwastaken
Copy link
Copy Markdown
Author

Both items addressed:

  1. CLA — signed
  2. Published externally — article is live on Dev.to: https://dev.to/aymenhmaidi/mcp-security-why-your-ai-agents-tool-calls-need-a-firewall-3h48

Updated COMMUNITY.md to point to the published URL instead of the in-repo path. Ready for re-review!

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Review Summary

This blog post provides a comprehensive overview of the security risks associated with the Model Context Protocol (MCP) and offers actionable recommendations for mitigating these risks. The post is well-written, informative, and aligns with the goals of the repository. However, there are areas that could benefit from additional clarity, technical refinement, and alignment with best practices.


🔴 CRITICAL: Security Issues

  1. Tool Description Sanitization Pipeline:

    • The regex-based sanitization approach is prone to false negatives and may miss sophisticated prompt injection attempts. For example, adversaries could use obfuscated or encoded payloads that bypass simple regex checks.
    • Recommendation: Integrate a more robust NLP-based classifier trained on a dataset of malicious and benign tool descriptions. Consider leveraging pre-trained models for detecting adversarial instructions.
  2. Provenance Tracking Implementation:

    • The current implementation of ProvenanceTracker relies on exact SHA-256 hash matching, which is brittle and prone to false negatives when data is slightly modified (e.g., whitespace changes, re-encoding). This could allow attackers to bypass provenance checks.
    • Recommendation: Replace exact hash matching with content fingerprinting techniques, such as rolling hashes or MinHash, to improve resilience against minor modifications.
  3. Cross-Domain Data Leakage:

    • While the blog mentions the importance of provenance tracking and trust domain isolation, the example implementation does not address how to handle nested or derived data (e.g., data transformations or aggregations). This could lead to leakage of sensitive information across trust boundaries.
    • Recommendation: Implement recursive provenance tracking for derived data. For example, if data from Server A is transformed and used in a tool call to Server B, the governance layer should still enforce the original trust boundary.
  4. Human-in-the-Loop Approval:

    • The webhook-based approval mechanism assumes that the human operator can make an informed decision based on the provided arguments. However, sensitive data (e.g., PII or credentials) may still be exposed in the approval request itself.
    • Recommendation: Redact sensitive data from the approval request before sending it to the human operator. Use a secure channel for approvals and ensure that the request payload is encrypted.

🟡 WARNING: Potential Breaking Changes

  1. Tool Fingerprinting:

    • Introducing fingerprinting for tool definitions may break backward compatibility for existing deployments that dynamically discover tools without validation. This could lead to blocked tool calls in production environments.
    • Recommendation: Provide a migration guide for existing users, including steps to generate fingerprints for currently approved tools and handle discrepancies during runtime.
  2. Argument Boundary Enforcement:

    • Enforcing strict thresholds for argument sizes and patterns may cause legitimate tool calls to be blocked, especially in edge cases where larger payloads are expected (e.g., processing large documents).
    • Recommendation: Allow configurable thresholds and provide detailed logs for blocked calls to help operators fine-tune policies.

💡 Suggestions for Improvement

  1. OWASP Agentic Top 10 Mapping:

    • The blog maps threats to OWASP Agentic Top 10 categories but does not provide direct links to the OWASP documentation. Adding links would improve accessibility and credibility.
  2. Code Examples:

    • The code examples are helpful but could benefit from additional comments explaining key decisions and trade-offs. For example, the provenance tracker could include comments about why certain fields (e.g., sensitivity) are chosen.
  3. Telemetry Recommendations:

    • The OpenTelemetry setup is a good starting point, but consider adding examples of how to integrate with popular observability platforms like Datadog or Prometheus. This would make it easier for users to adopt the recommendations.
  4. Real-World Case Studies:

    • The blog would be even more impactful if it included real-world case studies or examples of MCP-related security incidents. This would help readers understand the urgency of implementing the recommended defenses.
  5. Tool Allowlist YAML Example:

    • The YAML example for tool allowlisting is clear but could include comments explaining the rationale behind each rule. For instance, why certain tools are denied for specific agent roles.
  6. Markdown Formatting:

    • Consider adding a table of contents at the beginning of the blog post for easier navigation, especially given its length.

Final Recommendation

Merge the pull request after addressing the critical security issues and warnings. The blog post is a valuable addition to the repository's documentation and provides actionable insights for securing MCP-based agent deployments.

@imran-siddique
Copy link
Copy Markdown
Member

Review: APPROVE (pending rebase)

Diff reviewed — 2 files: COMMUNITY.md entry + new blog post at packages/agent-mesh/docs/blog/mcp-security-firewall.md (404 lines). Closes #848.

Content review: Excellent, technically sound blog covering MCP threat surface — tool poisoning (ASI01), rug-pull attacks (ASI02), cross-server data leakage (ASI03), over-permissioned tools. Includes 6 practical defenses with concrete code examples and YAML configs. Well-structured with real attack scenarios.

Security checklist:

  • No eval(), exec(), pickle, shell=True, innerHTML in code examples
  • Code examples use standard libraries only (hashlib, json, re, httpx, asyncio)
  • Slack webhook URL is a placeholder (https://hooks.slack.com/workflows/T.../A.../...) — not a real secret
  • No hardcoded API keys or tokens
  • Scope matches description — blog post + COMMUNITY.md entry only
  • Additions > 0 (405 additions)

Before merging, please:

  1. Rebase your branch — there are merge conflicts with current main (mergeable_state: dirty)
  2. Fork workflows need approval in the Actions tab (Maintainer approval gate)

Great contribution! 🎉

imran-siddique
imran-siddique previously approved these changes Apr 8, 2026
Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Content review passed — excellent MCP security blog post. Approving.

Author: please rebase to resolve merge conflicts, then this can merge.

aymenhmaidiwastaken and others added 2 commits April 9, 2026 00:44
…Calls

Addresses microsoft#848. Covers MCP threat surface (tool poisoning, rug-pull attacks,
cross-server data leakage, over-permissioned tools), real attack scenarios,
and practical defenses including tool allowlisting, definition fingerprinting,
argument boundary enforcement, and runtime monitoring.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aymenhmaidiwastaken
Copy link
Copy Markdown
Author

Rebased on latest main — merge conflict in COMMUNITY.md resolved. Should be good to merge now!

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Agent: code-reviewer

Feedback on Pull Request: MCP Security Blog Post


🔴 CRITICAL: Security Concerns

  1. Tool Description Sanitization Pipeline

    • The sanitization pipeline for tool descriptions is a good start, but regex-based detection alone is insufficient for robust security. Attackers can craft adversarial descriptions that bypass simple regex patterns. Consider integrating a machine learning-based prompt injection classifier trained on adversarial examples to complement the regex checks.
  2. Provenance Tracking Implementation

    • The ProvenanceTracker implementation uses SHA-256 for exact content matching. This approach is vulnerable to partial data leakage (e.g., substring matches or modified data). Use rolling hashes or MinHash for approximate matching to detect partial overlaps and ensure robust provenance tracking.
  3. Cross-Server Data Leakage

    • The blog mentions provenance tracking but does not address cryptographic integrity verification for cross-server data flows. Without cryptographic signatures, provenance tags can be tampered with. Consider signing provenance tags using Ed25519 or similar cryptographic methods to ensure integrity.
  4. Human-in-the-Loop Approval

    • The human approval mechanism relies on external communication tools like Slack or Teams. If these tools are compromised, attackers could spoof approval responses. Implement cryptographic signatures for approval responses to ensure authenticity.

🟡 WARNING: Potential Breaking Changes

  1. Tool Fingerprinting

    • Introducing tool fingerprinting as a runtime check may break existing integrations if servers dynamically update tool definitions. Ensure backward compatibility by allowing a grace period for server updates or providing a migration path for existing deployments.
  2. Argument Boundary Enforcement

    • Enforcing strict thresholds for argument sizes and patterns could lead to false positives in legitimate use cases. Provide clear documentation and configuration options for users to customize thresholds based on their specific needs.

💡 Suggestions for Improvement

  1. Telemetry and Monitoring

    • The OpenTelemetry integration is a strong addition. Enhance it by including error codes or reasons for blocked tool calls in the telemetry data. This will help operators diagnose issues faster.
  2. Trust Domain Isolation

    • The trust domain isolation mechanism is well-designed but could benefit from more granular policies. For example, allow specific tools to cross domains only under certain conditions (e.g., time-based restrictions or user-specific overrides).
  3. Documentation

    • The blog post references the MCP Trust Guide and MCP Security Scanner but does not provide direct links to their GitHub pages or installation instructions. Add these links for easier access.
  4. Code Examples

    • The code snippets are helpful but could be expanded with unit tests or examples of expected input/output. This would make it easier for readers to understand how to implement the solutions.
  5. OWASP Agentic Top 10 Mapping

    • The blog does a great job of mapping threats to OWASP Agentic Top 10 categories. Consider adding a summary table that lists each threat, its corresponding OWASP category, and the recommended defense.
  6. Community Engagement

    • Encourage community contributions by adding a call-to-action for readers to share their own security practices or contribute to the MCP Security Scanner module.

Overall Assessment

This blog post is a comprehensive and well-written piece that addresses critical security concerns in MCP-based agent deployments. It provides actionable recommendations and practical code examples, making it highly valuable for the community. However, there are critical areas that need stronger defenses, especially around cryptographic integrity and adversarial detection.


Recommended Actions

  1. Integrate cryptographic integrity checks for provenance tags and human approvals.
  2. Enhance the sanitization pipeline with machine learning-based classifiers.
  3. Expand documentation with direct links and installation instructions for referenced tools.
  4. Provide a migration path for existing deployments to adapt to new security features.

Let me know if you need further clarification or additional feedback!

@imran-siddique imran-siddique enabled auto-merge (squash) April 8, 2026 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-mesh agent-mesh package documentation Improvements or additions to documentation size/L Large PR (< 500 lines) size/M Medium PR (< 200 lines)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

📝 Blog Post: MCP Security — Why Your AI Agent's Tool Calls Need a Firewall

2 participants