blog: MCP Security — Why Your AI Agent Tool Calls Need a Firewall#899
blog: MCP Security — Why Your AI Agent Tool Calls Need a Firewall#899aymenhmaidiwastaken wants to merge 2 commits intomicrosoft:mainfrom
Conversation
|
Welcome to the Agent Governance Toolkit! Thanks for your first pull request. |
🤖 AI Agent: contributor-guide — 🌟 What You Did WellHi @aymenhmaidiwastaken! 👋 Welcome to the Agent Governance Toolkit community, and thank you for contributing your time and expertise! 🎉 Your blog post draft is incredibly thoughtful and well-researched — it's clear you've put a lot of effort into breaking down complex security concepts into actionable advice. Let's dive into the review! 🌟 What You Did Well
🛠 Suggestions for ImprovementHere are a few areas where we can refine your contribution to align with project conventions and ensure maximum impact: 1. File Placement
2. Linting
3. Commit Message
4. Security-Sensitive Content
5. Cross-Referencing Internal Resources
🔗 Helpful ResourcesHere are some resources to help you refine your contribution:
✅ Next Steps
Once you've made these updates, we'll review your PR again and work towards merging it. Thank you for helping us make the Agent Governance Toolkit even better! 🚀 Looking forward to your updates! 😊 |
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Feedback on Pull Request: blog: MCP Security — Why Your AI Agent Tool Calls Need a Firewall
🔴 CRITICAL
-
Tool Description Injection Vulnerability
The blog correctly highlights the risk of tool poisoning via description injection but does not explicitly recommend sanitizing tool descriptions before they are consumed by the agent. This is a critical omission because malicious descriptions can bypass LLM safeguards.
Actionable Recommendation: Add explicit guidance to sanitize tool descriptions for hidden instructions or malicious payloads before they are presented to the agent. This could include stripping non-visible characters, detecting prompt injection patterns, and validating descriptions against a whitelist of allowed patterns. -
Cross-Server Data Leakage
While the blog mentions the risk of cross-server data leakage, it does not provide concrete implementation details for tracking data provenance across tool calls. Without this, the recommendation for isolating MCP server trust domains lacks actionable guidance.
Actionable Recommendation: Include technical details on how to implement data provenance tracking, such as tagging data with metadata about its origin and enforcing policies based on these tags.
🟡 WARNING
- Backward Compatibility of Tool Fingerprinting
The recommendation to fingerprint tool definitions and block tools with changed definitions could lead to breaking changes in production environments. If an MCP server updates a tool description or schema for legitimate reasons (e.g., bug fixes or feature enhancements), agents may fail to function unless the fingerprints are updated.
Actionable Recommendation: Suggest implementing a staged approval process for fingerprint changes, where updates are flagged but not immediately blocked. This allows operators to review and approve legitimate changes without disrupting production.
💡 SUGGESTIONS
-
Expand Human-in-the-Loop Guidance
The blog mentions human approval for sensitive operations but does not specify how this could be implemented in practice.
Actionable Recommendation: Provide examples of how to integrate human-in-the-loop mechanisms, such as using a webhook to trigger approval workflows in tools like Slack or Microsoft Teams. -
Runtime Monitoring Details
The recommendation for runtime monitoring is high-level and does not specify what tools or frameworks could be used to implement anomaly detection.
Actionable Recommendation: Suggest specific technologies or libraries (e.g., OpenTelemetry for tracing, Elasticsearch for log analysis) that can be used to implement runtime monitoring. -
OWASP Agentic Top 10 Mapping
While the blog references ASI01 (Prompt Injection), it could benefit from mapping the other threats (rug-pull attacks, data leakage, over-permissioned tools) to relevant OWASP Agentic Top 10 categories.
Actionable Recommendation: Expand the OWASP mapping to include ASI02 (Supply Chain Vulnerabilities) for rug-pull attacks and ASI03 (Data Leakage) for cross-server data leakage. -
Tool Allowlist Implementation
The YAML example for tool allowlisting is helpful but lacks details on how this policy would be enforced programmatically.
Actionable Recommendation: Provide a code snippet or pseudocode demonstrating how the allowlist can be integrated into the agent's runtime logic. -
Clarify "Excessive Data Volume" Detection
The blog mentions scanning arguments for excessive data volume but does not define thresholds or criteria for what constitutes "excessive."
Actionable Recommendation: Add guidance on setting thresholds based on tool schema expectations, such as maximum string lengths or array sizes. -
Link to MCP Trust Guide and Security Scanner
The blog links to the MCP Trust Guide and Security Scanner but does not summarize their functionality or relevance to the recommendations.
Actionable Recommendation: Briefly describe what these resources provide and how they can help implement the defenses outlined in the blog.
General Observations
- The blog is well-written and provides a clear overview of the MCP threat landscape. It effectively communicates the urgency of securing tool calls and offers practical recommendations.
- The inclusion of real-world attack scenarios is excellent and helps illustrate the risks.
- The blog aligns well with the goals of the repository and contributes valuable insights to the community.
Final Recommendation
Merge the pull request after addressing the critical issues and warnings. Consider incorporating the suggestions to further enhance the blog's utility and actionable guidance.
🤖 AI Agent: security-scanner — Security Review of Blog Post: "MCP Security — Why Your AI Agent Tool Calls Need a Firewall"Security Review of Blog Post: "MCP Security — Why Your AI Agent Tool Calls Need a Firewall"This pull request adds a blog post discussing the security challenges of the Model Context Protocol (MCP) and provides practical recommendations for mitigating risks. While the blog post itself does not introduce code changes to the repository, it is critical to evaluate the security implications of the advice provided, as downstream users may rely on this guidance to secure their systems. Findings1. Prompt Injection Defense BypassRating: 🔴 CRITICAL
Suggested Fix:
2. Policy Engine CircumventionRating: 🟠 HIGH Suggested Fix:
3. Trust Chain WeaknessesRating: 🟡 MEDIUM Suggested Fix:
4. Credential ExposureRating: 🟠 HIGH Suggested Fix:
5. Sandbox EscapeRating: 🔵 LOW Suggested Fix:
6. Deserialization AttacksRating: 🟡 MEDIUM Suggested Fix:
7. Race ConditionsRating: 🔵 LOW Suggested Fix:
8. Supply Chain RisksRating: 🟠 HIGH Suggested Fix:
Summary of Findings
Final RecommendationThe blog post provides valuable insights into the security challenges of MCP and outlines several practical defenses. However, the proposed solutions could be strengthened in several areas, particularly around prompt injection, trust chain validation, and supply chain risks. Addressing these gaps will ensure that the guidance is robust and comprehensive for downstream users. |
imran-siddique
left a comment
There was a problem hiding this comment.
Thanks for writing about MCP security @aymenhmaidiwastaken! Great content. Two items before we can merge:
- Sign the CLA — the license/cla check is still pending. Follow the bot instructions.
- Publish the blog externally — per issue #848, the deliverable is a published post on Dev.to/Medium/Hashnode. Please publish the article there, then update the COMMUNITY.md link to point to the published URL instead of the in-repo path.
The content quality is excellent — looking forward to getting this merged once published!
|
@microsoft-github-policy-service agree |
|
Thanks for the review @imran-siddique! Really appreciate the feedback. I'll work on both items:
Also, the AI code reviewer raised some solid points — I'll incorporate the critical ones (sanitizing tool descriptions, data provenance tracking details) and the OWASP Agentic Top 10 mapping before publishing. Should make the article stronger. Will update the PR shortly! |
|
Great, thanks @aymenhmaidiwastaken! Take your time with the publishing. Once the blog is live and CLA is signed, ping us and we'll merge right away. |
|
Updated the blog post with all the reviewer feedback incorporated:
I'll publish this on Dev.to and update the COMMUNITY.md link once it's live. Working on that now. |
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Feedback for Pull Request: MCP Security Blog Post
🔴 CRITICAL: Security Concerns
-
Tool Description Sanitization Pipeline:
- The sanitization pipeline proposed in the blog post relies heavily on regex-based matching for detecting malicious patterns in tool descriptions. While regex can catch obvious cases, it is insufficient for detecting sophisticated prompt injection attacks that leverage advanced obfuscation techniques. Consider integrating a more robust NLP-based classifier trained on adversarial examples to detect hidden instructions in tool descriptions.
-
Provenance Tracker Implementation:
- The
ProvenanceTrackerimplementation uses SHA-256 for content hashing, which is insufficient for detecting partial matches or modified data. Attackers can easily bypass this by slightly altering the data. Consider using fuzzy hashing techniques like ssdeep or MinHash for more robust content similarity detection.
- The
-
Cross-Domain Policies:
- The example policy for cross-domain data flow allows exceptions for specific tools like
translate_text. This introduces a potential attack vector where malicious actors could exploit the exception to exfiltrate sensitive data. Ensure that exceptions are tightly scoped and include additional safeguards such as content classification, size limits, and PII detection.
- The example policy for cross-domain data flow allows exceptions for specific tools like
-
Human-in-the-Loop Approval:
- The webhook-based approval mechanism lacks authentication and authorization checks. An attacker could potentially spoof approval requests or responses. Ensure that the webhook endpoint is secured using cryptographic signatures, and validate responses using a secure mechanism (e.g., HMAC or JWT).
-
Telemetry Logging:
- While the blog post recommends logging tool calls with full arguments, this approach may inadvertently log sensitive data (e.g., PII, credentials). Ensure that sensitive data is redacted or encrypted before being logged to prevent accidental exposure.
🟡 WARNING: Potential Breaking Changes
-
Tool Allowlisting:
- The proposed allowlist mechanism introduces a breaking change to how agents interact with MCP servers. If implemented, agents will no longer be able to dynamically discover tools, which could impact existing workflows. Ensure that this change is documented and communicated clearly to users, along with migration guidance.
-
Fingerprinting Tool Definitions:
- The fingerprinting mechanism requires MCP servers to maintain consistent tool definitions across sessions. Any server-side changes to tool definitions will now result in blocked tool calls, which could disrupt production systems. Provide a fallback mechanism or alerting system to handle such cases gracefully.
💡 SUGGESTIONS: Improvements
-
Structured Telemetry:
- The blog post recommends using OpenTelemetry for monitoring tool calls. Extend this recommendation to include distributed tracing across MCP servers to track data flow between trust domains. This will provide better visibility into cross-server interactions.
-
Runtime Argument Validation:
- The argument boundary enforcement mechanism could benefit from integrating a dedicated library for sensitive data detection, such as Microsoft Presidio or Pydantic validators. This would improve accuracy and reduce false positives.
-
Sandboxing Tool Execution:
- The blog post does not address sandboxing for tools that execute code (e.g.,
execute_command). Consider recommending the use of containerization (e.g., Docker, Firecracker) or syscall filtering (e.g., seccomp) to isolate tool execution environments.
- The blog post does not address sandboxing for tools that execute code (e.g.,
-
Backward Compatibility:
- For organizations with existing MCP deployments, provide a migration guide for implementing the proposed defenses incrementally. This could include a phased rollout plan for allowlisting, fingerprinting, and provenance tracking.
-
OWASP Agentic Top 10 Mapping:
- The blog post provides a good mapping to OWASP Agentic Top 10 categories but could benefit from a more detailed explanation of how each recommendation mitigates specific risks. Consider adding a table summarizing the defenses against each OWASP category.
Summary
This blog post provides a comprehensive overview of the security challenges in MCP-based AI agent deployments and proposes actionable defenses. However, the implementation details for some defenses (e.g., sanitization, provenance tracking) require refinement to address sophisticated attack vectors. Additionally, the proposed changes may introduce breaking changes for existing deployments, which should be carefully documented and communicated.
Recommendations
- Address the critical security concerns flagged above, particularly around sanitization, provenance tracking, and webhook approval mechanisms.
- Provide detailed migration guidance for users adopting the proposed defenses.
- Consider extending the blog post to include sandboxing recommendations and distributed tracing for cross-server interactions.
This blog post is a valuable addition to the repository's documentation and aligns well with the project's focus on security and governance.
|
Both items addressed:
Updated |
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Review Summary
This blog post provides a comprehensive overview of the security risks associated with the Model Context Protocol (MCP) and offers actionable recommendations for mitigating these risks. The post is well-written, informative, and aligns with the goals of the repository. However, there are areas that could benefit from additional clarity, technical refinement, and alignment with best practices.
🔴 CRITICAL: Security Issues
-
Tool Description Sanitization Pipeline:
- The regex-based sanitization approach is prone to false negatives and may miss sophisticated prompt injection attempts. For example, adversaries could use obfuscated or encoded payloads that bypass simple regex checks.
- Recommendation: Integrate a more robust NLP-based classifier trained on a dataset of malicious and benign tool descriptions. Consider leveraging pre-trained models for detecting adversarial instructions.
-
Provenance Tracking Implementation:
- The current implementation of
ProvenanceTrackerrelies on exact SHA-256 hash matching, which is brittle and prone to false negatives when data is slightly modified (e.g., whitespace changes, re-encoding). This could allow attackers to bypass provenance checks. - Recommendation: Replace exact hash matching with content fingerprinting techniques, such as rolling hashes or MinHash, to improve resilience against minor modifications.
- The current implementation of
-
Cross-Domain Data Leakage:
- While the blog mentions the importance of provenance tracking and trust domain isolation, the example implementation does not address how to handle nested or derived data (e.g., data transformations or aggregations). This could lead to leakage of sensitive information across trust boundaries.
- Recommendation: Implement recursive provenance tracking for derived data. For example, if data from Server A is transformed and used in a tool call to Server B, the governance layer should still enforce the original trust boundary.
-
Human-in-the-Loop Approval:
- The webhook-based approval mechanism assumes that the human operator can make an informed decision based on the provided arguments. However, sensitive data (e.g., PII or credentials) may still be exposed in the approval request itself.
- Recommendation: Redact sensitive data from the approval request before sending it to the human operator. Use a secure channel for approvals and ensure that the request payload is encrypted.
🟡 WARNING: Potential Breaking Changes
-
Tool Fingerprinting:
- Introducing fingerprinting for tool definitions may break backward compatibility for existing deployments that dynamically discover tools without validation. This could lead to blocked tool calls in production environments.
- Recommendation: Provide a migration guide for existing users, including steps to generate fingerprints for currently approved tools and handle discrepancies during runtime.
-
Argument Boundary Enforcement:
- Enforcing strict thresholds for argument sizes and patterns may cause legitimate tool calls to be blocked, especially in edge cases where larger payloads are expected (e.g., processing large documents).
- Recommendation: Allow configurable thresholds and provide detailed logs for blocked calls to help operators fine-tune policies.
💡 Suggestions for Improvement
-
OWASP Agentic Top 10 Mapping:
- The blog maps threats to OWASP Agentic Top 10 categories but does not provide direct links to the OWASP documentation. Adding links would improve accessibility and credibility.
-
Code Examples:
- The code examples are helpful but could benefit from additional comments explaining key decisions and trade-offs. For example, the provenance tracker could include comments about why certain fields (e.g.,
sensitivity) are chosen.
- The code examples are helpful but could benefit from additional comments explaining key decisions and trade-offs. For example, the provenance tracker could include comments about why certain fields (e.g.,
-
Telemetry Recommendations:
- The OpenTelemetry setup is a good starting point, but consider adding examples of how to integrate with popular observability platforms like Datadog or Prometheus. This would make it easier for users to adopt the recommendations.
-
Real-World Case Studies:
- The blog would be even more impactful if it included real-world case studies or examples of MCP-related security incidents. This would help readers understand the urgency of implementing the recommended defenses.
-
Tool Allowlist YAML Example:
- The YAML example for tool allowlisting is clear but could include comments explaining the rationale behind each rule. For instance, why certain tools are denied for specific agent roles.
-
Markdown Formatting:
- Consider adding a table of contents at the beginning of the blog post for easier navigation, especially given its length.
Final Recommendation
Merge the pull request after addressing the critical security issues and warnings. The blog post is a valuable addition to the repository's documentation and provides actionable insights for securing MCP-based agent deployments.
|
Review: APPROVE (pending rebase) ✅ Diff reviewed — 2 files: COMMUNITY.md entry + new blog post at packages/agent-mesh/docs/blog/mcp-security-firewall.md (404 lines). Closes #848. Content review: Excellent, technically sound blog covering MCP threat surface — tool poisoning (ASI01), rug-pull attacks (ASI02), cross-server data leakage (ASI03), over-permissioned tools. Includes 6 practical defenses with concrete code examples and YAML configs. Well-structured with real attack scenarios. Security checklist:
Before merging, please:
Great contribution! 🎉 |
imran-siddique
left a comment
There was a problem hiding this comment.
Content review passed — excellent MCP security blog post. Approving.
Author: please rebase to resolve merge conflicts, then this can merge.
…Calls Addresses microsoft#848. Covers MCP threat surface (tool poisoning, rug-pull attacks, cross-server data leakage, over-permissioned tools), real attack scenarios, and practical defenses including tool allowlisting, definition fingerprinting, argument boundary enforcement, and runtime monitoring.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7862abd to
45d97f4
Compare
|
Rebased on latest main — merge conflict in COMMUNITY.md resolved. Should be good to merge now! |
There was a problem hiding this comment.
🤖 AI Agent: code-reviewer
Feedback on Pull Request: MCP Security Blog Post
🔴 CRITICAL: Security Concerns
-
Tool Description Sanitization Pipeline
- The sanitization pipeline for tool descriptions is a good start, but regex-based detection alone is insufficient for robust security. Attackers can craft adversarial descriptions that bypass simple regex patterns. Consider integrating a machine learning-based prompt injection classifier trained on adversarial examples to complement the regex checks.
-
Provenance Tracking Implementation
- The
ProvenanceTrackerimplementation uses SHA-256 for exact content matching. This approach is vulnerable to partial data leakage (e.g., substring matches or modified data). Use rolling hashes or MinHash for approximate matching to detect partial overlaps and ensure robust provenance tracking.
- The
-
Cross-Server Data Leakage
- The blog mentions provenance tracking but does not address cryptographic integrity verification for cross-server data flows. Without cryptographic signatures, provenance tags can be tampered with. Consider signing provenance tags using Ed25519 or similar cryptographic methods to ensure integrity.
-
Human-in-the-Loop Approval
- The human approval mechanism relies on external communication tools like Slack or Teams. If these tools are compromised, attackers could spoof approval responses. Implement cryptographic signatures for approval responses to ensure authenticity.
🟡 WARNING: Potential Breaking Changes
-
Tool Fingerprinting
- Introducing tool fingerprinting as a runtime check may break existing integrations if servers dynamically update tool definitions. Ensure backward compatibility by allowing a grace period for server updates or providing a migration path for existing deployments.
-
Argument Boundary Enforcement
- Enforcing strict thresholds for argument sizes and patterns could lead to false positives in legitimate use cases. Provide clear documentation and configuration options for users to customize thresholds based on their specific needs.
💡 Suggestions for Improvement
-
Telemetry and Monitoring
- The OpenTelemetry integration is a strong addition. Enhance it by including error codes or reasons for blocked tool calls in the telemetry data. This will help operators diagnose issues faster.
-
Trust Domain Isolation
- The trust domain isolation mechanism is well-designed but could benefit from more granular policies. For example, allow specific tools to cross domains only under certain conditions (e.g., time-based restrictions or user-specific overrides).
-
Documentation
- The blog post references the MCP Trust Guide and MCP Security Scanner but does not provide direct links to their GitHub pages or installation instructions. Add these links for easier access.
-
Code Examples
- The code snippets are helpful but could be expanded with unit tests or examples of expected input/output. This would make it easier for readers to understand how to implement the solutions.
-
OWASP Agentic Top 10 Mapping
- The blog does a great job of mapping threats to OWASP Agentic Top 10 categories. Consider adding a summary table that lists each threat, its corresponding OWASP category, and the recommended defense.
-
Community Engagement
- Encourage community contributions by adding a call-to-action for readers to share their own security practices or contribute to the MCP Security Scanner module.
Overall Assessment
This blog post is a comprehensive and well-written piece that addresses critical security concerns in MCP-based agent deployments. It provides actionable recommendations and practical code examples, making it highly valuable for the community. However, there are critical areas that need stronger defenses, especially around cryptographic integrity and adversarial detection.
Recommended Actions
- Integrate cryptographic integrity checks for provenance tags and human approvals.
- Enhance the sanitization pipeline with machine learning-based classifiers.
- Expand documentation with direct links and installation instructions for referenced tools.
- Provide a migration path for existing deployments to adapt to new security features.
Let me know if you need further clarification or additional feedback!
Closes #848
Drafted the MCP security blog post covering the threat landscape around AI agent tool calls — tool poisoning, rug-pull attacks, cross-server data leakage, and over-permissioned tools with concrete attack scenarios.
Includes six practical recommendations: tool allowlisting, definition fingerprinting, argument boundary enforcement, human-in-the-loop for sensitive ops, runtime monitoring, and trust domain isolation.
Happy to revise based on feedback!