Security auditor for AI agent configurations
Scans Claude Code setups for hardcoded secrets, permission misconfigs,
hook injection, MCP server risks, and agent prompt injection vectors.
Quick Start Β· What It Catches Β· Opus Pipeline Β· GitHub Action Β· MiniClaw Β· Distribution
The AI agent ecosystem is growing faster than its security tooling. In January 2026 alone:
- 12% of a major agent skill marketplace was malicious (341 of 2,857 community skills)
- A CVSS 8.8 CVE exposed 17,500+ internet-facing instances to one-click RCE
- The Moltbook breach compromised 1.5M API tokens across 770,000 agents
Developers install community skills, connect MCP servers, and configure hooks without any automated way to audit the security of their setup. AgentShield scans your .claude/ directory and flags vulnerabilities before they become exploits.
Built at the Claude Code Hackathon (Cerebral Valley x Anthropic, Feb 2026). Part of the Everything Claude Code ecosystem (42K+ stars).
# Scan your Claude Code config (no install required)
npx ecc-agentshield scan
# Or install globally
npm install -g ecc-agentshield
agentshield scanThat's it. AgentShield auto-discovers your ~/.claude/ directory, scans all config files, and prints a graded security report.
AgentShield Security Report
Grade: F (0/100)
Score Breakdown
Secrets ββββββββββββββββββββ 0
Permissions ββββββββββββββββββββ 0
Hooks ββββββββββββββββββββ 0
MCP Servers ββββββββββββββββββββ 0
Agents ββββββββββββββββββββ 0
β CRITICAL Hardcoded Anthropic API key
CLAUDE.md:13
Evidence: sk-ant-a...cdef
Fix: Replace with environment variable reference [auto-fixable]
β CRITICAL Overly permissive allow rule: Bash(*)
settings.json
Evidence: Bash(*)
Fix: Restrict to specific commands: Bash(git *), Bash(npm *), Bash(node *)
Summary
Files scanned: 6
Findings: 73 total β 19 critical, 29 high, 15 medium, 4 low, 6 info
Auto-fixable: 8 (use --fix)
# Scan a specific directory
agentshield scan --path /path/to/.claude
# Auto-fix safe issues (replaces hardcoded secrets with env var references)
agentshield scan --fix
# JSON output for CI pipelines
agentshield scan --format json
# Generate an HTML security report
agentshield scan --format html > report.html
# Three-agent Opus 4.6 adversarial analysis (requires ANTHROPIC_API_KEY)
agentshield scan --opus --stream
# Generate a secure baseline config
agentshield init102 rules across 5 categories, graded AβF with a 0β100 numeric score.
| What | Examples |
|---|---|
| API keys | Anthropic (sk-ant-), OpenAI (sk-proj-), AWS (AKIA), Google (AIza), Stripe (sk_test_/sk_live_) |
| Tokens | GitHub PATs (ghp_/github_pat_), Slack (xox[bprs]-), JWTs (eyJ...), Bearer tokens |
| Credentials | Hardcoded passwords, database connection strings (postgres/mongo/mysql/redis), private key material |
| Env leaks | Secrets passed through environment variables in configs, echo $SECRET in hooks |
| What | Examples |
|---|---|
| Wildcard access | Bash(*), Write(*), Edit(*) β unrestricted tool permissions |
| Missing deny lists | No deny rules for rm -rf, sudo, chmod 777 |
| Dangerous flags | --dangerously-skip-permissions usage |
| Mutable tool exposure | All mutable tools (Write, Edit, Bash) allowed without scoping |
| Destructive git | git push --force, git reset --hard in allowed commands |
| Unrestricted network | curl *, wget, ssh *, scp * in allow list without scope |
| What | Examples |
|---|---|
| Command injection | ${file} interpolation in shell commands β attacker-controlled filenames become code |
| Data exfiltration | curl -X POST with variable interpolation sending data to external URLs |
| Silent errors | 2>/dev/null, || true β failing security hooks that silently pass |
| Missing hooks | No PreToolUse hooks, no Stop hooks for session-end validation |
| Network exposure | Unthrottled network requests in hooks, sensitive file access without filtering |
| Session startup | SessionStart hooks that download and execute remote scripts |
| Package installs | Global npm install -g, pip install, gem install, cargo install in hooks |
| Container escape | Docker --privileged, --pid=host, --network=host, root volume mounts |
| Credential access | macOS Keychain, GNOME Keyring, /etc/shadow reads |
| Reverse shells | /dev/tcp, mkfifo + nc, Python/Perl socket shells |
| Clipboard access | pbcopy, xclip, xsel, wl-copy β exfiltration via clipboard |
| Log tampering | journalctl --vacuum, rm /var/log, history -c β anti-forensics |
| What | Examples |
|---|---|
| High-risk servers | Shell/command MCPs, filesystem with root access, database MCPs, browser automation |
| Supply chain | npx -y auto-install without confirmation β typosquatting vector |
| Hardcoded secrets | API tokens in MCP environment config instead of env var references |
| Remote transport | MCP servers connecting to remote URLs (SSE/streamable HTTP) |
| Shell metacharacters | &&, |, ; in MCP server command arguments |
| Missing metadata | No version pin, no description, excessive server count |
| Sensitive file args | .env, .pem, credentials.json passed as server arguments |
| Network exposure | Binding to 0.0.0.0 instead of localhost |
| Auto-approve | autoApprove settings that skip user confirmation for tool calls |
| Missing timeouts | High-risk servers without timeout β resource exhaustion risk |
| What | Examples |
|---|---|
| Unrestricted tools | Agents with Bash access, no allowedTools restriction |
| Prompt injection surface | Agents processing external/user-provided content without defenses |
| Auto-run instructions | CLAUDE.md containing "Always run", "without asking", "automatically install" |
| Hidden instructions | Unicode zero-width characters, HTML comments, base64-encoded directives |
| URL execution | CLAUDE.md instructing agents to fetch and execute remote URLs |
| Time bombs | Delayed execution instructions triggered by time or absence conditions |
| Data harvesting | Bulk collection of passwords, credentials, or database dumps |
| Prompt reflection | ignore previous instructions, you are now, DAN jailbreak, fake system prompts |
| Output manipulation | always report ok, remove warnings from output, suppress security findings |
Automatically applies safe fixes:
- Replaces hardcoded secrets with
${ENV_VAR}references - Tightens wildcard permissions (
Bash(*)β scopedBash(git *),Bash(npm *))
Only fixes marked auto: true are applied. Permission changes require human review.
Generates a hardened .claude/ directory with scoped permissions, safety hooks, and security best practices. Existing files are never overwritten.
Three-agent adversarial pipeline powered by Claude Opus 4.6:
- Red Team (Attacker) β finds exploitable attack vectors and multi-step chains
- Blue Team (Defender) β evaluates existing protections and recommends hardening
- Auditor β synthesizes both perspectives into a prioritized risk assessment
The Attacker finds that curl hooks with ${file} interpolation + Bash(*) = command injection pivot. The Defender notes no PreToolUse hooks exist to stop it. The Auditor chains them into a prioritized action list.
agentshield scan --opus # Red + Blue run in parallel
agentshield scan --opus --stream # Sequential with real-time output
agentshield scan --opus --stream -v # Verbose β see full agent reasoning βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Phase 1a: ATTACKER (Red Team) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Attacker analysis complete (4521 tokens)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Phase 1b: DEFENDER (Blue Team) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Defender analysis complete (3892 tokens)
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Phase 2: AUDITOR β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Risk Level: CRITICAL
Opus Score: ββββββββββββββββββββ 15/100
Requires ANTHROPIC_API_KEY environment variable.
| Format | Flag | Use Case |
|---|---|---|
| Terminal | --format terminal (default) |
Interactive use |
| JSON | --format json |
CI pipelines, programmatic access |
| Markdown | --format markdown |
Documentation, PRs |
| HTML | --format html |
Self-contained shareable report (dark theme, all CSS inlined) |
- name: AgentShield Security Scan
uses: affaan-m/agentshield@v1
with:
path: "."
min-severity: "medium"
fail-on-findings: "true"Inputs:
| Input | Default | Description |
|---|---|---|
path |
. |
Path to scan |
min-severity |
medium |
Minimum severity: critical, high, medium, low, info |
fail-on-findings |
true |
Fail the action if findings meet severity threshold |
format |
terminal |
Output format |
Outputs: score (0β100), grade (AβF), total-findings, critical-count
The action writes a markdown job summary and emits GitHub annotations inline on affected files.
agentshield scan [options] Scan configuration directory
-p, --path <path> Path to scan (default: ~/.claude or cwd)
-f, --format <format> Output: terminal, json, markdown, html
--fix Auto-apply safe fixes
--opus Enable Opus 4.6 multi-agent analysis
--stream Stream Opus analysis in real-time
--min-severity <severity> Filter: critical, high, medium, low, info
-v, --verbose Show detailed output
agentshield init Generate secure baseline config
agentshield miniclaw start [opts] Launch MiniClaw secure agent server
-p, --port <port> Port (default: 3847)
-H, --hostname <host> Hostname (default: localhost)
--network <policy> Network: none, localhost, allowlist
--rate-limit <n> Max req/min per IP (default: 10)
--sandbox-root <path> Root path for sandboxes
--max-duration <ms> Max session duration (default: 300000)
| Category | Rules | Patterns | Severity Range |
|---|---|---|---|
| Secrets | 10 | 14 | Critical -- Medium |
| Permissions | 10 | -- | Critical -- Medium |
| Hooks | 34 | -- | Critical -- Low |
| MCP Servers | 23 | -- | Critical -- Info |
| Agents | 25 | -- | Critical -- Info |
| Total | 102 | 14 |
src/
βββ index.ts CLI entry point (commander)
βββ action.ts GitHub Action entry point
βββ types.ts Type system + Zod schemas
βββ scanner/
β βββ discovery.ts Config file discovery
β βββ index.ts Scan orchestrator
βββ rules/
β βββ index.ts Rule registry
β βββ secrets.ts Secret detection (10 rules, 14 patterns)
β βββ permissions.ts Permission audit (10 rules)
β βββ mcp.ts MCP server security (23 rules)
β βββ hooks.ts Hook analysis (34 rules)
β βββ agents.ts Agent config review (25 rules)
βββ reporter/
β βββ score.ts Scoring engine (A-F grades)
β βββ terminal.ts Color terminal output
β βββ json.ts JSON + Markdown output
β βββ html.ts Self-contained HTML report
βββ fixer/
β βββ transforms.ts Fix transforms (secret, permission, generic)
β βββ index.ts Fix engine orchestrator
βββ init/
β βββ index.ts Secure config generator
βββ opus/
β βββ prompts.ts Attacker/Defender/Auditor system prompts
β βββ pipeline.ts Three-agent Opus 4.6 pipeline
β βββ render.ts Opus analysis rendering
βββ miniclaw/
βββ types.ts Core type system (immutable, readonly)
βββ sandbox.ts Sandbox lifecycle + path validation
βββ router.ts Prompt sanitization + output filtering
βββ tools.ts Whitelist-based tool authorization
βββ server.ts HTTP server with rate limiting + CORS
βββ dashboard.tsx React dashboard component
βββ index.ts Entry point and re-exports
MiniClaw is a minimal, sandboxed AI agent runtime bundled with AgentShield. Where typical agent platforms expose many attack surfaces (Telegram, Discord, email, community plugins), MiniClaw presents a single HTTP endpoint backed by an isolated sandbox.
# Start with secure defaults (localhost:3847, no network, safe tools only)
npx ecc-agentshield miniclaw start
# Custom configuration
npx ecc-agentshield miniclaw start --port 4000 --network localhost --rate-limit 20Or use as a library:
import { startMiniClaw } from 'ecc-agentshield/miniclaw';
const { server, stop } = startMiniClaw();
// Listening on http://localhost:3847Four independently enforced layers:
Request β [Rate Limit] β [CORS] β [Size Cap] β [Sanitize Prompt]
β
[Tool Whitelist]
β
[Sandbox FS]
β
[Filter Output] β Response
- Server β Rate limiting (10 req/min/IP), CORS, 10KB request cap, localhost-only binding
- Prompt Router β Strips 12+ injection pattern categories (system prompt overrides, identity reassignment, jailbreaks, data exfiltration URLs, zero-width Unicode, base64 payloads)
- Tool Whitelist β Three tiers: Safe (read/search/list), Guarded (write/edit), Restricted (bash/network β disabled by default)
- Sandbox β Isolated filesystem per session, path traversal blocked, symlink escape detection, extension whitelist, 10MB file cap, 5-min timeout, no network by default
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/prompt |
Send a prompt |
POST |
/api/session |
Create a sandboxed session |
GET |
/api/session |
Session info |
DELETE |
/api/session/:id |
Destroy session + cleanup |
GET |
/api/events/:sessionId |
Security audit events |
GET |
/api/health |
Health check |
MiniClaw has zero external runtime dependencies β Node.js built-ins only (http, fs, path, crypto). The optional React dashboard requires React 18+ as a peer dependency.
npm install # Install dependencies
npm run dev # Development mode
npm test # Run tests (912 tests)
npm run test:coverage # Coverage report
npm run typecheck # Type check
npm run build # Build
npm run scan:demo # Demo scan against vulnerable examplesAgentShield is available through multiple channels:
| Channel | Use Case | Install |
|---|---|---|
| Standalone CLI | Direct scanning from your terminal | npm install -g ecc-agentshield or npx ecc-agentshield scan |
| GitHub Action | Automated security checks on PRs in CI/CD | uses: affaan-m/agentshield@v1 |
| ECC Plugin | Claude Code users via the ECC skill ecosystem | Install through Everything Claude Code |
| ECC Tools GitHub App | Integrated scanning across your GitHub org | Install at github.com/apps/ecc-tools |
MIT
Built by @affaanmustafa Β· Part of Everything Claude Code