pwnkit

Let autonomous AI agents hack you before attackers do.
Fully autonomous agentic pentesting framework.

Docs · Website · Blog · Benchmark · Triage

Fully autonomous agentic pentesting for web apps, AI/LLM apps, package ecosystems, and source code.

This README is the fast path. The detailed command reference, configuration, architecture notes, recipes, and benchmark breakdowns live in the docs site.

Quick Start

Standalone binary (zero deps)

curl -fsSL https://raw.githubusercontent.com/PwnKit-Labs/pwnkit/main/install.sh | bash

Downloads a self-contained pwnkit binary (~74 MB) for your platform from the latest GitHub Release — no Node, no Bun, no npm, no node_modules. Installs to ~/.pwnkit/bin/pwnkit. Set PWNKIT_INSTALL_DIR=/usr/local/bin to change the location, PWNKIT_VERSION=vX.Y.Z to pin a version.

Binaries ship for linux-x64, linux-arm64, darwin-arm64, and windows-x64. The interactive Bun-based TUI is baked into the binary — no extra install step. Intel Mac users: install Bun and compile from source.

Docker

docker run --rm -e OPENROUTER_API_KEY=$KEY \
  ghcr.io/pwnkit-labs/pwnkit:latest scan --target https://example.com

If you use Azure OpenAI instead, also pass AZURE_OPENAI_BASE_URL and AZURE_OPENAI_MODEL. For the Responses API, the Azure base URL should include /openai/v1.

The image ships with Node 20, Playwright/Chromium, and the standard pentest toolbox (sqlmap, nmap, nikto, gobuster, ffuf, hydra, john, …) preinstalled.

Once installed

# Scan an AI / LLM endpoint
pwnkit scan --target https://example.com/api/chat

# Pentest a web app
pwnkit scan --target https://example.com --mode web

# White-box scan with source code access
pwnkit scan --target https://example.com --repo ./source

# Audit a package
pwnkit audit lodash

# Review source code
pwnkit review ./my-app

# Import and verify kernel crash reports
pwnkit ingest ./kernel-crashes --verify --output json

# Auto-detect — just give it a target
pwnkit https://example.com

The binary is named pwnkit when installed via install.sh and pwnkit-cli when installed via npm. Substitute whichever your install route used; everything else is identical. From v0.10.0 the npm package is a smart launcher that downloads the platform-specific binary on first run and caches it under ~/.pwnkit/cache/<version>/, so npx pwnkit-cli and npm i -g pwnkit-cli work without a separate install step.

What It Does

scan targets AI / LLM apps, web apps, REST / OpenAPI APIs, and MCP servers.
audit installs and inspects packages across npm, pypi, cargo, and oci with ecosystem-specific prep, static analysis, and AI review.
review performs deep source-code security review on a local repo or Git URL.
ingest parses kernel crash reports and can validate them against reproducers, including a real QEMU kernel VM path that compiles and runs reproducers inside the guest when configured.
triage-data turns benchmark runs and verified findings into labeled JSONL for triage-model training.
cloud-sink can stream findings and final reports to an orchestrator with PWNKIT_CLOUD_SINK + PWNKIT_CLOUD_SCAN_ID.
dashboard, history, findings, and triage provide local persistence and review workflows.

Why It’s Different

Shell-first web pentesting. The agent uses bash, writes scripts, and chains tools like a human pentester instead of being trapped in a small HTTP-tool DSL.
Blind verification. Findings are independently re-exploited before they are reported.
Docs-backed benchmark transparency. The current benchmark details live in the docs and raw artifacts under packages/benchmark/results.

Docs

Snapshot

XBOW 99.0% aggregate (103/104) · 97.9% gpt-5.4 cohort (93/95) · $5.20/flag. Cybench 90.0% (36/40). AI/LLM regression 10/10.

The benchmark page is the canonical surface — it separates the stable model-specific cohort from the rotation-volatile retained aggregate, lists the historical mixed publication line, and notes the remaining challenge-set mismatches.

GitHub Action

- uses: PwnKit-Labs/pwnkit@main
  with:
    mode: review
    path: .
    format: sarif
  env:
    OPENROUTER_API_KEY: ${{ secrets.OPENROUTER_API_KEY }}

Development

git clone https://github.com/PwnKit-Labs/pwnkit.git
cd pwnkit
pnpm install
pnpm lint
pnpm test

See CONTRIBUTING.md.

Part of PwnKit Labs

Open-source adversarial security for the agentic AI era. pwnkit is one piece of the open-source PwnKit Labs stack:

pwnkit — AI agent pentester (detect)
foxguard — Rust security scanner (prevent)
opensoar — Python-native SOAR platform (respond)

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 859 Commits
.github		.github
action		action
assets		assets
demo		demo
docs		docs
packages		packages
scripts		scripts
templates		templates
test-targets		test-targets
.dockerignore		.dockerignore
.gitignore		.gitignore
.npmignore		.npmignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
action.yml		action.yml
install.sh		install.sh
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pwnkit

Quick Start

Standalone binary (zero deps)

Docker

Once installed

What It Does

Why It’s Different

Docs

Snapshot

GitHub Action

Development

Part of PwnKit Labs

License

About

Uh oh!

Releases 31

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

pwnkit

Quick Start

Standalone binary (zero deps)

Docker

Once installed

What It Does

Why It’s Different

Docs

Snapshot

GitHub Action

Development

Part of PwnKit Labs

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 31

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages