Redwood-style AI Control as an LLM proxy for production agentic deployments.
# Install uv (if needed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone and start everything
git clone https://github.com/LuthienResearch/luthien-proxy
cd luthien-proxy
# Configure API keys
cp .env.example .env
# Edit .env and add your keys:
# OPENAI_API_KEY=sk-proj-...
# ANTHROPIC_API_KEY=sk-ant-...
# Start the stack
./scripts/quick_start.shLaunch your AI assistant through the proxy using the built-in scripts:
Claude Code:
./scripts/launch_claude_code.shCodex:
./scripts/launch_codex.shThese scripts automatically configure the proxy settings. All requests now flow through the policy enforcement layer!
When you first visit any admin page (Activity Monitor, Policy Config, or Debug views), you'll be redirected to:
http://localhost:8000/login
Default credentials (development):
- Admin API Key:
admin-dev-key
After logging in, your session persists across pages. Click "Sign Out" on any admin page to log out.
ADMIN_API_KEY in your .env file before exposing to a network.
Open the Activity Monitor in your browser to see requests in real-time:
http://localhost:8000/activity/monitor
Watch as requests flow through, see policy decisions, and inspect before/after diffs.
Use the Policy Configuration UI to change policies without restart:
http://localhost:8000/policy-config
- Browse available policies (NoOp, AllCaps, DebugLogging, etc.)
- Click to select and activate
- Test immediately - changes take effect instantly
Create a new policy by subclassing SimpleJudgePolicy:
# src/luthien_proxy/policies/my_custom_policy.py
from luthien_proxy.policies.simple_judge_policy import SimpleJudgePolicy
class MyCustomPolicy(SimpleJudgePolicy):
"""Block dangerous commands before they execute."""
RULES = [
"Never allow 'rm -rf' commands",
"Block requests to delete production data",
"Prevent executing untrusted code"
]
# That's it! SimpleJudgePolicy handles the LLM judge logic for you.
# It evaluates both requests, responses, and tool calls against your rules.Restart the gateway and your policy appears in the Policy Config UI automatically.
- Gateway (OpenAI/Anthropic-compatible) at http://localhost:8000
- PostgreSQL and Redis fully configured
- Local LLM (Ollama) at http://localhost:11434
- Real-time monitoring at http://localhost:8000/activity/monitor
- Policy management UI at http://localhost:8000/policy-config
The gateway provides:
- OpenAI Chat Completions API (
/v1/chat/completions) - Anthropic Messages API (
/v1/messages) - Integrated policy enforcement via control plane
- Support for streaming and non-streaming requests
- Hot-reload policy switching (no restart needed)
- Docker
- Python 3.13+
- uv
# After code changes, restart the gateway
docker compose restart gateway
# Run unit tests
uv run pytest tests/unit_tests
# Run integration tests
uv run pytest tests/integration_tests
# Run e2e tests (slow, use sparingly)
uv run pytest -m e2e
# Test the gateway
./scripts/test_gateway.sh
# Format and lint
./scripts/format_all.sh
# Full dev checks (format + lint + tests + type check)
./scripts/dev_checks.sh
# Type check only
uv run pyrightThe gateway supports OpenTelemetry for distributed tracing and log correlation.
By default, the gateway runs without the observability stack. To enable it:
# Start observability stack (Tempo, Loki, Promtail, Grafana)
./scripts/observability.sh up -d
# The gateway will automatically detect and use the observability stack
# Access Grafana at http://localhost:3000
# Username: admin, Password: adminThe observability stack is completely optional and does not affect core functionality.
- Distributed tracing with OpenTelemetry → Grafana Tempo
- Structured logging with trace context (trace_id, span_id)
- Log-trace correlation in Grafana
- Real-time activity feed at
/activity/monitor - Pre-built dashboard for traces and logs
OpenTelemetry is enabled by default. To configure the endpoint in .env:
# OpenTelemetry endpoint (enabled by default)
OTEL_EXPORTER_OTLP_ENDPOINT=http://tempo:4317
# Optional: customize service metadata
SERVICE_NAME=luthien-proxy
SERVICE_VERSION=2.0.0
ENVIRONMENT=development
# To disable tracing, set:
# OTEL_ENABLED=false- Usage guide: dev/observability.md
- Conventions: dev/context/otel-conventions.md
- Dashboard: Import
observability/grafana-dashboards/luthien-traces.jsonin Grafana
When observability is enabled:
- Grafana at http://localhost:3000 (dashboards and visualization)
- Tempo at http://localhost:3200 (trace storage and query)
- Loki at http://localhost:3100 (log aggregation)
Copy .env.example to .env and configure your environment:
# Upstream LLM Provider API Keys
OPENAI_API_KEY=your_openai_api_key_here
ANTHROPIC_API_KEY=your_anthropic_api_key_here
# Gateway Authentication
PROXY_API_KEY=sk-luthien-dev-key # API key for clients to access the proxy
ADMIN_API_KEY=admin-dev-key # API key for admin/policy management UI# Database
DATABASE_URL=postgresql://luthien:password@db:5432/luthien_control
# Redis (for real-time activity streaming)
REDIS_URL=redis://redis:6379
# Gateway
GATEWAY_HOST=localhost
GATEWAY_PORT=8000# Policy loading strategy
# Options: "db", "file", "db-fallback-file" (recommended), "file-fallback-db"
POLICY_SOURCE=db-fallback-file
# Path to YAML policy file (when POLICY_SOURCE includes "file")
POLICY_CONFIG=/app/config/policy_config.yaml# OpenTelemetry tracing
OTEL_ENABLED=true # Toggle tracing
OTEL_EXPORTER_OTLP_ENDPOINT=http://tempo:4317 # OTLP endpoint
# Service metadata for distributed tracing
SERVICE_NAME=luthien-proxy
SERVICE_VERSION=2.0.0
ENVIRONMENT=development
# Grafana for viewing traces
GRAFANA_URL=http://localhost:3000# Configuration for judge-based policies (ToolCallJudgePolicy, SimpleJudgePolicy)
LLM_JUDGE_MODEL=openai/gpt-4 # Model for judge
LLM_JUDGE_API_BASE=http://localhost:11434/v1 # API base URL
LLM_JUDGE_API_KEY=your_judge_api_key # API key for judgeSee .env.example for all available options and defaults.
The gateway loads policies from POLICY_CONFIG (defaults to config/policy_config.yaml).
Example policy configuration:
policy:
class: "luthien_proxy.policies.tool_call_judge_v3:ToolCallJudgeV3Policy"
config:
model: "ollama/gemma2:2b"
api_base: "http://local-llm:11434"
api_key: "ollama"
probability_threshold: 0.6
temperature: 0.0
max_tokens: 256Available policies in src/luthien_proxy/policies/:
noop_policy.py- Pass-through (no filtering)all_caps_policy.py- Simple transformation exampledebug_logging_policy.py- Logs requests/responses for debuggingtool_call_judge_policy.py- AI-based tool call safety evaluationsimple_policy.py- Base class for custom policiessimple_judge_policy.py- Base class for LLM-based rule enforcement
- Lint/format:
uv run ruff checkanduv run ruff format. Core rules enabled (E/F/I/D). Line length is 120; long-line lint (E501) is ignored to avoid churn after formatting.
Editor setup (VS Code)
- Install the Ruff extension.
- In this repo, VS Code uses Ruff for both formatting and import organization via
.vscode/settings.json. - Type checking:
uv run pyright(configured in[tool.pyright]withinpyproject.toml). - Tests:
uv run pytest -qwith coverage forsrc/luthien_proxy/**configured in[tool.pytest.ini_options]. - Config consolidation: Ruff, Pytest, and Pyright live in
pyproject.tomlto avoid extra files.
The gateway integrates everything into a single FastAPI application:
-
Gateway (
src/luthien_proxy/): Unified FastAPI + LiteLLM integration- OpenAI Chat Completions API compatibility
- Anthropic Messages API compatibility
- Event-driven policy system with streaming support
- OpenTelemetry instrumentation for observability
-
Orchestration (
src/luthien_proxy/orchestration/): Request processing coordinationPolicyOrchestratorcoordinates the streaming pipeline- Real-time event publishing for UI updates
- Trace context propagation
-
Policy System (
src/luthien_proxy/policies/): Event-driven policy frameworkSimplePolicy- Base class for simple request/response policiesSimpleJudgePolicy- Base class for LLM-based rule enforcement- Examples: NoOpPolicy, AllCapsPolicy, DebugLoggingPolicy, ToolCallJudgePolicy
-
Policy Core (
src/luthien_proxy/policy_core/): Policy protocol and contexts- Policy protocol definitions
- Request/response contexts for policy processing
- Chunk builders for streaming
-
Streaming (
src/luthien_proxy/streaming/): Streaming support- Policy executor for stream processing
- Client formatters for OpenAI/Anthropic formats
-
UI (
src/luthien_proxy/ui/): Real-time monitoring and debugging/activity/monitor- Live activity feed/activity/live- WebSocket activity stream- Debug endpoints for inspection
Documentation:
- Start here: Development docs index - Guide to all documentation
- Request processing architecture: dev/REQUEST_PROCESSING_ARCHITECTURE.md - How requests flow through the system
- Live policy updates: dev/LIVE_POLICY_DEMO.md - Switching policies without restart in Claude Code
- Observability: dev/observability.md - Tracing and monitoring
- Viewing traces: dev/VIEWING_TRACES_GUIDE.md - Using Grafana/Tempo
- Context files: dev/context/ - Architectural patterns, decisions, and gotchas
Gateway (http://localhost:8000)
API Endpoints:
POST /v1/chat/completions— OpenAI Chat Completions API (streaming and non-streaming)POST /v1/messages— Anthropic Messages API (streaming and non-streaming)GET /health— Health check
UI Endpoints:
GET /activity/monitor— Real-time activity monitor (HTML)GET /activity/live— WebSocket activity stream (JSON)GET /debug— Debug information viewer
Authentication:
All API requests require the Authorization: Bearer <PROXY_API_KEY> header.
Admin endpoints manage policies at runtime without requiring a restart. All admin requests require the Authorization: Bearer <ADMIN_API_KEY> header.
Get current policy:
curl http://localhost:8000/admin/policy/current \
-H "Authorization: Bearer admin-dev-key"Create a named policy instance:
curl -X POST http://localhost:8000/admin/policy/create \
-H "Content-Type: application/json" \
-H "Authorization: Bearer admin-dev-key" \
-d '{
"name": "my-policy",
"policy_class_ref": "luthien_proxy.policies.tool_call_judge_policy:ToolCallJudgePolicy",
"config": {
"model": "openai/gpt-4o-mini",
"probability_threshold": 0.99,
"temperature": 0.0,
"max_tokens": 256
}
}'Activate a policy:
curl -X POST http://localhost:8000/admin/policy/activate \
-H "Content-Type: application/json" \
-H "Authorization: Bearer admin-dev-key" \
-d '{"name": "my-policy"}'List available policy classes:
curl http://localhost:8000/admin/policy/list \
-H "Authorization: Bearer admin-dev-key"List saved policy instances:
curl http://localhost:8000/admin/policy/instances \
-H "Authorization: Bearer admin-dev-key"The gateway uses an event-driven policy architecture with streaming support.
src/luthien_proxy/policies/base_policy.py- Abstract policy interfacesrc/luthien_proxy/policies/simple_policy.py- Base class for custom policiessrc/luthien_proxy/policies/simple_judge_policy.py- Base class for LLM-based rule enforcementsrc/luthien_proxy/orchestration/policy_orchestrator.py- Policy orchestrationsrc/luthien_proxy/gateway_routes.py- API endpoint handlers with policy integrationconfig/policy_config.yaml- Policy configuration
Subclass SimplePolicy for basic request/response transformations, or SimpleJudgePolicy for LLM-based rule enforcement. See src/luthien_proxy/policies/ for examples.
# Start the gateway
./scripts/quick_start.sh
# Run automated tests
./scripts/test_gateway.sh
# View logs
docker compose logs -f gateway# Check service status
docker compose ps
# View gateway logs
docker compose logs gateway
# Restart gateway
docker compose restart gateway
# Full restart
docker compose down && ./scripts/quick_start.sh- Check API key: Ensure
Authorization: Bearer <PROXY_API_KEY>header is set - Check upstream credentials: Verify
OPENAI_API_KEYandANTHROPIC_API_KEYin.env - Check logs:
docker compose logs -f gateway
# Ensure services are running
docker compose ps
# Check service health
curl http://localhost:8000/health
# View detailed logs
docker compose logs gateway | tail -50# Check database is running
docker compose ps db
# Restart database
docker compose restart db
# Re-run migrations
docker compose run --rm db-migrationsApache License 2.0