Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
bin/
dist/
gen/
*.exe
.env.kontext
kontext
139 changes: 139 additions & 0 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
# Kontext CLI

## The problem

AI coding agents (Claude Code, Cursor, Codex) run on your laptop with whatever credentials you have lying around — long-lived API keys in `.env` files, GitHub tokens in your shell, database passwords in your config. There's no scoping, no audit trail, no way for a team lead to see what agents are doing across the org.

## What the CLI does

One command:

```bash
kontext start --agent claude
```

This launches Claude Code, but with two things added:

1. **Scoped credentials** — instead of using whatever's in your shell, the agent gets short-lived tokens resolved from your Kontext account. They expire when the session ends.

2. **Telemetry** — every tool call (file edits, shell commands, API calls) is logged to the Kontext dashboard. The team sees who did what, when, and whether it was allowed.

## How it works (the 30-second version)

```
You run: kontext start --agent claude

1. CLI checks your identity (OIDC token in your system keychain)
2. CLI reads .env.kontext to see what credentials the project needs
3. CLI resolves each credential from the Kontext backend
4. CLI launches Claude Code with those credentials as env vars
5. Every tool call Claude makes gets logged to your team's dashboard
6. When you exit, credentials expire, session ends
```

## The codebase

### Three binaries in one

The CLI is a single Go binary that runs in three modes:

- **`kontext start`** — the main command. Orchestrates everything, stays alive for the session.
- **`kontext hook`** — called by Claude Code automatically on every tool call. You never run this yourself.
- **`kontext login`** — one-time browser login. Stores your identity in the system keychain.

They're the same binary because Claude Code needs to spawn hook handlers by command name. One binary = no install issues.

### Why Go

The hook handler (`kontext hook`) gets spawned on every single tool call — every file edit, every shell command, every API request. Node.js takes 50-100ms to start. Go takes 5ms. Over a session with hundreds of tool calls, this matters.

Go also compiles to a single binary with zero dependencies. `brew install` and you're done.

### Project structure

```
cmd/kontext/main.go — CLI entry point (start, login, hook commands)
internal/
agent/ — agent adapter interface
claude/claude.go — Claude Code hook I/O format
auth/ — OIDC login + keychain storage
backend/ — ConnectRPC client for the Kontext API
credential/ — .env.kontext template parser
hook/ — hook event processor (stdin → evaluate → stdout)
run/ — the start command orchestrator
hooks.go — generates Claude Code hook config
sidecar/ — local Unix socket server
protocol.go — wire format for hook ↔ sidecar communication
gen/ — generated protobuf code (from kontext-dev/proto)
```

### The sidecar — why it exists

When Claude Code makes a tool call, it spawns `kontext hook` as a new process. That process needs to log the event and get a policy decision. If it made a network call to the backend every time, that's 100-300ms per tool call — unacceptable.

The sidecar solves this. It's a small server that starts alongside Claude Code and listens on a Unix socket file. The hook handler connects to it locally (sub-millisecond), and the sidecar maintains a persistent connection to the backend.

```
Claude Code → spawns kontext hook → Unix socket → sidecar → backend
(5ms) (0ms) (already connected)
```

The sidecar also sends heartbeats every 30 seconds to keep the session alive in the dashboard.

### Agent adapters

Each agent (Claude Code, Cursor, Codex) has a different format for hook events. The adapter translates:

```go
type Agent interface {
Name() string // "claude"
DecodeHookInput([]byte) (*HookEvent, error) // parse agent's JSON
EncodeAllow(*HookEvent, string) ([]byte, error) // format allow response
EncodeDeny(*HookEvent, string) ([]byte, error) // format deny response
}
```

Everything else — the sidecar, telemetry, credential resolution, policy evaluation — is shared. Adding a new agent is one file with four methods.

### Credential injection

A `.env.kontext` file in the project declares what credentials the agent needs:

```
GITHUB_TOKEN={{kontext:github}}
STRIPE_KEY={{kontext:stripe}}
```

Before launching the agent, the CLI resolves each placeholder by calling the Kontext backend with the user's identity. The backend returns a short-lived credential (could be an OAuth token, could be an API key — the CLI doesn't distinguish). These become env vars in the agent's process.

The agent uses them naturally — `git push` reads `GITHUB_TOKEN`, `curl` reads `STRIPE_KEY`. No special SDK, no interception.

### Auth

No client secrets. The user logs in once via browser (`kontext login`), and a refresh token is stored in the system keychain (macOS Keychain / Linux secret service). Every `kontext start` loads and refreshes the token automatically. The backend verifies the JWT and knows who the user is and which org they belong to.

### Telemetry vs credentials — two separate things

The CLI has two backend integrations that are completely independent:

**Telemetry** — session lifecycle + hook events. Uses ConnectRPC (gRPC-compatible) with bidirectional streaming. The proto lives in `kontext-dev/proto`. This is what powers the dashboard.

**Credentials** — provider token resolution. Uses a plain REST endpoint (`POST /api/v1/credentials/exchange`). This is what populates the env vars.

They use different protocols because they have different needs. Telemetry needs streaming (hundreds of events per session over one connection). Credentials need a simple request/response (one call per provider at session start).

### What's working today

- `kontext login` — browser OIDC login, keychain storage, token refresh
- `kontext start --agent claude` — launches Claude Code, interactive `.env.kontext` setup on first run
- Agent adapter for Claude Code — full hook I/O encoding/decoding
- Sidecar with Unix socket — accepts hook connections, relays events
- Hook command — reads stdin, talks to sidecar, writes decision to stdout
- Settings generation — creates Claude Code hook config automatically

### What's blocked on the server

- **Telemetry** (#408) — needs ConnectRPC `AgentService` endpoint on the API + auth change to accept user bearer tokens
- **Credentials** (#410) — needs `POST /api/v1/credentials/exchange` endpoint authenticated with user tokens

Both are unblocked by the same server-side auth change: `UnifiedAuthGuard` learning to accept user OIDC tokens as bearer tokens, not just service account tokens.
113 changes: 88 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,11 @@ kontext start --agent claude
```

1. **Authenticates** — loads your identity from the system keyring (set up via `kontext login`)
2. **Resolves credentials** — reads `.env.kontext`, exchanges placeholders for short-lived tokens via Kontext
3. **Launches the agent** — spawns Claude Code with credentials injected as env vars
4. **Enforces policy** — every tool call is evaluated against your org's OpenFGA policy (via a local sidecar)
5. **Logs everything** — full audit trail streamed to the Kontext backend via gRPC
2. **Creates a session** — registers with the Kontext backend, visible in the dashboard
3. **Resolves credentials** — reads `.env.kontext`, exchanges placeholders for short-lived tokens
4. **Launches the agent** — spawns Claude Code with credentials injected as env vars + governance hooks
5. **Captures every action** — PreToolUse, PostToolUse, and UserPromptSubmit events streamed to the backend
6. **Tears down cleanly** — session disconnected, credentials expired, temp files removed

Credentials are ephemeral — scoped to the session, gone when it ends.

Expand All @@ -33,28 +34,25 @@ go build -o bin/kontext ./cmd/kontext
### First-time setup

```bash
kontext login
kontext start --agent claude
```

Opens a browser for OIDC authentication. Stores your refresh token in the system keyring (macOS Keychain / Linux secret service). No client IDs or secrets to manage.
On first run, the CLI handles everything interactively:
- No session? Opens browser for OIDC login, stores refresh token in system keyring
- No `.env.kontext`? Prompts for which providers the project needs, writes the file
- Provider not connected? Opens browser to the Kontext hosted connect flow

### Declare credentials

Create a `.env.kontext` file in your project:
The `.env.kontext` file declares what credentials the project needs:

```
GITHUB_TOKEN={{kontext:github}}
STRIPE_KEY={{kontext:stripe}}
DATABASE_URL={{kontext:postgres/prod-readonly}}
```

### Run

```bash
kontext start --agent claude
```

The CLI resolves each placeholder, injects the credentials as env vars, and launches Claude Code with governance hooks active.
Commit this to your repo — the team shares it.

### Supported agents

Expand All @@ -69,38 +67,103 @@ The CLI resolves each placeholder, injects the credentials as env vars, and laun
```
kontext start --agent claude
├── Auth: OIDC refresh token from keyring → ephemeral session token
├── Credentials: .env.kontext → ExchangeCredential RPC → env vars
├── Sidecar: Unix socket server for hook ↔ backend communication
├── Agent: spawn claude with injected env + hook config
├── Auth: OIDC refresh token from keyring
├── ConnectRPC: CreateSession → session in dashboard
├── Sidecar: Unix socket server (kontext.sock)
│ ├── Heartbeat loop (30s)
│ └── Async event ingestion via ConnectRPC
├── Hooks: settings.json → Claude Code --settings
├── Agent: spawn claude with injected env
│ │
│ ├── [PreToolUse] → hook binary → sidecar → policy eval → allow/deny
│ └── [PostToolUse] → hook binary → sidecar → audit log
│ ├── [PreToolUse] → kontext hook → sidecar → ingest
│ ├── [PostToolUse] → kontext hook → sidecar → ingest
│ └── [UserPromptSubmit] → kontext hook → sidecar → ingest
└── Backend: bidirectional gRPC stream (ProcessHookEvent, SyncPolicy)
└── On exit: EndSession → cleanup
```

**Hook handlers** are the compiled `kontext hook` binary — <5ms startup, communicates with the sidecar over a Unix socket. No per-hook HTTP requests.
### Hook flow (per tool call)

**Policy evaluation** uses OpenFGA tuples cached locally by the sidecar. The backend streams policy updates in real-time via `SyncPolicy`.
```
Claude Code fires PreToolUse
→ spawns: kontext hook --agent claude
→ hook reads stdin JSON (tool_name, tool_input)
→ hook connects to sidecar via KONTEXT_SOCKET (Unix socket)
→ sidecar returns allow/deny immediately
→ sidecar ingests event to backend asynchronously
→ hook writes decision JSON to stdout, exits
→ ~5ms total (Go binary, no runtime startup)
```

## Telemetry Strategy

The CLI separates **governance telemetry** from **developer observability**. These are distinct concerns with different backends and data models.

### Governance telemetry (built-in)

Session lifecycle and tool call events flow to the Kontext backend. This powers the dashboard — sessions, traces, audit trail.

| Event | Source | When |
|---|---|---|
| `session.begin` | CLI lifecycle | Agent launched |
| `session.end` | CLI lifecycle | Agent exited |
| `hook.pre_tool_call` | PreToolUse hook | Before every tool execution |
| `hook.post_tool_call` | PostToolUse hook | After every tool execution |
| `hook.user_prompt` | UserPromptSubmit hook | User submits a prompt |

Events are streamed to the backend via the ConnectRPC `ProcessHookEvent` bidirectional stream and stored in the `mcp_events` table.

**What governance telemetry captures:**
- What the agent tried to do (tool name + input)
- What happened (tool response)
- Whether it was allowed (policy decision)
- Who did it (session → user → org attribution)
- When (timestamps, duration)

**What governance telemetry does NOT capture:**
- LLM reasoning or thinking
- Token usage or cost
- Model parameters
- Conversation history
- Response quality

### Developer observability (external, future)

LLM-level observability — generation details, token costs, reasoning traces, conversation history — is a separate concern. It is not part of the governance pipeline.

For this, the CLI will optionally export OpenTelemetry spans to an external backend:
- **Langfuse** — open-source, has a native Claude Code integration, self-hostable
- **Dash0** — OTEL-native SaaS, cheap ($0.60/M spans), AI/agent-aware

This is additive — the governance pipeline works independently. OTEL export is planned but not yet implemented.

## Protocol

Service definitions: [`proto/kontext/agent/v1/agent.proto`](proto/kontext/agent/v1/agent.proto)

Uses [ConnectRPC](https://connectrpc.com/) (gRPC-compatible) for backend communication.
The CLI communicates with the Kontext backend exclusively via ConnectRPC using the generated stubs. Requires the server-side `AgentService` endpoint ([kontext-dev/kontext#408](https://github.com/kontext-dev/kontext/issues/408)).

### Sidecar wire protocol

Hook handlers communicate with the sidecar over a Unix socket using length-prefixed JSON (4-byte big-endian uint32 + JSON payload):

- `EvaluateRequest` — hook → sidecar: agent, hook_event, tool_name, tool_input, tool_response
- `EvaluateResult` — sidecar → hook: allowed (bool), reason (string)

## Development

```bash
# Build
go build -o bin/kontext ./cmd/kontext

# Generate protobuf (requires buf)
# Generate protobuf (requires buf + plugins)
buf generate

# Test
go test ./...

# Link for local use
ln -sf $(pwd)/bin/kontext ~/.local/bin/kontext
```

## License
Expand Down
5 changes: 5 additions & 0 deletions buf.gen.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
version: v2
managed:
enabled: true
override:
- file_option: go_package_prefix
value: github.com/kontext-dev/kontext-cli/gen
inputs:
- git_repo: https://github.com/kontext-dev/proto.git
branch: main
Expand Down
Loading
Loading