kontext-dev · tumberger · Apr 5, 2026 · Apr 5, 2026 · Apr 5, 2026 · Apr 5, 2026
@@ -1,6 +1,5 @@
 bin/
 dist/
-gen/
 *.exe
 .env.kontext
 kontext
@@ -0,0 +1,139 @@
+# Kontext CLI
+
+## The problem
+
+AI coding agents (Claude Code, Cursor, Codex) run on your laptop with whatever credentials you have lying around — long-lived API keys in `.env` files, GitHub tokens in your shell, database passwords in your config. There's no scoping, no audit trail, no way for a team lead to see what agents are doing across the org.
+
+## What the CLI does
+
+One command:
+
+```bash
+kontext start --agent claude
+```
+
+This launches Claude Code, but with two things added:
+
+1. **Scoped credentials** — instead of using whatever's in your shell, the agent gets short-lived tokens resolved from your Kontext account. They expire when the session ends.
+
+2. **Telemetry** — every tool call (file edits, shell commands, API calls) is logged to the Kontext dashboard. The team sees who did what, when, and whether it was allowed.
+
+## How it works (the 30-second version)
+
+```
+You run: kontext start --agent claude
+
+1. CLI checks your identity (OIDC token in your system keychain)
+2. CLI reads .env.kontext to see what credentials the project needs
+3. CLI resolves each credential from the Kontext backend
+4. CLI launches Claude Code with those credentials as env vars
+5. Every tool call Claude makes gets logged to your team's dashboard
+6. When you exit, credentials expire, session ends
+```
+
+## The codebase
+
+### Three binaries in one
+
+The CLI is a single Go binary that runs in three modes:
+
+- **`kontext start`** — the main command. Orchestrates everything, stays alive for the session.
+- **`kontext hook`** — called by Claude Code automatically on every tool call. You never run this yourself.
+- **`kontext login`** — one-time browser login. Stores your identity in the system keychain.
+
+They're the same binary because Claude Code needs to spawn hook handlers by command name. One binary = no install issues.
+
+### Why Go
+
+The hook handler (`kontext hook`) gets spawned on every single tool call — every file edit, every shell command, every API request. Node.js takes 50-100ms to start. Go takes 5ms. Over a session with hundreds of tool calls, this matters.
+
+Go also compiles to a single binary with zero dependencies. `brew install` and you're done.
+
+### Project structure
+
+```
+cmd/kontext/main.go         — CLI entry point (start, login, hook commands)
+internal/
+  agent/                    — agent adapter interface
+    claude/claude.go        — Claude Code hook I/O format
+  auth/                     — OIDC login + keychain storage
+  backend/                  — ConnectRPC client for the Kontext API
+  credential/               — .env.kontext template parser
+  hook/                     — hook event processor (stdin → evaluate → stdout)
+  run/                      — the start command orchestrator
+    hooks.go                — generates Claude Code hook config
+  sidecar/                  — local Unix socket server
+    protocol.go             — wire format for hook ↔ sidecar communication
+gen/                        — generated protobuf code (from kontext-dev/proto)
+```
+
+### The sidecar — why it exists
+
+When Claude Code makes a tool call, it spawns `kontext hook` as a new process. That process needs to log the event and get a policy decision. If it made a network call to the backend every time, that's 100-300ms per tool call — unacceptable.
+
+The sidecar solves this. It's a small server that starts alongside Claude Code and listens on a Unix socket file. The hook handler connects to it locally (sub-millisecond), and the sidecar maintains a persistent connection to the backend.
+
+```
+Claude Code → spawns kontext hook → Unix socket → sidecar → backend
+                 (5ms)              (0ms)          (already connected)
+```
+
+The sidecar also sends heartbeats every 30 seconds to keep the session alive in the dashboard.
+
+### Agent adapters
+
+Each agent (Claude Code, Cursor, Codex) has a different format for hook events. The adapter translates:
+
+```go
+type Agent interface {
+    Name() string                                    // "claude"
+    DecodeHookInput([]byte) (*HookEvent, error)      // parse agent's JSON
+    EncodeAllow(*HookEvent, string) ([]byte, error)  // format allow response
+    EncodeDeny(*HookEvent, string) ([]byte, error)   // format deny response
+}
+```
+
+Everything else — the sidecar, telemetry, credential resolution, policy evaluation — is shared. Adding a new agent is one file with four methods.
+
+### Credential injection
+
+A `.env.kontext` file in the project declares what credentials the agent needs:
+
+```
+GITHUB_TOKEN={{kontext:github}}
+STRIPE_KEY={{kontext:stripe}}
+```
+
+Before launching the agent, the CLI resolves each placeholder by calling the Kontext backend with the user's identity. The backend returns a short-lived credential (could be an OAuth token, could be an API key — the CLI doesn't distinguish). These become env vars in the agent's process.
+
+The agent uses them naturally — `git push` reads `GITHUB_TOKEN`, `curl` reads `STRIPE_KEY`. No special SDK, no interception.
+
+### Auth
+
+No client secrets. The user logs in once via browser (`kontext login`), and a refresh token is stored in the system keychain (macOS Keychain / Linux secret service). Every `kontext start` loads and refreshes the token automatically. The backend verifies the JWT and knows who the user is and which org they belong to.
+
+### Telemetry vs credentials — two separate things
+
+The CLI has two backend integrations that are completely independent:
+
+**Telemetry** — session lifecycle + hook events. Uses ConnectRPC (gRPC-compatible) with bidirectional streaming. The proto lives in `kontext-dev/proto`. This is what powers the dashboard.
+
+**Credentials** — provider token resolution. Uses a plain REST endpoint (`POST /api/v1/credentials/exchange`). This is what populates the env vars.
+
+They use different protocols because they have different needs. Telemetry needs streaming (hundreds of events per session over one connection). Credentials need a simple request/response (one call per provider at session start).
+
+### What's working today
+
+- `kontext login` — browser OIDC login, keychain storage, token refresh
+- `kontext start --agent claude` — launches Claude Code, interactive `.env.kontext` setup on first run
+- Agent adapter for Claude Code — full hook I/O encoding/decoding
+- Sidecar with Unix socket — accepts hook connections, relays events
+- Hook command — reads stdin, talks to sidecar, writes decision to stdout
+- Settings generation — creates Claude Code hook config automatically
+
+### What's blocked on the server
+
+- **Telemetry** (#408) — needs ConnectRPC `AgentService` endpoint on the API + auth change to accept user bearer tokens
+- **Credentials** (#410) — needs `POST /api/v1/credentials/exchange` endpoint authenticated with user tokens
+
+Both are unblocked by the same server-side auth change: `UnifiedAuthGuard` learning to accept user OIDC tokens as bearer tokens, not just service account tokens.
@@ -9,10 +9,11 @@ kontext start --agent claude
 ```
 
 1. **Authenticates** — loads your identity from the system keyring (set up via `kontext login`)
-2. **Resolves credentials** — reads `.env.kontext`, exchanges placeholders for short-lived tokens via Kontext
-3. **Launches the agent** — spawns Claude Code with credentials injected as env vars
-4. **Enforces policy** — every tool call is evaluated against your org's OpenFGA policy (via a local sidecar)
-5. **Logs everything** — full audit trail streamed to the Kontext backend via gRPC
+2. **Creates a session** — registers with the Kontext backend, visible in the dashboard
+3. **Resolves credentials** — reads `.env.kontext`, exchanges placeholders for short-lived tokens
+4. **Launches the agent** — spawns Claude Code with credentials injected as env vars + governance hooks
+5. **Captures every action** — PreToolUse, PostToolUse, and UserPromptSubmit events streamed to the backend
+6. **Tears down cleanly** — session disconnected, credentials expired, temp files removed
 
 Credentials are ephemeral — scoped to the session, gone when it ends.
 
@@ -33,28 +34,25 @@ go build -o bin/kontext ./cmd/kontext
 ### First-time setup
 
 ```bash
-kontext login
+kontext start --agent claude
 ```
 
-Opens a browser for OIDC authentication. Stores your refresh token in the system keyring (macOS Keychain / Linux secret service). No client IDs or secrets to manage.
+On first run, the CLI handles everything interactively:
+- No session? Opens browser for OIDC login, stores refresh token in system keyring
+- No `.env.kontext`? Prompts for which providers the project needs, writes the file
+- Provider not connected? Opens browser to the Kontext hosted connect flow
 
 ### Declare credentials
 
-Create a `.env.kontext` file in your project:
+The `.env.kontext` file declares what credentials the project needs:
 
 ```
 GITHUB_TOKEN={{kontext:github}}
 STRIPE_KEY={{kontext:stripe}}
 DATABASE_URL={{kontext:postgres/prod-readonly}}
 ```
 
-### Run
-
-```bash
-kontext start --agent claude
-```
-
-The CLI resolves each placeholder, injects the credentials as env vars, and launches Claude Code with governance hooks active.
+Commit this to your repo — the team shares it.
 
 ### Supported agents
 
@@ -69,38 +67,103 @@ The CLI resolves each placeholder, injects the credentials as env vars, and laun
 ```
 kontext start --agent claude
   │
-  ├── Auth: OIDC refresh token from keyring → ephemeral session token
-  ├── Credentials: .env.kontext → ExchangeCredential RPC → env vars
-  ├── Sidecar: Unix socket server for hook ↔ backend communication
-  ├── Agent: spawn claude with injected env + hook config
+  ├── Auth: OIDC refresh token from keyring
+  ├── ConnectRPC: CreateSession → session in dashboard
+  ├── Sidecar: Unix socket server (kontext.sock)
+  │     ├── Heartbeat loop (30s)
+  │     └── Async event ingestion via ConnectRPC
+  ├── Hooks: settings.json → Claude Code --settings
+  ├── Agent: spawn claude with injected env
   │     │
-  │     ├── [PreToolUse]  → hook binary → sidecar → policy eval → allow/deny
-  │     └── [PostToolUse] → hook binary → sidecar → audit log
+  │     ├── [PreToolUse]        → kontext hook → sidecar → ingest
+  │     ├── [PostToolUse]       → kontext hook → sidecar → ingest
+  │     └── [UserPromptSubmit]  → kontext hook → sidecar → ingest
   │
-  └── Backend: bidirectional gRPC stream (ProcessHookEvent, SyncPolicy)
+  └── On exit: EndSession → cleanup
 ```
 
-**Hook handlers** are the compiled `kontext hook` binary — <5ms startup, communicates with the sidecar over a Unix socket. No per-hook HTTP requests.
+### Hook flow (per tool call)
 
-**Policy evaluation** uses OpenFGA tuples cached locally by the sidecar. The backend streams policy updates in real-time via `SyncPolicy`.
+```
+Claude Code fires PreToolUse
+  → spawns: kontext hook --agent claude
+  → hook reads stdin JSON (tool_name, tool_input)
+  → hook connects to sidecar via KONTEXT_SOCKET (Unix socket)
+  → sidecar returns allow/deny immediately
+  → sidecar ingests event to backend asynchronously
+  → hook writes decision JSON to stdout, exits
+  → ~5ms total (Go binary, no runtime startup)
+```
+
+## Telemetry Strategy
+
+The CLI separates **governance telemetry** from **developer observability**. These are distinct concerns with different backends and data models.
+
+### Governance telemetry (built-in)
+
+Session lifecycle and tool call events flow to the Kontext backend. This powers the dashboard — sessions, traces, audit trail.
+
+| Event | Source | When |
+|---|---|---|
+| `session.begin` | CLI lifecycle | Agent launched |
+| `session.end` | CLI lifecycle | Agent exited |
+| `hook.pre_tool_call` | PreToolUse hook | Before every tool execution |
+| `hook.post_tool_call` | PostToolUse hook | After every tool execution |
+| `hook.user_prompt` | UserPromptSubmit hook | User submits a prompt |
+
+Events are streamed to the backend via the ConnectRPC `ProcessHookEvent` bidirectional stream and stored in the `mcp_events` table.
+
+**What governance telemetry captures:**
+- What the agent tried to do (tool name + input)
+- What happened (tool response)
+- Whether it was allowed (policy decision)
+- Who did it (session → user → org attribution)
+- When (timestamps, duration)
+
+**What governance telemetry does NOT capture:**
+- LLM reasoning or thinking
+- Token usage or cost
+- Model parameters
+- Conversation history
+- Response quality
+
+### Developer observability (external, future)
+
+LLM-level observability — generation details, token costs, reasoning traces, conversation history — is a separate concern. It is not part of the governance pipeline.
+
+For this, the CLI will optionally export OpenTelemetry spans to an external backend:
+- **Langfuse** — open-source, has a native Claude Code integration, self-hostable
+- **Dash0** — OTEL-native SaaS, cheap ($0.60/M spans), AI/agent-aware
+
+This is additive — the governance pipeline works independently. OTEL export is planned but not yet implemented.
 
 ## Protocol
 
 Service definitions: [`proto/kontext/agent/v1/agent.proto`](proto/kontext/agent/v1/agent.proto)
 
-Uses [ConnectRPC](https://connectrpc.com/) (gRPC-compatible) for backend communication.
+The CLI communicates with the Kontext backend exclusively via ConnectRPC using the generated stubs. Requires the server-side `AgentService` endpoint ([kontext-dev/kontext#408](https://github.com/kontext-dev/kontext/issues/408)).
+
+### Sidecar wire protocol
+
+Hook handlers communicate with the sidecar over a Unix socket using length-prefixed JSON (4-byte big-endian uint32 + JSON payload):
+
+- `EvaluateRequest` — hook → sidecar: agent, hook_event, tool_name, tool_input, tool_response
+- `EvaluateResult` — sidecar → hook: allowed (bool), reason (string)
 
 ## Development
 
 ```bash
 # Build
 go build -o bin/kontext ./cmd/kontext
 
-# Generate protobuf (requires buf)
+# Generate protobuf (requires buf + plugins)
 buf generate
 
 # Test
 go test ./...
+
+# Link for local use
+ln -sf $(pwd)/bin/kontext ~/.local/bin/kontext
 ```
 
 ## License

@@ -1,4 +1,9 @@
 version: v2
+managed:
+  enabled: true
+  override:
+    - file_option: go_package_prefix
+      value: github.com/kontext-dev/kontext-cli/gen
 inputs:
   - git_repo: https://github.com/kontext-dev/proto.git
     branch: main