You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(agentos): group-scoped encrypted git PATs + GAP source SHA sync
Let the SDK clone private GAP repos with a group-owned PAT, stored encrypted
on the AgentOS server and fetched with the SDK's API key.
Server:
- crypto/secret-box.ts — AES-256-GCM at rest, keyed by AGENTOS_CREDENTIALS_KEY,
with kid-based rotation (AGENTOS_CREDENTIALS_KEY_OLD). Fails closed.
- stores/git-credential-store.ts — git_credentials collection, unique
(ownerGroup, host) (re-upsert = rotation), normalizeGitHost, resolve().
- routes/git-credentials.ts — CRUD (git-credentials:manage/:read, ownership)
+ POST /resolve for the SDK's cak_ key (strictly group-scoped, no admin
bypass, no-store, token never logged). Mounted in the dashboard router.
- permissions + agentos-editor seed; ensureIndexes wiring.
- SHA sync: RegistryDoc.sourceSha/sourceSyncedAt, written from session_started
payload.agent_sha; drift since the last run is logged.
- tests: secret-box + git-credential-store.
SPA:
- api.gitCredentials client + GitCredential type; Settings -> Git Credentials
section (write-only secret) gated on git-credentials:read/:manage.
Ops:
- scripts/provision-keycloak.mjs (pnpm provision:keycloak) — idempotent realm /
roles / OIDC client / mappers / service-account provisioning.
- .gitignore: ignore *.env (keycloak.env etc.).
- CLAUDE.md: document auth/RBAC (§2.6b), git credentials + SHA sync (§2.6c),
new collections, and all new env vars.
Note: clone-with-PAT is wired for the SDK LOCAL/library substrate. The
remote (sandbox-at-CAS) and AgentOS->CAS dashboard paths reuse CAS's existing
gitToken field and are a follow-up.
Copy file name to clipboardExpand all lines: CLAUDE.md
+75-3Lines changed: 75 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -92,12 +92,17 @@ Optional GitHub repo **variables** (build-time baked into the SPA bundle):
92
92
The Mongo cluster is the source of truth for AgentOS. **Only the `agentos-server` connects to Mongo.** As of the post-0.2.1 dev build, the Python SDK no longer writes Mongo directly — it POSTs telemetry to the server's ingest endpoint (`AgentOSHttpSink` → `POST /agentos/api/ingest/events`), and the server owns all writes. (The old `AgentRegistrySink` + `MongoMessageSink` and the `motor` dep were removed — see the Python SDK history below.)
93
93
94
94
**Collections (database = the server's `MONGO_DATABASE`):**
95
-
-`agent_registry` — one doc per registered agent (the server writes `source.type="library"` for harness-mode agents → AgentOS UI hides the chat-sandbox button for those, see commit `8d829b8`)
95
+
-`agent_registry` — one doc per registered agent (the server writes `source.type="library"` for harness-mode agents → AgentOS UI hides the chat-sandbox button for those, see commit `8d829b8`). Also carries **ownership** (`ownerGroup`/`ownerUser`, see §2.6b) and **GAP source sync** (`sourceSha`/`sourceSyncedAt` — the commit SHA the SDK last loaded; the `session_started` projection updates it and logs drift, see §2.6c).
96
96
-`agent_logs` — one doc per conversation (one `ComputerAgent` instance = one log row, multi-turn collapses correctly since the 0.2.0 session-id refactor)
97
97
-`sessions` — ordered chat transcript (one doc per session_id, entries appended in order; **`session_started` is the sole creator** of the doc, so a dropped/reordered start can't stub it)
98
98
-`chat_sessions` — the session-index row (`{_id, agent, createdAt, lastMessageAt}`) the dashboard's session list + per-agent `sessionCount`/`lastActivity` read. The server projection writes this so library-mode sessions show up (the old Python sink omitted it).
-`slack_threads` — Slack-bot chat-channel state only; **not** written by the ingest projection (it was dead/legacy for library agents).
101
+
-`roles` — the DB-backed RBAC map (`{_id: <Keycloak role name>, permissions[], builtin}`), editable in Settings→Roles; seeded with `agentos-admin`/`-editor`/`-viewer` (§2.6b).
102
+
-`api_keys` — AgentOS-issued service keys (`cak_…`), stored hashed; each carries `roleIds` (capability) + `group` (tenancy). Validated by the harness via introspection; permissions resolve from the same `roles` map (§2.6b).
103
+
-`git_credentials` — group-scoped git PATs (encrypted at rest), one per `(ownerGroup, host)`, used by the SDK to clone private GAP repos (§2.6c).
104
+
105
+
Resources stamped with `ownerGroup`/`ownerUser` are **hard-isolated**: a non-admin sees only their own or their group's; admins (`*`) see all (§2.6b).
101
106
102
107
**Credentials required:**
103
108
- On the **SDK** side: `AGENTOS_INGEST_URL` (e.g. `https://<host>/agentos/api/ingest/events`) + optional `AGENTOS_INGEST_TOKEN` (sent as `Authorization: Bearer …`). No Mongo creds.
@@ -107,6 +112,45 @@ The Mongo cluster is the source of truth for AgentOS. **Only the `agentos-server
> The shared-password gate is gone. AgentOS now does real SSO + DB-backed RBAC + group ownership. Code lives under `packages/agentos-server/src/auth/`.
118
+
119
+
**Authentication — Okta federated by Keycloak, BFF session.** The app speaks only OIDC to Keycloak (which brokers Okta). `agentos-server` is the confidential `agent-os-server-client`: it runs Authorization Code + PKCE server-side (`auth/oidc.ts`, `routes/auth.ts`), verifies tokens via JWKS (`jose`), and sets an **httpOnly `agentos_session` cookie** carrying a signed principal snapshot — no token ever reaches the browser. The SPA is SSO-only (`LoginPage`).
120
+
121
+
**Token refresh (reactive).** The session cookie tracks the (short) access-token expiry; the server also holds a rotating refresh token in `agentos_refresh`. On a `401`, the SPA silently `POST /auth/refresh` (single-flight) and replays; a dead refresh token → SSO sign-in. So you stay logged in while active and only re-auth after Keycloak's SSO idle/max timeout.
122
+
123
+
**Authorization — DB-backed roles.** Keycloak emits role *names* (`realm_access.roles`) + `groups`; AgentOS owns what each role *can do* via the `roles` collection (editable in Settings→Roles). `authenticate → resolvePermissions → authorize(perm)` gates every dashboard route. Permission catalog is code-defined (`auth/permissions.ts`).
124
+
125
+
**Three guards / trust boundaries** (`app.ts`): SERVICE `/agentos/api/ingest/*` (`requireIngestAuth`, fails open) + `/agentos/api/keys/*` (`requireIntrospectionAuth`, fails closed); DASHBOARD `/agentos/api/v1/*` (`authenticate`); OBS `/v1/*`. `cak_` API keys authenticate at the dashboard boundary too (→ service principal with `groups=[key.group]`).
126
+
127
+
**Groups = read-only from Keycloak Admin API** (Settings→Groups). If a user's token lacks the `groups` claim, the server backfills groups from the Admin API at login/refresh (`auth/keycloak-admin.ts:listUserGroups`).
-`AGENTOS_SESSION_SECRET` (HMAC for the signed cookies — **stable in prod**)
133
+
-`AGENTOS_DEFAULT_ROLE` (e.g. `agentos-viewer`) — fallback when the token has no AgentOS role
134
+
-`AGENTOS_BOOTSTRAP_ADMINS` (comma-sep emails granted `*` before role lookup — first-admin bring-up; remove after)
135
+
-`AGENTOS_DEV_AUTH=1` — **local only** dev bypass injecting an admin principal; never in deployed envs
136
+
-`KEYCLOAK_ADMIN_CLIENT_ID`/`SECRET` (defaults to the OIDC client) — service account needs `view-realm`/`view-users` for the Groups view + group backfill
137
+
138
+
> **Provisioning:**`pnpm --filter @computeragent/agentos-server provision:keycloak` (`scripts/provision-keycloak.mjs`) idempotently creates the realm, the three realm roles, the OIDC client (+ secret), the **Group Membership** mapper, and the service-account roles. Run with `DRY_RUN=1` first. Needs a Keycloak master-admin user/pass (used once, never stored).
139
+
140
+
### 2.6c Git credentials (private GAP repos) + SHA sync
141
+
142
+
> So the SDK can clone **private** GAP repos. Code: `auth/.../crypto/secret-box.ts`, `stores/git-credential-store.ts`, `routes/git-credentials.ts`; SDK side in `computeragent-py` (`harness/git_credential_client.py`, `substrates/local.py`).
143
+
144
+
-**Store.** A PAT is owned by a **group** and scoped to one **host** — one per `(ownerGroup, host)` in `git_credentials`, **AES-256-GCM encrypted at rest**. Managed in Settings→Git Credentials (perms `git-credentials:read`/`:manage`). The secret is write-only (never returned).
145
+
-**Resolve.** The SDK calls `POST /agentos/api/v1/git-credentials/resolve` with its `cak_` key; the server returns the decrypted PAT for the key's group + the repo host (strictly group-scoped, no admin bypass). The SDK injects it via `GIT_CONFIG_*`/`http.<host>.extraHeader` so the token never lands in `argv`/URL; SSH URLs pass through. Miss/401 → unauthenticated clone fallback (public repos unaffected).
146
+
-**SHA sync.** After cloning, the SDK runs `git rev-parse HEAD` and reports it as `agent_sha` on `session_started`; the projection writes `sourceSha`/`sourceSyncedAt` on the registry doc and logs any change. (Reactive — recorded on each run; the SDK already re-clones fresh, so the running agent is never stale.)
147
+
148
+
**Required env:**
149
+
- Server: `AGENTOS_CREDENTIALS_KEY` (base64 of 32 random bytes; **fail-closed** — credentials CRUD/resolve 503 without it). Optional `AGENTOS_CREDENTIALS_KEY_OLD` for rotation.
150
+
- SDK: `AGENTOS_API_URL` (e.g. `https://<host>/agentos/api/v1`) + the same `cak_` key it already uses (`COMPUTERAGENT_HARNESS_TOKEN` / `AGENTOS_INGEST_TOKEN`). The key's role must include `git-credentials:read`.
151
+
152
+
---
153
+
110
154
### 2.4 OpenTelemetry / New Relic
111
155
112
156
Every harness run emits GenAI-semconv spans + metrics through `OtelSink`. With env vars set, the sink ships out of process; without them it falls back to the console exporter.
|`CORS_ORIGIN`| empty | Comma-separated origins allowed to call the API (set to your SPA origin) |
360
416
|`NODE_ENV`| — |`production` enables secure cookies + tightens defaults |
361
417
|`COOKIE_SECURE`| derived from `NODE_ENV`| Force `true` / `false` explicitly |
362
-
|`AGENTOS_SESSION_SECRET`| random per boot |Cookie-session secret. **Set to a stable value in prod** or sessions are invalidated on restart |
363
-
|`API_AUTH_USER` + `API_AUTH_PASS`| unset |Basic-auth gate on the API. When unset the API is open (relies on network policy) |
418
+
|`AGENTOS_SESSION_SECRET`| random per boot |HMAC secret for the signed BFF cookies (`agentos_session`/`agentos_refresh`). **Set to a stable value in prod** or every session is invalidated on restart |
419
+
|`API_AUTH_USER` + `API_AUTH_PASS`| unset |**Legacy** — no longer gates the dashboard (SSO does, §2.6b). Now only used to build the Basic header for outbound loopback calls to the harness (`caAuthHeader`) |
364
420
|`AGENTOS_INGEST_TOKEN`| unset | Bearer token guarding `POST /agentos/api/ingest/events` (the Python SDK's telemetry ingest). When unset the route is **open** (anonymous writes to registry/logs/sessions) — set it on any network-exposed pod. The SDK must send the same value as `AGENTOS_INGEST_TOKEN`. |
421
+
|**Auth / RBAC** (§2.6b) | — |`KEYCLOAK_ISSUER_URL`, `OIDC_CLIENT_ID`, `OIDC_CLIENT_SECRET` (+ optional `OIDC_AUDIENCE`/`OIDC_REDIRECT_URI`/`OIDC_POST_LOGOUT_URI`/`OIDC_ROLES_CLAIM`/`OIDC_GROUPS_CLAIM`); `AGENTOS_DEFAULT_ROLE`, `AGENTOS_BOOTSTRAP_ADMINS`, `AGENTOS_DEV_AUTH=1` (local only); `KEYCLOAK_ADMIN_CLIENT_ID`/`SECRET` for the Groups view + group backfill. Provision with `pnpm provision:keycloak`. |
422
+
|**Git credentials** (§2.6c) | unset |`AGENTOS_CREDENTIALS_KEY` (base64 32B; **fail-closed** for credentials CRUD/resolve) + optional `AGENTOS_CREDENTIALS_KEY_OLD` for rotation. |
export AGENTOS_DEV_AUTH=1 # local only — admin principal, no Keycloak needed (§2.6b)
496
+
# To exercise real SSO/RBAC locally instead, drop AGENTOS_DEV_AUTH and set the
497
+
# KEYCLOAK_ISSUER_URL / OIDC_* vars (run `pnpm provision:keycloak` first).
498
+
# For git-credentials locally: export AGENTOS_CREDENTIALS_KEY=$(openssl rand -base64 32)
436
499
cd packages/agentos-server && pnpm dev
437
500
438
501
# Terminal 3 — SPA
@@ -470,6 +533,8 @@ Chronological from earliest to latest. Each entry has the commit ref where relev
470
533
|`2756b9a`|`agentos-server`: dashboard API extracted into its own Express service; whole stack dockerized. |
471
534
|`af47a08`|`engine-claude-agent-sdk`: set `IS_SANDBOX=1` for the spawned Claude CLI (skips first-run telemetry prompts and treats the host as a sandbox). |
472
535
|`8d829b8`|`agentos`: introduced derived `liveChatCapable` field. SDK writes `source.type="library"` to `agent_registry` for harness-mode agents; UI checks `liveChatCapable` and hides the chat-sandbox button for those. Also strips model prefixes at every Mongo write site. |
0 commit comments