Skip to content

feat: Google Workspace + Microsoft 365 identity console + Claude device-task dispatch#1053

Merged
ToddHebebrand merged 22 commits into
LanternOps:mainfrom
bdunncompany:feat/google-identity-device-tasks
Jun 12, 2026
Merged

feat: Google Workspace + Microsoft 365 identity console + Claude device-task dispatch#1053
ToddHebebrand merged 22 commits into
LanternOps:mainfrom
bdunncompany:feat/google-identity-device-tasks

Conversation

@bdunncompany

Copy link
Copy Markdown
Collaborator

Summary

Adds a Google Workspace + Microsoft 365 identity console (enter credentials in-app, then run helpdesk identity actions through the AI assistant) plus the Phase 3 Claude device-task dispatch groundwork. Both providers are feature-flagged dark by default (GOOGLE_WORKSPACE_ENABLED, M365_ENABLED).

Google Workspace

  • Org-scoped google_workspace_connections table (RLS, shape-1 org_id) + migration; connection UI (primary domain, super-admin to impersonate, service-account JSON) that does a live Directory call to verify domain-wide delegation before storing the (encrypted) key, and never returns it.
  • Identity tools: lookup, reset password, suspend, sign-out, update/alias, group add/remove + list, OU move, rename, 2SV reset, mail-delegate add/remove, license list/assign/remove, calendar share. Tier-gated (read vs mutating) and wired into the guardrail TOOL_PERMISSIONS + getToolTier.

Microsoft 365

  • Org-scoped m365_connections table (RLS) + migration; connection UI (tenant id, app/client id, client secret) verified via a live Graph call before storing the encrypted secret.
  • The existing m365_* tools are made backend-aware: direct app-only Graph when the org has a connection, else the existing Delegant broker. New m365DirectGraph maps the operations to Graph v1.0.

Claude device-task dispatch (Phase 3)

  • createSession can bind to a device; Fix-with-AI entry point.

Conformance to the in-flight tenant-isolation work

Tests

Route tests, tool gating/handler tests, schema + RLS, and the site-scope/deviceArgs/dead-perms contract tests are green locally (api vitest, web vitest, tsc, eslint).

Note

This is a large branch; happy to split into Google / M365 / device-dispatch PRs if that reviews better.

Comment thread apps/web/src/components/integrations/GoogleWorkspaceIntegration.tsx Fixed
@bdunncompany bdunncompany marked this pull request as draft June 1, 2026 18:11
@bdunncompany bdunncompany marked this pull request as ready for review June 1, 2026 18:21
@ToddHebebrand

Copy link
Copy Markdown
Collaborator

Full review — security + correctness + reliability + test coverage (4 passes; traced cross-file data flows on pr-1053, did not review diff-only). Net: the security/tenant-isolation posture is clean, but the broader review found 3 merge-blockers (one of which silently defeats a headline feature) plus error-handling and coverage gaps. Detail below; happy to see it merged once the blockers are addressed.

Security: PASS. Both new tables (google_workspace_connections, m365_connections) ship RLS enable+force+four org-isolation policies in their creating migration (direct-org_id shape → auto-discovered by the RLS contract test). Connection secrets are AES-256-GCM encrypted before storage and excluded from all response shapes; not logged. The createSession device-binding enforces both org and site axes (auth.canAccessSite), matching the #1047 fix. Egress is fixed-host (googleapis.com / graph.microsoft.com) with the M365 tenant id forced through a strict GUID check before the token URL. MFA + permission gates on mutations. Site-scope coverage contract: 7/7.

BLOCKER 1 (CRITICAL) — the M365 direct-Graph backend is unreachable in production. m365ToolDefinitions (services/aiAgentSdkTools.ts:524) early-returns [] unless DELEGANT_BASE_URL is set. So on a self-hosted instance with M365_ENABLED=true and a saved m365_connections row but no Delegant broker, the agent is never offered the M365 tools — the entire direct path (m365DirectGraph.ts + the "prefer direct backend" resolver) is dead code in the production tool path. The unit tests pass only because they call handlers directly and bypass the gate. Fix: gate on M365_ENABLED || DELEGANT_BASE_URL, mirroring how googleToolDefinitions gates on GOOGLE_WORKSPACE_ENABLED.

BLOCKER 2 (CRITICAL) — tenantCascade is incomplete; fails the cascade contract test. In services/tenantCascade.ts, the new m365_connections table (direct org_id) is missing from ORG_CASCADE_DELETE_ORDER, and google_workspace_connections (line 109) is placed out of alphabetical order (between deployments and device_boot_metrics). tenantCascade.integration.test.ts asserts both completeness (every org_id table is listed) and sorted order, so this fails CI. Functionally, GDPR org-erasure would skip the table holding the encrypted Azure client secret. Fix: add m365_connections and move google_workspace_connections to its sorted slot.

BLOCKER 3 (CRITICAL, reliability) — failed privileged mutations are recorded as successes. makeSessionAwareHandler infers tool success by JSON.parse-ing the handler return and checking for an error key. Two handlers defeat that by returning prose: (a) googleOffboardUserHandler (services/aiToolsGoogle.ts:818) always returns a prose summary, so an offboard where 0/6 steps succeeded is persisted as a successful Tier-3 mutation — a leaver who still has full access shows a green check in the audit trail; (b) the M365 errorTemplate (services/aiToolsM365.ts:121) returns prose and drops the structured code, so a Graph/Delegant auth_failed on m365_disable_user is also logged as success. Root cause: success is inferred from string shape. Fix: return a structured error envelope on failure (the Google handlers' errorString(code, …) convention), or pass an explicit success flag to the post-tool-use path.

HIGH — error masking on the M365 path. resolveUserId (services/aiToolsM365.ts:118) collapses every non-ok get_user result (auth_failed / forbidden / 5xx / unreachable) into null → "user not found." A misconfigured connection (expired secret, missing Graph grant) on m365_disable_user/m365_reset_password reports a phantom missing user instead of the real credential/permission failure. Propagate the underlying code instead of nulling it.

HIGH — google_set_forwarding swallows the create error. services/aiToolsGoogle.ts:560 computes normalizeGoogleError then discards it in an empty if body; Google accepts updateAutoForwarding to a pending-verification address, so forwarding is reported "enabled" while mail is not forwarded. Same empty-catch in the offboard forwarding step (line ~752). Surface the create error or verify verificationStatus before reporting success.

MEDIUM — direct-vs-Delegant precedence is silent. resolveContext prefers a direct connection and never consults a session's bound delegantM365ConnectionId; if both exist, the session silently retargets the org's own tenant with no signal. Honor an explicit session binding, or error on conflict.

Test gaps (privileged identity surface — worth closing before merge): no end-to-end approval-gating assertion (only the static tier map is tested — a refactor could make a destructive tool auto-execute with all tests still green); destructive tools aren't proven un-invokable as reads; google_restore_user has zero behavioral coverage; normalizeGoogleError isn't tested against a realistic Gaxios error body (secret-leak risk); m365DirectGraph.getToken fail-closed branches and M365 UPN-not-found on mutators are untested.

Positives: migrations are idempotent, no inner BEGIN/COMMIT, correct date-prefix ordering; routes export xxxRoutes and are mounted; pagination handled; the connection-create routes are genuinely fail-closed (live verify before storing, encrypt + null-row checks).

Status: Security PASS. 3 blockers (M365 tool-gate, tenantCascade completeness/order, silent-failure on privileged mutations) + 2 HIGH error-handling + test gaps. Given the size, the offered split (Google / M365-direct / device-dispatch) would isolate Blockers 1–2 to the M365 slice and make each piece reviewable on its own.

@ToddHebebrand ToddHebebrand left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deep review (two-reviewer pass: backend/RLS/secrets + AI-tools/dispatch/UI). The feature is in good shape — this is a one-line bounce on a GDPR-cascade contract gap.

The whole tenancy/secret surface verified clean. Both new tables carry RLS correctly: google_workspace_connections and m365_connections each have org_id NOT NULL, ENABLE+FORCE ROW LEVEL SECURITY, and all four breeze_org_isolation_* policies in the creating migration (shape-1, auto-discovered by rls-coverage.integration.test.ts — no allowlist edit needed). Service-account keys / client secrets are encrypted at rest, never returned by toConnectionResponse, never logged. Every /google/connection and /m365/connection route self-applies authMiddleware + requirePermission(ORGS_READ/WRITE), with requireMfa() on mutations. Migrations are idempotent, correctly dated, no inner BEGIN/COMMIT. SSRF-safe (fixed Graph base, GUID-validated tenant id, OData single-quote escaping). On the AI side, the #991-class tier-resolution bug is fixed and test-guarded — Google tools are session-aware (intentionally out of the exec aiTools Map), getToolTier falls through to googleToolTiers, and aiAgentSdkTools.googlegating.test.ts asserts every google_* name resolves to a defined tier. Device-task dispatch (aiAgent.ts:63-86) validates same-org and the #1047 canAccessSite axis before binding; destructive tools (google_wipe_mobile_device, google_offboard_user, suspend/reset) are tier-3 → checkGuardrails forces requiresApproval.

Must-fix — new m365_connections missing from the org-cascade contract. apps/api/migrations/2026-06-01-m365-connections.sql:13 creates m365_connections with an org_id column, but ORG_CASCADE_DELETE_ORDER in apps/api/src/services/tenantCascade.ts does not list it — the entry near line 106 is delegant_m365_connections, a different (older, #991) table. tenantCascade.integration.test.ts cross-checks the list against every org_id column in information_schema and will fail CI on the omission, and functionally an org delete would orphan its M365 identity connection. This is the exact gap class #991 hit. Add 'm365_connections' alongside the other connection tables (google_workspace_connections is already correctly listed at :109).

Nice-to-have (non-blocking). GoogleWorkspaceIntegration.tsx and M365Integration.tsx POST-save / DELETE-disconnect via raw fetchWithAuth with inline setSaveState feedback — that's the CLAUDE.md-permitted inline-handler exception, but they're recorded nowhere (not in runActionAllowlist.ts / migration backlog), so they silently skip runAction's HTTP-200 {success:false} detection. Add them to the backlog so they're tracked, not invisible.

Everything else is approve-ready — re-request once m365_connections is in the cascade list.

…tion + wiring

google_workspace_connections table (org-scoped, RLS shape 1, FORCE RLS, 4 org
policies), idempotent migration (validated twice on local test DB), schema
barrel export, ORG_CASCADE_DELETE_ORDER registration, and GOOGLE_WORKSPACE_ENABLED
feature flag (default off). No client/route/tools yet — those are Phase 1b/2.

WIP checkpoint on a long integrated build; squashed before PR.
…nt tools

Phase 1b: @googleapis/admin + @googleapis/gmail + google-auth-library deps;
googleClient.ts (DWD JWT via each package's own auth namespace — admin
impersonates super-admin, gmail impersonates target user; least-privilege
scopes; typed error normalization); googleHelpers.ts (org-scoped connection
load + authorize + decrypt); routes/google.ts (GET/POST/DELETE /google/connection,
ORGS_WRITE+MFA gated, fail-closed live-Directory verify before storing the
encrypted SA key, never returns the key); mounted in index.ts.

Phase 2: aiToolsGoogle.ts (8 handlers — lookup[T1], reset_password, suspend,
restore, signout, set_forwarding, set_vacation, update_user[T3]); registered in
aiAgentSdkTools.ts via googleToolDefinitions gated on GOOGLE_WORKSPACE_ENABLED;
TOOL_TIERS + aiGuardrails RBAC ('google' resource) entries. All T3 writes ride
the existing tier-3 approval + audit path; require a reason.

Tests: 31 new (client, helpers, 8 handlers incl. error mapping + disposition,
gating). Existing aiGuardrails/m365 gating/session tests still green. API tsc
clean. No push (pre-deploy).
…th-AI UI

API: createSession now accepts an org-validated deviceId + approvalMode and
binds them to the ai_session (the 'task on this computer' scoping); route schema
extended. Cross-org device binding rejected ('Invalid device').

Web: aiStore.createSession takes an optional deviceId and sends approvalMode;
new startDeviceTask(deviceId, ctx) opens a device-scoped session. DeviceDetails
gets a 'Fix with AI' button (data-testid device-fix-with-ai) that launches it
with the device page-context.

Tests: aiAgent.deviceTask.test.ts (4, incl. cross-org rejection); existing
aiAgent.m365 binding unaffected. Web aiStore test green (node22); astro check 0
errors/0 warnings. Full touched-surface regression: 154/154 green. pnpm lockfile
updated for the @googleapis deps. No push (pre-deploy).
…he named Google list)

Adds calendar sharing, the last item Billy explicitly named that wasn't
already covered (aliases are handled in google_update_user). New
getCalendarClient impersonates the calendar OWNER with a narrow
calendar.acls scope; google_share_calendar (tier 3, reason+approval)
inserts an ACL rule (role: freeBusyReader/reader/writer/owner, default
reader) on the owner's primary or a named calendar.

- @googleapis/calendar@^15 dep
- CALENDAR_SCOPES added to the DWD scope CSV (operator must authorize)
- 4 handler tests + gating test bumped to 9 tools + scope-union test
- api tsc clean; 35/35 google unit tests green
… (cluster 1)

google_offboard_user: best-effort sequence over a departing user — optional
OOO + manager forwarding (no copy kept) run first while the account is active,
then OAuth-token revoke, remove-from-all-groups, SELECTIVE mobile account wipe
(admin_account_wipe, corporate data only — BYOD-safe, never a full device wipe),
sign-out, suspend last. Each step independent + reported.

google_wipe_mobile_device: STOLEN-DEVICE-ONLY full factory reset
(admin_remote_wipe), explicitly separate from offboard, erases the whole device.

Also fixes a latent fail-closed RBAC gap: google_share_calendar (Phase 4),
google_offboard_user and google_wipe_mobile_device were missing from
TOOL_PERMISSIONS, which would have blocked them at the guardrail.

New DWD scopes (scope-minimized): admin.directory.group, .group.member,
.device.mobile.action. Verified all API shapes against the installed type defs
+ Google docs (account_wipe vs remote_wipe enum). api tsc clean; 42/42 google
unit tests + 80/80 guardrails green.

Follow-up noted: add a TOOL_TIERS<->TOOL_PERMISSIONS coverage test.
…y-email (cluster 2)

google_security_drift (tier 1, read): pages the directory and buckets users
into no-2-step, super-admins, suspended, never-logged-in, and stale (>N days,
default 90). Pure computeSecurityDrift() is unit-tested. No new scope (uses
admin.directory.user).

google_email_report (tier 1): runs the drift report, renders HTML, and emails
it via the existing email service. Recipient is LOCKED to the connection's
admin address (no arbitrary recipient) so the agent cannot exfiltrate directory
data; errors cleanly when no email provider is configured.

Registered in aiAgentSdkTools (TOOL_TIERS + definitions), aiGuardrails
(TOOL_PERMISSIONS, resource google/read), gating test bumped to 13 tools.
api tsc clean; 127/127 google+guardrails unit tests green.
…raph backend + UI

Google (clusters 1-3 complete, 24 tools): group membership, OU move, rename,
2SV reset, mail delegation, license management (+@googleapis/licensing) + the
Google Workspace connection UI (Identity integrations tab).

M365: org-scoped m365_connections table + RLS migration, /m365/connection
connect route (M365_ENABLED flag), m365DirectGraph.ts (client-credentials Graph
backend reusing c2cM365 helpers), aiToolsM365 made backend-aware (direct Graph
when an org connection exists, Delegant broker otherwise), and the Microsoft 365
connection UI sub-tab. Existing Delegant tools/tests unchanged.

Verified: api 5455 passing, web 697 passing, astro check 0 errors, tsc clean.
Checkpoint on feature branch; not pushed.
Cover invokeDirect's Graph mapping for the direct M365 backend: get_user GET,
disable_user PATCH accountEnabled:false, reset_user_password PATCH
passwordProfile (+ generated temp password), list_groups, signin-activity
$filter, 403→forbidden, and the missing-userId bad_request guard (no Graph
call). 7/7 green.
…ntime)

The 24 google_* tools are session-aware (not in the aiTools execution
registry) and were not in m365ToolTiers, so getToolTier() returned undefined
for all of them. checkGuardrails then treated every Google tool as tier
undefined → 'Unknown tool' → blocked, so the entire Google Workspace suite
failed at runtime despite passing registration + handler unit tests (those
never exercise the checkGuardrails→getToolTier path).

Wire googleToolTiers into getToolTier's fallback chain. Add a regression test
asserting getToolTier resolves every google_* tool to its declared tier.

Verified: tsc clean; full api suite 5463 passing / 0 failures.
…-in-depth)

m365DirectGraph get_user_signin_activity interpolated the user id into an
OData $filter; encodeURIComponent covers URL transport but not OData literal
escaping. Double single-quotes so a quote in the id can't alter the filter.
Same-tenant read with the org's own token, so low severity. tsc + test green.
Covers the M365 connection route the credentials UI drives: GET never returns
the secret; POST verifies via Graph (acquireClientCredentialsToken +
testGraphAccess) and encrypts before storing, returning 201 with no secret in
the body; Graph-verify failure and token-acquisition failure both 400 with a
hint and do not encrypt/store; missing client secret is rejected 400. 7/7.
Symmetric to the m365 route tests: GET never returns the service-account key;
POST verifies via a live Directory call (getDirectoryClient.users.get) and
encrypts the key before storing, returning 201 with no key in the body; a
failed DWD verify and a malformed key both 400 (with hint on the former) and
do not encrypt/store; missing key rejected 400. 7/7, tsc clean.
index.ts applies no global auth to the /api/v1 group; each route file attaches
authMiddleware itself. google.ts and m365.ts were the only routers missing it,
so requirePermission saw no auth context and returned 401 for every caller even
with a valid Bearer. Add authMiddleware before the feature-flag gate, plus a
regression test in each suite that mounts the router without harness auth and
asserts authMiddleware is invoked.
…y connection forms

Collapsible step-by-step in the Google and M365 connection forms (service-account
JSON + domain-wide delegation scopes for Google; Entra app registration tenant/client
id + client secret + Graph app permissions for M365).
…s#1049 security contracts

- M365 tenant id: acquireClientCredentialsToken now requires the M365TenantId
  brand (Entra GUID). Validate-and-narrow via isM365TenantId at both call sites
  (POST /m365/connection and m365DirectGraph.getToken), rejecting non-GUID ids
  before they reach the token URL; update fixtures/mocks + add a reject test.
- aiAgent.createSession device binding: honor LanternOps#1047's auth.canAccessSite so a
  site-restricted caller cannot bind a session to an out-of-site device within
  an accessible org. Add site-restricted reject + allow tests.
Avoids Trivy's gcp-service-account secret rule false-positive on the example
'{ "type": "service_account", ... }' literal; use a descriptive placeholder.
…chable M365 gate, cascade

Addresses Todd's review of the Google/M365 identity console:
- M365 tool gate: m365ToolDefinitions advertises tools on M365_ENABLED OR
  DELEGANT_BASE_URL (was Delegant-only), so the direct app-only Graph path is
  reachable in production.
- tenantCascade: add m365_connections, sort google_workspace_connections — both
  org_id tables now covered by GDPR org-erasure; cascade contract passes.
- Privileged mutations no longer log failure as success: googleOffboardUserHandler
  returns a structured error envelope when any step fails; m365 errorTemplate
  returns errorString(code,...) instead of prose; resolveUserId propagates the
  underlying error (404 -> not-found, auth/forbidden/5xx -> the real failure)
  rather than collapsing every non-ok to phantom 'user not found'.
- google_set_forwarding: verify the destination's verificationStatus and surface
  a real create failure (via a create-or-read probe) instead of reporting
  forwarding 'enabled' when it can't deliver; same fix in the offboard step.
- Add google_disable_forwarding to turn forwarding back off (optionally remove
  the address). Wired into tiers + guardrails + SDK registry.
- Tests for all the above (forwarding pending/failure/already-exists, disable,
  offboard envelope, m365 auth-vs-not-found).
@bdunncompany bdunncompany force-pushed the feat/google-identity-device-tasks branch from cba9559 to 14d2581 Compare June 2, 2026 00:13
@bdunncompany

Copy link
Copy Markdown
Collaborator Author

Thanks for the thorough review — all 3 blockers + both HIGH addressed, plus a turn-it-off companion.

Blocker 1 — M365 direct path unreachable. m365ToolDefinitions now gates on M365_ENABLED || DELEGANT_BASE_URL (mirrors googleToolDefinitions), so the direct app-only Graph path is advertised on an M365_ENABLED instance with a saved connection and no broker.

Blocker 2 — tenantCascade. Added m365_connections; moved google_workspace_connections to its sorted slot. Both org_id tables are now in ORG_CASCADE_DELETE_ORDER; alphabetical + completeness checks pass.

Blocker 3 — failed mutations logged as success. Root cause was success inferred from string shape. Fixed at the source: googleOffboardUserHandler returns a structured errorString('offboard_incomplete', …) when any step fails; the M365 errorTemplate returns errorString(e.code, …) instead of prose so the code survives and the post-tool-use path sees the failure.

HIGH — resolveUserId masking. Now returns a discriminated result; callers map a 404 (not_found) to "user not found" and surface auth_failed/forbidden/5xx as the real error instead of a phantom missing user.

HIGH — google_set_forwarding swallowed the create error. Reworked via a shared enableAutoForwarding helper: it reads the destination's verificationStatus and, on a create throw, probes with a read (the address may already exist — Gmail 409s, which normalizeGoogleError can't distinguish by code) and only surfaces the create error when the address is genuinely absent. A non-accepted status now returns a forwarding_pending_verification envelope rather than reporting "enabled." Same fix applied to the offboard forwarding step.

New — google_disable_forwarding to turn forwarding back off (optionally removing the address), wired into tiers + guardrails + the SDK registry.

Tests added for: forwarding pending-verification, real create failure, already-exists/read-status, disable, the offboard error envelope, and m365 auth-failure-vs-not-found. Note the Gmail/Graph mutation paths are covered by unit mocks (I don't have a live test tenant for them).

Did not split into 3 PRs — happy to if you'd prefer, but since the blockers are fixed and you've already reviewed the whole thing, a delta re-review seemed lighter than carving out the shared tenantCascade/SDK-registry/session machinery. Your call. Re-requesting review.

@ToddHebebrand

Copy link
Copy Markdown
Collaborator

Status: verified merge-ready — holding for the release after the in-flight security release. This isn't a code concern; it's purely release sequencing, so nothing more is needed from you right now.

Re-review of the fix-up (14d258108) is clean. The one must-fix from the prior pass is resolved — m365_connections is now in ORG_CASCADE_DELETE_ORDER (tenantCascade.ts:161), and you also relocated google_workspace_connections to its correct alphabetical slot (:147). A fresh full pass over the whole +4525/−35 diff (backend/RLS/secrets, AI-tools/dispatch, web/UI) turned up zero must-fix: both new tables carry org_id + RLS + all four policies idempotently; secrets are encrypted, stripped from responses, never logged; routes self-apply authMiddleware + requireMfa; SSRF surface is locked (fixed Graph base, GUID-gated tenant id, OData escaping, report recipient pinned to conn.adminEmail); tool-tier resolution holds across all 25 tools with the three lists in agreement; destructive tools are tier-3/approval-gated and auto_approve can't bypass; device-task dispatch validates org and the #1047 site axis. The bonus hardening in the fix-up (google_disable_forwarding, forwarding pending-verification, offboard FAILED envelope, the M365_ENABLED OR DELEGANT_BASE_URL gating) all checks out.

The Trivy CRITICAL on the service-account field is a stale false positive — already remediated by a163049e3 (plain placeholder), and all Trivy/Gitleaks/Security-Audit jobs pass; the only red is the non-required Smoke integration flakiness.

A few non-blocking nits I'll hand off when we pick this back up (no action now): the unused _auth in resolveContext (worth a one-line comment noting the deliberate divergence from the M365 auth.orgId path), a redundant second m365_connections select in m365DirectGraph.getToken, IntegrationsPage.tsx:431 using useState instead of the hash convention for the sub-tab, and CreateAiSessionRequest not declaring deviceId/approvalMode (route tests pass).

Will merge once the security release is out. Leaving the existing review state as-is until then.

bdunncompany and others added 3 commits June 2, 2026 14:00
… 500)

"Fix with AI" on a device dispatches POST /ai/sessions with only { deviceId }
and no orgId. createSession resolved the session org as
`options.orgId ?? auth.orgId ?? auth.accessibleOrgIds?.[0]`, so a partner /
multi-org caller (auth.orgId === null) bound the session to accessibleOrgIds[0]
— an org unrelated to the device. The SECURITY-CRITICAL cross-org check
(dev.orgId !== orgId) then rejected the dispatch, and the bare throw surfaced
as HTTP 500 "Invalid device". For an MSP admin every Fix-with-AI click failed.

Fix: resolve the device up front and anchor the session org to the device's
org when the caller can access it (auth.canAccessOrg), before falling back to
auth.orgId / accessibleOrgIds[0]. The opaque cross-org/site checks are
unchanged: a device the caller cannot reach still falls through to the same
"Invalid device" rejection (no existence leak), and an explicit orgId that
doesn't own the device is still rejected. Also map "Invalid device" to 400
instead of 500 in the route.

Tests (TDD): multi-org caller binds to the device's org; explicit-orgId
mismatch still rejected; route maps Invalid device -> 400. Verified live: the
repro now returns 201 bound to the device's org; bogus device returns 400.

Found via live E2E (FEATURE_TEST_LOG 2026-06-01); unit tests passed orgId
explicitly so they missed it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 7b026ad219c0b2ab933f8ca20e7665d32a246ac2)
ToddHebebrand added a commit that referenced this pull request Jun 10, 2026
#1182)

## Summary

Addresses **L-5** of the #917 hardening cluster. Partner unsuspend
(`apps/api/src/routes/admin/abuse.ts`) did:

```ts
UPDATE users SET status='active' WHERE partner_id=? AND status='disabled'
```

— re-enabling **every** disabled user under the partner, regardless of
why they were disabled. So a suspend → unsuspend cycle would silently
resurrect users an operator had deliberately disabled for **compromise,
off-boarding, or a manual admin action**.

## Fix — record why a user is disabled

- **Migration** `2026-06-09-users-disabled-reason.sql`: adds a nullable
`users.disabled_reason` (NULL = "disabled for some non-suspension
reason") and backfills users currently disabled under a
still-`suspended` partner as `'partner_suspended'`, so in-flight
suspensions still unsuspend correctly. Idempotent; adding a column to
the already-RLS-forced `users` table needs no policy change.
- **Suspend** stamps `disabled_reason='partner_suspended'` on the users
it disables, and now skips users already `disabled` so it never
re-stamps an other-reason disable. Session/JWT/OAuth/remote-session
revocation still covers **all** partner users (via the separate
`affectedUserIds` set), so the suspension's security posture is
unchanged.
- **Unsuspend** re-enables only `status='disabled' AND
disabled_reason='partner_suspended'`, clearing the marker. Users
disabled for any other reason stay disabled.

The two user-state queries are extracted into small exported helpers
(`disablePartnerUsersForSuspension`, `reEnableSuspensionDisabledUsers`)
so the handler and the integration test share the exact SQL.

## Tests
- New `partnerUnsuspendScope.integration.test.ts` (real Postgres) proves
the row-filtering semantics end-to-end: a user disabled for another
reason is **not** re-stamped by suspend and is **left disabled** after
unsuspend, while suspension-disabled users are restored.
- Existing abuse unit suite (26) and the `autoMigrate` ordering guard
(43) green; `tsc` + lint clean.

> Note: a local `db:check-drift` against the shared dev DB flags
`2026-06-01-google-workspace-connections` / `m365-connections` ledger
entries with no files — those are from the still-open #1053 branch (not
on `main`), unrelated to this PR. The new migration applies cleanly on a
fresh DB (verified via the integration runner's `autoMigrate`).

Addresses **L-5** of #917 (other sub-items tracked separately).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ToddHebebrand and others added 2 commits June 11, 2026 19:42
# Conflicts:
#	apps/api/package.json
#	apps/web/src/components/devices/DeviceDetails.tsx
#	pnpm-lock.yaml
Trivy blocking scan flagged HIGH CVE-2026-48068/48069 in @grpc/grpc-js
1.14.3 (malformed-request server crash), a transitive dep of
firebase-admin. Freshly-published CVE; main's last scan passed only
because it ran before the vuln DB updated. Pin to the patched 1.14.4 via
the existing pnpm.overrides convention.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ToddHebebrand ToddHebebrand merged commit b393386 into LanternOps:main Jun 12, 2026
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants