Skip to content

feat: add GitHub dispatch worker and harden agent session flows#2081

Merged
yottahmd merged 10 commits intomainfrom
feat/gci
May 3, 2026
Merged

feat: add GitHub dispatch worker and harden agent session flows#2081
yottahmd merged 10 commits intomainfrom
feat/gci

Conversation

@yottahmd
Copy link
Copy Markdown
Collaborator

@yottahmd yottahmd commented May 2, 2026

Summary

  • add scheduler-side GitHub dispatch polling, tracked-job persistence, and shared webhook runtime-param handling
  • expose machine-authenticated Cloud credentials for dispatch API calls and harden the related license client and manager paths
  • fix agent session UI action persistence, polling-driven root and delegate navigation, and OpenAI Codex tool-call replay handling

Testing

  • go test ./internal/service/scheduler ./internal/license ./internal/agent ./internal/llm/providers/openaicodex -count=1
  • pnpm exec vitest run src/features/agent/hooks/tests/useAgentChat.test.tsx

Summary by CodeRabbit

Release Notes

  • New Features

    • Added GitHub Dispatch integration for processing dispatch jobs with polling and status tracking
    • Implemented webhook runtime parameter formatting for consistent environment variable handling
    • Enhanced agent sessions with UI action navigation support and replay functionality
  • Bug Fixes

    • Fixed LLM tool call ID field handling to conditionally include ID only when present
  • Tests

    • Added comprehensive test coverage for GitHub Dispatch workflows, webhook parameters, and session UI actions

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c1dc9c10-bcda-4989-8a71-b7eaa7b2d340

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces a GitHub Dispatch job-polling scheduler worker that continuously fetches dispatch jobs from a cloud license service, enqueues corresponding DAG runs with webhook-derived runtime parameters, tracks job state on disk, and reports completion status back to the cloud. Supporting changes add cloud client methods, license credential structures, webhook parameter formatting, session message persistence for UI actions, and frontend navigation handling for UI-action messages.

Changes

GitHub Dispatch Feature

Layer / File(s) Summary
Cloud Licensing Foundation
internal/license/activation.go, internal/license/client.go, internal/license/manager.go
CloudMachineCredentials struct added. CloudClient gains PullGitHubDispatch, AcceptGitHubDispatch, and FinishGitHubDispatch methods with request/response types. Manager exposes ActivationData() and CloudMachineCredentials() to retrieve persisted credentials.
Webhook Runtime Parameters
internal/core/webhook_params.go
BuildWebhookRuntimeParams formats webhook payload/headers/extras into space-delimited KEY=value syntax with sorted non-empty extras entries.
Job Persistence
internal/persis/filegithubdispatch/store.go
File-backed store for TrackedJob records (job ID, DAG name, run ID, phase, update time) under tracked.json. Provides Upsert, Delete, List with atomic file writes and sorted retrieval.
Scheduler Worker & Job Processing
internal/service/scheduler/github_dispatch.go
NewGitHubDispatchWorker wires cloud client, license manager, DAG/queue stores, and tracker into a worker. Concurrent pullLoop fetches jobs, routes cancel commands to abort handler, enqueues DAG runs with webhook params, and persists tracking. reportLoop monitors DAG run completion, finalizes GitHub dispatch with terminal status, and cleans up tracking. Helpers extract GitHub metadata (actor, event, repo, SHA, PR/issue, release, workflow) from dispatch payload and build runtime parameters.
DAG Execution Queueing
internal/service/scheduler/enqueue_webhook.go
EnqueueWebhookRun checks for duplicate attempts, rehydrates and clones execution DAG, generates webhook-specific log/artifact paths, creates queued DAG run attempt with metadata, enqueues to queue with low priority, and defers rollback on error.
Session Message Handling
internal/agent/session.go
createEmitUIActionFunc now locks sm.mu, increments sm.sequenceID, appends UIAction message to sm.messages list, publishes to subscribers, and persists via sm.onMessage callback.
Frontend Navigation
ui/src/features/agent/hooks/useAgentChat.ts
useAgentChat hook adds per-session refs to track handled UI-action IDs and hydration state. New consumeNavigateUIActions effect filters streamed messages for ui_action.navigate entries and triggers navigation, preventing duplicates and allowing controlled initial replay. Delegate snapshot application now includes navigation-action consumption.
Scheduler Integration & Wiring
internal/cmd/context.go, internal/service/scheduler/scheduler.go, internal/service/frontend/api/v1/webhooks.go
NewScheduler conditionally creates GitHub dispatch tracker and worker when LicenseManager is non-nil. Scheduler adds githubDispatch runner field and launches it in a startup goroutine. webhooks.go delegates parameter building to core.BuildWebhookRuntimeParams.
Comprehensive Test Coverage
internal/license/client_test.go, internal/license/manager_test.go, internal/persis/filegithubdispatch/store_test.go, internal/service/scheduler/github_dispatch_test.go, internal/agent/session_test.go, ui/src/features/agent/hooks/__tests__/useAgentChat.test.tsx
Tests validate cloud client dispatch endpoints (pull/accept/finish), license credential resolution, job persistence (upsert/list/delete/missing file), GitHub dispatch worker end-to-end (process job, report status, cancel handling, error continuation), session UIAction storage, and frontend navigation replay during fallback polling and delegate reopening.

OpenAI Codex Tool-Call ID Fix

Layer / File(s) Summary
Conditional ID Emission
internal/llm/providers/openaicodex/codex.go
convertMessages now includes the per-item "id" field in function-call payloads only when splitToolCallID returns a non-empty itemID, preventing empty-string IDs in the OpenAI Codex API payload.
Test Coverage
internal/llm/providers/openaicodex/codex_test.go
TestConvertMessages_OmitsEmptyFunctionCallItemID asserts that function-call payloads omit the "id" field when the tool call item ID is empty.

Sequence Diagram

sequenceDiagram
    participant Scheduler
    participant CloudClient
    participant GitHubDispatchWorker
    participant DAGStore
    participant Queue
    participant Tracker as Tracker Store
    participant DAGRunStore
    
    Scheduler->>GitHubDispatchWorker: Start()
    Note over GitHubDispatchWorker: pullLoop & reportLoop concurrent
    
    GitHubDispatchWorker->>CloudClient: PullGitHubDispatch()
    alt Job Available
        CloudClient-->>GitHubDispatchWorker: GitHubDispatchJob
        GitHubDispatchWorker->>DAGStore: Fetch DAG config
        DAGStore-->>GitHubDispatchWorker: DAG definition
        GitHubDispatchWorker->>Queue: EnqueueWebhookRun<br/>(with runtime params)
        GitHubDispatchWorker->>Tracker: Upsert(JobID, pending_accept)
        GitHubDispatchWorker->>CloudClient: AcceptGitHubDispatch()
        Tracker->>Tracker: Update phase: accepted
    else No Job
        CloudClient-->>GitHubDispatchWorker: nil
        Note over GitHubDispatchWorker: Sleep (idleDelay)
    end
    
    par Report Loop
        GitHubDispatchWorker->>Tracker: List()
        Tracker-->>GitHubDispatchWorker: [TrackedJob]
        GitHubDispatchWorker->>DAGRunStore: GetAttempt(dagRunID)
        alt DAG Complete
            DAGRunStore-->>GitHubDispatchWorker: Terminal Status
            GitHubDispatchWorker->>CloudClient: FinishGitHubDispatch<br/>(with status)
            GitHubDispatchWorker->>Tracker: Delete(JobID)
        else DAG Active
            Note over GitHubDispatchWorker: Skip, check again
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.26% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the two main areas of change: adding GitHub dispatch worker functionality to the scheduler and hardening agent session flows, both of which are central to this changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/gci

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/license/client.go (1)

145-259: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Treat empty 2xx dispatch responses as errors, not no-ops.

doJSONOptional() currently collapses 204 No Content and a 200 response with an empty body into the same false, nil, which lets PullGitHubDispatch() treat malformed API responses as an empty queue. Preserve the 204/no-content distinction and fail when a 2xx dispatch response is missing id.

🔧 Suggested fix
 func (c *CloudClient) doJSONOptional(ctx context.Context, method, path string, reqBody, respBody any) (bool, error) {
@@
 	if resp.StatusCode == http.StatusNoContent {
 		return false, nil
 	}
@@
-	if respBody == nil || len(respData) == 0 {
-		return false, nil
+	if respBody == nil {
+		return true, nil
+	}
+	if len(respData) == 0 {
+		return true, nil
 	}
 	if err := json.Unmarshal(respData, respBody); err != nil {
 		return false, fmt.Errorf("failed to unmarshal response: %w", err)
 	}
 	return true, nil
 }
 
 func (c *CloudClient) PullGitHubDispatch(ctx context.Context, req PullGitHubDispatchRequest) (*GitHubDispatchJob, error) {
 	var resp GitHubDispatchJob
 	ok, err := c.doJSONOptional(ctx, http.MethodPost, "/api/v1/github/dispatch/pull", req, &resp)
 	if err != nil {
 		return nil, err
 	}
-	if !ok || resp.ID == "" {
-		return nil, nil
+	if !ok {
+		return nil, nil
+	}
+	if resp.ID == "" {
+		return nil, fmt.Errorf("github dispatch response missing id")
 	}
 	return &resp, nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/license/client.go` around lines 145 - 259, doJSONOptional currently
treats an empty 2xx response the same as 204 No Content and lets callers (like
PullGitHubDispatch) silently treat malformed responses as no-ops; change
doJSONOptional so only an explicit http.StatusNoContent (204) returns (false,
nil), but any other 2xx with an empty body returns (false, error) (e.g.
fmt.Errorf("empty response body for %s", path) or a CloudError) and still
unmarshal non-empty bodies as before, and update PullGitHubDispatch to return an
error when doJSONOptional succeeds but resp.ID == "" (rather than returning
nil,nil) so missing ID is treated as a failed/malformed response.
🧹 Nitpick comments (1)
internal/persis/filegithubdispatch/store.go (1)

138-141: ⚡ Quick win

Consider syncing the tracker directory after rename for full crash durability.

At Line 138, the file data is synced, but the directory entry update is not. A sudden crash can still lose tracked.json metadata even after successful Rename.

💡 Suggested durability hardening
-	if err := os.Rename(tmpName, filepath.Join(s.dir, trackerFile)); err != nil {
+	finalPath := filepath.Join(s.dir, trackerFile)
+	if err := os.Rename(tmpName, finalPath); err != nil {
 		return fmt.Errorf("rename tracker temp file: %w", err)
 	}
+	dir, err := os.Open(s.dir)
+	if err != nil {
+		return fmt.Errorf("open tracker dir for sync: %w", err)
+	}
+	defer dir.Close()
+	if err := dir.Sync(); err != nil {
+		return fmt.Errorf("sync tracker dir: %w", err)
+	}
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/persis/filegithubdispatch/store.go` around lines 138 - 141, The
rename succeeds but the containing directory is not fsynced, risking loss of
tracked.json metadata on a crash; after the os.Rename(tmpName,
filepath.Join(s.dir, trackerFile)) call in the function that uses s.dir, open
the directory (os.Open(s.dir)), call Sync() on that *os.File, handle and return
any error (e.g., wrap with "sync tracker dir after rename"), and Close the
directory file; ensure this directory-sync step occurs only after the rename and
before returning nil to provide full crash durability for trackerFile.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/license/client.go`:
- Around line 157-163: AcceptGitHubDispatch and FinishGitHubDispatch interpolate
jobID directly into the request path, allowing '/' or '?' in jobID to alter the
URL; call url.PathEscape(jobID) before concatenating into the path and use the
escaped value when invoking doJSONAllowNoContent (e.g., replace
"/api/v1/github/dispatch/"+jobID+"/accept" with
"/api/v1/github/dispatch/"+escapedJobID+"/accept" and similarly for the finish
path).

In `@internal/license/manager.go`:
- Around line 80-106: CloudMachineCredentials currently can return persisted
credentials from m.state even when the Manager is a zero-value or running in a
non-heartbeat/cloud mode; to fix this, early-return nil when m is nil or m.state
is nil and also validate the manager's active source/mode before exposing creds:
add a guard at the top of CloudMachineCredentials that checks m != nil, m.state
!= nil, and that the manager's source/mode (e.g. a method or field such as
m.Source(), m.Mode(), or m.activeSource) indicates the heartbeat/cloud
credential source is active — if not, return nil, nil; keep the existing
Claims() and ActivationData() checks (claims.ID, activation.ServerID,
activation.HeartbeatSecret) afterward.

In `@internal/service/scheduler/enqueue_webhook.go`:
- Around line 36-42: The idempotency check currently treats any non-nil error
from dagRunStore.FindAttempt as "not found"; change it to explicitly handle the
not-found case by using errors.Is(err, exec.ErrDAGRunIDNotFound) and only
proceed to create a new attempt when that specific error occurs, while returning
other errors immediately; keep the existing behavior of logging and returning
nil when FindAttempt returns nil (attempt exists). Locate the call to
dagRunStore.FindAttempt(ctx, dagRun) and replace the broad err==nil/else flow
with explicit branching: if err==nil -> logger.Info(...) and return nil; else if
errors.Is(err, exec.ErrDAGRunIDNotFound) -> continue to create attempt; else ->
return the error. Ensure you import/use errors and reference
exec.ErrDAGRunIDNotFound exactly as in scheduler.go.

In `@internal/service/scheduler/github_dispatch_test.go`:
- Around line 120-121: The assertions in github_dispatch_test.go check unquoted
substrings against status.Params; update the expected fragments to include
quotes so they match the runtime-serialized params (e.g., change checks against
GITHUB_EVENT_NAME=pull_request and GITHUB_PR_NUMBER=42 to match
GITHUB_EVENT_NAME="pull_request" and GITHUB_PR_NUMBER="42"). Locate the
assertions that reference status.Params (the two assert.Contains calls) and
replace the expected substrings with the quoted forms to ensure the test
compares against the actual serialized param values.

---

Outside diff comments:
In `@internal/license/client.go`:
- Around line 145-259: doJSONOptional currently treats an empty 2xx response the
same as 204 No Content and lets callers (like PullGitHubDispatch) silently treat
malformed responses as no-ops; change doJSONOptional so only an explicit
http.StatusNoContent (204) returns (false, nil), but any other 2xx with an empty
body returns (false, error) (e.g. fmt.Errorf("empty response body for %s", path)
or a CloudError) and still unmarshal non-empty bodies as before, and update
PullGitHubDispatch to return an error when doJSONOptional succeeds but resp.ID
== "" (rather than returning nil,nil) so missing ID is treated as a
failed/malformed response.

---

Nitpick comments:
In `@internal/persis/filegithubdispatch/store.go`:
- Around line 138-141: The rename succeeds but the containing directory is not
fsynced, risking loss of tracked.json metadata on a crash; after the
os.Rename(tmpName, filepath.Join(s.dir, trackerFile)) call in the function that
uses s.dir, open the directory (os.Open(s.dir)), call Sync() on that *os.File,
handle and return any error (e.g., wrap with "sync tracker dir after rename"),
and Close the directory file; ensure this directory-sync step occurs only after
the rename and before returning nil to provide full crash durability for
trackerFile.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 213e443d-1d6d-4150-86fc-a0d678966a82

📥 Commits

Reviewing files that changed from the base of the PR and between 366758f and 1e7d773.

📒 Files selected for processing (21)
  • internal/agent/session.go
  • internal/agent/session_test.go
  • internal/cmd/context.go
  • internal/core/webhook_params.go
  • internal/core/webhook_params_test.go
  • internal/license/activation.go
  • internal/license/client.go
  • internal/license/client_test.go
  • internal/license/manager.go
  • internal/license/manager_test.go
  • internal/llm/providers/openaicodex/codex.go
  • internal/llm/providers/openaicodex/codex_test.go
  • internal/persis/filegithubdispatch/store.go
  • internal/persis/filegithubdispatch/store_test.go
  • internal/service/frontend/api/v1/webhooks.go
  • internal/service/scheduler/enqueue_webhook.go
  • internal/service/scheduler/github_dispatch.go
  • internal/service/scheduler/github_dispatch_test.go
  • internal/service/scheduler/scheduler.go
  • ui/src/features/agent/hooks/__tests__/useAgentChat.test.tsx
  • ui/src/features/agent/hooks/useAgentChat.ts

Comment thread internal/license/client.go
Comment thread internal/license/manager.go
Comment thread internal/service/scheduler/enqueue_webhook.go
Comment thread internal/service/scheduler/github_dispatch_test.go
@yottahmd
Copy link
Copy Markdown
Collaborator Author

yottahmd commented May 2, 2026

Follow-up commit f9fd53a93 addresses the remaining valid CodeRabbit feedback from the review body as well: the cloud client now rejects empty non-204 success bodies and missing dispatch IDs, and the file-backed GitHub dispatch tracker now fsyncs the parent directory after rename for durability. Local verification reran the touched Go packages and the agent chat Vitest suite on this commit.

@yottahmd yottahmd merged commit aa763bf into main May 3, 2026
11 checks passed
@yottahmd yottahmd deleted the feat/gci branch May 3, 2026 06:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant