feat: add task lifecycle diagnostics by glittercowboy · Pull Request #120 · gsd-build/daemon

glittercowboy · 2026-04-30T17:20:41Z

Summary

Adds local task lifecycle diagnostics and pretty log filtering for daemon logs.
Supervises Pi task execution with attempt-aware lifecycle events, deadline hooks, cleanup events, and terminal failure metadata.
Exposes attempt-scoped plan capability metadata to the Pi extension.
Consumes protocol-go v0.32.0.

Dependency

Depends on protocol-go v0.32.0, tagged from feat: add task attempt lifecycle protocol protocol-go#41.

Verification

go test ./internal/pi ./internal/session ./internal/logging
go test ./...
go build -o gsd-cloud .

Post-merge

Push the next daemon release tag and verify the release workflow.

Summary by CodeRabbit

Release Notes

New Features
- Enhanced logs command with filtering by session, task, and time range; added JSON and formatted timeline output options.
- Improved lifecycle logging with automatic secret redaction in prompt previews.
- Added attempt tracking to task execution with better timeout detection and reporting.
Chores
- Updated protocol dependency to latest version.

coderabbitai · 2026-04-30T17:21:09Z

📝 Walkthrough

Walkthrough

The PR adds lifecycle telemetry infrastructure spanning CLI logging tools, event logging/serialization, and turn-level execution supervision. New code enables structured log filtering and rendering via command-line, lifecycle event construction with prompt preview redaction, executor lifecycle hook callbacks, and turn supervisor orchestration tracking task phases with deadline-based failure detection.

Changes

Cohort / File(s)	Summary
Logging CLI `cmd/logs.go`, `cmd/logs_filter.go`, `cmd/logs_pretty.go`, `cmd/logs_test.go`	Adds structured log viewing with filtering by session/task/time/level, JSON parsing, and pretty timeline rendering with ANSI coloring. Validates mutual exclusivity of targeting flags and formats timestamps as `HH:MM:SS` with phase labels and optional metadata (task ID, failure code, retryable marker).
Lifecycle Event Infrastructure `internal/logging/lifecycle.go`, `internal/logging/lifecycle_test.go`	Defines `TaskLifecycleLogEvent` and `TaskLifecycleLogInput` for structured lifecycle logging. Implements `PromptPreview` to normalize whitespace, redact secrets matching a compiled regex, and truncate by word/character count without exceeding either constraint.
Executor Lifecycle Hooks `internal/pi/executor.go`, `internal/pi/executor_test.go`, `internal/pi/worker.go`	Adds `LifecycleHooks` callback mechanism to `Executor` and wires invocations for process start, prompt write, first event visibility, and cleanup phases. Extends plan capability environment injection with `GSD_PLAN_CAPABILITY_ID` and `GSD_PLAN_CAPABILITY_ATTEMPT_ID`. Updates `streamPiEvents` signature to accept lifecycle hooks parameter.
Plan Capability Detection `internal/pi/extension/plan-tools.js`	Extends `hasPlanCapability` gate to require `GSD_PLAN_CAPABILITY_ATTEMPT_ID` environment variable in addition to existing URL/token/expiry checks for plan tool registration.
Session Actor Lifecycle Telemetry `internal/session/actor.go`	Extends `taskContext` with attempt-aware fields (`AttemptID`, `AttemptNumber`, `TurnKind`, `Deadlines`) and propagates them into protocol messages. Adds lifecycle sink that logs events via `NewTaskLifecycleLog` and optionally sends `protocol.TaskLifecycle` messages to relay. Populates `FailureCode` and `Retryable` from task errors and wires executor through `TurnSupervisor` with lifecycle/tool hooks.
Turn Supervision `internal/session/turn_supervisor.go`, `internal/session/turn_supervisor_test.go`	Introduces `TurnSupervisor` to orchestrate per-turn lifecycle phases (task started, prompt written, first/visible events, tool start/finish, timeouts). Tracks deadlines (process start, prompt write, first event, tool idle, etc.), emits phase events enriched with attempt/turn metadata, and synthesizes `turnFailureError` with `TurnResult` on timeout or failure. Provides `TurnHooks` callbacks to wire into executor behavior.
Dependency Update `go.mod`	Updates `github.com/gsd-build/protocol-go` from `v0.29.1` to `v0.32.0`.

Sequence Diagram

sequenceDiagram
    participant Client as Caller
    participant TS as TurnSupervisor
    participant Executor as Pi Executor
    participant Sink as Lifecycle Sink
    
    Client->>TS: Run(context, runFunc)
    activate TS
    
    TS->>TS: recordStartTime()
    TS->>Sink: Phase("task_started", fields)
    
    TS->>Client: runFunc(context, TurnHooks)
    activate Client
    
    Client->>Executor: Run(...)
    activate Executor
    
    Executor->>Executor: hooks.PromptWritten()
    Executor->>TS: PromptWritten via TurnHooks
    deactivate Executor
    TS->>TS: scheduleDeadline(FirstEvent)
    TS->>Sink: Phase("prompt_written", fields)
    
    Executor->>Executor: hooks.FirstEventSeen()
    Executor->>TS: FirstEventSeen via TurnHooks
    TS->>Sink: Phase("first_event_seen", fields)
    TS->>TS: scheduleDeadline(ToolIdle)
    
    Executor->>Executor: hooks.ToolStarted()
    Executor->>TS: ToolStarted(id, name) via TurnHooks
    TS->>Sink: Phase("tool_started", fields)
    
    Executor->>Executor: hooks.ToolFinished()
    Executor->>TS: ToolFinished(id, name) via TurnHooks
    TS->>Sink: Phase("tool_finished", fields)
    
    Client-->>TS: return
    deactivate Client
    
    TS->>Sink: Phase("task_timed_out", fields)
    TS->>TS: Result() = {FailureCode, Retryable}
    TS-->>Client: return error
    deactivate TS

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat: emit file_activity events on pi write/edit completion #60 — Modifies internal/pi/executor.go and streamPiEvents to add lifecycle/tool-execution callbacks, directly overlapping with executor lifecycle hook wiring.
Add plan runtime commit tool #110 — Modifies plan-capability code (internal/session/actor.go, plan-tools.js, environment injection) with overlapping changes to actor wiring and capability detection.
Add shared browser session daemon bridge #83 — Changes Executor.Run behavior and lifecycle hook injection in internal/pi/executor.go, directly overlapping with this PR's executor modifications.

Poem

🐰 Hops through lifecycles, phase by phase,
Events gathered, redacted in place,
Supervisor watches each turn with care,
Deadlines ticking in the twilight air,
Pretty timelines bloom in color bright! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 5.13% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: add task lifecycle diagnostics' clearly and concisely describes the main feature being introduced across the changeset—adding task lifecycle tracking, supervision, and diagnostic capabilities including pretty log filtering, lifecycle event hooks, and deadline management.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch codex/turn-lifecycle-daemon

_{Review rate limit: 6/10 reviews remaining, refill in 22 minutes and 8 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/logs_pretty.go`:
- Around line 47-49: The code currently treats any mode != colorNever the same
and calls colorizePrettyLine(), causing ANSI escapes even when piping; change
the condition around the call in the pretty-printing path so that
colorizePrettyLine(line, event.Phase) is only called when mode==colorAlways OR
when mode==colorAuto AND the output is a TTY (use an isTerminal/isTTY check on
the writer e.g., os.Stdout or the logger output). Update the branch using the
variables mode, colorNever and the function colorizePrettyLine to perform that
TTY check (or introduce a small helper like isTerminal()) so color=auto disables
ANSI when output is not a terminal.

In `@cmd/logs.go`:
- Around line 156-158: When logsOptions.lastTask is true and you call
latestTaskID(lines), handle the empty-string case instead of leaving
filter.TaskID blank: check the return value of latestTaskID(lines) and if it's
an empty string, return an error (or exit with a clear message) indicating "no
task-scoped entries found for --last-task" rather than assigning "" to
filter.TaskID (which causes filterLogLines() to match everything); update the
branch that currently sets filter.TaskID to call latestTaskID(lines) and bail
out with a clear error or non-zero exit when the result is empty.
- Around line 107-116: hasStructuredMode() currently treats color and noColor as
selectors for structured rendering; remove opts.color != "auto" and opts.noColor
from the hasStructuredMode() boolean expression so color flags do not change the
mode, and add validation in the command's option parsing/validation (where
logsOptionsState is validated) to return an error if --color/--no-color is
provided without --pretty (i.e., require opts.pretty when opts.color != "auto"
or opts.noColor is true) so color flags only affect pretty output once
structured/pretty mode is explicitly selected.

In `@internal/session/turn_supervisor.go`:
- Around line 75-125: The One-shot timers started in TurnHooks (the FirstEvent
timer in PromptWritten and the ToolIdle timer in ToolStarted) must be tracked on
the TurnSupervisor (add fields like firstEventTimer and toolIdleTimer) and
cancelled when the turn advances or finishes: start and store the timers instead
of calling time.AfterFunc inline in PromptWritten/ToolStarted, call Stop() (and
drain the channel if needed) from ToolFinished and from the cleanup path just
before run() returns or when the supervisor is marked complete, and also clear
the timer fields when stopped; ensure PromptWritten/ToolStarted/ToolFinished and
the post-run return path consistently stop any outstanding timers so stale
timers cannot call s.timeout after the turn has already completed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ef1b9ae3-cc18-4907-bb40-da4259db2122

📥 Commits

Reviewing files that changed from the base of the PR and between 156be4a and 5913ff0.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (14)

cmd/logs.go
cmd/logs_filter.go
cmd/logs_pretty.go
cmd/logs_test.go
go.mod
internal/logging/lifecycle.go
internal/logging/lifecycle_test.go
internal/pi/executor.go
internal/pi/executor_test.go
internal/pi/extension/plan-tools.js
internal/pi/worker.go
internal/session/actor.go
internal/session/turn_supervisor.go
internal/session/turn_supervisor_test.go

coderabbitai · 2026-04-30T17:28:36Z

+		if mode != colorNever {
+			line = colorizePrettyLine(line, event.Phase)
+		}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

color=auto currently behaves like always.

Every mode except never runs colorizePrettyLine(), so piping pretty logs to a file or another command still injects ANSI escapes. auto needs a TTY check before deciding whether to colorize.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cmd/logs_pretty.go` around lines 47 - 49, The code currently treats any mode != colorNever the same and calls colorizePrettyLine(), causing ANSI escapes even when piping; change the condition around the call in the pretty-printing path so that colorizePrettyLine(line, event.Phase) is only called when mode==colorAlways OR when mode==colorAuto AND the output is a TTY (use an isTerminal/isTTY check on the writer e.g., os.Stdout or the logger output). Update the branch using the variables mode, colorNever and the function colorizePrettyLine to perform that TTY check (or introduce a small helper like isTerminal()) so color=auto disables ANSI when output is not a terminal.

coderabbitai · 2026-04-30T17:28:36Z

+func (opts logsOptionsState) hasStructuredMode() bool {
+	return opts.sessionID != "" ||
+		opts.taskID != "" ||
+		opts.lastTask ||
+		opts.since > 0 ||
+		opts.level != "" ||
+		opts.pretty ||
+		opts.json ||
+		opts.color != "auto" ||
+		opts.noColor


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Keep color flags from changing the command mode.

hasStructuredMode() treats --color and --no-color as selectors for structured rendering, so gsd-cloud logs --no-color stops tailing the raw file and switches to a one-shot pretty snapshot. These flags should only modify pretty output once structured mode is already selected, or validation should require --pretty.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cmd/logs.go` around lines 107 - 116, hasStructuredMode() currently treats color and noColor as selectors for structured rendering; remove opts.color != "auto" and opts.noColor from the hasStructuredMode() boolean expression so color flags do not change the mode, and add validation in the command's option parsing/validation (where logsOptionsState is validated) to return an error if --color/--no-color is provided without --pretty (i.e., require opts.pretty when opts.color != "auto" or opts.noColor is true) so color flags only affect pretty output once structured/pretty mode is explicitly selected.

coderabbitai · 2026-04-30T17:28:36Z

+	if logsOptions.lastTask {
+		filter.TaskID = latestTaskID(lines)
+	}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Don't fall back to the entire log when --last-task finds nothing.

If latestTaskID(lines) returns "", filter.TaskID stays empty and filterLogLines() passes every event through. On a log file with no task-scoped entries yet, --last-task would silently dump the full log instead of returning no matches or a clear error.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cmd/logs.go` around lines 156 - 158, When logsOptions.lastTask is true and you call latestTaskID(lines), handle the empty-string case instead of leaving filter.TaskID blank: check the return value of latestTaskID(lines) and if it's an empty string, return an error (or exit with a clear message) indicating "no task-scoped entries found for --last-task" rather than assigning "" to filter.TaskID (which causes filterLogLines() to match everything); update the branch that currently sets filter.TaskID to call latestTaskID(lines) and bail out with a clear error or non-zero exit when the result is empty.

coderabbitai · 2026-04-30T17:28:36Z

+	hooks := TurnHooks{
+		PromptWritten: func() {
+			s.emit("prompt_written", nil)
+			if s.opts.Deadlines.FirstEvent > 0 {
+				time.AfterFunc(s.opts.Deadlines.FirstEvent, func() {
+					s.mu.Lock()
+					shouldTimeout := !s.firstEvent && s.result.FailureCode == ""
+					s.mu.Unlock()
+					if shouldTimeout {
+						s.timeout("no_first_event_timeout", true)
+					}
+				})
+			}
+		},
+		FirstEventSeen: func() {
+			s.mu.Lock()
+			s.firstEvent = true
+			s.mu.Unlock()
+			s.emit("first_event_seen", nil)
+		},
+		FirstVisibleEventSeen: func() {
+			s.mu.Lock()
+			s.firstVisible = true
+			s.retrySafe = false
+			s.mu.Unlock()
+			s.emit("first_visible_event_seen", nil)
+		},
+		ToolStarted: func(toolCallID string, toolName string) {
+			s.mu.Lock()
+			s.retrySafe = false
+			s.mu.Unlock()
+			s.emit("tool_started", map[string]any{"toolCallId": toolCallID, "toolName": toolName})
+			if s.opts.Deadlines.ToolIdle > 0 {
+				time.AfterFunc(s.opts.Deadlines.ToolIdle, func() {
+					s.timeout("tool_idle_timeout", false)
+				})
+			}
+		},
+		ToolFinished: func(toolCallID string, toolName string) {
+			s.emit("tool_finished", map[string]any{"toolCallId": toolCallID, "toolName": toolName})
+		},
+	}
+
+	err := run(ctx, hooks)
+	s.mu.Lock()
+	hasFailure := s.result.FailureCode != ""
+	s.mu.Unlock()
+	if hasFailure {
+		return turnFailureError{result: s.Result()}
+	}
+	return err


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Cancel outstanding deadline timers when the turn advances or exits.

ToolStarted() arms a one-shot ToolIdle timer, but ToolFinished() never stops it and Run() never marks the supervisor as complete before returning. That means a turn can start a tool, finish it quickly, keep streaming for longer than ToolIdle, and still get canceled by the stale timer. The same late-fire problem exists for the FirstEvent timer after run() has already returned. Track these timers on TurnSupervisor and stop them on state transitions / final completion before returning.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/session/turn_supervisor.go` around lines 75 - 125, The One-shot timers started in TurnHooks (the FirstEvent timer in PromptWritten and the ToolIdle timer in ToolStarted) must be tracked on the TurnSupervisor (add fields like firstEventTimer and toolIdleTimer) and cancelled when the turn advances or finishes: start and store the timers instead of calling time.AfterFunc inline in PromptWritten/ToolStarted, call Stop() (and drain the channel if needed) from ToolFinished and from the cleanup path just before run() returns or when the supervisor is marked complete, and also clear the timer fields when stopped; ensure PromptWritten/ToolStarted/ToolFinished and the post-run return path consistently stop any outstanding timers so stale timers cannot call s.timeout after the turn has already completed.

glittercowboy added 4 commits April 30, 2026 10:40

feat: add pretty task log diagnostics

1b1d6c8

feat: supervise task lifecycle attempts

1185df0

feat: expose plan tools by attempt

6967dc1

chore: consume task lifecycle protocol

5913ff0

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

glittercowboy merged commit 42f9c78 into main Apr 30, 2026
2 checks passed

glittercowboy deleted the codex/turn-lifecycle-daemon branch April 30, 2026 17:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add task lifecycle diagnostics#120

feat: add task lifecycle diagnostics#120
glittercowboy merged 4 commits intomainfrom
codex/turn-lifecycle-daemon

glittercowboy commented Apr 30, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 30, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 30, 2026

Uh oh!

coderabbitai Bot Apr 30, 2026

Uh oh!

coderabbitai Bot Apr 30, 2026

Uh oh!

coderabbitai Bot Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

glittercowboy commented Apr 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Dependency

Verification

Post-merge

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

glittercowboy commented Apr 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 30, 2026 •

edited

Loading