Skip to content

GH-2222: Expand dispatcher recovery + add periodic loop#2227

Closed
alekspetrov wants to merge 1 commit intomainfrom
pilot/GH-2222
Closed

GH-2222: Expand dispatcher recovery + add periodic loop#2227
alekspetrov wants to merge 1 commit intomainfrom
pilot/GH-2222

Conversation

@alekspetrov
Copy link
Copy Markdown
Collaborator

Summary

Automated PR created by Pilot for task GH-2222.

Closes #2222
Supersedes #2225

Changes

  • Add StaleRunningThreshold, StaleQueuedThreshold, StaleRecoveryInterval to DispatcherConfig (StaleTaskDuration kept as backwards-compat alias)
  • Rewrite recoverStaleTasks() to recover both running and queued orphans, marking them failed (not re-queued — re-queuing without a worker recreates the orphan)
  • Change Start() to accept context.Context and launch runStaleRecoveryLoop goroutine that ticks every StaleRecoveryInterval
  • Add summary log "stale recovery complete, reset N tasks" on every pass (even when 0) for diagnosability
  • Update all Start() callers in main.go and tests
  • Add tests: TestRecoverStaleTasks_QueuedAndRunning, TestRecoverStaleTasks_RespectsThresholds, TestRunStaleRecoveryLoop_Periodic, TestQueueTask_AfterRecovery

Test plan

  • go build ./... passes
  • go test ./internal/executor/... — all new and existing tests pass
  • go test ./... — full suite passes

…ecutor/`)

- Add StaleRunningThreshold, StaleQueuedThreshold, StaleRecoveryInterval
  to DispatcherConfig (StaleTaskDuration kept as backwards-compat alias)
- Rewrite recoverStaleTasks() to recover both running and queued orphans,
  marking them failed (not re-queued — re-queuing without a worker just
  recreates the orphan)
- Change Start() to accept context.Context and launch runStaleRecoveryLoop
  goroutine that ticks every StaleRecoveryInterval
- Add summary log "stale recovery complete, reset N tasks" on every pass
  (even when 0) for diagnosability
- Update all Start() callers in main.go and tests
- Add tests: TestRecoverStaleTasks_QueuedAndRunning,
  TestRecoverStaleTasks_RespectsThresholds, TestRunStaleRecoveryLoop_Periodic,
  TestQueueTask_AfterRecovery
@alekspetrov alekspetrov added the pilot Pilot AI will work on this label Apr 7, 2026
@alekspetrov alekspetrov closed this Apr 7, 2026
@alekspetrov alekspetrov mentioned this pull request Apr 7, 2026
1 task
@alekspetrov alekspetrov deleted the pilot/GH-2222 branch April 7, 2026 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pilot Pilot AI will work on this

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expand dispatcher recovery + add periodic loop (internal/executor/) - In `i...

1 participant