-
Notifications
You must be signed in to change notification settings - Fork 198
Description
Description:
When the agent loop encounters a critical error (e.g., panic, infinite loop detection) and breaks out of its Run() loop, the main Go process continues to run. Because the process does not exit (exit code 0), Docker's restart: unless-stopped policy is never triggered.
This leaves the agent in a "Zombie State":
• The container is Up.
• The HTTP server/Gateway is active.
• The Agent is removed from the Router.
• Incoming messages result in agent not found.
Logs:
Agent loop breaks/crashes ... Later requests fail time=2026-03-02T12:25:04.670+07:00 level=WARN msg="inbound: agent not found" agent=default channel=telegram
Root Cause:
• In internal/agent/loop.go (or similar), critical errors use break to exit the loop but do not signal the main application to shut down.
Proposed Solution:
• Implement a Health Check mechanism. If the main agent loop dies, the application should call os.Exit(1).
• Or, propagate the context cancellation up to main.go to trigger a graceful shutdown, allowing the orchestrator (Docker/K8s) to restart the pod/container.