You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running many sub-agents concurrently (reported with ~13 sub-agents running + 2 background bash jobs) freezes the entire TUI — it stops redrawing and stops responding to input. Observed on v0.8.65 (YOLO mode), but the deeper root cause still exists on main.
Repro
Kick off enough work that ~10+ sub-agents run at once (panel shows Agents 13 running / 13, Bash jobs: 2 running).
Screen locks up; input and redraw stall.
Root-cause analysis (file:line)
The TUI event loop and the per-turn monitors contend on a single shared, write-locked event receiver, and high-volume agent progress events make it worse.
Lock contention on the shared event receiver (primary). rx_event is an Arc<RwLock<mpsc::Receiver<Event>>>. Multiple tasks take the exclusive write lock:
TUI event/render loop drains under the write lock — crates/tui/src/tui/ui.rs:1808
One monitor_turn task per sub-agent (so ~13 at once) also takes the write lock and blocks on recv().await — crates/tui/src/core/engine/runtime_threads.rs:2850
Exec stream handler contends for the same lock — crates/tui/src/core/main.rs:7067(verify path)
With ~13 monitors holding/queuing the writer lock, tokio RwLock writer-fairness starves the render loop from acquiring it → no redraw, no input drain → freeze.
Bounded event channel (256) under burst. mpsc::channel(256) — crates/tui/src/core/engine.rs:824. Many agents emitting rapid AgentProgress (try_send, crates/tui/src/tools/subagent/mod.rs ~6158) can fill the buffer; events drop or senders stall, compounding the stall.
Remove the RwLock around rx_event — there should be a single owner draining it; give monitors their own event stream/side-channel instead of contending with the render loop.
Or move monitor_turn off the render-critical lock entirely.
Increase event channel capacity and/or coalesce AgentProgress events before send.
Add/raise a concurrent sub-agent cap (semaphore) so a burst can't spawn an unbounded number of contending monitors.
Environment
codewhale version: 0.8.65 (observed); root cause present on main (0.8.66)
Summary
Running many sub-agents concurrently (reported with ~13 sub-agents running + 2 background bash jobs) freezes the entire TUI — it stops redrawing and stops responding to input. Observed on v0.8.65 (YOLO mode), but the deeper root cause still exists on
main.Repro
Agents 13 running / 13,Bash jobs: 2 running).Root-cause analysis (file:line)
The TUI event loop and the per-turn monitors contend on a single shared, write-locked event receiver, and high-volume agent progress events make it worse.
Lock contention on the shared event receiver (primary).
rx_eventis anArc<RwLock<mpsc::Receiver<Event>>>. Multiple tasks take the exclusive write lock:crates/tui/src/tui/ui.rs:1808monitor_turntask per sub-agent (so ~13 at once) also takes the write lock and blocks onrecv().await—crates/tui/src/core/engine/runtime_threads.rs:2850crates/tui/src/core/main.rs:7067(verify path)With ~13 monitors holding/queuing the writer lock, tokio
RwLockwriter-fairness starves the render loop from acquiring it → no redraw, no input drain → freeze.Bounded event channel (256) under burst.
mpsc::channel(256)—crates/tui/src/core/engine.rs:824. Many agents emitting rapidAgentProgress(try_send,crates/tui/src/tools/subagent/mod.rs~6158) can fill the buffer; events drop or senders stall, compounding the stall.Redraw storm (partially mitigated on main).
AgentProgressredraws are throttled (~100ms) atcrates/tui/src/tui/ui.rs:2729-2747(from v0.8.58: Fix live-subagent freeze under load — 4+ concurrent subagents block TUI input/rendering #3033). This helps but does not remove the lock-contention root cause; v0.8.65 users without/with the throttle still freeze under enough agents.Suggested fixes
RwLockaroundrx_event— there should be a single owner draining it; give monitors their own event stream/side-channel instead of contending with the render loop.monitor_turnoff the render-critical lock entirely.AgentProgressevents before send.Environment
main(0.8.66)