Summary
A panic inside Core::process() (spawned via fire-and-forget tokio::spawn at lib.rs:269-273) is silently consumed because the JoinHandle is dropped without being awaited. The panicking task never calls notify() on any of the four barriers (execute_block_barrier, merklize_barrier, seal_barrier, make_canonical_barrier), causing all subsequent blocks to hang forever.
Related sub-issues
- Bare
.unwrap() on barrier waits (L476, L478, L496, L522): If a prior block panicked, these unwraps cascade into further panics, amplifying the deadlock across all in-flight blocks.
seal_barrier not closed on shutdown (L250-254): When the ordered-block channel closes, run() closes three barriers but omits seal_barrier, leaving any task waiting on it permanently hung.
Reproduction
- Trigger any
assert!/assert_eq! failure inside process() (e.g., epoch mismatch at L401, execute_height invariant at L461).
- Observe that no subsequent blocks are processed — they all hang on
execute_block_barrier.wait_timeout.
Impact
- Severity: Critical
- Complete pipeline halt with no recovery path other than node restart.
- Multiple production
assert!/assert_eq! calls exist in non-#[cfg(debug_assertions)] paths (L401, L459, L461, L700, L778, L945), making this triggerable.
Suggested investigation areas
- Await or
JoinSet-manage the spawned tasks to propagate panics.
- Convert production-path
assert! to graceful error handling.
- Add
seal_barrier.close() to the shutdown path.
Files
crates/pipe-exec-layer-ext-v2/execute/src/lib.rs
Summary
A panic inside
Core::process()(spawned via fire-and-forgettokio::spawnatlib.rs:269-273) is silently consumed because theJoinHandleis dropped without being awaited. The panicking task never callsnotify()on any of the four barriers (execute_block_barrier,merklize_barrier,seal_barrier,make_canonical_barrier), causing all subsequent blocks to hang forever.Related sub-issues
.unwrap()on barrier waits (L476, L478, L496, L522): If a prior block panicked, these unwraps cascade into further panics, amplifying the deadlock across all in-flight blocks.seal_barriernot closed on shutdown (L250-254): When the ordered-block channel closes,run()closes three barriers but omitsseal_barrier, leaving any task waiting on it permanently hung.Reproduction
assert!/assert_eq!failure insideprocess()(e.g., epoch mismatch at L401, execute_height invariant at L461).execute_block_barrier.wait_timeout.Impact
assert!/assert_eq!calls exist in non-#[cfg(debug_assertions)]paths (L401, L459, L461, L700, L778, L945), making this triggerable.Suggested investigation areas
JoinSet-manage the spawned tasks to propagate panics.assert!to graceful error handling.seal_barrier.close()to the shutdown path.Files
crates/pipe-exec-layer-ext-v2/execute/src/lib.rs