Summary
#2709 made collect_traces tolerate a missing trace-N-<hash>.jsonl for transactions that produce a receipt but no TransactionEnd tracer event (e.g. EIP-3607 collisions, CREATE-into-non-empty failures during process_transaction). The fix appends an in-memory TransactionTraces(traces=[]) placeholder per missing file.
The same placeholder is not written to disk, which causes --verify-traces to report spurious divergences for these tests. Reproducer: any test where a tx's TransactionEnd event never fires (the same set #2709 was written for).
Reproducer
Trio of tests that exercise the path:
tests/ported_static/stCreateTest/test_transaction_collision_to_empty_but_code.py
tests/ported_static/stCreateTest/test_transaction_collision_to_empty_but_nonce.py
tests/ported_static/stEIP3607/test_init_colliding_with_non_empty_account.py
Both runs (baseline dump + current live run) use post-#2709 code:
# 1. Dump baseline
TMPDIR=./.tmp uv run fill \
tests/ported_static/stCreateTest/test_transaction_collision_to_empty_but_code.py \
--evm-dump-dir .tmp/baseline --traces \
-n 4 --output .tmp/fix_baseline --clean
# 2. Verify against the freshly-dumped baseline
TMPDIR=./.tmp uv run fill \
tests/ported_static/stCreateTest/test_transaction_collision_to_empty_but_code.py \
--evm-dump-dir .tmp/current --traces \
--verify-traces .tmp/baseline \
--verify-traces-comparator exact-no-stack \
--verify-traces-json .tmp/report.json \
-n 4 --output .tmp/fix_current --clean
Result: Total: 24, Equivalent: 0, Divergent: 24 even though the inputs and the runs are identical.
Root cause
The two sides of the comparison go through different code paths:
| Side |
Source |
Output for these tests |
baseline_traces_list |
_load_traces_from_dump_dir(dump_dir) — globs trace-*.jsonl from disk |
[Traces(root=[])] — 0 TransactionTraces (no trace files exist) |
current_traces_list |
t8n.get_traces() → collect_traces(...) — in-memory |
[Traces(root=[TransactionTraces(traces=[])])] — 1 empty TransactionTraces (post-#2709 padding) |
The compare_traces outer length check (len(baseline.root) != len(current.root)) then fires.
#2709 only patched the in-memory collect_traces. Neither output_traces (src/ethereum_spec_tools/evm_tools/t8n/evm_trace/eip3155.py) nor the --evm-dump-dir plumbing (shutil.copy in transition_tool.py::collect_traces) emit anything for the missing-trace case, so the placeholder never reaches disk.
Proposed fix
Make the on-disk representation symmetric with the in-memory one. Two equivalent approaches:
Option A — write empty placeholder files
In transition_tool.py::collect_traces, when the missing-file branch (the #2709 branch) runs and debug_output_path is set, also write an empty trace-{i}-{hash}.jsonl to the dump dir. Then _load_traces_from_dump_dir reads it as a zero-line TransactionTraces and matches the in-memory shape.
if not trace_file_path.exists():
traces.append(TransactionTraces(traces=[]))
if debug_output_path:
(Path(debug_output_path) / trace_file_name).write_text("")
continue
Trivial, ~3 lines. Doesn't change any other comparator behaviour.
Option B — have _load_traces_from_dump_dir reconcile from output/result.json
When loading a dump dir, also read output/result.json to count receipts; if there are more receipts than trace-*.jsonl files, pad the result with TransactionTraces(traces=[]) for the missing ones.
More invasive (touches the loader), but doesn't change the on-disk format and has no risk of empty files confusing other tooling.
Either approach makes the verify-traces comparison symmetric. Option A is the smallest diff and pairs naturally with the existing #2709 patch.
Why it matters
Currently the only workaround is to put every test that fails-during-execution into an allowlist (e.g. EEST's static-port FORCE_HARDCODED_TESTS) so the test passes via baseline-equivalent state assertions, never actually exercising the verify-traces path. That defeats the point of trace verification for an entire class of tests.
Context
Found while wiring up --verify-traces on PR #2695. After rebasing onto a forks/amsterdam containing #2709, the 3 STRUCTURAL tests above pass fill but report 78 spurious divergences in the trace report.
Summary
#2709 made
collect_tracestolerate a missingtrace-N-<hash>.jsonlfor transactions that produce a receipt but noTransactionEndtracer event (e.g. EIP-3607 collisions, CREATE-into-non-empty failures duringprocess_transaction). The fix appends an in-memoryTransactionTraces(traces=[])placeholder per missing file.The same placeholder is not written to disk, which causes
--verify-tracesto report spurious divergences for these tests. Reproducer: any test where a tx'sTransactionEndevent never fires (the same set #2709 was written for).Reproducer
Trio of tests that exercise the path:
tests/ported_static/stCreateTest/test_transaction_collision_to_empty_but_code.pytests/ported_static/stCreateTest/test_transaction_collision_to_empty_but_nonce.pytests/ported_static/stEIP3607/test_init_colliding_with_non_empty_account.pyBoth runs (baseline dump + current live run) use post-#2709 code:
Result:
Total: 24, Equivalent: 0, Divergent: 24even though the inputs and the runs are identical.Root cause
The two sides of the comparison go through different code paths:
baseline_traces_list_load_traces_from_dump_dir(dump_dir)— globstrace-*.jsonlfrom disk[Traces(root=[])]— 0 TransactionTraces (no trace files exist)current_traces_listt8n.get_traces()→collect_traces(...)— in-memory[Traces(root=[TransactionTraces(traces=[])])]— 1 empty TransactionTraces (post-#2709 padding)The
compare_tracesouter length check (len(baseline.root) != len(current.root)) then fires.#2709 only patched the in-memory
collect_traces. Neitheroutput_traces(src/ethereum_spec_tools/evm_tools/t8n/evm_trace/eip3155.py) nor the--evm-dump-dirplumbing (shutil.copyintransition_tool.py::collect_traces) emit anything for the missing-trace case, so the placeholder never reaches disk.Proposed fix
Make the on-disk representation symmetric with the in-memory one. Two equivalent approaches:
Option A — write empty placeholder files
In
transition_tool.py::collect_traces, when the missing-file branch (the #2709 branch) runs anddebug_output_pathis set, also write an emptytrace-{i}-{hash}.jsonlto the dump dir. Then_load_traces_from_dump_dirreads it as a zero-lineTransactionTracesand matches the in-memory shape.Trivial, ~3 lines. Doesn't change any other comparator behaviour.
Option B — have
_load_traces_from_dump_dirreconcile fromoutput/result.jsonWhen loading a dump dir, also read
output/result.jsonto count receipts; if there are more receipts thantrace-*.jsonlfiles, pad the result withTransactionTraces(traces=[])for the missing ones.More invasive (touches the loader), but doesn't change the on-disk format and has no risk of empty files confusing other tooling.
Either approach makes the verify-traces comparison symmetric. Option A is the smallest diff and pairs naturally with the existing #2709 patch.
Why it matters
Currently the only workaround is to put every test that fails-during-execution into an allowlist (e.g. EEST's static-port
FORCE_HARDCODED_TESTS) so the test passes via baseline-equivalent state assertions, never actually exercising the verify-traces path. That defeats the point of trace verification for an entire class of tests.Context
Found while wiring up
--verify-traceson PR #2695. After rebasing onto aforks/amsterdamcontaining #2709, the 3 STRUCTURAL tests above passfillbut report 78 spurious divergences in the trace report.