Skip to content

verify-traces: asymmetry between in-memory and disk-loaded traces for failed-during-execution txs (extend #2709) #2758

@leolara

Description

@leolara

Summary

#2709 made collect_traces tolerate a missing trace-N-<hash>.jsonl for transactions that produce a receipt but no TransactionEnd tracer event (e.g. EIP-3607 collisions, CREATE-into-non-empty failures during process_transaction). The fix appends an in-memory TransactionTraces(traces=[]) placeholder per missing file.

The same placeholder is not written to disk, which causes --verify-traces to report spurious divergences for these tests. Reproducer: any test where a tx's TransactionEnd event never fires (the same set #2709 was written for).

Reproducer

Trio of tests that exercise the path:

  • tests/ported_static/stCreateTest/test_transaction_collision_to_empty_but_code.py
  • tests/ported_static/stCreateTest/test_transaction_collision_to_empty_but_nonce.py
  • tests/ported_static/stEIP3607/test_init_colliding_with_non_empty_account.py

Both runs (baseline dump + current live run) use post-#2709 code:

# 1. Dump baseline
TMPDIR=./.tmp uv run fill \
  tests/ported_static/stCreateTest/test_transaction_collision_to_empty_but_code.py \
  --evm-dump-dir .tmp/baseline --traces \
  -n 4 --output .tmp/fix_baseline --clean

# 2. Verify against the freshly-dumped baseline
TMPDIR=./.tmp uv run fill \
  tests/ported_static/stCreateTest/test_transaction_collision_to_empty_but_code.py \
  --evm-dump-dir .tmp/current --traces \
  --verify-traces .tmp/baseline \
  --verify-traces-comparator exact-no-stack \
  --verify-traces-json .tmp/report.json \
  -n 4 --output .tmp/fix_current --clean

Result: Total: 24, Equivalent: 0, Divergent: 24 even though the inputs and the runs are identical.

Root cause

The two sides of the comparison go through different code paths:

Side Source Output for these tests
baseline_traces_list _load_traces_from_dump_dir(dump_dir) — globs trace-*.jsonl from disk [Traces(root=[])]0 TransactionTraces (no trace files exist)
current_traces_list t8n.get_traces()collect_traces(...) — in-memory [Traces(root=[TransactionTraces(traces=[])])]1 empty TransactionTraces (post-#2709 padding)

The compare_traces outer length check (len(baseline.root) != len(current.root)) then fires.

#2709 only patched the in-memory collect_traces. Neither output_traces (src/ethereum_spec_tools/evm_tools/t8n/evm_trace/eip3155.py) nor the --evm-dump-dir plumbing (shutil.copy in transition_tool.py::collect_traces) emit anything for the missing-trace case, so the placeholder never reaches disk.

Proposed fix

Make the on-disk representation symmetric with the in-memory one. Two equivalent approaches:

Option A — write empty placeholder files

In transition_tool.py::collect_traces, when the missing-file branch (the #2709 branch) runs and debug_output_path is set, also write an empty trace-{i}-{hash}.jsonl to the dump dir. Then _load_traces_from_dump_dir reads it as a zero-line TransactionTraces and matches the in-memory shape.

if not trace_file_path.exists():
    traces.append(TransactionTraces(traces=[]))
    if debug_output_path:
        (Path(debug_output_path) / trace_file_name).write_text("")
    continue

Trivial, ~3 lines. Doesn't change any other comparator behaviour.

Option B — have _load_traces_from_dump_dir reconcile from output/result.json

When loading a dump dir, also read output/result.json to count receipts; if there are more receipts than trace-*.jsonl files, pad the result with TransactionTraces(traces=[]) for the missing ones.

More invasive (touches the loader), but doesn't change the on-disk format and has no risk of empty files confusing other tooling.

Either approach makes the verify-traces comparison symmetric. Option A is the smallest diff and pairs naturally with the existing #2709 patch.

Why it matters

Currently the only workaround is to put every test that fails-during-execution into an allowlist (e.g. EEST's static-port FORCE_HARDCODED_TESTS) so the test passes via baseline-equivalent state assertions, never actually exercising the verify-traces path. That defeats the point of trace verification for an entire class of tests.

Context

Found while wiring up --verify-traces on PR #2695. After rebasing onto a forks/amsterdam containing #2709, the 3 STRUCTURAL tests above pass fill but report 78 spurious divergences in the trace report.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions