Skip to content

fix(pseudo-peer): cache block hashes during header serving#120

Closed
AlliedToasters wants to merge 1 commit intohl-archive-node:node-builderfrom
AlliedToasters:fix/pseudo-peer-hash-cache
Closed

fix(pseudo-peer): cache block hashes during header serving#120
AlliedToasters wants to merge 1 commit intohl-archive-node:node-builderfrom
AlliedToasters:fix/pseudo-peer-hash-cache

Conversation

@AlliedToasters
Copy link

Summary

  • Cache block hash→number mappings during GetBlockHeaders responses so that subsequent GetBlockBodies requests (which arrive by hash) resolve instantly from the cache
  • Increase the blockhash LRU cache limit from 1M to 15M entries to cover full chain ranges
  • Replace the rate-limited public RPC fallback (fallback_to_official_rpc) with a hard error, since the RPC fallback masked the underlying cache population issue

Problem

During the Bodies stage, the main node sends GetBlockBodies requests containing block hashes. The pseudo peer needs to resolve these hashes to block numbers to fetch the data from its block source. Previously, the GetBlockHeaders handler did not cache hash→number mappings, so the Bodies handler triggered slow backfill scans — fetching and hashing blocks sequentially to find the target hash. This blocked the single-threaded pseudo peer event loop for minutes at a time, causing the main node's protocol breach timeout (~120s) to disconnect the peer.

The result was a connect/disconnect cycle every ~45 seconds during the Bodies stage, with the RPC fallback exhausting its daily rate limit and making backfills even slower.

Fix

The GetBlockHeaders handler now computes and caches the hash for every header it serves. Since the main node always requests headers before bodies for a given block range, the cache is pre-populated by the time GetBlockBodies arrives. The LRU cache is also increased to 15M entries to avoid eviction across large chain ranges.

Test plan

  • Tested on HyperEVM testnet: Bodies stage completed ~11M blocks in ~3 minutes with zero disconnects (vs previous ~45s disconnect cycle)
  • Zero hash backfill scans triggered during sync
  • cargo check passes

Fixes #109

🤖 Generated with Claude Code

… Bodies disconnect

The GetBlockHeaders handler now caches hash→number mappings for every
header it serves, and the blockhash LRU cache limit is increased from
1M to 15M entries.

Previously, GetBlockBodies requests (which arrive by hash) frequently
missed the cache and triggered slow backfill scans that blocked the
single-threaded pseudo peer event loop. The main node's protocol breach
timeout then disconnected the unresponsive peer every ~45 seconds.

With this fix the Bodies stage completes without disconnects — tested
on testnet syncing ~11M blocks in ~3 minutes.

Also replaces the rate-limited public RPC fallback with a hard error,
since all blocks should be available locally and the RPC fallback masked
the underlying cache population issue.

Fixes hl-archive-node#109
@sprites0
Copy link
Collaborator

sprites0 commented Mar 4, 2026

This was due to a different subtle bug in cache warming and I'll fix it in #122. Thanks for the contribution!

@sprites0 sprites0 closed this Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

For initial sync + Bodies stage, --local-ingest-dir reads excessive amount of file repeatedly

2 participants