Skip to content

feat: combine generate-state-bloat and init-state into single flow#3345

Open
yongkangc wants to merge 8 commits intomainfrom
yk/combine-state-bloat
Open

feat: combine generate-state-bloat and init-state into single flow#3345
yongkangc wants to merge 8 commits intomainfrom
yk/combine-state-bloat

Conversation

@yongkangc
Copy link
Copy Markdown
Contributor

@yongkangc yongkangc commented Mar 26, 2026

Closes RETH-665

Adds tempo generate-state-bloat CLI command that derives TIP20 storage
slots and writes them directly into the database via ETL collectors,
bypassing the intermediate binary dump file.

Extracts StorageLoader from init_state.rs to share ETL collection,
genesis merge, DB writes, and trie computation between both commands.

Before

flowchart LR
    A1[bench.yml] -->|nu tempo.nu bench --bloat N| A2[tempo.nu]
    A2 -->|cargo run -p tempo-xtask generate-state-bloat --out file.bin| A3[xtask]
    A3 -->|writes| A4[.bin file]
    A2 -->|tempo init-from-binary-dump file.bin| A5[tempo CLI]
    A4 -->|reads| A5
    A5 -->|ETL → DB| A6[(Database)]
Loading

After

flowchart LR
    B1[bench.yml] -->|nu tempo.nu bench --bloat N| B2[tempo.nu]
    B2 -->|tempo generate-state-bloat --size N| B3[tempo CLI]
    B3 -->|derive → ETL → DB| B4[(Database)]
Loading

Adds `tempo generate-state-bloat` CLI command that derives TIP20 storage
slots and writes them directly into the database via ETL collectors,
bypassing the intermediate binary dump file.

Extracts `StorageLoader` from init_state.rs to share ETL collection,
genesis merge, DB writes, and trie computation between both commands.

Closes RETH-665

Amp-Thread-ID: https://ampcode.com/threads/T-019d2ae1-3209-7268-9621-014b6a132778
ETL's par_sort_unstable_by does not preserve insertion order for equal
keys. Add a 1-byte priority suffix (0x00 for genesis, 0x01 for dump)
so dump entries deterministically win over genesis for overlapping slots.

Also: increase WORKER_CHUNK_SIZE 100→4096 for the single hash worker,
remove unused Address from slot_bytes, make log_collection_progress
private.

Amp-Thread-ID: https://ampcode.com/threads/T-019d2d59-8dcd-70de-a88f-4b09b1684c15
Comment on lines -138 to -143
// Process blocks from binary file
loop {
// Read next block header; EOF means no more blocks.
let mut header_buf = [0u8; 40];
match reader.read_exact(&mut header_buf) {
Ok(()) => {}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the chunk is moved over to InitFromBinaryDump

@yongkangc yongkangc force-pushed the yk/combine-state-bloat branch from 9d5952f to 27a7ed6 Compare March 27, 2026 04:30
@decofe
Copy link
Copy Markdown
Member

decofe commented Mar 27, 2026

cc @decofe

✅ Benchmark complete! View job

Bench Comparison: 27a7ed6 vs 27a7ed6

Configuration

  • Bloat: 1000 MiB
  • Preset: tip20
  • Target TPS: 10000
  • Duration: 300s
  • Snapshot: schelk
  • Baseline blocks: 594
  • Feature blocks: 594

Results

Metric Baseline Feature Delta
Latency Mean [ms] 1000.0 1000.0 0.0%
Latency Std Dev [ms] 261.3 137.2 -47.5%
Latency P50 [ms] 1000.0 1000.0 0.0%
Latency P90 [ms] 1004.0 1004.0 0.0%
Latency P99 [ms] 2016.0 1692.0 -16.1%
TPS 9262.0 9264.0 0.0%
Mgas/s 451.3 451.5 0.0%

Per-Run Details

Run Blocks Total Tx Success Failed P50 Latency TPS Mgas/s
baseline-1 297 2742457 2742457 0 1000.0 9265.0 451.5
feature-1 297 2742452 2742452 0 1000.0 9265.0 451.5
feature-2 297 2741849 2741849 0 1000.0 9263.0 451.4
baseline-2 297 2740292 2740292 0 1000.0 9258.0 451.1

Observability

@yongkangc yongkangc force-pushed the yk/combine-state-bloat branch from dfbf7d3 to ee34c92 Compare March 27, 2026 05:07
Comment on lines +116 to +122
Self::GenerateStateBloat(cmd) => {
let runtime = runner.runtime();
runner.run_blocking_until_ctrl_c(
cmd.execute::<tempo_node::node::TempoNode>(runtime),
)?;
Ok(())
}
Copy link
Copy Markdown
Contributor Author

@yongkangc yongkangc Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking this might be needed in future

@yongkangc yongkangc force-pushed the yk/combine-state-bloat branch from ee34c92 to 62c2806 Compare March 27, 2026 05:14
Comment on lines -1416 to -1422
# Generate bloat file
let bloat_file = $"($abs_localnet)/state_bloat.bin"
if $bloat > 0 {
print $"Generating state bloat \(($bloat) MiB\)..."
let token_args = ($TIP20_TOKEN_IDS | each { |id| ["--token" $"($id)"] } | flatten)
cargo run -p tempo-xtask --profile $profile -- generate-state-bloat --size $bloat --out $bloat_file ...$token_args
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont need this anymore because we now generate a shared .bin file from the baseline worktree's xtask, which both baseline and feature sides would then load via init-from-binary-dump.

@yongkangc yongkangc force-pushed the yk/combine-state-bloat branch from 62c2806 to e29b1da Compare March 27, 2026 05:23
Move GenerateStateBloat from a separate file into init_state.rs so both
state-init commands share the same module. StorageLoader and LoadStats
become private to the module.

Amp-Thread-ID: https://ampcode.com/threads/T-019d2d83-eba4-740c-8fea-04cbf1cad24f
@yongkangc yongkangc force-pushed the yk/combine-state-bloat branch from e29b1da to aa52f7b Compare March 27, 2026 05:36
@yongkangc yongkangc marked this pull request as ready for review March 27, 2026 05:43
Copilot AI review requested due to automatic review settings March 27, 2026 05:43
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aa52f7b3ee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR streamlines the “state bloat” workflow by adding a tempo generate-state-bloat CLI subcommand that derives TIP20 storage entries and writes them directly into the DB via ETL, and by refactoring shared loading logic so both the legacy binary-dump loader and the new direct generator reuse the same pipeline.

Changes:

  • Add tempo generate-state-bloat subcommand and wire it into the CLI.
  • Refactor init-state logic into a shared StorageLoader that handles ETL, genesis merge, DB writes, and trie/state-root computation.
  • Update tempo.nu bench/dev flows to generate bloat directly into the database (no intermediate .bin).

Reviewed changes

Copilot reviewed 1 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tempo.nu Switch bench/dev scripts from binary dump generation + init-from-binary-dump to generate-state-bloat writing directly into DB.
bin/tempo/src/tempo_cmd.rs Expose the new GenerateStateBloat subcommand and execute it via the CLI runner.
bin/tempo/src/init_state.rs Introduce StorageLoader; keep binary-dump loader; implement direct bloat generation with parallel address derivation and ETL-based DB writes.
bin/tempo/Cargo.toml Add dependencies needed for mnemonic/BIP32 derivation and parallelism (alloy-signer, coins-bip32, rayon, etc.).
Cargo.lock Lockfile updates for the new dependencies.
Comments suppressed due to low confidence (1)

tempo.nu:101

  • The early-return condition uses ($datadir)/db existence as a proxy for “bloat already loaded”, but tempo init alone will also create that path. If a user previously initialized without bloat and then reruns with --bloat, this will incorrectly skip generation and print a misleading message. Consider either checking for a dedicated marker (e.g., a file in the datadir/meta) or at least change the message to indicate you’re skipping because the DB already exists (not because bloat is present).
    # Skip if this node already has a database with bloat loaded
    if ($db_path | path exists) {
        print $"State bloat already loaded into ($datadir | path basename)"
        return

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yongkangc yongkangc requested a review from shekhirin March 27, 2026 07:34
Skip zero-value entries in the shared loader so direct generation matches init-from-binary-dump. Restore the shared binary-dump path for comparison benches so older refs still initialize and both sides start from identical prebuilt state.

Co-authored-by: YK <46377366+yongkangc@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d2e4d-4b5c-716b-a13f-c57cc63c2438
Co-authored-by: YK <46377366+yongkangc@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d2e63-4b27-7416-ab6c-28cf05adde46
@yongkangc
Copy link
Copy Markdown
Contributor Author

@decofe bench preset=tip20 duration=300 bloat=1 tps=10000

Refresh stale loader comments after the direct generate-state-bloat path replaced the old dump-only flow.\n\nCo-Authored-By: YK <46377366+yongkangc@users.noreply.github.com>

Amp-Thread-ID: https://ampcode.com/threads/T-019d2e92-432c-731c-b396-8438d71b1c6e
@tempoxyz tempoxyz deleted a comment from decofe Mar 27, 2026
Co-authored-by: YK <46377366+yongkangc@users.noreply.github.com>
Amp-Thread-ID: https://ampcode.com/threads/T-019d2e9b-1515-7459-888c-42f7502cc533
@decofe
Copy link
Copy Markdown
Member

decofe commented Mar 27, 2026

cc @yongkangc

✅ Benchmark complete! View job

Bench Comparison: 401bb01 vs 80973ce

Configuration

  • Bloat: 1000 MiB
  • Preset: tip20
  • Target TPS: 10000
  • Duration: 300s
  • Snapshot: schelk
  • Baseline blocks: 593
  • Feature blocks: 594

Results

Metric Baseline Feature Delta
Latency Mean [ms] 1001.2 1000.0 -0.1%
Latency Std Dev [ms] 340.0 175.6 -48.4%
Latency P50 [ms] 1000.0 1000.0 0.0%
Latency P90 [ms] 1004.0 1004.0 0.0%
Latency P99 [ms] 1860.0 1866.0 0.3%
TPS 9210.0 9261.0 0.6%
Mgas/s 448.8 451.3 0.6%

Per-Run Details

Run Blocks Total Tx Success Failed P50 Latency TPS Mgas/s
baseline-1 296 2733239 2733239 0 1000.0 9265.0 451.5
feature-1 297 2742269 2742269 0 1000.0 9264.0 451.4
feature-2 297 2740123 2740123 0 1000.0 9257.0 451.1
baseline-2 297 2715658 2715658 0 1000.0 9154.0 446.0

Observability

@tempoxyz tempoxyz deleted a comment from decofe Mar 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants