Skip to content

LedgerDB: prune on garbage collection instead of on every change #1513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

amesgen
Copy link
Member

@amesgen amesgen commented May 19, 2025

This is in preparation for #1424

Currently, we prune the LedgerDB (ie remove all but the last k+1 states) every time we adopt a longer chain. This means that we can not rely on the fact that other threads (like the copyAndSnapshot ChainDB background) actually observe all immutable ledger states, just as described in the caveats of our Watcher abstraction.

However, a predictable ledger snapshotting rule (#1424) requires this property; otherwise, when the node is under high load and/or we are adopting multiple blocks in quick succession, the node might not be able to create a snapshot for its desired block.

This PR changes this fact: Now, when adopting new blocks, the LedgerDB is not immediately pruned. Instead, the copyAndSnapshot ChainDB thread will periodically (on every new immutable block) wake up and (in particular) garbage collect the LedgerDB based on a slot number.

Also, this makes the semantics more consistent with the existing garbage collection of previously-applied blocks in the LedgerDB, and also with how the ChainDB works, where we also don't immediately delete blocks from the VolatileDB once they are buried beneath k+1 blocks.

See #1513 (comment) for benchmarks demonstrating that the peak memory usage does not increase while syncing (where we now briefly might hold more than k+1 ledger states in memory).

@amesgen amesgen changed the base branch from cardano-node-10.4-backports to main May 20, 2025 15:03
@amesgen amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch 2 times, most recently from 8b48bb3 to 045f1cc Compare May 20, 2025 15:15
@amesgen amesgen changed the base branch from main to amesgen/v2-ledgerseq-close May 20, 2025 15:15
@amesgen amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch 4 times, most recently from 13e5533 to 68402ed Compare May 20, 2025 17:25
Copy link
Contributor

@jasagredo jasagredo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@amesgen
Copy link
Member Author

amesgen commented Jun 5, 2025

Sync benchmarks are looking good (mainnet, first 1e6 slots/blocks):

plot

LMDB benchmark (of course, this is a bit degenerate as Byron doesn't have tables, but this still serves as a regression test for the DbChangelog aspects which are touched by this PR).
plot

Note that baf3e7f is crucial; otherwise, there is a significant (2x) regression in max heap size.

Base automatically changed from amesgen/v2-ledgerseq-close to main June 5, 2025 21:18
@amesgen amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch 2 times, most recently from 2e01b1c to b9e25f5 Compare June 10, 2025 15:47
@amesgen amesgen changed the base branch from main to amesgen/ledgerdb-v2-locking June 10, 2025 15:49
@amesgen amesgen force-pushed the amesgen/ledgerdb-v2-locking branch from 19faf20 to 4010598 Compare June 10, 2025 17:54
@amesgen amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch from b9e25f5 to 894940c Compare June 10, 2025 18:09
Base automatically changed from amesgen/ledgerdb-v2-locking to main June 11, 2025 09:07
amesgen added 2 commits June 29, 2025 21:55
It is not necessary to perform the garbage collection of the LedgerDB and the
map of invalid blocks in the same STM transaction. In the past, this was
important, but it is not anymore, see
#1507.
This is an optimization to reduce the maximum memory usage (more relevant with
the in-memory backend), see the added commit and the benchmark in the pull
request.
@amesgen amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch from 894940c to a8fa7e2 Compare June 30, 2025 08:11
@amesgen amesgen marked this pull request as ready for review June 30, 2025 08:22
@amesgen amesgen moved this from 🏗 In progress to 👀 In review in Consensus Team Backlog Jun 30, 2025
LedgerDbPruneAll
| -- | Prune to only keep the last @k@ states.
LedgerDbPruneKeeping SecurityParam
| -- | Prune such that all (non-anchor) states are older than the given slot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| -- | Prune such that all (non-anchor) states are older than the given slot.
| -- | Prune such that all (non-anchor) states are younger than the given slot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out this inversion in a couple of places!

Changed, using "not older" instead of "younger" to account for the case of equality (we want to keep states with the same slot as the argument slot).

-- hold the last \(k\) in-memory ledger states. This data type is impemented
-- using the /finger tree/ data structure and has the following time
-- hold (at least) the last \(k\) in-memory ledger states. This data type is
-- impemented using the /finger tree/ data structure and has the following time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- impemented using the /finger tree/ data structure and has the following time
-- implemented using the /finger tree/ data structure and has the following time

where
DbChangelog{changelogStates} = dblog

-- The anchor of @vol'@ might still have a tip slot larger than @slot@, which
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- The anchor of @vol'@ might still have a tip slot larger than @slot@, which
-- The anchor of @vol'@ might still have a tip slot smaller than @slot@, which

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or older

LedgerDbPruneBeforeSlot slot ->
(closeButHead before, LedgerSeq after)
where
-- The anchor of @vol'@ might still have a tip slot larger than @slot@,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- The anchor of @vol'@ might still have a tip slot larger than @slot@,
-- The anchor of @vol'@ might still have a tip slot older than @slot@,

amesgen added 9 commits June 30, 2025 13:51
regarding the previous few commits
For consistency with V1. This only makes a difference if there are non-pruned
states.

Also, a very small benefit is that we get (very slightly) faster replay on node
startup.
This is used in db-analyser only, where everything happens synchronously in a
single thread, so it is fine to immediately prune.

V1 already does this.
@amesgen amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch from a8fa7e2 to b503dc3 Compare June 30, 2025 11:52
-- ^ Garbage collect references to old blocks that have been previously
-- applied and committed.
, garbageCollect :: SlotNo -> m ()
-- ^ Garbage collect references to old state that is older than the given
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- ^ Garbage collect references to old state that is older than the given
-- ^ Garbage collect references to old states that are older than the given

data LedgerDbPrune
= -- | Prune all states, keeping only the current tip.
LedgerDbPruneAll
| -- | Prune to only keep the last @k@ states.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't LedgerDbPruneKeeping redundant, now that we have LedgerDbPruneBeforeSlot? Or rather, could this be replaced with the new value? (But I understand this is a different concern that the current PR should not address).

. readTVar
$ ldbChangelog env
where
k = unNonZero $ maxRollbacks $ ledgerDbCfgSecParam $ ldbCfg env
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense, in a different PR, to add the value of k directly to the configuration environment, or is this indirection not costly enough to justify this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also see this pattern a couple of times, which might be another justification why we might want to add k to the environment.

Right ImmutableTip -> rollbackTo immTip
Right (SpecificPoint pt) -> rollbackTo pt
Left n -> do
let rollbackMax = maxRollback dblog `min` k
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it sound/safe that the db-changelog reports a maximum rollback larger than k?


-- See the Haddocks above as for why we garbage-collect the LedgerDB already
-- here (instead of as part of the scheduled GC).
whenJust (withOriginToMaybe gcSlotNo) $ LedgerDB.garbageCollect cdbLedgerDB
Copy link
Contributor

@geo2a geo2a Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here suggests that we perform garbage-collection on the LedgerDB as part of the scheduled GC of ChainDB. However, when I look at ChainDB.Impl.Background.garbageCollect, I only see a call to the VolatileDB GC.

My question is: are we performing garbage collection on the LedgerDB as part of the ChainDB GC? If yes, could you point me to the place in the code where that happens?

Copy link
Contributor

@geo2a geo2a Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, now I see that changing this is the entire point of e414367!! Please disregard my previous comment.

@@ -0,0 +1,6 @@
### Breaking

- Changed pruning of immutable ledger states to happen on LedgerDB/ChainDB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changelog entry here seems misleading, but please correct me if I'm wrong.

If I understand correctly, we are now pruning the LedgerDB in copyAndSnapshotRunner, i.e. when moving blocks from VolatileDB to ImmutableDB. In the scheduled ChainDB GCs, we will not prune the LedgerDB at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 👀 In review
Development

Successfully merging this pull request may close these issues.

4 participants