LedgerDB: prune on garbage collection instead of on every change #1513

amesgen · 2025-05-19T12:26:24Z

This is in preparation for #1424

Currently, we prune the LedgerDB (ie remove all but the last k+1 states) every time we adopt a longer chain. This means that we can not rely on the fact that other threads (like the copyAndSnapshot ChainDB background) actually observe all immutable ledger states, just as described in the caveats of our Watcher abstraction.

However, a predictable ledger snapshotting rule (#1424) requires this property; otherwise, when the node is under high load and/or we are adopting multiple blocks in quick succession, the node might not be able to create a snapshot for its desired block.

This PR changes this fact: Now, when adopting new blocks, the LedgerDB is not immediately pruned. Instead, the copyAndSnapshot ChainDB thread will periodically (on every new immutable block) wake up and (in particular) garbage collect the LedgerDB based on a slot number.

Also, this makes the semantics more consistent with the existing garbage collection of previously-applied blocks in the LedgerDB, and also with how the ChainDB works, where we also don't immediately delete blocks from the VolatileDB once they are buried beneath k+1 blocks.

See #1513 (comment) for benchmarks demonstrating that the peak memory usage does not increase while syncing (where we now briefly might hold more than k+1 ledger states in memory).

jasagredo

Looks good.

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V2.hs

amesgen · 2025-06-05T14:17:03Z

Sync benchmarks are looking good (mainnet, first 1e6 slots/blocks):

LMDB benchmark (of course, this is a bit degenerate as Byron doesn't have tables, but this still serves as a regression test for the DbChangelog aspects which are touched by this PR).

Note that baf3e7f is crucial; otherwise, there is a significant (2x) regression in max heap size.

It is not necessary to perform the garbage collection of the LedgerDB and the map of invalid blocks in the same STM transaction. In the past, this was important, but it is not anymore, see #1507.

This is an optimization to reduce the maximum memory usage (more relevant with the in-memory backend), see the added commit and the benchmark in the pull request.

jasagredo · 2025-06-30T10:40:34Z

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/API.hs

+    LedgerDbPruneAll
+  | -- | Prune to only keep the last @k@ states.
+    LedgerDbPruneKeeping SecurityParam
+  | -- | Prune such that all (non-anchor) states are older than the given slot.


Suggested change

| -- | Prune such that all (non-anchor) states are older than the given slot.

| -- | Prune such that all (non-anchor) states are younger than the given slot.

Thanks for pointing out this inversion in a couple of places!

Changed, using "not older" instead of "younger" to account for the case of equality (we want to keep states with the same slot as the argument slot).

jasagredo · 2025-06-30T10:42:54Z

...ros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1/DbChangelog.hs

-- hold the last \(k\) in-memory ledger states. This data type is impemented
-- using the /finger tree/ data structure and has the following time
+-- hold (at least) the last \(k\) in-memory ledger states. This data type is
+-- impemented using the /finger tree/ data structure and has the following time


Suggested change

-- impemented using the /finger tree/ data structure and has the following time

-- implemented using the /finger tree/ data structure and has the following time

jasagredo · 2025-06-30T10:43:39Z

...ros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1/DbChangelog.hs

+ where
+  DbChangelog{changelogStates} = dblog
+
+  -- The anchor of @vol'@ might still have a tip slot larger than @slot@, which


Suggested change

-- The anchor of @vol'@ might still have a tip slot larger than @slot@, which

-- The anchor of @vol'@ might still have a tip slot smaller than @slot@, which

jasagredo · 2025-06-30T10:44:57Z

...boros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V2/LedgerSeq.hs

+  LedgerDbPruneBeforeSlot slot ->
+    (closeButHead before, LedgerSeq after)
+   where
+    -- The anchor of @vol'@ might still have a tip slot larger than @slot@,


Suggested change

-- The anchor of @vol'@ might still have a tip slot larger than @slot@,

-- The anchor of @vol'@ might still have a tip slot older than @slot@,

regarding the previous few commits

For consistency with V1. This only makes a difference if there are non-pruned states. Also, a very small benefit is that we get (very slightly) faster replay on node startup.

This is used in db-analyser only, where everything happens synchronously in a single thread, so it is fine to immediately prune. V1 already does this.

dnadales · 2025-06-30T11:54:29Z

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/API.hs

-  -- ^ Garbage collect references to old blocks that have been previously
-  -- applied and committed.
+  , garbageCollect :: SlotNo -> m ()
+  -- ^ Garbage collect references to old state that is older than the given


Suggested change

-- ^ Garbage collect references to old state that is older than the given

-- ^ Garbage collect references to old states that are older than the given

dnadales · 2025-06-30T11:56:31Z

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/API.hs

+data LedgerDbPrune
+  = -- | Prune all states, keeping only the current tip.
+    LedgerDbPruneAll
+  | -- | Prune to only keep the last @k@ states.


Isn't LedgerDbPruneKeeping redundant, now that we have LedgerDbPruneBeforeSlot? Or rather, could this be replaced with the new value? (But I understand this is a different concern that the current PR should not address).

dnadales · 2025-06-30T12:04:04Z

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1.hs

+    . readTVar
+    $ ldbChangelog env
+ where
+  k = unNonZero $ maxRollbacks $ ledgerDbCfgSecParam $ ldbCfg env


Would it make sense, in a different PR, to add the value of k directly to the configuration environment, or is this indirection not costly enough to justify this?

We also see this pattern a couple of times, which might be another justification why we might want to add k to the environment.

dnadales · 2025-06-30T12:09:46Z

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V1.hs

+    Right ImmutableTip -> rollbackTo immTip
+    Right (SpecificPoint pt) -> rollbackTo pt
+    Left n -> do
+      let rollbackMax = maxRollback dblog `min` k


Is it sound/safe that the db-changelog reports a maximum rollback larger than k?

geo2a · 2025-07-01T08:00:47Z

...ros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/ChainDB/Impl/Background.hs


+    -- See the Haddocks above as for why we garbage-collect the LedgerDB already
+    -- here (instead of as part of the scheduled GC).
+    whenJust (withOriginToMaybe gcSlotNo) $ LedgerDB.garbageCollect cdbLedgerDB


The comment here suggests that we perform garbage-collection on the LedgerDB as part of the scheduled GC of ChainDB. However, when I look at ChainDB.Impl.Background.garbageCollect, I only see a call to the VolatileDB GC.

My question is: are we performing garbage collection on the LedgerDB as part of the ChainDB GC? If yes, could you point me to the place in the code where that happens?

oops, now I see that changing this is the entire point of e414367!! Please disregard my previous comment.

geo2a · 2025-07-01T08:08:53Z

...ros-consensus/changelog.d/20250626_193647_alexander.esgen_ledgerdb_garbage_collect_states.md

@@ -0,0 +1,6 @@
+### Breaking
+
+- Changed pruning of immutable ledger states to happen on LedgerDB/ChainDB


The changelog entry here seems misleading, but please correct me if I'm wrong.

If I understand correctly, we are now pruning the LedgerDB in copyAndSnapshotRunner, i.e. when moving blocks from VolatileDB to ImmutableDB. In the scheduled ChainDB GCs, we will not prune the LedgerDB at all.

amesgen changed the base branch from cardano-node-10.4-backports to main May 20, 2025 15:03

amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch 2 times, most recently from 8b48bb3 to 045f1cc Compare May 20, 2025 15:15

amesgen changed the base branch from main to amesgen/v2-ledgerseq-close May 20, 2025 15:15

amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch 4 times, most recently from 13e5533 to 68402ed Compare May 20, 2025 17:25

jasagredo approved these changes May 21, 2025

View reviewed changes

ouroboros-consensus/src/ouroboros-consensus/Ouroboros/Consensus/Storage/LedgerDB/V2.hs Show resolved Hide resolved

amesgen force-pushed the amesgen/v2-ledgerseq-close branch from 981971e to 0c5b137 Compare May 28, 2025 12:00

amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch from 68402ed to 4d6fd67 Compare May 28, 2025 12:00

amesgen mentioned this pull request May 28, 2025

LedgerDB.V2: make sure to actually close handles #1516

Merged

jasagredo mentioned this pull request May 29, 2025

Consensus release for node 10.6 #1541

Closed

jasagredo added this to Consensus Team Backlog Jun 5, 2025

jasagredo moved this to 🏗 In progress in Consensus Team Backlog Jun 5, 2025

jasagredo assigned amesgen Jun 5, 2025

amesgen mentioned this pull request Jun 5, 2025

LedgerDB V2: prevent race conditions between using (duplicating) and closing LedgerTableHandle s #1551

Closed

amesgen force-pushed the amesgen/v2-ledgerseq-close branch from 0c5b137 to 7900088 Compare June 5, 2025 11:28

amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch from 4d6fd67 to 7049fd4 Compare June 5, 2025 14:16

Base automatically changed from amesgen/v2-ledgerseq-close to main June 5, 2025 21:18

amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch 2 times, most recently from 2e01b1c to b9e25f5 Compare June 10, 2025 15:47

amesgen changed the base branch from main to amesgen/ledgerdb-v2-locking June 10, 2025 15:49

amesgen force-pushed the amesgen/ledgerdb-v2-locking branch from 19faf20 to 4010598 Compare June 10, 2025 17:54

amesgen mentioned this pull request Jun 10, 2025

LedgerDB.V2: opportunistically reduce lock contention when closing a Forker #1557

Open

amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch from b9e25f5 to 894940c Compare June 10, 2025 18:09

Base automatically changed from amesgen/ledgerdb-v2-locking to main June 11, 2025 09:07

amesgen added 2 commits June 29, 2025 21:55

LedgerDB.garbageCollect: allow (non-STM) effectful cleanup

7a91a13

It is not necessary to perform the garbage collection of the LedgerDB and the map of invalid blocks in the same STM transaction. In the past, this was important, but it is not anymore, see #1507.

ChainDB: garbage-collect LedgerDB more promptly

e414367

This is an optimization to reduce the maximum memory usage (more relevant with the in-memory backend), see the added commit and the benchmark in the pull request.

amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch from 894940c to a8fa7e2 Compare June 30, 2025 08:11

amesgen marked this pull request as ready for review June 30, 2025 08:22

amesgen requested review from nfrisby, fraser-iohk, dnadales and geo2a as code owners June 30, 2025 08:22

amesgen mentioned this pull request Jun 30, 2025

LedgerDB: implement predictable snapshotting #1575

Open

amesgen moved this from 🏗 In progress to 👀 In review in Consensus Team Backlog Jun 30, 2025

jasagredo reviewed Jun 30, 2025

View reviewed changes

amesgen added 9 commits June 30, 2025 13:51

LedgerDB: introduce slot-based pruning

82af130

LedgerDB.V1: prune on garbage collection instead of on every change

764baa5

LedgerDB.V1: adapt queries for DbChangelog of length >k

906313b

LedgerDB.V2: prune on garbage collection instead of on every change

91a4171

LedgerDB.V2: adapt queries for DbChangelog of length >k

b907781

LedgerDB.garbageCollect: update documentation

ac281d4

regarding the previous few commits

LedgerDB.V2: take snapshot at immutable tip

825e53d

For consistency with V1. This only makes a difference if there are non-pruned states. Also, a very small benefit is that we get (very slightly) faster replay on node startup.

LedgerDB.V2.TestInternals: prune LedgerSeq

c281a55

This is used in db-analyser only, where everything happens synchronously in a single thread, so it is fine to immediately prune. V1 already does this.

Add changelogs

b503dc3

amesgen force-pushed the amesgen/ledgerdb-garbage-collect-states branch from a8fa7e2 to b503dc3 Compare June 30, 2025 11:52

dnadales approved these changes Jun 30, 2025

View reviewed changes

geo2a reviewed Jul 1, 2025

View reviewed changes

geo2a approved these changes Jul 1, 2025

View reviewed changes

	\| -- \| Prune such that all (non-anchor) states are older than the given slot.
	\| -- \| Prune such that all (non-anchor) states are younger than the given slot.

	-- impemented using the /finger tree/ data structure and has the following time
	-- implemented using the /finger tree/ data structure and has the following time

	-- The anchor of @vol'@ might still have a tip slot larger than @slot@, which
	-- The anchor of @vol'@ might still have a tip slot smaller than @slot@, which

	-- ^ Garbage collect references to old state that is older than the given
	-- ^ Garbage collect references to old states that are older than the given

		@@ -0,0 +1,6 @@
		### Breaking

		- Changed pruning of immutable ledger states to happen on LedgerDB/ChainDB

LedgerDB: prune on garbage collection instead of on every change #1513

Are you sure you want to change the base?

LedgerDB: prune on garbage collection instead of on every change #1513

Conversation

amesgen commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jasagredo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

amesgen commented Jun 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

geo2a Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

geo2a Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

amesgen commented May 19, 2025 •

edited

Loading

amesgen commented Jun 5, 2025 •

edited

Loading

geo2a Jul 1, 2025 •

edited

Loading

geo2a Jul 1, 2025 •

edited

Loading