leverage lazy bytestring for 'Serialise' CBOR-in-CBOR. #5138

KtorZ · 2025-06-10T19:14:26Z

Description

The main use case for this type lies within the StateQuery protocol, where it can be particularly useful to obtain a plain CBOR response. This is particularly useful for large query results such as DebugNewEpochState (multiple GB on mainnet). The network library makes a great effort at trying to serialise and deserialise bytes lazily throughout; and these efforts are unfortunately destroyed by this implementation that would here evaluate the entire ByteString when decoding and encoding. So on a machine that would have both a server and client using this library, we would pay twice the cost of fully evaluating in memory the entire response, instead of leverage lazy IO as, I believe, is originally intended.

Note that 'decode' and 'encode' here rely on the default implementation for lazy ByteString in Codec.Serialise, which do the right thing: encode lazy bytestrings as indefinite sequences of byte chunks; effectively preserving the laziness in both directions.

Checklist

Quality

Commit sequence makes sense and have useful messages, see ref.
New tests are added and existing tests are updated.
Self-reviewed the PR.

Maintenance

Linked an issue or added the PR to the current sprint of ouroboros-network project.
Added labels.
Updated changelog files.
The documentation has been properly updated, see ref.

The main use case for this type lies within the StateQuery protocol, where it can be particularly useful to obtain a plain CBOR response. This is particularly useful for large query results such as DebugNewEpochState (multiple GB on mainnet). The network library makes a great effort at trying to serialise and deserialise bytes lazily throughout; and these efforts are unfortunately destroyed by this implementation that would here evaluate the entire ByteString when decoding and encoding. So on a machine that would have both a server and client using this library, we would pay twice the cost of fully evaluating in memory the entire response, instead of leverage lazy IO as, I believe, is originally intended. Note that 'decode' and 'encode' here rely on the default implementation for lazy ByteString in Codec.Serialise, which do the right thing: encode lazy bytestrings as indefinite sequences of byte chunks; effectively preserving the laziness in both directions.

coot · 2025-06-11T09:21:39Z

ouroboros-network-api/src/Ouroboros/Network/Block.hs

 instance Serialise (Serialised a) where
  encode (Serialised bs) = mconcat [
        Enc.encodeTag 24
-      , Enc.encodeBytes (Lazy.toStrict bs)
+        encode bs


Nice, so we avoid lazy to strict conversion, which indeed is costly.

The only problem with it is that we're modifying serialisation, which won't work across different versions (backwards compatibility).

I think we should:

add SerialisedV2 newtype wrapper,

bump version of the NodeToClient mini-protocol

use it in LocalQueryLedger mini-protocol if the negotiated version allows for it

An alternative is to modify the instance without changing the encoding (so it remains backwards compatible).

@nfrisby do you agree? I think, this type is mostly used in ouroboros-conesnsus.

FTR the ticket for making decoding in this instance lazy is #5114 which we recently talked about in the context of incremental ledger serialization (EDIT: encoding indeed is difficult to do non-incremental as it is only possible if one knows the size upfront¹).

Also note that this instance (via (un)wrapCBORinCBOR) is used for sending eg txs/headers/blocks via N2C/N2N protocols, so all of these also would have to be patched for backwards-compatibility (across all implementations that do anything with txs/headers/blocks). Therefore,

An alternative is to modify the instance without changing the encoding (so it remains backwards compatible).

sounds simpler to me.

Footnotes

It doesn't really matter for txs/headers/blocks as we do know the size there, but it is annoying for the LSQ stuff which motivates this PR. ↩

The only problem with it is that we're modifying serialisation

Indeed, but that's also a desired feature IMO as it should prevent evaluating the entire bytestring in one chunk on the server; which can instead stream the response to clients. (I am really seeing this in the context of the state-query protocol).

I like the idea of making this version-specific; although the version here should be driven by the NodeToClient version in the case of the state query protocols. I believe the Serialised type is also used elsewhere, where version-constaints may be different but all that can very likely be resolved through a type-class or a type family.

An alternative is to modify the instance without changing the encoding (so it remains backwards compatible).

While this might be possible for decode, I don't think it would be possible for encode as we can't know the length of the bytestring when we begin serialising. So we need to at least partially evaluate it to know how many chunks are there if we want to use definite CBOR structures.

For decoding, we might still be able to decode by chunks once we have parsed the CBOR header type and know the expected size of the ByteString.

Having said that, doing it for even only just decode at least solves the problem on the receiving end. So that's half of the problem already solved :)

From the RFC (section 3.4.5.1):

Sometimes it is beneficial to carry an embedded CBOR data item that is not meant to be decoded immediately at the time the enclosing data item is being decoded. Tag number 24 (CBOR data item) can be used to tag the embedded byte string as a single data item encoded in CBOR format. Contained items that aren't byte strings are invalid.

It seems what you're proposing in this patch is not a valid CBOR, see encodeChunked.

I don't think you're reading this right. encodeChunked produces a single data item: a bytestring (major type 2).

Whether it is indefinite or definite doesn't really change the fact that it's a single CBOR data item.

Yes, you're right.

I think we all agree on adding SerialisedV2 to ouroboros-newtork and a new NodeToClientVersion.

This query used to exists, but was removed and replaced with rewardsProvenance', which does provide far less insightful results. I figured we could re-implement the old query by actually fetching the NewEpochState, and re-running the same calculations. The downside is obviously the time and resources it takes to fetch that state. So I ended up writing a partial CBOR decoder for it, that would skip the entire UTxO set (the largest part of the object) since it isn't actually required for calculations. However, this turned out not so useful because the results are unfortunately encoded and decoded as a strict bytestring at the moment instead of leverage lazy I/O. So we get no benefit from having a lazy decoder. I opened a PR to address the issue ouroboros-network: IntersectMBO/ouroboros-network#5138 So, hopefully, this wasn't all for nothing.

KtorZ requested a review from a team as a code owner June 10, 2025 19:14

github-project-automation bot added this to Ouroboros Network Jun 10, 2025

github-project-automation bot moved this to In Progress in Ouroboros Network Jun 10, 2025

KtorZ force-pushed the lazy-serialised branch from 1c8fa69 to f928427 Compare June 10, 2025 19:15

coot requested changes Jun 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

leverage lazy bytestring for 'Serialise' CBOR-in-CBOR. #5138

leverage lazy bytestring for 'Serialise' CBOR-in-CBOR. #5138

Uh oh!

KtorZ commented Jun 10, 2025

Uh oh!

coot Jun 11, 2025

Uh oh!

amesgen Jun 11, 2025 •

edited

Loading

Uh oh!

KtorZ Jun 11, 2025

Uh oh!

KtorZ Jun 11, 2025 •

edited

Loading

Uh oh!

coot Jun 12, 2025

Uh oh!

KtorZ Jun 12, 2025 •

edited

Loading

Uh oh!

coot Jun 12, 2025

Uh oh!

Uh oh!

leverage lazy bytestring for 'Serialise' CBOR-in-CBOR. #5138

Are you sure you want to change the base?

leverage lazy bytestring for 'Serialise' CBOR-in-CBOR. #5138

Uh oh!

Conversation

KtorZ commented Jun 10, 2025

Description

Checklist

Quality

Maintenance

Uh oh!

coot Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

amesgen Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Footnotes

Uh oh!

KtorZ Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

KtorZ Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coot Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

KtorZ Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coot Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

amesgen Jun 11, 2025 •

edited

Loading

KtorZ Jun 11, 2025 •

edited

Loading

KtorZ Jun 12, 2025 •

edited

Loading