Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 2 additions & 5 deletions ouroboros-network-api/src/Ouroboros/Network/Block.hs
Original file line number Diff line number Diff line change
Expand Up @@ -473,16 +473,13 @@ fromSerialised dec (Serialised payload) =
--
-- TODO: replace with encodeEmbeddedCBOR from cborg-0.2.4 once
-- it is available, since that will be faster.
--
-- TODO: Avoid converting to a strict ByteString, as that requires copying O(n)
-- in case the lazy ByteString consists of more than one chunks.
instance Serialise (Serialised a) where
encode (Serialised bs) = mconcat [
Enc.encodeTag 24
, Enc.encodeBytes (Lazy.toStrict bs)
encode bs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, so we avoid lazy to strict conversion, which indeed is costly.

The only problem with it is that we're modifying serialisation, which won't work across different versions (backwards compatibility).

I think we should:

  • add SerialisedV2 newtype wrapper,
  • bump version of the NodeToClient mini-protocol
  • use it in LocalQueryLedger mini-protocol if the negotiated version allows for it

An alternative is to modify the instance without changing the encoding (so it remains backwards compatible).

@nfrisby do you agree? I think, this type is mostly used in ouroboros-conesnsus.

Copy link
Member

@amesgen amesgen Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR the ticket for making decoding in this instance lazy is #5114 which we recently talked about in the context of incremental ledger serialization (EDIT: encoding indeed is difficult to do non-incremental as it is only possible if one knows the size upfront1).

Also note that this instance (via (un)wrapCBORinCBOR) is used for sending eg txs/headers/blocks via N2C/N2N protocols, so all of these also would have to be patched for backwards-compatibility (across all implementations that do anything with txs/headers/blocks). Therefore,

An alternative is to modify the instance without changing the encoding (so it remains backwards compatible).

sounds simpler to me.

Footnotes

  1. It doesn't really matter for txs/headers/blocks as we do know the size there, but it is annoying for the LSQ stuff which motivates this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only problem with it is that we're modifying serialisation

Indeed, but that's also a desired feature IMO as it should prevent evaluating the entire bytestring in one chunk on the server; which can instead stream the response to clients. (I am really seeing this in the context of the state-query protocol).

I like the idea of making this version-specific; although the version here should be driven by the NodeToClient version in the case of the state query protocols. I believe the Serialised type is also used elsewhere, where version-constaints may be different but all that can very likely be resolved through a type-class or a type family.

Copy link
Contributor Author

@KtorZ KtorZ Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative is to modify the instance without changing the encoding (so it remains backwards compatible).

While this might be possible for decode, I don't think it would be possible for encode as we can't know the length of the bytestring when we begin serialising. So we need to at least partially evaluate it to know how many chunks are there if we want to use definite CBOR structures.

For decoding, we might still be able to decode by chunks once we have parsed the CBOR header type and know the expected size of the ByteString.

Having said that, doing it for even only just decode at least solves the problem on the receiving end. So that's half of the problem already solved :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the RFC (section 3.4.5.1):

Sometimes it is beneficial to carry an embedded CBOR data item that is not meant to be decoded immediately at the time the enclosing data item is being decoded. Tag number 24 (CBOR data item) can be used to tag the embedded byte string as a single data item encoded in CBOR format. Contained items that aren't byte strings are invalid.

It seems what you're proposing in this patch is not a valid CBOR, see encodeChunked.

Copy link
Contributor Author

@KtorZ KtorZ Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you're reading this right. encodeChunked produces a single data item: a bytestring (major type 2).

Whether it is indefinite or definite doesn't really change the fact that it's a single CBOR data item.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right.

I think we all agree on adding SerialisedV2 to ouroboros-newtork and a new NodeToClientVersion.

]

decode = do
tag <- Dec.decodeTag
when (tag /= 24) $ fail "expected tag 24 (CBOR-in-CBOR)"
Serialised . Lazy.fromStrict <$> Dec.decodeBytes
Serialised <$> decode
Loading