Gossipsub partial messages — full implementation#473
Open
lucassaldanha wants to merge 31 commits into
Open
Conversation
Step 1 of the partial-messages extension: plumb SubOpts.requestsPartial / SubOpts.supportsSendingPartial through subscribe announcements in both directions, and track the per-peer-per-topic receive state. - AbstractRouter parses the flags with the spec-mandated coercion (supportsSendingPartial := requestsPartial || supportsSendingPartial) and zeroes them on subscribe=false. - New enqueueSubscribe hook unifies outbound subscribe enqueueing so GossipRouter can attach per-topic flags in a single override. - GossipRouter exposes setTopicPartialFlags(topic, ...) to configure flags advertised for a locally-subscribed topic, and stores inbound flags in a new PartialSubscriptionState (plain HashMap on the pubsub event loop). State is cleaned on peer disconnect, topic unsubscribe, and per-peer unsubscribe. - Outbound unsubscribe MUST NOT carry partial flags; enforced at the SubscriptionPart wire-build site. No routing behaviour changes yet. See docs/partial-messages.md §4.5, §5, §6.1 for context.
…t addSubscription overload
- `PartialSubscriptionWireTest`: route reads of `partialSubscriptionState`
through `submitOnEventThread { ... }.join()`. The state container is
not thread-safe; direct access from the JUnit thread races the event
loop and can surface as `ConcurrentModificationException` or stale
reads. Two helpers (`peerFlagsOnEventLoop`, `snapshotPartialStateOnEventLoop`)
establish the happens-before barrier.
- `RpcPartsQueue`: remove the 2-arg `addSubscription(topic, status)`
default overload. The remaining 4-arg abstract method is the single
source of truth; `addSubscribe` / `addUnsubscribe` remain the
convenience entry points.
- `PartialSubFlags.coerce(requestsPartial, supportsSendingPartial)`: single source of truth for the spec coercion rule `supportsSendingPartial := requestsPartial || supportsSendingPartial`. Used from `GossipRouter.setTopicPartialFlags` for the outbound side. AbstractRouter keeps the inline expression for the receive side to avoid a reverse layering dependency (pubsub -> gossip); a comment notes the rule is applied on both sides. - `PartialSubscriptionState.setPeerFlags`: document that passing `PartialSubFlags.NONE` (or any equivalent all-false flags) is treated as a removal. Makes the set-sometimes-deletes invariant explicit for readers. - `AbstractRouter.handleMessageSubscriptions`: add Kdoc now that the method is `protected open`. Documents the "call super" contract for overrides (GossipRouter relies on this to keep peersTopics and partialSubscriptionState in sync) and the flag-normalisation precondition.
Introduces the public partial-messages API surface and the internal state management layer required before any routing logic lands: Public API (io.libp2p.pubsub.gossip.partialmessages): - PartialMessagesHandler<PeerState> — onIncomingRpc + onEmitGossip; PartialMessagesPeerFeedback passed per-call (resolves open question from design doc §9) - PublishAction<PeerState> / PublishActionsFn<PeerState> - PartialMessagesPeerFeedback interface + FeedbackKind enum Internal state management: - GroupId — content-equality ByteArray wrapper for use as map key - GroupState<PeerState> — per-(topic,groupId) container with mutable TTL and app-opaque peerStates - PartialGroupStateStore<PeerState> — TTL countdown, GC on ttl≤0 or empty peerStates, DoS caps (255/topic, 8/topic/peer, matching go-libp2p defaults), onPeerDisconnected, onTopicUnsubscribed - PartialMessagesAdapter (internal interface) / PartialMessagesAdapterImpl<PeerState> — erases PeerState at the GossipRouter boundary via a single @Suppress("UNCHECKED_CAST") in the builder Wiring: - GossipRouterBuilder: partialMessagesHandler field; build-time error if PARTIAL_MESSAGES extension enabled without a handler - GossipRouter: internal var partialMessages: PartialMessagesAdapter? No routing changes in this step.
Replaces the stub in GossipRouter.processPartialMessageExtension with the full flow: drop RPCs missing topicID or groupID, then delegate to PartialMessagesAdapterImpl which gets-or-creates the GroupState (with DoS cap enforcement) and calls handler.onIncomingRpc with the live peerStates map.
Adds the outbound path for the partial-messages extension: - GossipRpcPartsQueue: addPartialMessage queues a PartialMessagePart; takeMerged caps at 1 per RPC (proto field is optional, not repeated). - PartialMessagesAdapter: publishPartial invokes the client's PublishActionsFn, enforces the spec MUST (omit partialMessage when peer supports but did not request), updates nextPeerState atomically, and calls back via enqueueFn. - GossipRouter: publishPartial looks up PeerHandler by PeerId, routes through GossipRpcPartsQueue (not a direct send), and flushes pending. - Gossip facade: publishPartial submits to the event thread and returns CompletableFuture<Unit>. Tests: PartialMessagesOutboundRpcTest (5 wire-level) and 6 new unit tests in PartialMessagesAdapterImplTest.
Exercises the full stack built in steps 1-4 — SubOpts flag plumbing, ControlExtensions handshake, inbound handler dispatch, group-state tracking, and outbound publishPartial — over real TCP/Noise/Mplex using a trivial ByteArray bitmap as PeerState. Four tests: - Unidirectional partial RPC (payload + metadata delivered to handler) - Bidirectional round-trip between two hosts - nextPeerState persisted and visible to subsequent decide() calls - Spec MUST: partialMessage omitted when peer supports but did not request Both hosts bind to port 0 to avoid conflicts with other tests.
Per §5.2 of the partial-messages spec: when we have requested partial messages for a topic T and a peer supports sending partial for T, skip IDONTWANT to that peer. Sending IDONTWANT is redundant — the peer is expected to deliver partial RPCs instead of full messages. Also extends PubsubProtocol.supportsIDontWant() to include Gossip_V_1_3, since v1.3 is a superset of v1.2 and both IDONTWANT and the Extensions control message must coexist for partial-message sessions.
Three one-line additions to GossipRouter that call the already-implemented PartialMessagesAdapter hooks at the right lifecycle points: - heartbeat() → partialMessages?.onHeartbeat() Decrements TTL on all live groups each heartbeat and garbage-collects groups whose TTL has reached zero or whose peerStates map is empty. - onPeerDisconnected() → partialMessages?.onPeerDisconnected(peer.peerId) Removes the disconnected peer from every group's peerStates; GCs groups that become empty as a result. - unsubscribe() → partialMessages?.onTopicUnsubscribed(topic) Drops all group state for the topic when we leave it. TTL is already reset on every publishPartial call via getOrCreateLocalGroup.
…rs (step 9) §5.3 of the partial-messages spec: during the gossipsub heartbeat lazy-push, peers where we support sending partial AND they requested partial are excluded from IHAVE targets. Instead, handler.onEmitGossip is invoked once per locally-initiated (topic, groupId) group with the collected partial peers so the client can drive per-part delivery via publishPartial. - Add onEmitGossip(topic, partialPeers) to PartialMessagesAdapter and implement it in PartialMessagesAdapterImpl: iterates locally-initiated groups for the topic and calls handler.onEmitGossip for each. - Modify GossipRouter.emitGossip: when the adapter is set and we advertise supportsSendingPartial for the topic, partition the selected gossip targets into fullPeers (IHAVE) and partialPeers (onEmitGossip). Only fullPeers receive IHAVE. - Add PartialMessagesEmitGossipTest covering: IHAVE suppression for partial peers, onEmitGossip invocation for locally-initiated groups, no call for peer-initiated groups, and fallback to normal IHAVE when extension is off or flags unset.
…es (step 10) Wire partial-messages support through SimGossipRouter/Builder so the simulator can create networks with both partial-capable and non-partial peers. New tests: - PartialMessagesMixedPeerTest: 3-node E2E test (A partial, B partial, C full-only) verifying full-message suppression for partial peers, partial RPC delivery, and that non-partial senders still reach partial-capable peers unconditionally. - PartialMessagesSimTest: simulator scenario with a 4-peer star topology confirming full-message suppression and non-partial propagation in the gossip mesh. Refactoring: - GossipRouterBuilder: move router.partialMessages assignment from createGossipRouter() to build() so subclasses inherit the wiring without duplicating it; expose buildGossipExtensionsConfig() as protected. - SimGossipRouter: accept gossipExtensionsConfig as constructor parameter (defaults to GossipExtensionsConfig() for backward compatibility). - SimGossipRouterBuilder: pass buildGossipExtensionsConfig() to SimGossipRouter.
44b09c4 to
b391e04
Compare
sendControlExtensions was called after super.onPeerActive, which had already flushed the pending queue with subscriptions. Peers such as go-libp2p record extension support only from the very first RPC they receive and treat a subsequent extensions message as a protocol violation, leaving partialMessages=false for the peer. Moving sendControlExtensions before super.onPeerActive ensures extensions are queued alongside subscriptions and flushed together in a single first-contact RPC.
…ssip with empty mCache - Relax early-exit guard so onEmitGossip fires even when mCache has no full messages (partial-only publishers never populate mCache) - Build partialCandidates from all topic peers (not just non-mesh ihave candidates) so mesh peers participate in partial message groups - Guard shuffledMessageIds and IHAVE loop behind ids.isNotEmpty() check - Only call onEmitGossip when partialCandidates is non-empty
Replace NopPartialMessagesFeedback with RouterBackedPartialMessagesFeedback so that reportFeedback(INVALID) calls notifyRouterMisbehavior on the GossipRouterEventBroadcaster, enabling peer scoring to penalize misbehaving partial-message senders.
…ion and are subscribed
…tate Change PartialMessagesHandler.onIncomingRpc return type from Unit to PeerState? so handlers can update the library-managed per-peer state on inbound RPCs. PartialMessagesAdapterImpl now stores the returned value into groupState.peerStates[from] when non-null. All existing handler implementations updated to return null (no state change). New test asserts that state written by the first callback is visible to the second.
…s extension is disabled
…table types unsafe
…tial type mismatches
…lifecycle tests Tests were broken by the subscription guard added in fix 02aa9f7 which drops partial RPCs for unsubscribed topics. Both tests must subscribe before sending the inbound partial RPC to exercise the handler path.
…ccurate semantics
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Full implementation of the Gossipsub partial-messages extension
(libp2p/specs#685, spec pin
6b6203ee). Design rationale and responsibility boundary are indocs/partial-messages.md.This is the umbrella PR. Each step was developed and reviewed independently;
the step PRs are listed below for per-step review history.
Testing
This has all been tested in two ways:
Next steps:
After this PR is merged, we will be able to test Teku on a live test network for partial messages.
What's in this PR
Wire protocol (Steps 1, 3, 4)
SubOptspartial-messages flags (requestsPartial/supportsSendingPartial) sent and parsed on every subscribe announcement.Coercion rule (
supportsSendingPartial := requestsPartial || supportsSendingPartial)applied on receive; flags ignored on
subscribe=false.RPC.partialdispatch: capability validation, group-statecreation/update,
onIncomingRpccallback, DoS caps.publishPartial(topic, groupId, PublishActionsFn)on theGossipfacade, routed through
GossipRpcPartsQueue. Enforces the spec MUST: omitpartialMessagewhen peer supports but did not request.Client API (Step 2)
PartialMessagesHandler<PeerState>—onIncomingRpcandonEmitGossipcallbacks.
PublishAction<PeerState>/PublishActionsFn<PeerState>— atomicper-peer state updates on publish.
PartialMessagesPeerFeedback— side-channel peer-scoring hook.PartialGroupStateStore— per-(topic, groupId)state container with TTLcountdown and DoS caps (255 / 8 per go-libp2p defaults).
Routing rules (Steps 6, 7, 9)
broadcastInbound+broadcastOutbound): skipfull message to peers that support partial at the node level and requested
partial for the topic.
that support sending partial for the topic.
targets are partitioned out;
handler.onEmitGossipis called once perlocally-initiated group instead of enqueuing IHAVE.
Lifecycle (Step 8)
Tests (Steps 5, 10)
PartialMessagesEndToEndTest— two-host bitmap-based round-trip overTCP/Noise/Mplex.
PartialMessagesMixedPeerTest— three-host test (A partial, B partial, Cfull-only) verifying suppression, partial RPC delivery, and non-partial
sender behaviour.
PartialMessagesSimTest— four-peer star simulator scenario confirmingsuppression and non-partial propagation in a gossip mesh.
Step PRs (review history)
Tracking issue: #435