Skip to content

Gossipsub partial messages — full implementation#473

Open
lucassaldanha wants to merge 31 commits into
libp2p:developfrom
lucassaldanha:feature/partial-messages-support
Open

Gossipsub partial messages — full implementation#473
lucassaldanha wants to merge 31 commits into
libp2p:developfrom
lucassaldanha:feature/partial-messages-support

Conversation

@lucassaldanha

@lucassaldanha lucassaldanha commented Apr 29, 2026

Copy link
Copy Markdown
Collaborator

Summary

Full implementation of the Gossipsub partial-messages extension
(libp2p/specs#685, spec pin
6b6203ee). Design rationale and responsibility boundary are in
docs/partial-messages.md.

This is the umbrella PR. Each step was developed and reviewed independently;
the step PRs are listed below for per-step review history.

Testing

This has all been tested in two ways:

  1. Using https://github.com/libp2p/test-plans/tree/master/gossipsub-interop; particularly the example here for partial messages.
  2. With Teku (after changes to use partial messages logic), I was able to successfully run a kurtosis network between Teku, Prysm and Lighthouse. We ere able to both send and receive partial messages, including eager-push of partial messages.

Next steps:
After this PR is merged, we will be able to test Teku on a live test network for partial messages.


What's in this PR

Wire protocol (Steps 1, 3, 4)

  • Per-topic SubOpts partial-messages flags (requestsPartial /
    supportsSendingPartial) sent and parsed on every subscribe announcement.
    Coercion rule (supportsSendingPartial := requestsPartial || supportsSendingPartial)
    applied on receive; flags ignored on subscribe=false.
  • Full inbound RPC.partial dispatch: capability validation, group-state
    creation/update, onIncomingRpc callback, DoS caps.
  • Outbound publishPartial(topic, groupId, PublishActionsFn) on the Gossip
    facade, routed through GossipRpcPartsQueue. Enforces the spec MUST: omit
    partialMessage when peer supports but did not request.

Client API (Step 2)

  • PartialMessagesHandler<PeerState>onIncomingRpc and onEmitGossip
    callbacks.
  • PublishAction<PeerState> / PublishActionsFn<PeerState> — atomic
    per-peer state updates on publish.
  • PartialMessagesPeerFeedback — side-channel peer-scoring hook.
  • PartialGroupStateStore — per-(topic, groupId) state container with TTL
    countdown and DoS caps (255 / 8 per go-libp2p defaults).

Routing rules (Steps 6, 7, 9)

  • Full-message suppression (broadcastInbound + broadcastOutbound): skip
    full message to peers that support partial at the node level and requested
    partial for the topic.
  • IDONTWANT suppression: skip IDONTWANT to peers we request partial from and
    that support sending partial for the topic.
  • IHAVE replacement: during heartbeat lazy-push, partial-capable gossip
    targets are partitioned out; handler.onEmitGossip is called once per
    locally-initiated group instead of enqueuing IHAVE.

Lifecycle (Step 8)

  • Heartbeat: decrement TTLs, GC expired groups.
  • Peer disconnect: purge that peer's state from all groups.
  • Topic unsubscribe: drop all group state for that topic.

Tests (Steps 5, 10)

  • PartialMessagesEndToEndTest — two-host bitmap-based round-trip over
    TCP/Noise/Mplex.
  • PartialMessagesMixedPeerTest — three-host test (A partial, B partial, C
    full-only) verifying suppression, partial RPC delivery, and non-partial
    sender behaviour.
  • PartialMessagesSimTest — four-peer star simulator scenario confirming
    suppression and non-partial propagation in a gossip mesh.
  • Unit tests for each routing rule, lifecycle hook, and wire encoding.

Step PRs (review history)

Step PR Description
1 #460 Per-topic SubOpts flag plumbing
2 #461 PartialMessagesHandler API and GroupState container
3 #463 Inbound RPC.partial dispatch
4 #465 Outbound publishPartial
5 #466 End-to-end integration test
6 #467 Full-message suppression
7 #468 IDONTWANT suppression
8 #469 Heartbeat TTL GC + cleanup hooks
9 #470 IHAVE replacement with onEmitGossip
10 #471 Mixed-peer interop test + simulator scenario

Tracking issue: #435

Step 1 of the partial-messages extension: plumb SubOpts.requestsPartial /
SubOpts.supportsSendingPartial through subscribe announcements in both
directions, and track the per-peer-per-topic receive state.

- AbstractRouter parses the flags with the spec-mandated coercion
  (supportsSendingPartial := requestsPartial || supportsSendingPartial)
  and zeroes them on subscribe=false.
- New enqueueSubscribe hook unifies outbound subscribe enqueueing so
  GossipRouter can attach per-topic flags in a single override.
- GossipRouter exposes setTopicPartialFlags(topic, ...) to configure
  flags advertised for a locally-subscribed topic, and stores inbound
  flags in a new PartialSubscriptionState (plain HashMap on the pubsub
  event loop). State is cleaned on peer disconnect, topic unsubscribe,
  and per-peer unsubscribe.
- Outbound unsubscribe MUST NOT carry partial flags; enforced at the
  SubscriptionPart wire-build site.

No routing behaviour changes yet. See docs/partial-messages.md §4.5,
§5, §6.1 for context.
…t addSubscription overload

- `PartialSubscriptionWireTest`: route reads of `partialSubscriptionState`
  through `submitOnEventThread { ... }.join()`. The state container is
  not thread-safe; direct access from the JUnit thread races the event
  loop and can surface as `ConcurrentModificationException` or stale
  reads. Two helpers (`peerFlagsOnEventLoop`, `snapshotPartialStateOnEventLoop`)
  establish the happens-before barrier.

- `RpcPartsQueue`: remove the 2-arg `addSubscription(topic, status)`
  default overload. The remaining 4-arg abstract method is the single
  source of truth; `addSubscribe` / `addUnsubscribe` remain the
  convenience entry points.
- `PartialSubFlags.coerce(requestsPartial, supportsSendingPartial)`:
  single source of truth for the spec coercion rule
  `supportsSendingPartial := requestsPartial || supportsSendingPartial`.
  Used from `GossipRouter.setTopicPartialFlags` for the outbound side.
  AbstractRouter keeps the inline expression for the receive side to
  avoid a reverse layering dependency (pubsub -> gossip); a comment
  notes the rule is applied on both sides.

- `PartialSubscriptionState.setPeerFlags`: document that passing
  `PartialSubFlags.NONE` (or any equivalent all-false flags) is
  treated as a removal. Makes the set-sometimes-deletes invariant
  explicit for readers.

- `AbstractRouter.handleMessageSubscriptions`: add Kdoc now that the
  method is `protected open`. Documents the "call super" contract
  for overrides (GossipRouter relies on this to keep peersTopics and
  partialSubscriptionState in sync) and the flag-normalisation
  precondition.
Introduces the public partial-messages API surface and the internal
state management layer required before any routing logic lands:

Public API (io.libp2p.pubsub.gossip.partialmessages):
- PartialMessagesHandler<PeerState> — onIncomingRpc + onEmitGossip;
  PartialMessagesPeerFeedback passed per-call (resolves open question
  from design doc §9)
- PublishAction<PeerState> / PublishActionsFn<PeerState>
- PartialMessagesPeerFeedback interface + FeedbackKind enum

Internal state management:
- GroupId — content-equality ByteArray wrapper for use as map key
- GroupState<PeerState> — per-(topic,groupId) container with mutable
  TTL and app-opaque peerStates
- PartialGroupStateStore<PeerState> — TTL countdown, GC on ttl≤0 or
  empty peerStates, DoS caps (255/topic, 8/topic/peer, matching
  go-libp2p defaults), onPeerDisconnected, onTopicUnsubscribed
- PartialMessagesAdapter (internal interface) /
  PartialMessagesAdapterImpl<PeerState> — erases PeerState at the
  GossipRouter boundary via a single @Suppress("UNCHECKED_CAST")
  in the builder

Wiring:
- GossipRouterBuilder: partialMessagesHandler field; build-time error
  if PARTIAL_MESSAGES extension enabled without a handler
- GossipRouter: internal var partialMessages: PartialMessagesAdapter?

No routing changes in this step.
Replaces the stub in GossipRouter.processPartialMessageExtension with
the full flow: drop RPCs missing topicID or groupID, then delegate to
PartialMessagesAdapterImpl which gets-or-creates the GroupState (with
DoS cap enforcement) and calls handler.onIncomingRpc with the live
peerStates map.
Adds the outbound path for the partial-messages extension:

- GossipRpcPartsQueue: addPartialMessage queues a PartialMessagePart;
  takeMerged caps at 1 per RPC (proto field is optional, not repeated).
- PartialMessagesAdapter: publishPartial invokes the client's
  PublishActionsFn, enforces the spec MUST (omit partialMessage when peer
  supports but did not request), updates nextPeerState atomically, and
  calls back via enqueueFn.
- GossipRouter: publishPartial looks up PeerHandler by PeerId, routes
  through GossipRpcPartsQueue (not a direct send), and flushes pending.
- Gossip facade: publishPartial submits to the event thread and returns
  CompletableFuture<Unit>.

Tests: PartialMessagesOutboundRpcTest (5 wire-level) and 6 new unit
tests in PartialMessagesAdapterImplTest.
Exercises the full stack built in steps 1-4 — SubOpts flag plumbing,
ControlExtensions handshake, inbound handler dispatch, group-state
tracking, and outbound publishPartial — over real TCP/Noise/Mplex
using a trivial ByteArray bitmap as PeerState.

Four tests:
- Unidirectional partial RPC (payload + metadata delivered to handler)
- Bidirectional round-trip between two hosts
- nextPeerState persisted and visible to subsequent decide() calls
- Spec MUST: partialMessage omitted when peer supports but did not request

Both hosts bind to port 0 to avoid conflicts with other tests.
Per §5.2 of the partial-messages spec: when we have requested partial
messages for a topic T and a peer supports sending partial for T, skip
IDONTWANT to that peer. Sending IDONTWANT is redundant — the peer is
expected to deliver partial RPCs instead of full messages.

Also extends PubsubProtocol.supportsIDontWant() to include Gossip_V_1_3,
since v1.3 is a superset of v1.2 and both IDONTWANT and the Extensions
control message must coexist for partial-message sessions.
Three one-line additions to GossipRouter that call the already-implemented
PartialMessagesAdapter hooks at the right lifecycle points:

- heartbeat()          → partialMessages?.onHeartbeat()
  Decrements TTL on all live groups each heartbeat and garbage-collects
  groups whose TTL has reached zero or whose peerStates map is empty.

- onPeerDisconnected() → partialMessages?.onPeerDisconnected(peer.peerId)
  Removes the disconnected peer from every group's peerStates; GCs groups
  that become empty as a result.

- unsubscribe()        → partialMessages?.onTopicUnsubscribed(topic)
  Drops all group state for the topic when we leave it.

TTL is already reset on every publishPartial call via getOrCreateLocalGroup.
…rs (step 9)

§5.3 of the partial-messages spec: during the gossipsub heartbeat lazy-push,
peers where we support sending partial AND they requested partial are excluded
from IHAVE targets. Instead, handler.onEmitGossip is invoked once per
locally-initiated (topic, groupId) group with the collected partial peers so
the client can drive per-part delivery via publishPartial.

- Add onEmitGossip(topic, partialPeers) to PartialMessagesAdapter and implement
  it in PartialMessagesAdapterImpl: iterates locally-initiated groups for the
  topic and calls handler.onEmitGossip for each.
- Modify GossipRouter.emitGossip: when the adapter is set and we advertise
  supportsSendingPartial for the topic, partition the selected gossip targets into
  fullPeers (IHAVE) and partialPeers (onEmitGossip). Only fullPeers receive IHAVE.
- Add PartialMessagesEmitGossipTest covering: IHAVE suppression for partial peers,
  onEmitGossip invocation for locally-initiated groups, no call for peer-initiated
  groups, and fallback to normal IHAVE when extension is off or flags unset.
…es (step 10)

Wire partial-messages support through SimGossipRouter/Builder so the simulator
can create networks with both partial-capable and non-partial peers.

New tests:
- PartialMessagesMixedPeerTest: 3-node E2E test (A partial, B partial, C full-only)
  verifying full-message suppression for partial peers, partial RPC delivery, and
  that non-partial senders still reach partial-capable peers unconditionally.
- PartialMessagesSimTest: simulator scenario with a 4-peer star topology confirming
  full-message suppression and non-partial propagation in the gossip mesh.

Refactoring:
- GossipRouterBuilder: move router.partialMessages assignment from
  createGossipRouter() to build() so subclasses inherit the wiring without
  duplicating it; expose buildGossipExtensionsConfig() as protected.
- SimGossipRouter: accept gossipExtensionsConfig as constructor parameter
  (defaults to GossipExtensionsConfig() for backward compatibility).
- SimGossipRouterBuilder: pass buildGossipExtensionsConfig() to SimGossipRouter.
@lucassaldanha lucassaldanha force-pushed the feature/partial-messages-support branch from 44b09c4 to b391e04 Compare April 29, 2026 08:54
lucassaldanha and others added 16 commits May 1, 2026 14:30
sendControlExtensions was called after super.onPeerActive, which had
already flushed the pending queue with subscriptions. Peers such as
go-libp2p record extension support only from the very first RPC they
receive and treat a subsequent extensions message as a protocol
violation, leaving partialMessages=false for the peer.

Moving sendControlExtensions before super.onPeerActive ensures extensions
are queued alongside subscriptions and flushed together in a single
first-contact RPC.
…ssip with empty mCache

- Relax early-exit guard so onEmitGossip fires even when mCache has no
  full messages (partial-only publishers never populate mCache)
- Build partialCandidates from all topic peers (not just non-mesh ihave
  candidates) so mesh peers participate in partial message groups
- Guard shuffledMessageIds and IHAVE loop behind ids.isNotEmpty() check
- Only call onEmitGossip when partialCandidates is non-empty
Replace NopPartialMessagesFeedback with RouterBackedPartialMessagesFeedback
so that reportFeedback(INVALID) calls notifyRouterMisbehavior on the
GossipRouterEventBroadcaster, enabling peer scoring to penalize misbehaving
partial-message senders.
…tate

Change PartialMessagesHandler.onIncomingRpc return type from Unit to
PeerState? so handlers can update the library-managed per-peer state on
inbound RPCs. PartialMessagesAdapterImpl now stores the returned value
into groupState.peerStates[from] when non-null. All existing handler
implementations updated to return null (no state change). New test
asserts that state written by the first callback is visible to the second.
…lifecycle tests

Tests were broken by the subscription guard added in fix 02aa9f7 which drops
partial RPCs for unsubscribed topics. Both tests must subscribe before sending
the inbound partial RPC to exercise the handler path.
@lucassaldanha lucassaldanha changed the title Gossipsub partial messages — full implementation (Steps 1–10) Gossipsub partial messages — full implementation May 13, 2026
@lucassaldanha lucassaldanha marked this pull request as ready for review May 13, 2026 02:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant