feat: Speculative Availability #10563

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

timwu20 wants to merge 22 commits into paritytech:master from ChainSafe:tim/speculative-availability-rb

+740 −282

timwu20 commented Dec 6, 2025 •

edited

Loading

Description

Introduces Speculative Availability chunk requests within the Availability Distribution subsystem. In Availability Distibution subsystem we listen on ActiveLeavesUpdate and call Prospective Parachains to gather backable candidates that we can start fetching their erasure encoded chunks from the backing group before the candidates are actually backed on chain. This feature is currently enabled by running the node with the --speculative-availability CLI flag.

Review Notes

Note: This PR is a reimplementation of PR#9444 with less changes to the handling of fetch tasks, overall less diffs, and added CLI flag to enable feature.

Moves request_backale_candidates out of Provisioner subsystem to subystem-util crate to be used as well in Availability Distribution subsystem.
Introduces CoreInfo and CoreInfoOrigin private types in availability-distribution::requester module. CoreInfo is used to create fetch tasks now instead of using available cores directly. We construct CoreInfo instances from calling Prospective Parachains for backable candidates and we denote the origin of these CoreInfo instances with CoreInfo::Scheduled. CoreInfo instances created from candidates that have been backed on chain have CoreInfo::Occupied origin.
Modifies Availability Store subsystem to accept new AvailabilityStoreMessage::NoteBackableCandidates message type.
- This new message type is handled by the Availability Store subsystem to write the meta data to pass validation when accepting the actual chunk for the associated candidate hash. It was discovered during development, that storing the chunk from scheduled/early fetch requests were not being persisted due to not having the meta data previously stored for said candidate.
Includes Zombienet test that utilizes two parachains which support elastic scaling. Assertions are done on the polkadot_parachain_fetched_chunks_total[origin="scheduled"] metric.
--speculative-availability CLI flag support.

TODO

Benchmark with speculative availability enabled.

Checklist

My PR includes a detailed description as outlined in the "Description" and its two subsections above.
My PR follows the labeling requirements of this project (at minimum one label for T required)
- External contributors: Use /cmd label <label-name> to add labels
- Maintainers can also add labels manually
I have made corresponding changes to the documentation (if applicable)
I have added tests that prove my fix is effective or that my feature works (if applicable)

timwu20 added 13 commits

December 5, 2025 14:20


          core info, get backable candidates, hack av-store

dcd9d9a


          add new av store message type, send message for early fetches

9fedb16


          remove duplicate function

e2ee38b


          remove hack when querying chunk

bf6b456


          clean up error handling, revise fetching backable candidates

bff8238


          add CoreInfoOrigin to request types

3b8c5e2


          add wip zombienet test for speculative availability

ac4f97b


          add origin metric label to fetched_chunks, assert on said label in ZN…

db5b912

… test


          revise NoteBackableCandidate msg to handle HashSet of candidate hashe…

cd93f03

…s, add num_validators attribute


          add speculative-availability cli flag, integrate into availability di…

6e087a7

…stribution


          code cleanup

9f930cd


          fix zn test

e73deba


          fix benchmark

612d5d2

timwu20 marked this pull request as draft

December 6, 2025 05:20

cla-bot-2021 bot commented Dec 6, 2025

User @timwu20, please sign the CLA here.

Author

timwu20 commented Dec 6, 2025

/cmd prdoc

Contributor

github-actions bot commented Dec 6, 2025

Command "prdoc" has failed ❌! See logs here


          cargo fmt

91c62dd

Author

timwu20 commented Dec 6, 2025

/cmd label T0-node T8-polkadot

Contributor

github-actions bot commented Dec 6, 2025

Command "label T0-node T8-polkadot" has failed ❌! See logs here

timwu20 added 2 commits

December 6, 2025 06:44


          fix polkadot_service::NewFullParams

59ce4bb


          fix provisioner and availability-distribution tests

da460f7

timwu20 marked this pull request as ready for review

December 7, 2025 20:32

haikoschol approved these changes

View reviewed changes

Contributor

haikoschol left a comment

lgtm, just some comment/logging nits

polkadot/node/network/availability-distribution/src/requester/mod.rs Outdated Show resolved Hide resolved

polkadot/node/network/availability-distribution/src/requester/mod.rs Outdated Show resolved Hide resolved

polkadot/node/network/availability-distribution/src/requester/mod.rs Outdated Show resolved Hide resolved

polkadot/node/subsystem-types/src/messages.rs Outdated Show resolved Hide resolved

polkadot/zombienet-sdk-tests/tests/functional/speculative_availability_requests.rs Outdated Show resolved Hide resolved

eskimor reviewed

View reviewed changes

Member

eskimor left a comment

Quick first pass, will have a closer look tomorrow.

polkadot/node/network/availability-distribution/src/requester/mod.rs Show resolved Hide resolved

polkadot/node/network/availability-distribution/src/requester/fetch_task/mod.rs Outdated Show resolved Hide resolved

polkadot/node/network/availability-distribution/src/requester/mod.rs Outdated Show resolved Hide resolved

polkadot/node/network/availability-distribution/src/requester/mod.rs Outdated Show resolved Hide resolved

eskimor reviewed

View reviewed changes

Member

eskimor left a comment

There is one edge case as described by Axay on his PR: If the core does not immediately get occupied on the next leaf, we would drop the task and refetch the same thing if the core gets occupied later.

Just highlighting mostly, maybe worth documenting that this is a know limitiation. A fix is not terribly hard, but likely still overkill. (Fix being to look into the ancestry, just as we do with occupied cores - the tricky part is just to keep it cheap as fetching all the data from other subystems is relatively heavy ... but yeah, not worth it. Let's just add a comment describing this limitation.)

polkadot/node/core/av-store/src/lib.rs Outdated Show resolved Hide resolved

polkadot/node/network/availability-distribution/src/requester/mod.rs Outdated Show resolved Hide resolved

polkadot/node/network/availability-distribution/src/requester/mod.rs Show resolved Hide resolved

polkadot/node/network/availability-distribution/src/requester/mod.rs Outdated Show resolved Hide resolved

polkadot/node/network/availability-distribution/src/requester/mod.rs Outdated Show resolved Hide resolved

polkadot/node/network/availability-distribution/src/requester/mod.rs Outdated

    
              				if let Some(session_info) = session_info {

              					let num_validators =

              						session_info.validator_groups.iter().fold(0usize, |mut acc, group| {

Member

eskimor Dec 11, 2025

session info also controls the full list of validators, no reason to accumulate group counts.

Author

timwu20 Dec 11, 2025

I just lifted this from earlier in the file here. Is there another way to get the number of validators from the localized polkadot_availability_distribution::requester::session_cache::SessionInfo (link)?

Author

timwu20 Dec 19, 2025

Added num_validators attribute in localized SessionInfo in fb004b2.


          cr feedback

e540d8c

github-actions bot requested review from haikoschol and kmk142789

December 11, 2025 15:40

Contributor

github-actions bot commented Dec 11, 2025

Review required! Latest push from author must always be reviewed


          remove unnused imports

e33be9d

timwu20 requested a review from eskimor

December 12, 2025 16:58

timwu20 and others added 4 commits

December 12, 2025 12:04


          cr feedback

d8e7ba2


          Update polkadot/node/network/availability-distribution/src/requester/…

7b272be

…mod.rs

Co-authored-by: Haiko Schol <[email protected]>


          cr feedback

146245d


          add note on feature, and comment on injected core_index, add num_vali…

fb004b2

…dators to SessionInfo

Author

timwu20 commented Dec 19, 2025

There is one edge case as described by Axay on his PR: If the core does not immediately get occupied on the next leaf, we would drop the task and refetch the same thing if the core gets occupied later.

Just highlighting mostly, maybe worth documenting that this is a know limitiation. A fix is not terribly hard, but likely still overkill. (Fix being to look into the ancestry, just as we do with occupied cores - the tricky part is just to keep it cheap as fetching all the data from other subystems is relatively heavy ... but yeah, not worth it. Let's just add a comment describing this limitation.)

Added a note about this limitation in fb004b2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet