-
Notifications
You must be signed in to change notification settings - Fork 187
Enrich Forest Archive with receipts/events/tipset mapping #6802
Description
Summary
There's an ask from the ecosystem to enrich the Forest Archive with additional information:
- events,
- message receipts,
- epoch to tipset mapping.
Based on recent snapshots, the disk space overhead would be minimal.
Details
Events and message receipts
# calibnet:
❯ forest-tool archive info lite_2000_3580931.forest.car.zst
CAR format: ForestCARv1.zst
Snapshot version: 1
Network: calibnet
Epoch: 3580931
State-roots: 2000
Messages sets: 2000
Receipts: 49662
Receipts size: 3 MiB
Events: 70489
Events size: 19.6 MiB
Head Tipset: bafy2bzaceb3mmhd7nzfr6iybwtx7yskpca4khenx6w4swd6aii5er7jxj2ml2
bafy2bzacecbc4kdarzogsylixrzxq5hoo7j73b4n25nivbdmshlq5zvq55dlo
bafy2bzaceacjm5kyd6ye43ssvg3wprgxnbjvowwlagwzt3dcl6zu6qcqpkr4y
Index size: 356.8 MiB
# mainnet
❯ forest-tool archive info lite_2000_5881637.forest.car.zst
CAR format: ForestCARv1.zst
Snapshot version: 1
Network: mainnet
Epoch: 5881637
State-roots: 2000
Messages sets: 2000
Receipts: 32145
Receipts size: 1.4 MiB
Events: 78323
Events size: 14 MiB
Head Tipset: bafy2bzaceczzxkwgk57humvkuvmrdcaofjmyqq73rdpuox4oo5qzliabgbhd2
bafy2bzacecdkeuaizhozax2w2xzw36oaaoldcuglwoyxyrzrvssgd2roqli7e
bafy2bzacea4lzsq6bw7nndrjveemixa7j5qnrkhfr6tqva4ky3tvhobgt57g6
bafy2bzacebkgc26qitf7rhjn5a2coega7aegshtnc6i5konfglg6dfolvf6k4
Index size: 1.91 GiB
This translates to roughly ~1.4 MiB for receipts and ~14 MiB for events (uncompressed) on mainnet for 2000 epochs.
Warning
Forest won't be able to backfill receipts and events for the entire archive; only post-FVM epochs. This shouldn't be a big deal but shout if I'm wrong. The epoch-to-tipset mapping does not have this limitation as it only requires chain data, not state recomputation.
Backfilling receipts and events only post-FVM (Skyr @ 1960320) so ~4M epochs would total to ~3 GiB for receipts and ~30 GiB for events (assuming roughly the same load).
Epoch to tipset key mapping
Estimated size: 180–224 MiB uncompressed on mainnet (assuming 4 blocks/tipset over 5.9M epochs).
Based on experiments in #6827, it'd be 1 GiB compressed. An alternative is to have a skip list - with skip length of 10 we'd end at ~158 MiB on mainnet.
Additional considerations:
- this mapping must be created strictly for finalized tipsets
- this could replace the existing checkpoints mechanism in Forest (which is effectively a poor-man's version of this mapping)
This would allow for faster epoch-to-tipset lookups.
Approach
Backfilling
Theoretically, we could backfill this data in existing diff snapshots but this would be cumbersome. A better approach is to create additional diff CARs that only contain the extra data (as suggested by ribasushi), rather than replacing the ~20 TB archive. This could be facilitated with a subcommand that would wrap this logic.
It probably doesn't make sense to add the epoch-to-tipset mapping for archival snapshots; perhaps we could add it to the newly generated lite snapshots.
Latest snapshots
Receipts and events could be added to latest snapshots without any format changes. The disk space overhead will be minimal (<20 MiB for mainnet snapshot). Initial snapshots might not have all events, but eventually (after ~1 day) they should all be there. Lotus import with receipts and events has been confirmed to work.
Epoch-to-tipset-key mapping would live under a new field in snapshot format v2 header (see https://github.com/filecoin-project/FIPs/blob/master/FRCs/frc-0108.md#v2-specification). This change is backwards compatible.
forest/src/chain/snapshot_format.rs
Lines 45 to 55 in 0f47e3d
| /// Defined in <https://github.com/filecoin-project/FIPs/blob/98e33b9fa306959aa0131519eb4cc155522b2081/FRCs/frc-0108.md#snapshotmetadata> | |
| #[derive(Debug, Serialize, Deserialize, PartialEq, Eq, derive_more::Constructor)] | |
| #[serde(rename_all = "PascalCase")] | |
| pub struct FilecoinSnapshotMetadata { | |
| /// Snapshot version | |
| pub version: FilecoinSnapshotVersion, | |
| /// Chain head tipset key | |
| pub head_tipset_key: NonEmpty<Cid>, | |
| /// F3 snapshot `CID` | |
| pub f3_data: Option<Cid>, | |
| } |
Additional Links & Resources
Discussion in Slack https://filecoinproject.slack.com/archives/C027XAH72TD/p1773397075551599
Metadata
Metadata
Assignees
Labels
Type
Projects
Status