-
Notifications
You must be signed in to change notification settings - Fork 33
Open
Description
Objective
Clarify bootstrapping requirements, constraints sufficient to align on "correct" behavior and realize "low-hanging" optimization opportunity.
Origin Document
Questions surfaced while working on #732 & #694.
Goals
- Clarify bootstrapping "success"/"failure" conditions
- Reduce time to bootstrap (or fail) in router implementations
- Consider how bootstrapping status is signaled to other modules (esp. if we plan on removing the FSM)
- Account for TTL-base nature of libp2p peerstore
Legend
flowchart
a[State description]
subgraph next[Next state description]
nest[Nested state description]
end
other[Other state]
cond{Condition}
act([Action])
a --> next
next --> cond
cond --"condition value"--> other
cond --"alternative value"--> act
act --"result"--> other
Flowchart
flowchart
start[Node Startup]
subgraph persStart[Persistence Module Start]
hasState{Does this node already\nhave some state?}
gen(["Genesis hydration\n(initial staked actor\nidentities nadded to state)"])
end
hasState --"NO"--> gen
subgraph p2pStart[P2P Module Start]
subgraph l[Libp2p Host Setup]
ll([Libp2p host\nlistening])
end
subgraph sar[Staked Actor Router Setup]
sarHandle([Staked actor router\nprotocol handler\nregistered])
end
subgraph usar[Unstaked actor Router Setup]
usarDisc([Unstaked actor DHT peer\ndiscovery start])
usarHandle([Unstaked actor router\nprotocol handler\nregistered])
usarGossip([Unstaked actor router\nGossipsub setup])
end
end
l --> sar
sar --> usar
bs[P2P Bootstrapping Start]
isBSNode{Is this node\nconfigured as a\nbootstrap node?}
start --> persStart
persStart --> p2pStart
p2pStart --> bs
bs --> isBSNode
isBSNode --"NO"--> bsProg
isBSNode --"YES"--> bsBSNode
firstBS --> bsReachable
bsReachable --"YES"--> rpc
subgraph bsProg[RPC bootstrapping]
firstBS[Considering first configured bootstrap node]
rpc([Get staked peers from\n`rpcPeerstoreProvider`\nusing bootstrap node])
isStaked{Is this node a\nstaked actor?}
firstPeer[Considering first peer]
nextPeer[Considering next peer]
peerReachable{Is the current\npeer healthy?}
morePeers{Are there more peers?}
bsReachable{"is the current\nbootstrap node healthy?"}
addStaked([Add peer to\nstaked actor router])
addUnstaked([Add peer to\nunstaked actor router])
addLibp2p([Add peer to libp2p host])
con([Attempt to connect])
minbs{are >= 3 peers\nconnected?}
bsRetry(["Retry"])
bsAttempts{"Max attempts\nreached for this\npeer?"}
moreBS{Are there more\nconfigured\nbootstrap nodes?}
nextBS[Considering next\nbootstrap node]
end
minbs --"NO"--> bsFail
minbs --"YES"--> bsDone
bsReachable --"NO"--> nextBS
rpc --> firstPeer
firstPeer --> peerReachable
peerReachable --"YES"--> isStaked
isStaked --"YES"--> addStaked
isStaked --"NO"--> addUnstaked
addStaked --> addUnstaked
addUnstaked --> addLibp2p
peerReachable --"NO"--> nextPeer
nextPeer --> peerReachable
addLibp2p --> con
con --"success"--> morePeers
morePeers --"NO"--> moreBS
morePeers --"YES"--> nextPeer
con --"error"--> bsAttempts
nextBS --> bsReachable
bsFail --> nodeFail
bsAttempts --"NO"--> bsRetry
bsAttempts --"YES"--> nextPeer
bsRetry --> con
moreBS --"NO" --> minbs
moreBS --"YES"--> nextBS
subgraph bsBSNode[Bootstrap node setup]
gps["Get staked peers from\n`persistencePeerstoreProvider`\n(last known state; possibly genesis\nor a snapshot)"]
addAllStaked([Add peers to\nstaked actor router])
addAllUnstaked([Add peers to\nunstaked actor router])
addAllLibp2p([Add peers to libp2p\nhost peerstore])
isStaked2{Is this node a\nstaked actor?}
end
gps --> isStaked2
isStaked2 --"YES"--> addAllStaked
isStaked2 --"NO"--> addAllUnstaked
addAllStaked --> addAllUnstaked
addAllUnstaked --> addAllLibp2p
addAllLibp2p --> bsDone
bsFail["P2P Bootrapping Failure"]
nodeFail["Node Startup Failure"]
bsDone["P2P Bootstrapping Success"]
fsm["State Machine Transition\n(`P2P_IsBootstrapped`)"]
rest[...]
bsDone --> fsm
fsm --> rest
Deliverable
- Determine success/failure bootstrapping condition(s) (e.g. when quorum number of known bootstrap nodes are (un)reachable)
- Update P2P docs to describe bootstrapping success and failure scenarios
- Update router bootstrapping implementations respectively
- Support simultaneous dialing of bootstrap nodes with some "max concurrency" (see: [P2P] chore: concurrent bootstrapping #694)
- Design bootstrap status signaling mechanism / convention
- Ensure libp2p peerstore network addresses expire & renew appropriately
- Consider using
AddressTTLas default - Should staked actor network addresses expire? (see:
PermanentAddrTTL)
- Consider using
Non-goals / Non-deliverables
- Remove or replace the state machine module
General issue deliverables
- Update the appropriate CHANGELOG(s)
- Update any relevant local/global README(s)
- Update relevant source code tree explanations
- Add or update any relevant or supporting mermaid diagrams
Testing Methodology
- All tests:
make test_all - LocalNet: verify a
LocalNetis still functioning correctly by following the instructions at docs/development/README.md - k8s LocalNet: verify a
k8s LocalNetis still functioning correctly by following the instructions here
Creator: @bryanchriswhite
Co-Owners:
Metadata
Metadata
Assignees
Labels
p2pP2P specific changesP2P specific changes
Type
Projects
Status
In Progress