Skip to content

[P2P] Router bootstrapping #859

@bryanchriswhite

Description

@bryanchriswhite

Objective

Clarify bootstrapping requirements, constraints sufficient to align on "correct" behavior and realize "low-hanging" optimization opportunity.

Origin Document

Questions surfaced while working on #732 & #694.

Goals

  • Clarify bootstrapping "success"/"failure" conditions
  • Reduce time to bootstrap (or fail) in router implementations
  • Consider how bootstrapping status is signaled to other modules (esp. if we plan on removing the FSM)
  • Account for TTL-base nature of libp2p peerstore

Legend

flowchart

a[State description]
subgraph next[Next state description]
	nest[Nested state description]
end
other[Other state]

cond{Condition}
act([Action])

a --> next
next --> cond
cond --"condition value"--> other


cond --"alternative value"--> act
act --"result"--> other
Loading

Flowchart

flowchart

start[Node Startup]
subgraph persStart[Persistence Module Start]
    hasState{Does this node already\nhave some state?}
    gen(["Genesis hydration\n(initial staked actor\nidentities nadded to state)"])
end

hasState --"NO"--> gen


subgraph p2pStart[P2P Module Start]
    subgraph l[Libp2p Host Setup]
        ll([Libp2p host\nlistening])
    end
    subgraph sar[Staked Actor Router Setup]
        sarHandle([Staked actor router\nprotocol handler\nregistered])
    end
    subgraph usar[Unstaked actor Router Setup]
        usarDisc([Unstaked actor DHT peer\ndiscovery start])
        usarHandle([Unstaked actor router\nprotocol handler\nregistered])
        usarGossip([Unstaked actor router\nGossipsub setup])
    end
end

l --> sar
sar --> usar

bs[P2P Bootstrapping Start]

isBSNode{Is this node\nconfigured as a\nbootstrap node?}

start --> persStart
persStart --> p2pStart
p2pStart --> bs
bs --> isBSNode
isBSNode --"NO"--> bsProg
isBSNode --"YES"--> bsBSNode

firstBS --> bsReachable
bsReachable --"YES"--> rpc

subgraph bsProg[RPC bootstrapping]
    firstBS[Considering first configured bootstrap node]
    rpc([Get staked peers from\n`rpcPeerstoreProvider`\nusing bootstrap node])
    isStaked{Is this node a\nstaked actor?}

    firstPeer[Considering first peer]
    nextPeer[Considering next peer]
    peerReachable{Is the current\npeer healthy?}
    morePeers{Are there more peers?}

    bsReachable{"is the current\nbootstrap node healthy?"}
    
    addStaked([Add peer to\nstaked actor router])
    addUnstaked([Add peer to\nunstaked actor router])
    addLibp2p([Add peer to libp2p host])
    con([Attempt to connect])

    minbs{are >= 3 peers\nconnected?}
    bsRetry(["Retry"])
    bsAttempts{"Max attempts\nreached for this\npeer?"}

    moreBS{Are there more\nconfigured\nbootstrap nodes?}
    nextBS[Considering next\nbootstrap node]
end

minbs --"NO"--> bsFail
minbs --"YES"--> bsDone

bsReachable --"NO"--> nextBS
rpc --> firstPeer


firstPeer --> peerReachable
peerReachable --"YES"--> isStaked
isStaked --"YES"--> addStaked
isStaked --"NO"--> addUnstaked
addStaked --> addUnstaked
addUnstaked --> addLibp2p
peerReachable --"NO"--> nextPeer
nextPeer --> peerReachable
addLibp2p --> con
con --"success"--> morePeers
morePeers --"NO"--> moreBS
morePeers --"YES"--> nextPeer
con --"error"--> bsAttempts

nextBS --> bsReachable


bsFail --> nodeFail
bsAttempts --"NO"--> bsRetry
bsAttempts --"YES"--> nextPeer

bsRetry --> con

moreBS --"NO" --> minbs
moreBS --"YES"--> nextBS

subgraph bsBSNode[Bootstrap node setup]
    gps["Get staked peers from\n`persistencePeerstoreProvider`\n(last known state; possibly genesis\nor a snapshot)"]
    addAllStaked([Add peers to\nstaked actor router])
    addAllUnstaked([Add peers to\nunstaked actor router])
    addAllLibp2p([Add peers to libp2p\nhost peerstore])
    isStaked2{Is this node a\nstaked actor?}
end

gps --> isStaked2
isStaked2 --"YES"--> addAllStaked
isStaked2 --"NO"--> addAllUnstaked
addAllStaked --> addAllUnstaked
addAllUnstaked --> addAllLibp2p
addAllLibp2p --> bsDone

bsFail["P2P Bootrapping Failure"]
nodeFail["Node Startup Failure"]

bsDone["P2P Bootstrapping Success"]

fsm["State Machine Transition\n(`P2P_IsBootstrapped`)"]
rest[...]
bsDone --> fsm
fsm --> rest
Loading

Deliverable

  • Determine success/failure bootstrapping condition(s) (e.g. when quorum number of known bootstrap nodes are (un)reachable)
  • Update P2P docs to describe bootstrapping success and failure scenarios
  • Update router bootstrapping implementations respectively
  • Support simultaneous dialing of bootstrap nodes with some "max concurrency" (see: [P2P] chore: concurrent bootstrapping #694)
  • Design bootstrap status signaling mechanism / convention
  • Ensure libp2p peerstore network addresses expire & renew appropriately

Non-goals / Non-deliverables

  • Remove or replace the state machine module

General issue deliverables

  • Update the appropriate CHANGELOG(s)
  • Update any relevant local/global README(s)
  • Update relevant source code tree explanations
  • Add or update any relevant or supporting mermaid diagrams

Testing Methodology

  • All tests: make test_all
  • LocalNet: verify a LocalNet is still functioning correctly by following the instructions at docs/development/README.md
  • k8s LocalNet: verify a k8s LocalNet is still functioning correctly by following the instructions here

Creator: @bryanchriswhite
Co-Owners:

Metadata

Metadata

Labels

p2pP2P specific changes

Type

No type

Projects

Status

In Progress

Relationships

None yet

Development

No branches or pull requests

Issue actions