Skip to content

Add bestpath fanout config capabilities #524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

taspelund
Copy link
Contributor

Adds an API endpoint for /bestpath/fanout and methods to get/set the fanout value. This defines the maximum number of ECMP paths that can be selected by the RIB during bestpath calculation.

Fixes: #400

Adds an API endpoint for /bestpath/fanout and methods to get/set the
fanout value.  This defines the maximum number of ECMP paths that can be
selected by the RIB during bestpath calculation.

Fixes: #400

Signed-off-by: Trey Aspelund <[email protected]>
@taspelund taspelund requested a review from rcgoodfellow July 21, 2025 19:40
@taspelund taspelund self-assigned this Jul 21, 2025
@taspelund taspelund added bgp Border Gateway Protocol mgd Maghemite daemon static Static Routing rust Pull requests that update rust code labels Jul 21, 2025
Copy link
Collaborator

@rcgoodfellow rcgoodfellow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Trey. Overall looks good. Just a few follow ups.

/// Update the fanout setting.
Update {
/// Maximum number of equal-cost paths for ECMP forwarding
fanout: NonZeroU8,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a practical limit of something like 8 in Dendrite. Is that right @Nieuwejaar?

Copy link
Contributor Author

@taspelund taspelund Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought behind a u8 was that it would be a large enough value to give us a lot of headroom for a while. Unless our switches will be doing 32 uplinks * 8 VLANs per uplink, we wouldn't realistically hit the u8 max.

I agree that the Oxide system should impose limits based on what the switch can handle.
Does that limit belong in maghemite, dpd, or in Nexus?

I think it would make sense for a hard limit to live in dpd (just as a defensive practice) and for Nexus to restrict the upper bounds to dpd's limit at the API level.

This is not a hill I'm interested in dying on though, so if you feel strongly about having a check in maghemite as well then I'm happy to add one

#[derive(Debug, Deserialize, Serialize, JsonSchema)]
pub struct BestpathFanoutRequest {
/// Maximum number of equal-cost paths for ECMP forwarding
pub fanout: NonZeroU8,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment for this and the NonZeroU8 from mgadm on practical limits for fanout.

Signed-off-by: Trey Aspelund <[email protected]>
@taspelund
Copy link
Contributor Author

I realized during this last round of testing that the handler for the "update" API should trigger a re-run of bestpath.
Otherwise the RIB will not be updated with the new amount of allowed paths (regardless of whether the new value has increased or decreased.
e.g.

treyaspelund@Tallon-IV 07:45:24 PM | ~/git/maghemite  trey/bgp_rib_knobs
‣ ./target/debug/mgadm bestpath fanout read
2

treyaspelund@Tallon-IV 07:46:33 PM | ~/git/maghemite  trey/bgp_rib_knobs
‣ ./target/debug/mgadm bgp status selected 65100
Static Routes
=============
Prefix          Nexthop  RIB Priority
100.64.10.0/22  4.4.4.3  1
                4.4.4.4  1
                4.4.4.5  1

treyaspelund@Tallon-IV 07:46:35 PM | ~/git/maghemite  trey/bgp_rib_knobs
‣ ./target/debug/mgadm static add-v4-route 100.64.10.0/22 4.4.4.6

treyaspelund@Tallon-IV 07:46:38 PM | ~/git/maghemite  trey/bgp_rib_knobs
‣ ./target/debug/mgadm bgp status selected 65100
Static Routes
=============
Prefix          Nexthop  RIB Priority
100.64.10.0/22  4.4.4.3  1
                4.4.4.4  1

treyaspelund@Tallon-IV 07:46:40 PM | ~/git/maghemite  trey/bgp_rib_knobs
‣ ./target/debug/mgadm bestpath fanout update 4
Updated bestpath fanout to: 4

treyaspelund@Tallon-IV 07:47:00 PM | ~/git/maghemite  trey/bgp_rib_knobs
‣ ./target/debug/mgadm bgp status selected 65100
Static Routes
=============
Prefix          Nexthop  RIB Priority
100.64.10.0/22  4.4.4.3  1
                4.4.4.4  1

I'll write a method that walks the RIB and triggers a re-run of bestpath for each route, perhaps taking a closure to allow the caller to optimize which routes need to be kicked.

@taspelund
Copy link
Contributor Author

taspelund commented Jul 22, 2025

New behavior observed w/ the bestpath helper invoked by the fanout update handler:

treyaspelund@Tallon-IV 08:24:58 PM | ~/git/maghemite  trey/bgp_rib_knobs
‣ ./target/debug/mgadm bestpath fanout read
2

treyaspelund@Tallon-IV 08:25:02 PM | ~/git/maghemite  trey/bgp_rib_knobs
‣ ./target/debug/mgadm bgp status selected 65100
Static Routes
=============
Prefix          Nexthop  RIB Priority
100.64.10.0/22  4.4.4.3  1
                4.4.4.4  1

treyaspelund@Tallon-IV 08:25:04 PM | ~/git/maghemite  trey/bgp_rib_knobs
‣ ./target/debug/mgadm bestpath fanout update 4
Updated bestpath fanout to: 4

treyaspelund@Tallon-IV 08:25:11 PM | ~/git/maghemite  trey/bgp_rib_knobs
‣ ./target/debug/mgadm bgp status selected 65100
Static Routes
=============
Prefix          Nexthop  RIB Priority
100.64.10.0/22  4.4.4.3  1
                4.4.4.4  1
                4.4.4.5  1
                4.4.4.6  1

treyaspelund@Tallon-IV 08:25:12 PM | ~/git/maghemite  trey/bgp_rib_knobs
‣ ./target/debug/mgadm bestpath fanout update 1
Updated bestpath fanout to: 1

treyaspelund@Tallon-IV 08:25:14 PM | ~/git/maghemite  trey/bgp_rib_knobs
‣ ./target/debug/mgadm bgp status selected 65100
Static Routes
=============
Prefix          Nexthop  RIB Priority
100.64.10.0/22  4.4.4.3  1

Currently I'm just having the API handler pass a closure that always evaluates to true so we'll run bestpath against every prefix upon a fanout update. However, I made the helper function generic so we have the option of triggering bestpath based on more granular/optimal conditions without requiring a whole lot of other changes.

I'm sure at a large enough RIB scale we wouldn't want to run bestpath for all prefixes, but as an initial implementation I think it's probably fine. Maybe something worth revisiting once we start doing scale testing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bgp Border Gateway Protocol mgd Maghemite daemon rust Pull requests that update rust code static Static Routing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bestpath: make BESTPATH_FANOUT configurable
2 participants