Skip to content

Conversation

@dypet
Copy link
Contributor

@dypet dypet commented Nov 14, 2025

What I did
Set flow_reconcile_pending, activate_role_pending, and brainsplit_recover_pending back to false after receiving the appropriate notification. For flow_reconcile_pending and activate_role_pending, this is after controller sends operation approval. For brainsplit, clearing the flag after DPU enters stable state again.

Why I did it
flow_reconcile_pending, activate_role_pending, and brainsplit_recover_pending were not being reset to false after being set true for the first time.

How I verified it
Tested on HA Smartswitch testbed, checked activate_role_pending etc is false after sending operation approval. Tested brainsplit by shutting communication channel between DPUs and then un-shutting after both enter standalone state, causing brain split. Recovered from brainsplit by setting one DPU to admin down/dead and then back to admin up, after which the DPUs paired into active/standby again.

Details if related

@dypet dypet requested a review from prsunny as a code owner November 14, 2025 18:49
@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

else if (in(ha_scope_event[i].ha_state, {SAI_DASH_HA_STATE_ACTIVE,
SAI_DASH_HA_STATE_STANDBY}))
{
fvs.push_back({"brainsplit_recover_pending", "false"});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what the ha_state is when it reports brain split. Have you tested brain split?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brainsplit is reported when DPU-DPU connection recovers and both DPU are in standalone

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So if they transition to active or standby, it means they have recovered from brain-split state.

Copy link

@yue-fred-gao yue-fred-gao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@zjswhhh zjswhhh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@prsunny prsunny changed the title Set pending flags back to false. [SmartSwitch-HA] Set pending flags back to false. Nov 24, 2025
@prsunny prsunny merged commit 4c6457e into sonic-net:master Nov 25, 2025
15 checks passed
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202511: #4021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants