Skip to content

Conversation

@divyachandralekha
Copy link
Contributor

@divyachandralekha divyachandralekha commented Nov 14, 2025

What I did
VLAN membership is now correctly populated in STPd when MSTP is enabled
VLAN member port updates handled dynamically after MSTP enable
PVST/MST mode switch logic corrected
Fixed structure alignment issues

How I verified it
Verified MSTP CIST instance convergence
Verified PVST convergence

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@divyachandralekha divyachandralekha marked this pull request as ready for review November 19, 2025 11:48
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@divyachandralekha
Copy link
Contributor Author

@skulambi-cisco , Please review

@divyachandralekha
Copy link
Contributor Author

@muhammadalihussnain
Please review

Comment on lines 146 to 216

/* Send once when PVST/MSTP is enabled first time */
if ((l2ProtoEnabled == L2_MSTP) && (old_l2ProtoEnabled == L2_NONE))
{
SWSS_LOG_INFO("Populating VLANs for STP mode change: %d -> %d", old_l2ProtoEnabled, l2ProtoEnabled);

// Get all existing VLANs from state table instead of iterating 1-4095
vector<string> vlanKeys;
m_stateVlanTable.getKeys(vlanKeys);

SWSS_LOG_INFO("Found %zu configured VLANs to populate", vlanKeys.size());

for (auto vlanKey : vlanKeys)
{
STP_MST_VLAN_PORT_MAP *vlan_msg = NULL;
vector<PORT_ATTR> port_list;
uint16_t portCnt = 0;
int len;

// Extract VLAN ID from key (e.g., "Vlan100" -> 100)
string vlanIdStr = vlanKey.substr(4); // Remove Vlan prefix

int vlan_id;
try
{
vlan_id = stoi(vlanIdStr);
}
catch (const std::exception& e)
{
SWSS_LOG_ERROR("Invalid VLAN ID in key '%s': %s", vlanKey.c_str(), e.what());
continue;
}


// Get all port members for this VLAN
portCnt = (uint16_t)getAllVlanMem(vlanKey, port_list);
if (!portCnt)
continue;

SWSS_LOG_DEBUG("VLAN %d has %d ports", vlan_id, portCnt);
len = (int)(sizeof(STP_MST_VLAN_PORT_MAP) + (portCnt * sizeof(PORT_LIST)));

vlan_msg = (STP_MST_VLAN_PORT_MAP *)calloc(1, len);
if (!vlan_msg)
{
SWSS_LOG_ERROR("calloc failed for vlan %d", vlan_id);
// Continue processing other VLANs instead of returning
continue;
}

// Populate port list
PORT_LIST *attr = vlan_msg->port_list;
for (size_t i = 0; i < port_list.size(); i++)
{
snprintf(attr[i].intf_name, IFNAMSIZ, "%s", port_list[i].intf_name);
attr[i].tagging_mode = port_list[i].mode;
SWSS_LOG_DEBUG(" Port[%zu]: %s Mode %d", i, port_list[i].intf_name, port_list[i].mode);
}

// Fill in VLAN message
vlan_msg->port_count = portCnt;
vlan_msg->vlan_id = (uint16_t)vlan_id;
vlan_msg->stp_mode = l2ProtoEnabled;
vlan_msg->add = true;

// Send message to STP daemon
sendMsgStpd(STP_MST_VLAN_PORT_LIST_CONFIG, len, (void *)vlan_msg, (L2_PROTO_MODE)vlan_msg->stp_mode);

free(vlan_msg);
}
}
Copy link
Contributor

@vganesan-nokia vganesan-nokia Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good for populating the vlan member list when the MSTP is enabled after the state db vlan table update is complete. Please consider the scenarios of updating the vlan members after the MSTP is enabled or during boot up, the MSTP global enable may be done before all vlan members are updated fully in the state db.

So in addition to this, we may also need to handle vlan member update to send add/remove port info in STP_MST_VLAN_PORT_LIST_CONFIG message for single port updates (add or remove). i.e., we may have to do something like enhancing doVlanMemUpdateTask() to send this message for MSTP (if the protocol mode is MSTP).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Please check doVlanMemUpdateTask()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Please check doVlanMemUpdateTask()

Thanks @divyachandralekha. The logic looks good. I have some minor code comments for this.

Comment on lines +1123 to +1134
// Key is just the instance ID (e.g., "0", "1", "2")
uint16_t instance_id;
try
{
instance_id = static_cast<uint16_t>(stoi(key.c_str()));
}
catch (const std::exception& e)
{
SWSS_LOG_ERROR("Invalid instance ID in key '%s': %s", key.c_str(), e.what());
it = consumer.m_toSync.erase(it);
continue;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may need changes in the sonic-utilities/config/stp.py to avoid using "MST_INSTANCE" in all places where STP_MST_INST table entries are accessed (add/remove/update or retrieve). For example adjustment is needed where, when mstp is enabled, config db STP_MST_INST is populated for instace 0 with MST_INSTANCE in the key string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

taken care in stp.py

}

sendMsgStpd(STP_MST_INST_CONFIG, len, (void *)msg);
sendMsgStpd(STP_MST_INST_CONFIG, len, (void *)msg, l2ProtoEnabled);
Copy link
Contributor

@vganesan-nokia vganesan-nokia Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message construction here does not appear to be conforming to the expectation of the processing in sonic-stp/mstp/mstp_mgr.c: mstpmgr_process_inst_vlan_config_msg()).

As per the expectation of the processing of this STP_MST_INST_CONFIG messge (in sonic-stp/mstp/mstp_mgr.c: mstpmgr_process_inst_vlan_config_msg()), the STP_MST_INST_CONFIG message is supposed to carry a list of instances. The instance information is to be filled as an element (of type MST_INST_CONFIG_MSG) of array mst_list[] - a filed in STP_MST_INSTANCE_CONFIG_MSG and msg->mst_count should be set accordingly. (This STP_MST_INSTANCE_CONFIG_MSG used (from stp_ipc.h) in the above mentioned function (from sonic-stp) is different, which uses the mst_count and mst_list).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now I have verified CIST instance alone. I will do the MST verification and related fixes in the next PR.

}

sendMsgStpd(STP_MST_GLOBAL_CONFIG, sizeof(msg), (void *)&msg);
sendMsgStpd(STP_MST_GLOBAL_CONFIG, sizeof(msg), (void *)&msg, l2ProtoEnabled);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check. In the message construction above, there may be type casting problem for msg.revision_number. In STP_MST_GLOBAL_CONFIG_MSG, the field revision_number is declared as uint16_t. But we are type casting as uint32_t. Also, casting stoi() output which is int to uint8_t may give unexpected results (reducing from 4 bytes to 1 byte).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

Copy link
Contributor

@vganesan-nokia vganesan-nokia Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see the changes for the type casting. Hope this will be taken care in the final merge version. Thanks for addressing the comment

@prsunny
Copy link
Collaborator

prsunny commented Nov 24, 2025

@divyachandralekha , please follow template to provide title description etc. The PR is lacking all details

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

vlan_msg->add = (op == SET_COMMAND) ? true : false;

sendMsgStpd(STP_MST_VLAN_PORT_LIST_CONFIG, len, (void *)vlan_msg, (L2_PROTO_MODE)vlan_msg->stp_mode);
if (vlan_msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we reset vlan_msg to null after freeing memory since this is used in a loop and we may get here without any memory allocation if isLagEmpty() is true?
OR
If we put the entire processing under the if(!isLagEmpty()) block, we may not need this check at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

{
STP_MST_VLAN_PORT_MAP *vlan_msg;
int len;
PORT_LIST *attr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This local variable may be unnecessary. We canc access the port_list directly like vlan_msg->port_list[0].tagging_mode.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return;
}

attr = vlan_msg->port_list;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If isLagEmpty() is true, there is no memory allocation and vlan_msg is un-initialized. So we may access uninitialized vlan_msg. Reccommend to put the vlan_msg encoding and sendMsgStpd also (not just memory allocation) inside if(!isLagEmpty(intfName)) block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Contributor

@vganesan-nokia vganesan-nokia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for responding to the comments. Looking forward to the next PR

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@mssonicbld
Copy link
Collaborator

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants