Skip to content

Conversation

@ejbrever
Copy link
Contributor

@ejbrever ejbrever commented Jan 3, 2025

Related to #1241, this is a proposal to add upgrade state information for components.

Example output:

module: openconfig-platform
+--rw components
+--rw component* [name]
+--rw name -> ../config/name
+--ro state
| +--ro name? string
<......>
| +--ro upgrade
| +--ro new-version? string
| +--ro new-version-service-impacting? boolean
| +--ro status? identityref
| +--ro step? string
| +--ro step-percent-complete? oc-types:percentage
| +--ro total-percent-complete? oc-types:percentage
| +--ro start-timestamp? oc-types:timeticks64
| +--ro duration? yang:counter64
| +--ro stop-timestamp? oc-types:timeticks64
| +--ro last-known-failure? string

@robshakir @ahsaanyousaf could you please take a look?

@ejbrever ejbrever requested a review from a team as a code owner January 3, 2025 22:16
@ejbrever ejbrever requested a review from robshakir January 7, 2025 01:15
@LimeHat
Copy link

LimeHat commented Jan 13, 2025

Are there other use cases (in addition to transceiver FW upgrades)?
If not, shall we consider placement of this grouping in /components/component/transceiver (or introduce a new when condition based on the component type)?

@ejbrever
Copy link
Contributor Author

@LimeHat I was thinking this could apply well to generic OS installs in which we don't have this information elsewhere either (and our workflows are still retrieving it via CLI). If folks would prefer to scope this down, my main need is around firmware upgrades on transceivers though, so that would be okay for me.

@gprasat
Copy link

gprasat commented Jan 16, 2025

step-percent-complete vs total-percent-complete --> The number of steps involved in a component upgrade is not standard between components/vendors. Is there any advantage in having this represented as two different parameters ?

@ejbrever
Copy link
Contributor Author

The number of steps utilized by a given implementation should not matter. "step-percent-complete" just represents the percent complete of the in-progress step as defined by the "step" leaf. When a new "step" starts this just gets reset to 0, so it wouldn't matter if there were 2 or 10 steps.

@dplore
Copy link
Member

dplore commented Feb 19, 2025

/gcbrun

@OpenConfigBot
Copy link

No major YANG version changes in commit 8b509e9

@jsterne
Copy link

jsterne commented Sep 11, 2025

About the 'step' leaf: The TransceiverFirmwareInstall service proposed in #293 has an RPC called Transfer and an RPC called Install. Are those what you mean by a 'step'? Would a router report a step called 'Transfer' and then a step called 'Install'?
Would you expect that some implementations might break down Transfer into multiple steps instead of just using one step called Transfer? e.g. they might report 'transferring-raw-data' and then 'ingesting-transferred-data' and then 'validating-transferred-data' etc?

@jsterne
Copy link

jsterne commented Sep 11, 2025

How does 'step' interact with the different identities of 'status'? For example we have a status called INSTALL_FILE_LOADING and another status called INSTALL_IN_PROGRESS as two separate status values. Those almost feel like they could be 2 different steps during a firmware upgrade.

"The component reboots due to critical errors.";
}

identity OPENCONFIG_INSTALL_STATUS {
Copy link

@jsterne jsterne Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the expected status when a device (e.g. router) first boots up and there hasn't been any firmware install on a transceiver? Suppress the install status leaf (i.e. don't return it)? Or do we need some sort of INSTALL_IDLE status?

"The file is being loaded on the component.";
}

identity INSTALL_IN_PROGRESS {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be 3 separate steps in an overall install:

  1. transfer
  2. activate (with the option to not commit)
  3. commit
    Do we consider the state to be INSTALL_IN_PROGRESS from when step 1 starts until the new firmware is committed? In other words, if an operator does an activate with the "no commit" option, and then operates the optic for several hours/days without committing the newly active bank, then the install status will stay as IN_PROGRESS during those hours/days until the bank containing the new firmware is finally committed? Note that the user could switch banks multiple times before finally deciding to commit the bank containing the new firmware. And they might decide to never commit the new firmware (may instead start a new install on top of it without ever comitting it).

"The timestamp when the install started.";
}

leaf duration {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like there is some overlap/duplication in having a start-timestamp, stop-timestamp, and duration. And would duration constantly update every second? (i.e. be very noisy if subscriber to ON_CHANGE)? I'd propose we drop the duration leaf here.

description
"The percent complete for the in-progress step.";
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most value for % complete, start-timestamp and end-timestamp is for the transfer step. And it isn't so clear that it is realistic (or important) to know the % complete for activate, commit or abort. I'd suggest we rework this to have transfer-start-time, transfer-end-time, and transfer-percent-complete.

@dplore dplore moved this to Ready to discuss in OC Operator Review Nov 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Ready to discuss

Development

Successfully merging this pull request may close these issues.

6 participants