Skip to content

Conversation

@shajmakh
Copy link
Member

@shajmakh shajmakh commented Nov 5, 2025

So far we had only base conditions for the NRO object, but we want to
have a more flexible interface to interact with while supporting
non-base conditions, just like we have for NRS controller.
In this commit:

  1. preserve the consistency of updating status conditions for
    numaresources CRs (operator and scheduler).
  2. minimize Status.Update calls on degraded condition updates
  3. keep using conditioninfo to maintain related commit modifications
    as much as possible, refactor later (switch to metav1 conditions)
  4. update conditions and return them instead of mutation.

@openshift-ci openshift-ci bot requested review from Tal-or and mrniranjan November 5, 2025 07:33
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 5, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: shajmakh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 5, 2025
@shajmakh shajmakh changed the title Manage conditions update controllers: cleanup status conditions update Nov 5, 2025
@shajmakh shajmakh changed the title controllers: cleanup status conditions update OCPBUGS-63029: controllers: cleanup status conditions update Nov 5, 2025
@openshift-ci-robot
Copy link

@shajmakh: This pull request references Jira Issue OCPBUGS-63029, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

So far we had only base conditions for the NRO object, but we want to
have a more flexible interface to interact with while supporting
non-base conditions, just like we have for NRS controller.
In this commit:

  1. preserve the consistency of updating status conditions for
    numaresources CRs (operator and scheduler).
  2. minimize Status.Update calls on degraded condition updates
  3. keep using conditioninfo to maintain related commit modifications
    as much as possible, refactor later (switch to metav1 conditions)
    Also fix a common function that updates conditions in place while considering the kind of condition (base/non-base)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@shajmakh
Copy link
Member Author

shajmakh commented Nov 5, 2025

/jira refresh

@openshift-ci-robot
Copy link

@shajmakh: This pull request references Jira Issue OCPBUGS-63029, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @mrniranjan

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@shajmakh
Copy link
Member Author

shajmakh commented Nov 5, 2025

relies on #2324

@shajmakh
Copy link
Member Author

shajmakh commented Nov 5, 2025

/retest

Copy link
Collaborator

@Tal-or Tal-or left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code wise LGTM but I need to have another cycle to understand the logic more clearly

Copy link
Member

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unifying the status/conditions management is a good idea. I actually dislike the InPlace approach and I'd like to see THAT gone, not the other way around

@shajmakh
Copy link
Member Author

shajmakh commented Nov 6, 2025

Unifying the status/conditions management is a good idea. I actually dislike the InPlace approach and I'd like to see THAT gone, not the other way around

is I understand you correctly, you want the code to explicitly return an updated condition set for more safe and guaranteed conditions updates, is that right?

@ffromani
Copy link
Member

Unifying the status/conditions management is a good idea. I actually dislike the InPlace approach and I'd like to see THAT gone, not the other way around

is I understand you correctly, you want the code to explicitly return an updated condition set for more safe and guaranteed conditions updates, is that right?

Unless there are obvious and major performance constraints, which we don't have evidence is the case, I strongly prefer to have functions which construct a full set of new conditions, likely based on the existing ones, rather than mutate the current set.

IOW, the InPlace variant should be gone, and we should just construct a new set an replace the full slice.

@shajmakh shajmakh force-pushed the manage-conditions-update branch from 804df43 to 88e7afc Compare November 19, 2025 10:24
@shajmakh
Copy link
Member Author

/retest

@shajmakh shajmakh force-pushed the manage-conditions-update branch from 88e7afc to 0058ce8 Compare November 27, 2025 16:50
@openshift-ci-robot
Copy link

@shajmakh: This pull request references Jira Issue OCPBUGS-63029, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @mrniranjan

In response to this:

So far we had only base conditions for the NRO object, but we want to
have a more flexible interface to interact with while supporting
non-base conditions, just like we have for NRS controller.
In this commit:

  1. preserve the consistency of updating status conditions for
    numaresources CRs (operator and scheduler).
  2. minimize Status.Update calls on degraded condition updates
  3. keep using conditioninfo to maintain related commit modifications
    as much as possible, refactor later (switch to metav1 conditions)
  4. update conditions and return them instead of mutation.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

The old version of the function was updating the condition inplace
without considering whether the it's a base condition or not.
The difference in base and non-base condition is that the status of one
base condition affect all others, thus the rest would need an update too.

The new version enhances this and splits the conditions into base and
non-base ones and update all conditions accordingly using u/s SetStatusCondition,
 which also cares to avoid noisy updates.
ref:
https://github.com/kubernetes/apimachinery/blob/master/pkg/api/meta/conditions.go

note that this peice is part of a larger enhancement that will soon
be used in the NRO-controller to update the conditions there.

Signed-off-by: Shereen Haj <[email protected]>
Since we are relying on `"k8s.io/apimachinery/pkg/api/meta"` function to update the conditions, use also the u/s
function to find the condition.

Signed-off-by: Shereen Haj <[email protected]>
So far we had only base conditions for the NRO object, but we want to
have a more flexible interface to interact with while supporting
non-base conditions, just like we have for NRS controller.
In this commit:
1. preserve the consistency of updating status conditions for
   numaresources CRs (operator and scheduler).
2. minimize Status.Update calls on degraded condition updates.
3. keep using `conditioninfo` to maintain related commit modifications
   as much as possible, refactor later (switch to metav1 conditions).
4. update conditions and return them instead of mutation.

Signed-off-by: Shereen Haj <[email protected]>
@shajmakh
Copy link
Member Author

/retest

Copy link
Member

@ffromani ffromani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this work. I'm generally in favor but I need another pass to understand deeply the changes and their implications

// update - so if they are updated since the current conditions.
func ComputeConditions(currentConditions []metav1.Condition, cond metav1.Condition, now time.Time) ([]metav1.Condition, bool) {
conditions := NewConditions(cond, now)
conditions := NewBaseConditionsWith(cond, now)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why With? Let's clarify further (WithTime?) or drop it

Comment on lines 121 to 127
newBase := NewBaseConditionsWith(condition, ts)
updated := false
for idx := range newBase {
changed := metahelper.SetStatusCondition(&conds, newBase[idx])
updated = updated || changed
}
return true
return updated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have testcase(s) which highlight the issues in the old code?

// UpdateConditionsInPlace mutates the given conditions, setting the value of the one pointed out to `condition` to the given values.
// Differently from `ComputeConditions`, it doesn't allocate new data. Returns true if successfully mutated conditions, false otherwise
func UpdateConditionsInPlace(conds []metav1.Condition, condition metav1.Condition, ts time.Time) bool {
cond := FindCondition(conds, condition.Type)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea. We can remove all the instance of this now-redundant function, can't we?

Comment on lines +90 to +91
// UpdateConditions returns the given conditions updated with the given condition.
// Returns true if the condition was updated, false otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thing is, Update is usually in place. Non in-place operation is Make Compute Create, whatever, but is not Update. We need to fix the naming here, because most certainly we don't want to mutate in-place (Update) if we can help it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants