Skip to content

Conversation

@enoodle
Copy link
Collaborator

@enoodle enoodle commented Nov 10, 2025

Description

Refactors kai-operator and prometheus operand to have better status on the kai config CRD. This will fix the always un-available issue and also write a meaningful message when prometheus operator is not installed.

Related Issues

Fixes #

Checklist

Note: Ensure your PR title follows the Conventional Commits format (e.g., feat(scheduler): add new feature)

  • Self-reviewed
  • Added/updated tests (if needed)
  • Updated CHANGELOG.md (if needed)
  • Updated documentation (if needed)

Breaking Changes

Additional Notes

@github-actions
Copy link

Merging this branch will decrease overall coverage

Impacted Packages Coverage Δ 🤖
github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1 0.00% (ø)
github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler 78.85% (-2.00%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands 0.00% (ø)
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/admission 86.55% (-0.73%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/binder 68.93% (-0.68%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable 72.18% (-7.16%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/node_scale_adjuster 79.17% (-1.68%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_group_controller 78.41% (-0.90%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_grouper 75.44% (-1.35%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus 50.86% (-0.25%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/queue_controller 76.64% (-0.72%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/scheduler 77.70% (-1.13%) 👎

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1/config_types.go 0.00% (ø) 21 0 21
github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler/status_reconciler.go 75.00% (-1.92%) 44 (+5) 33 (+3) 11 (+2) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/admission/admission.go 72.00% (-3.00%) 25 (+1) 18 7 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/binder/binder.go 68.42% (-3.80%) 19 (+1) 13 6 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/deployable.go 72.18% (-7.16%) 133 (+12) 96 37 (+12) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/interface.go 0.00% (ø) 0 0 0
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/interface.go 0.00% (ø) 0 0 0
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/node_scale_adjuster/node_scale_adjuster.go 64.71% (-4.04%) 17 (+1) 11 6 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_group_controller/pod_group_controller.go 61.54% (-2.46%) 26 (+1) 16 10 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_grouper/pod_grouper.go 57.89% (-3.22%) 19 (+1) 11 8 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/prometheus.go 22.76% (+4.01%) 123 (+11) 28 (+7) 95 (+4) 👍
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/resources.go 82.57% (-0.62%) 109 (-4) 90 (-4) 19 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/queue_controller/queue_controller.go 61.54% (-2.46%) 26 (+1) 16 10 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/scheduler/scheduler.go 50.00% (-2.94%) 36 (+2) 18 18 (+2) 👎

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler/status_reconciler_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/deployable_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/prometheus_test.go

@github-actions
Copy link

Merging this branch changes the coverage (9 decrease, 1 increase)

Impacted Packages Coverage Δ 🤖
github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1 0.00% (ø)
github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler 78.85% (-2.00%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands 0.00% (ø)
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/admission 86.55% (-0.73%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/binder 68.93% (-0.68%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable 89.47% (+10.13%) 🎉
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/node_scale_adjuster 79.17% (-1.68%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_group_controller 78.41% (-0.90%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_grouper 75.44% (-1.35%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus 50.86% (-0.25%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/queue_controller 76.64% (-0.72%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/scheduler 77.70% (-1.13%) 👎

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1/config_types.go 0.00% (ø) 21 0 21
github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler/status_reconciler.go 75.00% (-1.92%) 44 (+5) 33 (+3) 11 (+2) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/admission/admission.go 72.00% (-3.00%) 25 (+1) 18 7 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/binder/binder.go 68.42% (-3.80%) 19 (+1) 13 6 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/deployable.go 89.47% (+10.13%) 133 (+12) 119 (+23) 14 (-11) 🎉
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/interface.go 0.00% (ø) 0 0 0
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/interface.go 0.00% (ø) 0 0 0
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/node_scale_adjuster/node_scale_adjuster.go 64.71% (-4.04%) 17 (+1) 11 6 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_group_controller/pod_group_controller.go 61.54% (-2.46%) 26 (+1) 16 10 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_grouper/pod_grouper.go 57.89% (-3.22%) 19 (+1) 11 8 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/prometheus.go 22.76% (+4.01%) 123 (+11) 28 (+7) 95 (+4) 👍
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/resources.go 82.57% (-0.62%) 109 (-4) 90 (-4) 19 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/queue_controller/queue_controller.go 61.54% (-2.46%) 26 (+1) 16 10 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/scheduler/scheduler.go 50.00% (-2.94%) 36 (+2) 18 18 (+2) 👎

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler/status_reconciler_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/deployable_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/prometheus_test.go

davidLif
davidLif previously approved these changes Nov 11, 2025
@enoodle enoodle enabled auto-merge (squash) November 11, 2025 15:35
@github-actions
Copy link

Merging this branch changes the coverage (9 decrease, 1 increase)

Impacted Packages Coverage Δ 🤖
github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1 0.00% (ø)
github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler 78.85% (-2.00%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands 0.00% (ø)
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/admission 86.55% (-0.73%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/binder 68.93% (-0.68%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable 89.47% (+10.13%) 🎉
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/node_scale_adjuster 79.17% (-1.68%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_group_controller 78.41% (-0.90%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_grouper 75.44% (-1.35%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus 50.86% (-0.25%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/queue_controller 76.64% (-0.72%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/scheduler 77.70% (-1.13%) 👎

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1/config_types.go 0.00% (ø) 21 0 21
github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler/status_reconciler.go 75.00% (-1.92%) 44 (+5) 33 (+3) 11 (+2) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/admission/admission.go 72.00% (-3.00%) 25 (+1) 18 7 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/binder/binder.go 68.42% (-3.80%) 19 (+1) 13 6 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/deployable.go 89.47% (+10.13%) 133 (+12) 119 (+23) 14 (-11) 🎉
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/interface.go 0.00% (ø) 0 0 0
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/interface.go 0.00% (ø) 0 0 0
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/node_scale_adjuster/node_scale_adjuster.go 64.71% (-4.04%) 17 (+1) 11 6 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_group_controller/pod_group_controller.go 61.54% (-2.46%) 26 (+1) 16 10 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_grouper/pod_grouper.go 57.89% (-3.22%) 19 (+1) 11 8 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/prometheus.go 22.76% (+4.01%) 123 (+11) 28 (+7) 95 (+4) 👍
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/resources.go 82.57% (-0.62%) 109 (-4) 90 (-4) 19 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/queue_controller/queue_controller.go 61.54% (-2.46%) 26 (+1) 16 10 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/scheduler/scheduler.go 50.00% (-2.94%) 36 (+2) 18 18 (+2) 👎

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler/status_reconciler_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/deployable_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/prometheus_test.go

@github-actions
Copy link

Merging this branch changes the coverage (9 decrease, 1 increase)

Impacted Packages Coverage Δ 🤖
github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1 0.00% (ø)
github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler 78.85% (-2.00%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands 0.00% (ø)
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/admission 86.55% (-0.73%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/binder 68.93% (-0.68%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable 89.47% (+10.13%) 🎉
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/node_scale_adjuster 79.17% (-1.68%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_group_controller 78.41% (-0.90%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_grouper 75.44% (-1.35%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus 50.86% (-0.25%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/queue_controller 76.64% (-0.72%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/scheduler 77.70% (-1.13%) 👎

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1/config_types.go 0.00% (ø) 21 0 21
github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler/status_reconciler.go 75.00% (-1.92%) 44 (+5) 33 (+3) 11 (+2) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/admission/admission.go 72.00% (-3.00%) 25 (+1) 18 7 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/binder/binder.go 68.42% (-3.80%) 19 (+1) 13 6 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/deployable.go 89.47% (+10.13%) 133 (+12) 119 (+23) 14 (-11) 🎉
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/interface.go 0.00% (ø) 0 0 0
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/interface.go 0.00% (ø) 0 0 0
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/node_scale_adjuster/node_scale_adjuster.go 64.71% (-4.04%) 17 (+1) 11 6 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_group_controller/pod_group_controller.go 61.54% (-2.46%) 26 (+1) 16 10 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_grouper/pod_grouper.go 57.89% (-3.22%) 19 (+1) 11 8 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/prometheus.go 22.76% (+4.01%) 123 (+11) 28 (+7) 95 (+4) 👍
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/resources.go 82.57% (-0.62%) 109 (-4) 90 (-4) 19 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/queue_controller/queue_controller.go 61.54% (-2.46%) 26 (+1) 16 10 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/scheduler/scheduler.go 50.00% (-2.94%) 36 (+2) 18 18 (+2) 👎

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler/status_reconciler_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/deployable_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/prometheus_test.go

@github-actions
Copy link

Merging this branch changes the coverage (9 decrease, 1 increase)

Impacted Packages Coverage Δ 🤖
github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1 0.00% (ø)
github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler 78.85% (-2.00%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands 0.00% (ø)
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/admission 86.55% (-0.73%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/binder 68.93% (-0.68%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable 89.47% (+10.13%) 🎉
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/node_scale_adjuster 79.17% (-1.68%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_group_controller 78.41% (-0.90%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_grouper 75.44% (-1.35%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus 51.49% (-0.27%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/queue_controller 76.64% (-0.72%) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/scheduler 77.70% (-1.13%) 👎

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/NVIDIA/KAI-scheduler/pkg/apis/kai/v1/config_types.go 0.00% (ø) 21 0 21
github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler/status_reconciler.go 75.00% (-1.92%) 44 (+5) 33 (+3) 11 (+2) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/admission/admission.go 72.00% (-3.00%) 25 (+1) 18 7 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/binder/binder.go 68.42% (-3.80%) 19 (+1) 13 6 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/deployable.go 89.47% (+10.13%) 133 (+12) 119 (+23) 14 (-11) 🎉
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/interface.go 0.00% (ø) 0 0 0
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/interface.go 0.00% (ø) 0 0 0
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/node_scale_adjuster/node_scale_adjuster.go 64.71% (-4.04%) 17 (+1) 11 6 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_group_controller/pod_group_controller.go 61.54% (-2.46%) 26 (+1) 16 10 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/pod_grouper/pod_grouper.go 57.89% (-3.22%) 19 (+1) 11 8 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/prometheus.go 22.76% (+4.01%) 123 (+11) 28 (+7) 95 (+4) 👍
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/resources.go 83.04% (-0.58%) 112 (-4) 93 (-4) 19 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/queue_controller/queue_controller.go 61.54% (-2.46%) 26 (+1) 16 10 (+1) 👎
github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/scheduler/scheduler.go 50.00% (-2.94%) 36 (+2) 18 18 (+2) 👎

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Changed unit test files

  • github.com/NVIDIA/KAI-scheduler/pkg/operator/controller/status_reconciler/status_reconciler_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/deployable/deployable_test.go
  • github.com/NVIDIA/KAI-scheduler/pkg/operator/operands/prometheus/prometheus_test.go

@enoodle enoodle merged commit 960dc94 into main Nov 11, 2025
4 checks passed
@enoodle enoodle deleted the erez/refactor-operator-with-prometheus branch November 11, 2025 22:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants