v0.10.0

Latest

Latest

itsomri released this 18 Nov 11:26

· 1 commit to main since this release

ad3ac0e

What's Changed

Added

Added parent reference to SubGroup struct in PodGroup CRD to allow a hierarchical SubGroup structure
Added time aware scheduling capabilities
Added a tool to run time-aware fairness simulations over multiple cycles (see Time-Aware Fairness Simulator)
Added the option to configure the names of the webhook configuration resources
Added an option to configure reservation pods runtime class
Added enforcement of the nvidia runtime class for GPU pods, with the option to enforce a custom runtime class, or disable enforcement entirely
Added a preferred podAntiAffinity term by default for all KAI system services, can be set to required instead by setting global.requireDefaultPodAffinityTerm
Added support for service-level affinities
Added option to specify container name and type for fraction containers

Fixed

(Openshift only) - High CPU usage for the operator pod due to continues reconciles
Fixed a bug where the scheduler would not re-try updating podgroup status after failure
Fixed a bug where ray workloads gang scheduling would ignore minReplicas if autoscaling was not set
Fixed wrong status when prometheus operand is enabled in KAI Config
GPU-Operator v25.10.0 support for CDI enabled environments

Assets 3