Skip to content

v0.10.0

Latest

Choose a tag to compare

@itsomri itsomri released this 18 Nov 11:26
· 1 commit to main since this release
ad3ac0e

What's Changed

Added

  • Added parent reference to SubGroup struct in PodGroup CRD to allow a hierarchical SubGroup structure
  • Added time aware scheduling capabilities
  • Added a tool to run time-aware fairness simulations over multiple cycles (see Time-Aware Fairness Simulator)
  • Added the option to configure the names of the webhook configuration resources
  • Added an option to configure reservation pods runtime class
  • Added enforcement of the nvidia runtime class for GPU pods, with the option to enforce a custom runtime class, or disable enforcement entirely
  • Added a preferred podAntiAffinity term by default for all KAI system services, can be set to required instead by setting global.requireDefaultPodAffinityTerm
  • Added support for service-level affinities
  • Added option to specify container name and type for fraction containers

Fixed

  • (Openshift only) - High CPU usage for the operator pod due to continues reconciles
  • Fixed a bug where the scheduler would not re-try updating podgroup status after failure
  • Fixed a bug where ray workloads gang scheduling would ignore minReplicas if autoscaling was not set
  • Fixed wrong status when prometheus operand is enabled in KAI Config
  • GPU-Operator v25.10.0 support for CDI enabled environments