|
| 1 | +# Topology Aware Scheduling |
| 2 | +KAI Scheduler incorporates topology awareness and schedules workloads with consideration for the physical placement of nodes. |
| 3 | + |
| 4 | +## Topological Information |
| 5 | +Topology information is derived from both Kubernetes node labels and the Kueue Topology CRD: |
| 6 | +```yaml |
| 7 | +apiVersion: kueue.x-k8s.io/v1alpha1 |
| 8 | +kind: Topology |
| 9 | +metadata: |
| 10 | + name: "cluster-topology" |
| 11 | +spec: |
| 12 | + levels: |
| 13 | + - nodeLabel: "cloud.provider.com/topology-block" |
| 14 | + - nodeLabel: "cloud.provider.com/topology-rack" |
| 15 | + - nodeLabel: "kubernetes.io/hostname" |
| 16 | +``` |
| 17 | +A topology definition must include at least one level with a label selector. Based on this configuration, KAI Scheduler organizes nodes into hierarchical domains aligned with the specified labels. |
| 18 | +
|
| 19 | +## Topology Constraints |
| 20 | +Workloads that require location-aware scheduling can declare topology constraints via annotations, using either required or preferred placement. |
| 21 | +
|
| 22 | +To enforce strict rack-level locality, a workload can be annotated as follows: |
| 23 | +```yaml |
| 24 | +kai.scheduler/topology: "cluster-topology" |
| 25 | +kai.scheduler/topology-required-placement: "cloud.provider.com/topology-rack" |
| 26 | +``` |
| 27 | +With a required placement, the scheduler will not place the workload on nodes lacking the specified label. |
| 28 | +
|
| 29 | +For soft affinity, the annotation can specify a preferred placement: |
| 30 | +```yaml |
| 31 | +kai.scheduler/topology: "cluster-topology" |
| 32 | +kai.scheduler/topology-preferred-placement: "cloud.provider.com/topology-rack" |
| 33 | +``` |
| 34 | +In this case, the scheduler will first target nodes matching the label. If no suitable nodes are available, it will fall back to a higher-level topology domain. |
| 35 | +
|
| 36 | +You can also combine required and preferred placement in a single workload. In this model, the required constraint sets the upper boundary for eligible domains, while the preferred constraint guides the scheduler toward a more granular placement within that boundary. |
| 37 | +For example, a workload can require block-level placement while preferring rack-level locality, allowing tighter affinity without expanding the scheduling scope to the entire cluster. |
| 38 | +Such constraint can be expressed as: |
| 39 | +```yaml |
| 40 | +kai.scheduler/topology: "cluster-topology" |
| 41 | +kai.scheduler/topology-required-placement: "cloud.provider.com/topology-block" |
| 42 | +kai.scheduler/topology-preferred-placement: "cloud.provider.com/topology-rack" |
| 43 | +``` |
0 commit comments