NVIDIA · itsomri · Jan 7, 2026 · Jan 6, 2026 · Jan 6, 2026 · Jan 6, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -20,6 +20,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
 - Fixed GPU memory pods Fair Share and Queue Order calculations
 - Interpret negative or zero half-life value as disabled [#818](https://github.com/NVIDIA/KAI-Scheduler/pull/818) [itsomri](https://github.com/itsomri)
 - Handle invalid CSI StorageCapacities gracefully [#817](https://github.com/NVIDIA/KAI-Scheduler/pull/817) [rich7420](https://github.com/rich7420)
+- Embed CRD definitions in binary for env-test and time-aware-simulations to allow binary portability [#818](https://github.com/NVIDIA/KAI-Scheduler/pull/818) [itsomri](https://github.com/itsomri)
 
 ### Changed
 - Removed the constraint that prohibited direct nesting of subgroups alongside podsets within the same subgroupset.

diff --git a/deployments/kai-scheduler/.helmignore b/deployments/kai-scheduler/.helmignore
@@ -36,3 +36,5 @@ stable-index/*
 .github/
 tests/*
 
+# Go source files (used for embedding CRDs in Go binaries)
+*.go
diff --git a/deployments/kai-scheduler/crds/embed.go b/deployments/kai-scheduler/crds/embed.go
@@ -0,0 +1,45 @@
+// Copyright 2025 NVIDIA CORPORATION
+// SPDX-License-Identifier: Apache-2.0
+
+package crds
+
+import (
+	"embed"
+	"fmt"
+
+	apiextensionsv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1"
+	"k8s.io/apimachinery/pkg/util/yaml"
+)
+
+//go:embed *.yaml
+var embeddedCRDs embed.FS
+
+// LoadEmbeddedCRDs parses all embedded CRD YAML files and returns them as CRD objects.
+// This allows the CRDs to be bundled into binaries without depending on file paths.
+func LoadEmbeddedCRDs() ([]*apiextensionsv1.CustomResourceDefinition, error) {
+	entries, err := embeddedCRDs.ReadDir(".")
+	if err != nil {
+		return nil, fmt.Errorf("failed to read embedded crds directory: %w", err)
+	}
+
+	var crds []*apiextensionsv1.CustomResourceDefinition
+	for _, entry := range entries {
+		if entry.IsDir() || entry.Name() == "embed.go" {
+			continue
+		}
+
+		content, err := embeddedCRDs.ReadFile(entry.Name())
+		if err != nil {
+			return nil, fmt.Errorf("failed to read embedded CRD file %s: %w", entry.Name(), err)
+		}
+
+		crd := &apiextensionsv1.CustomResourceDefinition{}
+		if err := yaml.Unmarshal(content, crd); err != nil {
+			return nil, fmt.Errorf("failed to unmarshal CRD %s: %w", entry.Name(), err)
+		}
+
+		crds = append(crds, crd)
+	}
+
+	return crds, nil
+}
diff --git a/examples/README.md b/examples/README.md
@@ -0,0 +1,9 @@
+# KAI Scheduler Examples
+
+This directory contains example configurations and YAML files to help you get started with KAI Scheduler.
+
+## Quick Links
+
+- [Quickstart Examples](quickstart/README.md) - Get started with basic queue and pod setup
+- [Time-Aware Fairness](time-aware-fairness/README.md) - Configure historical usage-based fair scheduling
+
diff --git a/examples/quickstart/README.md b/examples/quickstart/README.md
@@ -0,0 +1,71 @@
+# Quick Start Examples
+
+This directory contains basic examples to get you started with KAI Scheduler.
+
+## Scheduling Queues
+
+A queue represents a job queue in the cluster. Queues are an essential scheduling primitive and can reflect different scheduling guarantees, such as resource quota and priority. Queues are typically assigned to different consumers in the cluster (users, groups, or initiatives). A workload must belong to a queue in order to be scheduled.
+
+KAI Scheduler operates with a two-level hierarchical scheduling queue system.
+
+### Default Queues
+
+After installing KAI Scheduler, a two-level queue hierarchy is automatically created:
+- `default-parent-queue` – Top-level (parent) queue. By default, this queue has no reserved resource quotas, allowing governance of resource distribution for its leaf queues.
+- `default-queue` – Leaf (child) queue under the `default-parent-queue` top-level queue. Workloads should reference this queue.
+
+The default queues are defined in [default-queues.yaml](default-queues.yaml).
+
+No manual queue setup is required. Both queues will exist immediately after installation, allowing you to start submitting workloads right away.
+
+### Creating Additional Queues
+
+To add custom queues, apply your queue configuration:
+
+```bash
+kubectl apply -f queues.yaml
+```
+
+For detailed configuration options, refer to the [Scheduling Queues documentation](../../docs/queues/README.md).
+
+## Assigning Pods to Queues
+
+To schedule a pod using KAI Scheduler, ensure the following:
+
+1. Specify the queue name using the `kai.scheduler/queue: default-queue` label on the pod/workload.
+2. Set the scheduler name in the pod specification as `kai-scheduler`.
+
+This ensures the pod is placed in the correct scheduling queue and managed by KAI Scheduler.
+
+> **⚠️ Workload Namespaces**
+> 
+> When submitting workloads, make sure to use a dedicated namespace. Do not use the `kai-scheduler` namespace for workload submission.
+
+## Submitting Example Pods
+
+### CPU-Only Pods
+
+To submit a simple pod that requests CPU and memory resources:
+
+```bash
+kubectl apply -f pods/cpu-only-pod.yaml
+```
+
+### GPU Pods
+
+Before running GPU workloads, ensure the [NVIDIA GPU-Operator](https://github.com/NVIDIA/gpu-operator) is installed in the cluster.
+
+To submit a pod that requests a GPU resource:
+
+```bash
+kubectl apply -f pods/gpu-pod.yaml
+```
+
+## Files
+
+| File | Description |
+|------|-------------|
+| [default-queues.yaml](default-queues.yaml) | Default parent and leaf queue configuration |
+| [pods/cpu-only-pod.yaml](pods/cpu-only-pod.yaml) | Example CPU-only pod |
+| [pods/gpu-pod.yaml](pods/gpu-pod.yaml) | Example GPU pod |
+
diff --git a/examples/quickstart/default-queues.yaml b/examples/quickstart/default-queues.yaml
@@ -0,0 +1,45 @@
+# Copyright 2025 NVIDIA CORPORATION
+# SPDX-License-Identifier: Apache-2.0
+
+# Default queue hierarchy created by KAI Scheduler on installation
+# Top-level parent queue - manages resource distribution for its children
+apiVersion: scheduling.run.ai/v2
+kind: Queue
+metadata:
+  name: default-parent-queue
+spec:
+  resources:
+    cpu:
+      limit: -1            # No limit
+      overQuotaWeight: 1   # Equal weight for over-quota resources
+      quota: 0             # No guaranteed quota
+    gpu:
+      limit: -1
+      overQuotaWeight: 1
+      quota: 0
+    memory:
+      limit: -1
+      overQuotaWeight: 1
+      quota: 0
+---
+# Leaf queue - workloads should reference this queue
+apiVersion: scheduling.run.ai/v2
+kind: Queue
+metadata:
+  name: default-queue
+spec:
+  parentQueue: default-parent-queue
+  resources:
+    cpu:
+      limit: -1
+      overQuotaWeight: 1
+      quota: 0
+    gpu:
+      limit: -1
+      overQuotaWeight: 1
+      quota: 0
+    memory:
+      limit: -1
+      overQuotaWeight: 1
+      quota: 0
+
diff --git a/examples/quickstart/pods/cpu-only-pod.yaml b/examples/quickstart/pods/cpu-only-pod.yaml
@@ -0,0 +1,21 @@
+# Copyright 2025 NVIDIA CORPORATION
+# SPDX-License-Identifier: Apache-2.0
+
+# Example: Simple CPU-only pod scheduled by KAI Scheduler
+apiVersion: v1
+kind: Pod
+metadata:
+  name: cpu-only-pod
+  labels:
+    kai.scheduler/queue: default-queue   # Required: assigns pod to a queue
+spec:
+  schedulerName: kai-scheduler           # Required: use KAI Scheduler
+  containers:
+    - name: main
+      image: ubuntu
+      args: ["sleep", "infinity"]
+      resources:
+        requests:
+          cpu: 100m
+          memory: 250M
+
diff --git a/examples/quickstart/pods/gpu-pod.yaml b/examples/quickstart/pods/gpu-pod.yaml
@@ -0,0 +1,22 @@
+# Copyright 2025 NVIDIA CORPORATION
+# SPDX-License-Identifier: Apache-2.0
+
+# Example: GPU pod scheduled by KAI Scheduler
+# Prerequisites: NVIDIA GPU-Operator must be installed in the cluster
+apiVersion: v1
+kind: Pod
+metadata:
+  name: gpu-pod
+  labels:
+    kai.scheduler/queue: default-queue   # Required: assigns pod to a queue
+spec:
+  schedulerName: kai-scheduler           # Required: use KAI Scheduler
+  containers:
+    - name: main
+      image: ubuntu
+      command: ["bash", "-c"]
+      args: ["nvidia-smi; trap 'exit 0' TERM; sleep infinity & wait"]
+      resources:
+        limits:
+          nvidia.com/gpu: "1"            # Request 1 GPU
+
diff --git a/examples/time-aware-fairness/README.md b/examples/time-aware-fairness/README.md
@@ -0,0 +1,156 @@
+# Time-Aware Fairness Examples
+
+Time-aware fairness is a feature in KAI Scheduler that uses historical resource usage by queues for making allocation and reclaim decisions.
+
+## Key Features
+
+1. **Historical Usage Consideration**: All else being equal, queues with higher past usage will get to run jobs after queues with lower usage.
+2. **Usage-Based Reclaim**: Queues that are starved over time will reclaim resources from queues that used a lot of resources.
+   > Note: This does not affect in-quota allocation—deserved quota still takes precedence over time-aware fairness.
+
+## How It Works
+
+Resource usage data is collected and persisted in Prometheus. The scheduler uses this data to make resource fairness calculations: the more resources consumed by a queue, the less over-quota resources it will receive compared to other queues.
+
+### Time Decay (Optional)
+
+If configured, the scheduler applies an [exponential time decay](https://en.wikipedia.org/wiki/Exponential_decay) formula controlled by a half-life period. For example, with a half-life of one hour, a GPU-second consumed an hour ago will be considered half as significant as a GPU-second consumed just now.
+
+## Examples in This Directory
+
+| File | Description |
+|------|-------------|
+| [scheduling-shard-minimal.yaml](scheduling-shard-minimal.yaml) | Minimal configuration to enable time-aware fairness |
+| [scheduling-shard-managed-prometheus.yaml](scheduling-shard-managed-prometheus.yaml) | Full configuration using KAI-managed Prometheus |
+| [scheduling-shard-external-prometheus.yaml](scheduling-shard-external-prometheus.yaml) | Configuration for using an external Prometheus instance |
+| [two-queue-oscillation/](two-queue-oscillation/) | Complete example demonstrating fair resource oscillation between two queues |
+
+## Quick Setup
+
+### Step 0: Install Prometheus (Optional)
+
+> **Note**: If you already have Prometheus and kube-state-metrics installed, skip to Step 1.
+
+If you don't already have Prometheus installed in your cluster, you can install it using the [kube-prometheus-stack](https://artifacthub.io/packages/helm/prometheus-community/kube-prometheus-stack) Helm chart. This chart includes the Prometheus Operator and kube-state-metrics.
+
+```bash
+helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
+helm repo update
+helm install prometheus prometheus-community/kube-prometheus-stack \
+  --namespace monitoring \
+  --create-namespace 
+```
+
+Wait for the pods to be ready:
+
+```bash
+kubectl wait --for=condition=Ready pods --all -n monitoring --timeout=300s
+```
+
+### Step 1: Enable Prometheus
+
+First, enable Prometheus via the KAI operator:
+
+```bash
+kubectl patch config kai-config --type merge -p '{"spec":{"prometheus":{"enabled":true}}}'
+```
+
+Wait for the Prometheus pod to be ready:
+
+```bash
+watch kubectl get pod -n kai-scheduler prometheus-prometheus-0
+```
+
+### Step 2: Configure the Scheduler
+
+Apply the minimal scheduling shard configuration:
+
+```bash
+kubectl apply -f scheduling-shard-minimal.yaml
+```
+
+Or patch the existing shard:
+
+```bash
+kubectl patch schedulingshard default --type merge -p '{"spec":{"usageDBConfig":{"clientType":"prometheus"}}}'
+```
+
+The scheduler will restart and connect to Prometheus.
+
+## Configuration Options
+
+### Usage Parameters
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `windowSize` | `1w` (1 week) | Time period considered for fairness calculations |
+| `windowType` | `sliding` | Window type: `sliding`, `tumbling`, or `cron` |
+| `halfLifePeriod` | disabled | Half-life for exponential decay (e.g., `10m`, `1h`) |
+| `fetchInterval` | `1m` | How often to fetch usage data from Prometheus |
+| `stalenessPeriod` | `5m` | Maximum age of usage data before considered stale |
+
+### kValue
+
+The `kValue` parameter controls the impact of historical usage on fairness calculations:
+- Higher values = more aggressive correction based on historical usage
+- Lower values = more weight on over-quota weights, less on history
+- Default: `1.0`
+
+### Window Types
+
+- **Sliding**: Considers usage from the last `windowSize` duration (rolling window)
+- **Tumbling**: Non-overlapping fixed windows that reset at `tumblingWindowStartTime`
+- **Cron**: Windows defined by a cron expression
+
+## Using External Prometheus
+
+If you have an existing Prometheus instance, configure it in the KAI config:
+
+```bash
+kubectl patch config kai-config --type merge -p '{
+  "spec": {
+    "prometheus": {
+      "enabled": true,
+      "externalPrometheusUrl": "http://prometheus.monitoring.svc.cluster.local:9090"
+    }
+  }
+}'
+```
+
+See [scheduling-shard-external-prometheus.yaml](scheduling-shard-external-prometheus.yaml) for a complete example.
+
+## Troubleshooting
+
+### Prerequisites
+
+Ensure the [Prometheus Operator](https://prometheus-operator.dev/docs/getting-started/installation/) is installed:
+
+```bash
+kubectl get crd prometheuses.monitoring.coreos.com
+```
+
+For cluster capacity metrics, [kube-state-metrics](https://artifacthub.io/packages/helm/prometheus-community/kube-state-metrics/) must also be installed.
+
+### Check Scheduler Logs
+
+If the scheduler cannot fetch usage metrics:
+
+```bash
+kubectl logs -n kai-scheduler deployment/kai-scheduler-default | grep -i usage
+```
+
+### Verify Prometheus Connection
+
+Check if the scheduler can reach Prometheus:
+
+```bash
+kubectl exec -n kai-scheduler deployment/kai-scheduler-default -- wget -q -O- http://prometheus-operated.kai-scheduler.svc.cluster.local:9090/api/v1/status/config
+```
+
+## Further Reading
+
+- [Time-Aware Fairness Documentation](../../docs/timeaware/README.md)
+- [Fairness Concepts](../../docs/fairness/README.md)
+- [Time-Aware Design Document](../../docs/developer/designs/time-aware-fairness/time-aware-fairness.md)
+- [Time-Aware Simulator](../../cmd/time-aware-simulator/README.md)
+
-Original file line number
+Diff line change
@@ Expand Up / @@ -36,3 +36,5 @@ stable-index/* @@
     .github/
     tests/*
+    # Go source files (used for embedding CRDs in Go binaries)
+    *.go