|
| 1 | +# Resource Management in Skyhook |
| 2 | + |
| 3 | +Skyhook provides flexible and robust resource management for the pods it creates, allowing you to control CPU and memory usage at both the namespace and per-package level. This document explains how resource defaults and overrides work, and what validation rules are enforced. |
| 4 | + |
| 5 | +--- |
| 6 | + |
| 7 | +## 1. Namespace Defaults with LimitRange |
| 8 | + |
| 9 | +By default, Skyhook uses a [Kubernetes LimitRange](https://kubernetes.io/docs/concepts/policy/limit-range/) to set default CPU and memory requests/limits for all containers in the namespace where Skyhook operates. |
| 10 | + |
| 11 | +**Example LimitRange:** |
| 12 | +```yaml |
| 13 | +apiVersion: v1 |
| 14 | +kind: LimitRange |
| 15 | +metadata: |
| 16 | + name: skyhook-default-limits |
| 17 | + namespace: <your-namespace> |
| 18 | +spec: |
| 19 | + limits: |
| 20 | + - type: Container |
| 21 | + default: |
| 22 | + cpu: 500m |
| 23 | + memory: 512Mi |
| 24 | + defaultRequest: |
| 25 | + cpu: 250m |
| 26 | + memory: 256Mi |
| 27 | +``` |
| 28 | +- If a pod/container does **not** specify its own resources, these defaults are applied. |
| 29 | +- You can configure these values via the Helm chart or Kustomize overlays. |
| 30 | +
|
| 31 | +--- |
| 32 | +
|
| 33 | +## 2. Per-Package Resource Overrides |
| 34 | +
|
| 35 | +You can override the default resource requests/limits for each package in your Skyhook Custom Resource (CR). This is done in the `resources` field for each package: |
| 36 | + |
| 37 | +**Example:** |
| 38 | +```yaml |
| 39 | +spec: |
| 40 | + packages: |
| 41 | + mypackage: |
| 42 | + version: 1.0.0 |
| 43 | + image: ghcr.io/nvidia/skyhook-packages/shellscript |
| 44 | + resources: |
| 45 | + cpuRequest: "200m" |
| 46 | + cpuLimit: "400m" |
| 47 | + memoryRequest: "128Mi" |
| 48 | + memoryLimit: "256Mi" |
| 49 | +``` |
| 50 | +- If **any** of the four fields (`cpuRequest`, `cpuLimit`, `memoryRequest`, `memoryLimit`) are set, **all four must be set** and must be positive values. |
| 51 | +- If no override is set, the namespace's LimitRange applies. |
| 52 | + |
| 53 | +--- |
| 54 | + |
| 55 | +## 3. Validation Rules |
| 56 | + |
| 57 | +Skyhook enforces the following validation rules (via webhook) for resource overrides: |
| 58 | + |
| 59 | +- If any of the four resource fields are set, **all four must be set**. |
| 60 | +- All values must be **positive**. |
| 61 | +- `cpuLimit` must be **greater than or equal to** `cpuRequest`. |
| 62 | +- `memoryLimit` must be **greater than or equal to** `memoryRequest`. |
| 63 | + |
| 64 | +**Examples:** |
| 65 | + |
| 66 | +| Valid? | cpuRequest | cpuLimit | memoryRequest | memoryLimit | Reason | |
| 67 | +|--------|-----------|----------|--------------|-------------|--------| |
| 68 | +| ✅ | 200m | 400m | 128Mi | 256Mi | All set, valid | |
| 69 | +| ❌ | 200m | | 128Mi | 256Mi | Not all fields set | |
| 70 | +| ❌ | 200m | 100m | 128Mi | 256Mi | cpuLimit < cpuRequest | |
| 71 | +| ❌ | 200m | 400m | 128Mi | 64Mi | memoryLimit < memoryRequest | |
| 72 | +| ❌ | 0 | 400m | 128Mi | 256Mi | Zero value | |
| 73 | + |
| 74 | +If a resource override is invalid, the Skyhook CR will be **rejected** by the webhook. |
| 75 | + |
| 76 | +--- |
| 77 | + |
| 78 | +## 4. Best Practices |
| 79 | + |
| 80 | +- Use LimitRange to set sensible defaults for your namespace. |
| 81 | +- Only set per-package overrides if you need different resource requirements for a specific package. |
| 82 | +- Review your resource settings to avoid overcommitting or underutilizing cluster resources. |
| 83 | +- If you change LimitRange defaults, new pods will use the new defaults unless overridden. |
| 84 | + |
| 85 | +--- |
| 86 | + |
| 87 | +## 5. Troubleshooting |
| 88 | + |
| 89 | +- If your Skyhook CR is rejected, check that all four resource fields are set and valid if you are using overrides. |
| 90 | +- Use `kubectl describe limitrange -n <namespace>` to see the current defaults. |
| 91 | +- Use `kubectl describe skyhook <name>` to see the status and any error messages. |
| 92 | + |
| 93 | +--- |
| 94 | + |
| 95 | +## 6. Disabling Resource Defaults (No Limits) |
| 96 | + |
| 97 | +If you do **not** want any default resource requests or limits applied to your Skyhook-managed pods/containers, you can simply **omit the LimitRange** from your namespace: |
| 98 | + |
| 99 | +- **Helm:** Set `limitRange: {}` or remove the `limitRange` section from your `values.yaml`. |
| 100 | + |
| 101 | +If there is **no LimitRange** and you do **not** set resource requests/limits in your package overrides, then: |
| 102 | +- Your pods/containers will run with **no resource requests or limits**. |
| 103 | +- This means they will be scheduled as "BestEffort" pods, which may be evicted first under resource pressure and may not get guaranteed CPU/memory. |
| 104 | + |
| 105 | +**Note:** |
| 106 | +- Disabling resource limits is not recommended for production clusters, as it can lead to resource contention and unpredictable scheduling. |
| 107 | +- Only do this if you have a specific reason and understand the implications. |
| 108 | + |
| 109 | +--- |
| 110 | + |
| 111 | +For more information, see the [Kubernetes documentation on resource management](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) and [LimitRange](https://kubernetes.io/docs/concepts/policy/limit-range/). |
0 commit comments