Skip to content

Commit a6e7903

Browse files
committed
feat: change how limits are manged to a use a limitrange via helm
includes updates to documentation organization
1 parent 4d23724 commit a6e7903

File tree

23 files changed

+313
-54
lines changed

23 files changed

+313
-54
lines changed

README.md

Lines changed: 5 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ Skyhook works in any Kubernetes environment (self-managed, on-prem, cloud) and s
4242
- **Package Interrupt:** service (containerd, cron, any thing systemd), or reboot
4343
- **Additional Tolerations:** are tolerations added to the packages
4444
- [**Runtime Required**](docs/runtime_required.md): requires node to come into the cluster with a taint, and will do work prior to removing custom taint.
45+
- **Resource Management:** Skyhook uses Kubernetes [LimitRange](https://kubernetes.io/docs/concepts/policy/limit-range/) to set default CPU and memory requests/limits for all containers in its namespace. You can override these defaults per-package in your Skyhook CR. Strict validation is enforced: if you set any resource override, you must set all four fields (cpuRequest, cpuLimit, memoryRequest, memoryLimit), and limits must be >= requests. See [docs/resource_management.md](docs/resource_management.md) for details and examples.
4546

4647
## Pre-built Packages
4748

@@ -137,29 +138,13 @@ Part of how the operator works is the [skyhook-agent](agent/README.md). Packages
137138
└── config.json
138139
```
139140

140-
## Example Kyverno Policy
141+
## Examples
141142

142-
This repository includes an example Kyverno policy that demonstrates how to restrict the images that can be used in Skyhook packages. While this is not a complete policy, it serves as a template that end users can modify to fit their security needs.
143+
See the [examples/](examples/) directory for sample manifests, usage patterns, and demo configurations to help you get started with Skyhook.
143144

144-
The policy prevents the creation of Skyhook resources that contain packages with restricted image patterns. Specifically, it blocks:
145-
- Images containing 'shellscript:' anywhere in the image name
146-
- Images from Docker Hub (matching 'docker.io/*')
145+
## Kyverno Policy Examples
147146

148-
If you are going to use kyverno make sure to turn on the creation of the skyhook-viewer-role in the values file for the operator. (rbac.createSkyhookViewerRole: true) and then bind kyverno to that role. Example policy:
149-
```
150-
apiVersion: rbac.authorization.k8s.io/v1
151-
kind: ClusterRoleBinding
152-
metadata:
153-
name: kyverno-skyhook-binding
154-
roleRef:
155-
apiGroup: rbac.authorization.k8s.io
156-
kind: ClusterRole
157-
name: skyhook-viewer-role
158-
subjects:
159-
- kind: ServiceAccount
160-
name: kyverno-reports-controller
161-
namespace: kyverno
162-
```
147+
See [docs/kyverno/README.md](docs/kyverno/README.md) for example Kyverno policies and guidance on restricting images or packages in Skyhook resources.
163148

164149
## [Skyhook-Operator](operator/README.md)
165150
The operator is a kbuernetes operator that monitors cluster events and coordinates the installation and lifecycle of Skyhook packages.

chart/README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,4 +35,7 @@ Settings | Description | Default |
3535
### NOTES
3636
- **estimatedPackageCount** and **estimatedNodeCount** are used to size the resource requirements. Default setting should be good for nodes > 1000 and packages 1-2 or nodes > 500 and packages >= 4. If your approaching this size deployment it would make sense to set these. You can also override them by explicitly with `controllerManager.manager.resources` the values file has an example.
3737
- **runtimeRequired**: If your systems nodes have this taint make sure to add the toleration to the controllerManager.tolerations
38-
- **CRD**: This project currently has one CRD and its not managed the ["recommended" way](https://helm.sh/docs/chart_best_practices/custom_resource_definitions/). Its part of the templates. Meaning it will be updated with the `helm upgrade`. We decided it was better do it this way for this project. Doing it either way has consequences and this route has worked well for upgrades so far our deployments.
38+
- **CRD**: This project currently has one CRD and its not managed the ["recommended" way](https://helm.sh/docs/chart_best_practices/custom_resource_definitions/). Its part of the templates. Meaning it will be updated with the `helm upgrade`. We decided it was better do it this way for this project. Doing it either way has consequences and this route has worked well for upgrades so far our deployments.
39+
40+
### Resource Management
41+
Skyhook uses Kubernetes LimitRange to set default CPU/memory requests/limits for all containers in the namespace. You can override these per-package in your Skyhook CR. Strict validation is enforced. See [../docs/resource_management.md](../docs/resource_management.md) for details and examples.

chart/templates/limitrange.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
{{- if .Values.limitRange }}
2+
apiVersion: v1
3+
kind: LimitRange
4+
metadata:
5+
name: {{ include "skyhook.fullname" . }}-default-limits
6+
namespace: {{ .Release.Namespace }}
7+
spec:
8+
limits:
9+
- type: Container
10+
default:
11+
cpu: {{ .Values.limitRange.default.cpu | default "500m" }}
12+
memory: {{ .Values.limitRange.default.memory | default "512Mi" }}
13+
defaultRequest:
14+
cpu: {{ .Values.limitRange.defaultRequest.cpu | default "250m" }}
15+
memory: {{ .Values.limitRange.defaultRequest.memory | default "256Mi" }}
16+
{{- end }}

chart/values.yaml

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,4 +122,17 @@ webhook:
122122

123123
## uninstall image for cleaning up webhook resources
124124
removalImage: bitnami/kubectl
125-
removalTag: latest
125+
removalTag: latest
126+
127+
## limitRange: is the limit range for the operator controller.
128+
## This sets for all containers in the namespace.
129+
## So if your package does not override the limits, these are what will be used.
130+
## if you omit this, the we will not create a limit range.
131+
## best practice on limits and requests is to make make the limits 2x the requests max.
132+
limitRange:
133+
default:
134+
cpu: "500m"
135+
memory: "512Mi"
136+
defaultRequest:
137+
cpu: "250m"
138+
memory: "256Mi"

docs/README.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# Skyhook Documentation
2+
3+
This directory contains user and operator documentation for Skyhook. Here you'll find guides, examples, and reference material to help you deploy, configure, and secure Skyhook in your Kubernetes cluster.
4+
5+
## Available Documentation
6+
7+
- [Resource Management](resource_management.md):
8+
How Skyhook manages CPU/memory resources using LimitRange, per-package overrides, and validation rules.
9+
10+
- [Kyverno Policy Examples](kyverno/README.md):
11+
Example Kyverno policies for restricting images or packages in Skyhook resources.
12+
13+
- [Providing Secrets to Packages](providing_secrets_to_packages.md):
14+
How to securely provide secrets to Skyhook-managed packages.
15+
16+
- [Releases](releases.md):
17+
Release notes and upgrade information for Skyhook.
18+
19+
- [Runtime Required](runtime_required.md):
20+
How to use the runtime required taint and feature in Skyhook.

docs/kyverno/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Kyverno Policy Examples for Skyhook
2+
3+
This directory contains example [Kyverno](https://kyverno.io/) policies for use with Skyhook. These are **not installed by default** and are provided as templates for users to adapt to their own security needs.
4+
5+
- `disable_packages.yaml`: Example policy to restrict or disable certain Skyhook packages/images.
6+
- `skyhook-viewer-binding.yaml`: Example RBAC binding for Kyverno to view Skyhook resources.
7+
8+
**Note:**
9+
- This directory was previously at the repo root and has been moved to `docs/kyverno/` for clarity.
10+
- If you use these policies, ensure you enable the `skyhook-viewer-role` in your Helm values and bind Kyverno to that role.
11+
12+
See the main [README](../README.md) for more information about Skyhook.
File renamed without changes.
File renamed without changes.

docs/resource_management.md

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Resource Management in Skyhook
2+
3+
Skyhook provides flexible and robust resource management for the pods it creates, allowing you to control CPU and memory usage at both the namespace and per-package level. This document explains how resource defaults and overrides work, and what validation rules are enforced.
4+
5+
---
6+
7+
## 1. Namespace Defaults with LimitRange
8+
9+
By default, Skyhook uses a [Kubernetes LimitRange](https://kubernetes.io/docs/concepts/policy/limit-range/) to set default CPU and memory requests/limits for all containers in the namespace where Skyhook operates.
10+
11+
**Example LimitRange:**
12+
```yaml
13+
apiVersion: v1
14+
kind: LimitRange
15+
metadata:
16+
name: skyhook-default-limits
17+
namespace: <your-namespace>
18+
spec:
19+
limits:
20+
- type: Container
21+
default:
22+
cpu: 500m
23+
memory: 512Mi
24+
defaultRequest:
25+
cpu: 250m
26+
memory: 256Mi
27+
```
28+
- If a pod/container does **not** specify its own resources, these defaults are applied.
29+
- You can configure these values via the Helm chart or Kustomize overlays.
30+
31+
---
32+
33+
## 2. Per-Package Resource Overrides
34+
35+
You can override the default resource requests/limits for each package in your Skyhook Custom Resource (CR). This is done in the `resources` field for each package:
36+
37+
**Example:**
38+
```yaml
39+
spec:
40+
packages:
41+
mypackage:
42+
version: 1.0.0
43+
image: ghcr.io/nvidia/skyhook-packages/shellscript
44+
resources:
45+
cpuRequest: "200m"
46+
cpuLimit: "400m"
47+
memoryRequest: "128Mi"
48+
memoryLimit: "256Mi"
49+
```
50+
- If **any** of the four fields (`cpuRequest`, `cpuLimit`, `memoryRequest`, `memoryLimit`) are set, **all four must be set** and must be positive values.
51+
- If no override is set, the namespace's LimitRange applies.
52+
53+
---
54+
55+
## 3. Validation Rules
56+
57+
Skyhook enforces the following validation rules (via webhook) for resource overrides:
58+
59+
- If any of the four resource fields are set, **all four must be set**.
60+
- All values must be **positive**.
61+
- `cpuLimit` must be **greater than or equal to** `cpuRequest`.
62+
- `memoryLimit` must be **greater than or equal to** `memoryRequest`.
63+
64+
**Examples:**
65+
66+
| Valid? | cpuRequest | cpuLimit | memoryRequest | memoryLimit | Reason |
67+
|--------|-----------|----------|--------------|-------------|--------|
68+
| ✅ | 200m | 400m | 128Mi | 256Mi | All set, valid |
69+
| ❌ | 200m | | 128Mi | 256Mi | Not all fields set |
70+
| ❌ | 200m | 100m | 128Mi | 256Mi | cpuLimit < cpuRequest |
71+
| ❌ | 200m | 400m | 128Mi | 64Mi | memoryLimit < memoryRequest |
72+
| ❌ | 0 | 400m | 128Mi | 256Mi | Zero value |
73+
74+
If a resource override is invalid, the Skyhook CR will be **rejected** by the webhook.
75+
76+
---
77+
78+
## 4. Best Practices
79+
80+
- Use LimitRange to set sensible defaults for your namespace.
81+
- Only set per-package overrides if you need different resource requirements for a specific package.
82+
- Review your resource settings to avoid overcommitting or underutilizing cluster resources.
83+
- If you change LimitRange defaults, new pods will use the new defaults unless overridden.
84+
85+
---
86+
87+
## 5. Troubleshooting
88+
89+
- If your Skyhook CR is rejected, check that all four resource fields are set and valid if you are using overrides.
90+
- Use `kubectl describe limitrange -n <namespace>` to see the current defaults.
91+
- Use `kubectl describe skyhook <name>` to see the status and any error messages.
92+
93+
---
94+
95+
## 6. Disabling Resource Defaults (No Limits)
96+
97+
If you do **not** want any default resource requests or limits applied to your Skyhook-managed pods/containers, you can simply **omit the LimitRange** from your namespace:
98+
99+
- **Helm:** Set `limitRange: {}` or remove the `limitRange` section from your `values.yaml`.
100+
101+
If there is **no LimitRange** and you do **not** set resource requests/limits in your package overrides, then:
102+
- Your pods/containers will run with **no resource requests or limits**.
103+
- This means they will be scheduled as "BestEffort" pods, which may be evicted first under resource pressure and may not get guaranteed CPU/memory.
104+
105+
**Note:**
106+
- Disabling resource limits is not recommended for production clusters, as it can lead to resource contention and unpredictable scheduling.
107+
- Only do this if you have a specific reason and understand the implications.
108+
109+
---
110+
111+
For more information, see the [Kubernetes documentation on resource management](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) and [LimitRange](https://kubernetes.io/docs/concepts/policy/limit-range/).

0 commit comments

Comments
 (0)