Skip to content

Commit 4d3a283

Browse files
committed
feat(chart): add node affinity for operator pod configuration
1 parent 0541aca commit 4d3a283

File tree

9 files changed

+439
-1
lines changed

9 files changed

+439
-1
lines changed

chart/templates/deployment.yaml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,9 @@ spec:
2424
annotations:
2525
kubectl.kubernetes.io/default-container: manager
2626
spec:
27+
{{- if and .Values.controllerManager.selectors .Values.controllerManager.nodeAffinity.matchExpressions }}
28+
{{- fail "Error: Cannot specify both controllerManager.selectors and controllerManager.nodeAffinity.matchExpressions. Use nodeAffinity.matchExpressions for complex node selection or selectors for simple key-value matching." }}
29+
{{- end }}
2730
affinity:
2831
nodeAffinity:
2932
requiredDuringSchedulingIgnoredDuringExecution:
@@ -38,6 +41,16 @@ spec:
3841
operator: In
3942
values:
4043
- linux
44+
{{- range .Values.controllerManager.nodeAffinity.matchExpressions }}
45+
- key: {{ .key }}
46+
operator: {{ .operator }}
47+
{{- if .values }}
48+
values:
49+
{{- range .values }}
50+
- {{ . }}
51+
{{- end }}
52+
{{- end }}
53+
{{- end }}
4154
{{- with .Values.controllerManager.tolerations }}
4255
tolerations:
4356
{{- toYaml . | nindent 6 }}

chart/values.yaml

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,27 @@ controllerManager:
77
# value: system-cpu
88
# effect: NoSchedule
99
tolerations: []
10-
## selectors: add node selectors to the controller manager pod
10+
## selectors: add simple node selectors to the controller manager pod
1111
## Example below is for a system-workload node selector
12+
## NOTE: Cannot be used together with nodeAffinity.matchExpressions
1213
# selectors:
1314
# dedicated: system-workload
1415
selectors: {}
16+
## nodeAffinity: add advanced node affinity expressions to the controller manager pod
17+
## This allows for more complex node selection than simple selectors
18+
## NOTE: Cannot be used together with selectors - choose one approach
19+
## Example below shows how to select nodes with specific labels using expressions
20+
# nodeAffinity:
21+
# matchExpressions:
22+
# - key: node-role.kubernetes.io/control-plane
23+
# operator: DoesNotExist
24+
# - key: dedicated
25+
# operator: In
26+
# values:
27+
# - system-workload
28+
# - gpu-workload
29+
nodeAffinity:
30+
matchExpressions: []
1531
## config for kube-rbac-proxy used for webhooks
1632
kubeRbacProxy:
1733
args:
Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# Node Affinity Test
2+
3+
This test validates the node affinity configuration feature for the Skyhook Operator Helm chart.
4+
5+
## Test Overview
6+
7+
The test demonstrates that the `controllerManager.nodeAffinity.matchExpressions` configuration in the Helm chart works correctly by:
8+
9+
1. **Phase 1**: Installing the operator with node affinity expressions that target non-existent labels
10+
- Uses `values-no-match.yaml` with expressions targeting `skyhook.nvidia.com/test-node=fooboar`
11+
- Verifies that pods remain in `Pending` state and cannot be scheduled
12+
- Validates the affinity expressions are correctly applied to the pod spec
13+
14+
2. **Phase 2**: Adding the required labels to nodes and updating the configuration
15+
- Adds `skyhook.nvidia.com/test-node=skyhooke2e` label to all nodes
16+
- Updates the deployment to use `values-match.yaml` with expressions targeting the existing label
17+
- Verifies that pods are now scheduled and running successfully
18+
19+
## Files
20+
21+
- `chainsaw-test.yaml` - The main test configuration
22+
- `values-no-match.yaml` - Helm values with node affinity targeting non-existent labels
23+
- `values-match.yaml` - Helm values with node affinity targeting existing labels
24+
- `assert-no-schedule.yaml` - Assertion to verify pods are not scheduled
25+
- `assert-scheduled.yaml` - Assertion to verify pods are scheduled and running
26+
27+
## Node Affinity Configuration
28+
29+
The test uses the following node affinity expressions:
30+
31+
```yaml
32+
controllerManager:
33+
nodeAffinity:
34+
matchExpressions:
35+
- key: node-role.kubernetes.io/control-plane
36+
operator: DoesNotExist
37+
- key: skyhook.nvidia.com/test-node
38+
operator: In
39+
values:
40+
- skyhooke2e
41+
```
42+
43+
## Running the Test
44+
45+
This test is designed to be run with Chainsaw in a Kind cluster. It will:
46+
47+
1. Create a Kind cluster (if not already present)
48+
2. Run the test scenarios
49+
3. Clean up labels and resources
50+
51+
The test validates that the Helm chart correctly translates the `nodeAffinity.matchExpressions` configuration into proper Kubernetes node affinity rules in the deployment template.
52+
53+
## Handling Selectors vs NodeAffinity
54+
55+
The Helm chart enforces a clear separation between simple selectors and advanced node affinity:
56+
57+
### Validation Behavior
58+
- **Cannot use both** `selectors` and `nodeAffinity.matchExpressions` together
59+
- The chart will fail with an error if both are defined
60+
- This prevents conflicting or confusing node selection rules
61+
62+
### Error Message
63+
```
64+
Error: Cannot specify both controllerManager.selectors and controllerManager.nodeAffinity.matchExpressions.
65+
Use nodeAffinity.matchExpressions for complex node selection or selectors for simple key-value matching.
66+
```
67+
68+
### Examples
69+
70+
**Simple selector (uses nodeSelector):**
71+
```yaml
72+
controllerManager:
73+
selectors:
74+
dedicated: system-workload
75+
```
76+
77+
**Advanced node affinity (uses nodeAffinity):**
78+
```yaml
79+
controllerManager:
80+
nodeAffinity:
81+
matchExpressions:
82+
- key: node-role.kubernetes.io/control-plane
83+
operator: DoesNotExist
84+
- key: skyhook.nvidia.com/test-node
85+
operator: In
86+
values:
87+
- skyhooke2e
88+
```
89+
90+
**Invalid (will cause error):**
91+
```yaml
92+
controllerManager:
93+
selectors:
94+
dedicated: system-workload
95+
nodeAffinity:
96+
matchExpressions:
97+
- key: node-role.kubernetes.io/control-plane
98+
operator: DoesNotExist
99+
```
100+
101+
### Usage Recommendations
102+
- Use `selectors` for simple key-value node selection
103+
- Use `nodeAffinity.matchExpressions` for complex node selection with operators like `In`, `NotIn`, `Exists`, `DoesNotExist`
104+
- Choose one approach - they cannot be mixed
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
#
2+
# LICENSE START
3+
#
4+
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
5+
#
6+
# Licensed under the Apache License, Version 2.0 (the "License");
7+
# you may not use this file except in compliance with the License.
8+
# You may obtain a copy of the License at
9+
#
10+
# http://www.apache.org/licenses/LICENSE-2.0
11+
#
12+
# Unless required by applicable law or agreed to in writing, software
13+
# distributed under the License is distributed on an "AS IS" BASIS,
14+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
# See the License for the specific language governing permissions and
16+
# limitations under the License.
17+
#
18+
# LICENSE END
19+
#
20+
21+
apiVersion: v1
22+
kind: Pod
23+
metadata:
24+
annotations:
25+
kubectl.kubernetes.io/default-container: manager
26+
labels:
27+
app: node-affinity-test-skyhook-operator-controller-manager
28+
app.kubernetes.io/instance: node-affinity-test
29+
app.kubernetes.io/name: skyhook-operator
30+
control-plane: controller-manager
31+
namespace: skyhook
32+
ownerReferences:
33+
- apiVersion: apps/v1
34+
blockOwnerDeletion: true
35+
controller: true
36+
kind: ReplicaSet
37+
spec:
38+
affinity:
39+
nodeAffinity:
40+
requiredDuringSchedulingIgnoredDuringExecution:
41+
nodeSelectorTerms:
42+
- matchExpressions:
43+
- key: kubernetes.io/arch
44+
operator: In
45+
values:
46+
- amd64
47+
- arm64
48+
- key: kubernetes.io/os
49+
operator: In
50+
values:
51+
- linux
52+
- key: skyhook.nvidia.com/test-node
53+
operator: In
54+
values:
55+
- fooboar
56+
- key: node-role.kubernetes.io/control-plane
57+
operator: DoesNotExist
58+
status:
59+
conditions:
60+
- lastProbeTime: null
61+
reason: Unschedulable
62+
status: "False"
63+
type: PodScheduled
64+
phase: Pending
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
#
2+
# LICENSE START
3+
#
4+
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
5+
#
6+
# Licensed under the Apache License, Version 2.0 (the "License");
7+
# you may not use this file except in compliance with the License.
8+
# You may obtain a copy of the License at
9+
#
10+
# http://www.apache.org/licenses/LICENSE-2.0
11+
#
12+
# Unless required by applicable law or agreed to in writing, software
13+
# distributed under the License is distributed on an "AS IS" BASIS,
14+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
# See the License for the specific language governing permissions and
16+
# limitations under the License.
17+
#
18+
# LICENSE END
19+
#
20+
21+
apiVersion: v1
22+
kind: Pod
23+
metadata:
24+
annotations:
25+
kubectl.kubernetes.io/default-container: manager
26+
labels:
27+
app: node-affinity-test-skyhook-operator-controller-manager
28+
app.kubernetes.io/instance: node-affinity-test
29+
app.kubernetes.io/name: skyhook-operator
30+
control-plane: controller-manager
31+
namespace: skyhook
32+
ownerReferences:
33+
- apiVersion: apps/v1
34+
blockOwnerDeletion: true
35+
controller: true
36+
kind: ReplicaSet
37+
spec:
38+
affinity:
39+
nodeAffinity:
40+
requiredDuringSchedulingIgnoredDuringExecution:
41+
nodeSelectorTerms:
42+
- matchExpressions:
43+
- key: kubernetes.io/arch
44+
operator: In
45+
values:
46+
- amd64
47+
- arm64
48+
- key: kubernetes.io/os
49+
operator: In
50+
values:
51+
- linux
52+
- key: node-role.kubernetes.io/control-plane
53+
operator: DoesNotExist
54+
- key: skyhook.nvidia.com/test-node
55+
operator: In
56+
values:
57+
- skyhooke2e
58+
status:
59+
(conditions[?type == 'Ready']):
60+
- status: 'True'
61+
(conditions[?type == 'PodScheduled']):
62+
- status: 'True'
63+
(conditions[?type == 'ContainersReady']):
64+
- status: 'True'
65+
phase: Running
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
#
2+
# LICENSE START
3+
#
4+
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
5+
#
6+
# Licensed under the Apache License, Version 2.0 (the "License");
7+
# you may not use this file except in compliance with the License.
8+
# You may obtain a copy of the License at
9+
#
10+
# http://www.apache.org/licenses/LICENSE-2.0
11+
#
12+
# Unless required by applicable law or agreed to in writing, software
13+
# distributed under the License is distributed on an "AS IS" BASIS,
14+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
# See the License for the specific language governing permissions and
16+
# limitations under the License.
17+
#
18+
# LICENSE END
19+
#
20+
21+
# yaml-language-server: $schema=https://raw.githubusercontent.com/kyverno/chainsaw/main/.schemas/json/test-chainsaw-v1alpha1.json
22+
apiVersion: chainsaw.kyverno.io/v1alpha1
23+
kind: Test
24+
metadata:
25+
name: helm-node-affinity
26+
spec:
27+
description: This test asserts that the helm chart correctly applies node affinity expressions
28+
from the values file. It validates that the controller manager pods are scheduled only on
29+
nodes matching the configured node affinity expressions.
30+
concurrent: false
31+
timeouts:
32+
assert: 240s
33+
exec: 240s
34+
steps:
35+
- try:
36+
- script:
37+
content: |
38+
## Install helm chart with node affinity configuration targeting non-existent labels
39+
## This should cause pods to NOT schedule
40+
../install-helm-chart.sh node-affinity-test values-no-match.yaml
41+
- assert:
42+
file: assert-no-schedule.yaml
43+
- try:
44+
- script:
45+
content: |
46+
## Upgrade helm chart to use values that match the node labels
47+
../install-helm-chart.sh node-affinity-test values-match.yaml
48+
- assert:
49+
file: assert-scheduled.yaml
50+
- try:
51+
- script:
52+
content: |
53+
## Upgrade helm chart to use values that match the node labels
54+
../install-helm-chart.sh node-affinity-test values-conflict-test.yaml
55+
error=$?
56+
if [ $error -eq 0 ]; then
57+
echo "✗ Helm chart installed successfully - this should not happen!"
58+
exit 1
59+
fi
60+
finally:
61+
- script:
62+
content: |
63+
## Remove helm chart
64+
../uninstall-helm-chart.sh node-affinity-test
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
#
2+
# LICENSE START
3+
#
4+
# Copyright (c) NVIDIA CORPORATION. All rights reserved.
5+
#
6+
# Licensed under the Apache License, Version 2.0 (the "License");
7+
# you may not use this file except in compliance with the License.
8+
# You may obtain a copy of the License at
9+
#
10+
# http://www.apache.org/licenses/LICENSE-2.0
11+
#
12+
# Unless required by applicable law or agreed to in writing, software
13+
# distributed under the License is distributed on an "AS IS" BASIS,
14+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
# See the License for the specific language governing permissions and
16+
# limitations under the License.
17+
#
18+
# LICENSE END
19+
#
20+
21+
controllerManager:
22+
# This should trigger the validation error since both are defined
23+
selectors:
24+
dedicated: system-workload
25+
nodeAffinity:
26+
matchExpressions:
27+
- key: skyhook.nvidia.com/test-node
28+
operator: In
29+
values:
30+
- skyhooke2e
31+
manager:
32+
image:
33+
repository: ghcr.io/nvidia/skyhook/operator
34+
tag: latest
35+
webhook:
36+
enable: false

0 commit comments

Comments
 (0)