Skip to content

Commit b37b91d

Browse files
authored
Merge pull request #547 from ykulazhenkov/nvidia-ipam-docs
Add examples for NVIDIA IPAM to README.md
2 parents fd03165 + 4b0d761 commit b37b91d

File tree

4 files changed

+287
-22
lines changed

4 files changed

+287
-22
lines changed

README.md

Lines changed: 122 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -90,14 +90,18 @@ and related configurations.
9090
to be deployed on RDMA & GPU supporting nodes (required for GPUDirect workloads).
9191
For NVIDIA GPU driver version < 465. Check [compatibility notes](#compatibility-notes) for details
9292
- `ibKubernetes`: [InfiniBand Kubernetes](https://github.com/Mellanox/ib-kubernetes/) and related configurations.
93-
- `SecondaryNetwork`: Specifies components to deploy in order to facilitate a secondary network in Kubernetes. It consists of the following optionally deployed components:
93+
- `secondaryNetwork`: Specifies components to deploy in order to facilitate a secondary network in Kubernetes. It consists of the following optionally deployed components:
9494
- [Multus-CNI](https://github.com/intel/multus-cni): Delegate CNI plugin to support secondary networks in Kubernetes
9595
- CNI plugins: Currently only [containernetworking-plugins](https://github.com/containernetworking/plugins) is supported
9696
- [IP Over Infiniband (IPoIB) CNI Plugin](https://github.com/Mellanox/ipoib-cni): Allow users to create an IPoIB child link and move it to the pod.
97-
- IPAM CNI: Currently only [Whereabout IPAM CNI](https://github.com/k8snetworkplumbingwg/whereabouts) is supported
97+
- IPAM CNI: [Whereabouts IPAM CNI](https://github.com/k8snetworkplumbingwg/whereabouts) and related configurations
98+
- `nvIpam`: [NVIDIA Kubernetes IPAM](https://github.com/Mellanox/nvidia-k8s-ipam) and related configurations.
9899

99100
>__NOTE__: Any sub-state may be omitted if it is not required for the cluster.
100101

102+
>__NOTE__: NVIDIA IPAM and Whereabouts IPAM plugin can be deployed simultaneously in the same cluster
103+
104+
101105
##### Example for NICClusterPolicy resource:
102106
In the example below we request OFED driver to be deployed together with RDMA shared device plugin
103107
but without NV Peer Memory driver.
@@ -107,12 +111,11 @@ apiVersion: mellanox.com/v1alpha1
107111
kind: NicClusterPolicy
108112
metadata:
109113
name: nic-cluster-policy
110-
namespace: nvidia-network-operator
111114
spec:
112115
ofedDriver:
113116
image: mofed
114117
repository: nvcr.io/nvidia/mellanox
115-
version: 5.9-0.5.6.0
118+
version: 23.04-0.5.3.3.1
116119
startupProbe:
117120
initialDelaySeconds: 10
118121
periodSeconds: 10
@@ -146,21 +149,84 @@ spec:
146149
cniPlugins:
147150
image: plugins
148151
repository: ghcr.io/k8snetworkplumbingwg
149-
version: v0.8.7-amd64
152+
version: v1.2.0-amd64
150153
multus:
151154
image: multus-cni
152155
repository: ghcr.io/k8snetworkplumbingwg
153-
version: v3.8
156+
version: v3.9.3
154157
# if config is missing or empty then multus config will be automatically generated from the CNI configuration file of the master plugin (the first file in lexicographical order in cni-conf-dir)
155158
config: ''
156159
ipamPlugin:
157160
image: whereabouts
158161
repository: ghcr.io/k8snetworkplumbingwg
159-
version: v0.5.4-amd64
162+
version: v0.6.1-amd64
160163
```
161164
162165
Can be found at: `example/crs/mellanox.com_v1alpha1_nicclusterpolicy_cr.yaml`
163166
167+
NicClusterPolicy with [NVIDIA Kubernetes IPAM](https://github.com/Mellanox/nvidia-k8s-ipam) configuration
168+
169+
```
170+
apiVersion: mellanox.com/v1alpha1
171+
kind: NicClusterPolicy
172+
metadata:
173+
name: nic-cluster-policy
174+
spec:
175+
ofedDriver:
176+
image: mofed
177+
repository: nvcr.io/nvidia/mellanox
178+
version: 23.04-0.5.3.3.1
179+
startupProbe:
180+
initialDelaySeconds: 10
181+
periodSeconds: 10
182+
livenessProbe:
183+
initialDelaySeconds: 30
184+
periodSeconds: 30
185+
readinessProbe:
186+
initialDelaySeconds: 10
187+
periodSeconds: 30
188+
rdmaSharedDevicePlugin:
189+
image: k8s-rdma-shared-dev-plugin
190+
repository: nvcr.io/nvidia/cloud-native
191+
version: v1.3.2
192+
# The config below directly propagates to k8s-rdma-shared-device-plugin configuration.
193+
# Replace 'devices' with your (RDMA capable) netdevice name.
194+
config: |
195+
{
196+
"configList": [
197+
{
198+
"resourceName": "rdma_shared_device_a",
199+
"rdmaHcaMax": 1000,
200+
"selectors": {
201+
"vendors": ["15b3"],
202+
"deviceIDs": ["101b"]
203+
}
204+
}
205+
]
206+
}
207+
secondaryNetwork:
208+
cniPlugins:
209+
image: plugins
210+
repository: ghcr.io/k8snetworkplumbingwg
211+
version: v1.2.0-amd64
212+
multus:
213+
image: multus-cni
214+
repository: ghcr.io/k8snetworkplumbingwg
215+
version: v3.9.3
216+
config: ''
217+
nvIpam:
218+
image: nvidia-k8s-ipam
219+
repository: ghcr.io/mellanox
220+
version: v0.0.2
221+
config: '{
222+
"pools": {
223+
"my-pool": {"subnet": "192.168.0.0/24", "perNodeBlockSize": 100, "gateway": "192.168.0.1"}
224+
}
225+
}'
226+
```
227+
228+
Can be found at: `example/crs/mellanox.com_v1alpha1_nicclusterpolicy_cr-nvidia-ipam.yaml`
229+
164230
#### NICClusterPolicy status
165231
NICClusterPolicy `status` field reflects the current state of the system.
166232
It contains a per sub-state and a global state `status`.
@@ -173,21 +239,31 @@ The global state reflects the logical _AND_ of each individual sub-state.
173239
174240
##### Example Status field of a NICClusterPolicy instance
175241
```
176-
Status:
177-
Applied States:
178-
Name: state-OFED
179-
State: ready
180-
Name: state-RDMA-device-plugin
181-
State: ready
182-
Name: state-NV-Peer
183-
State: ignore
184-
Name: state-cni-plugins
185-
State: ignore
186-
Name: state-Multus
187-
State: ready
188-
Name: state-whereabouts
189-
State: ready
190-
State: ready
242+
status:
243+
appliedStates:
244+
- name: state-pod-security-policy
245+
state: ignore
246+
- name: state-multus-cni
247+
state: ready
248+
- name: state-container-networking-plugins
249+
state: ignore
250+
- name: state-ipoib-cni
251+
state: ignore
252+
- name: state-whereabouts-cni
253+
state: ready
254+
- name: state-OFED
255+
state: ready
256+
- name: state-SRIOV-device-plugin
257+
state: ignore
258+
- name: state-RDMA-device-plugin
259+
state: ready
260+
- name: state-NV-Peer
261+
state: ignore
262+
- name: state-ib-kubernetes
263+
state: ignore
264+
- name: state-nv-ipam-cni
265+
state: ready
266+
state: ready
191267
```
192268
193269
>__NOTE__: An `ignore` State indicates that the sub-state was not defined in the custom resource
@@ -208,6 +284,9 @@ MacvlanNetwork CRD Spec includes the following fields:
208284
In the example below we deploy MacvlanNetwork CRD instance with mode as bridge, MTU 1500, default route interface as master,
209285
with resource "rdma/rdma_shared_device_a", that will be used to deploy NetworkAttachmentDefinition for macvlan to default namespace.
210286
287+
288+
With [Whereabouts IPAM CNI](https://github.com/k8snetworkplumbingwg/whereabouts)
289+
211290
```
212291
apiVersion: mellanox.com/v1alpha1
213292
kind: MacvlanNetwork
@@ -238,6 +317,27 @@ spec:
238317
239318
Can be found at: `example/crs/mellanox.com_v1alpha1_macvlannetwork_cr.yaml`
240319
320+
With [NVIDIA Kubernetes IPAM](https://github.com/Mellanox/nvidia-k8s-ipam)
321+
322+
```
323+
apiVersion: mellanox.com/v1alpha1
324+
kind: MacvlanNetwork
325+
metadata:
326+
name: example-macvlannetwork
327+
spec:
328+
networkNamespace: "default"
329+
master: "ens2f0"
330+
mode: "bridge"
331+
mtu: 1500
332+
ipam: |
333+
{
334+
"type": "nv-ipam",
335+
"poolName": "my-pool"
336+
}
337+
```
338+
339+
Can be found at: `example/crs/mellanox.com_v1alpha1_macvlannetwork_cr-nvidia-ipam.yaml`
340+
241341
### HostDeviceNetwork CRD
242342
This CRD defines a HostDevice secondary network. It is translated by the Operator to a `NetworkAttachmentDefinition` instance as defined in [k8snetworkplumbingwg/multi-net-spec](https://github.com/k8snetworkplumbingwg/multi-net-spec).
243343
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Copyright 2021 NVIDIA
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
apiVersion: mellanox.com/v1alpha1
15+
kind: MacvlanNetwork
16+
metadata:
17+
name: example-macvlannetwork
18+
spec:
19+
networkNamespace: "default"
20+
master: "ens2f0"
21+
mode: "bridge"
22+
mtu: 1500
23+
ipam: |
24+
{
25+
"type": "nv-ipam",
26+
"poolName": "my-pool"
27+
}
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# Copyright 2020 NVIDIA
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
apiVersion: mellanox.com/v1alpha1
15+
kind: NicClusterPolicy
16+
metadata:
17+
name: nic-cluster-policy
18+
spec:
19+
ofedDriver:
20+
image: mofed
21+
repository: nvcr.io/nvidia/mellanox
22+
version: 23.04-0.5.3.3.1
23+
startupProbe:
24+
initialDelaySeconds: 10
25+
periodSeconds: 10
26+
livenessProbe:
27+
initialDelaySeconds: 30
28+
periodSeconds: 30
29+
readinessProbe:
30+
initialDelaySeconds: 10
31+
periodSeconds: 30
32+
rdmaSharedDevicePlugin:
33+
image: k8s-rdma-shared-dev-plugin
34+
repository: nvcr.io/nvidia/cloud-native
35+
version: v1.3.2
36+
# The config below directly propagates to k8s-rdma-shared-device-plugin configuration.
37+
# Replace 'devices' with your (RDMA capable) netdevice name.
38+
config: |
39+
{
40+
"configList": [
41+
{
42+
"resourceName": "rdma_shared_device_a",
43+
"rdmaHcaMax": 1000,
44+
"selectors": {
45+
"vendors": ["15b3"],
46+
"deviceIDs": ["101b"]
47+
}
48+
}
49+
]
50+
}
51+
secondaryNetwork:
52+
cniPlugins:
53+
image: plugins
54+
repository: ghcr.io/k8snetworkplumbingwg
55+
version: v1.2.0-amd64
56+
multus:
57+
image: multus-cni
58+
repository: ghcr.io/k8snetworkplumbingwg
59+
version: v3.9.3
60+
config: ''
61+
nvIpam:
62+
image: nvidia-k8s-ipam
63+
repository: ghcr.io/mellanox
64+
version: v0.0.2
65+
config: '{
66+
"pools": {
67+
"my-pool": {"subnet": "192.168.0.0/24", "perNodeBlockSize": 100, "gateway": "192.168.0.1"}
68+
}
69+
}'
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# Copyright 2020 NVIDIA
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
apiVersion: mellanox.com/v1alpha1
15+
kind: NicClusterPolicy
16+
metadata:
17+
name: nic-cluster-policy
18+
spec:
19+
ofedDriver:
20+
image: {{ .Mofed.Image }}
21+
repository: {{ .Mofed.Repository }}
22+
version: {{ .Mofed.Version }}
23+
startupProbe:
24+
initialDelaySeconds: 10
25+
periodSeconds: 10
26+
livenessProbe:
27+
initialDelaySeconds: 30
28+
periodSeconds: 30
29+
readinessProbe:
30+
initialDelaySeconds: 10
31+
periodSeconds: 30
32+
rdmaSharedDevicePlugin:
33+
image: {{ .RdmaSharedDevicePlugin.Image }}
34+
repository: {{ .RdmaSharedDevicePlugin.Repository }}
35+
version: {{ .RdmaSharedDevicePlugin.Version }}
36+
# The config below directly propagates to k8s-rdma-shared-device-plugin configuration.
37+
# Replace 'devices' with your (RDMA capable) netdevice name.
38+
config: |
39+
{
40+
"configList": [
41+
{
42+
"resourceName": "rdma_shared_device_a",
43+
"rdmaHcaMax": 1000,
44+
"selectors": {
45+
"vendors": ["15b3"],
46+
"deviceIDs": ["101b"]
47+
}
48+
}
49+
]
50+
}
51+
secondaryNetwork:
52+
cniPlugins:
53+
image: {{ .CniPlugins.Image }}
54+
repository: {{ .CniPlugins.Repository }}
55+
version: {{ .CniPlugins.Version }}
56+
multus:
57+
image: {{ .Multus.Image }}
58+
repository: {{ .Multus.Repository }}
59+
version: {{ .Multus.Version }}
60+
config: ''
61+
nvIpam:
62+
image: {{ .NvIPAM.Image }}
63+
repository: {{ .NvIPAM.Repository }}
64+
version: {{ .NvIPAM.Version }}
65+
config: '{
66+
"pools": {
67+
"my-pool": {"subnet": "192.168.0.0/24", "perNodeBlockSize": 100, "gateway": "192.168.0.1"}
68+
}
69+
}'

0 commit comments

Comments
 (0)