Skip to content

Commit 13f76a5

Browse files
lxie123a-mccarthy
andauthored
Split k0RDENT/k0S to it's own page as it's a separate product from MKE (#170)
* Create k0rdent.rst Adding k0RDENT to partner validated configurations Signed-off-by: Lee Xie <[email protected]> Signed-off-by: Lee Xie <[email protected]> * Update index.rst Signed-off-by: Lee Xie <[email protected]> Signed-off-by: Lee Xie <[email protected]> * Update mirantis-mke.rst remove k0RDENT from matrix since it will have it's own page. Signed-off-by: Lee Xie <[email protected]> Signed-off-by: Lee Xie <[email protected]> * Update k0rdent.rst Signed-off-by: Lee Xie <[email protected]> Signed-off-by: Lee Xie <[email protected]> * Update partner-validated/k0rdent.rst Co-authored-by: Abigail McCarthy <[email protected]> Signed-off-by: Lee Xie <[email protected]> * Update partner-validated/k0rdent.rst Co-authored-by: Abigail McCarthy <[email protected]> Signed-off-by: Lee Xie <[email protected]> * Update partner-validated/k0rdent.rst Co-authored-by: Abigail McCarthy <[email protected]> Signed-off-by: Lee Xie <[email protected]> --------- Signed-off-by: Lee Xie <[email protected]> Signed-off-by: Lee Xie <[email protected]> Co-authored-by: Abigail McCarthy <[email protected]>
1 parent 7afaf4e commit 13f76a5

File tree

3 files changed

+163
-22
lines changed

3 files changed

+163
-22
lines changed

partner-validated/index.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ About Partner-Validated Configurations
2727
:hidden:
2828

2929
self
30+
k0rdent.rst
3031
mirantis-mke.rst
3132

3233
Partner-validated configurations help end users who want to use
@@ -153,4 +154,4 @@ What happens if the partner requires changes to the NVIDIA GPU Operator that are
153154

154155
How are CVE fixes managed for partner software that is used by the NVIDIA GPU Operator?
155156
The partner is responsible for managing security issues and is advised to proactively notify users of issues and fixes.
156-
When the partner provides users with software, such as a containerized GPU driver, the partner is responsible for notifying and resolving issues with the container image.
157+
When the partner provides users with software, such as a containerized GPU driver, the partner is responsible for notifying and resolving issues with the container image.

partner-validated/k0rdent.rst

Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
.. headings # #, * *, =, -, ^, "
2+
3+
.. |prod-name-long| replace:: Mirantis k0RDENT
4+
.. |prod-name-short| replace:: k0RDENT
5+
6+
#############################################
7+
|prod-name-long| with the NVIDIA GPU Operator
8+
#############################################
9+
10+
11+
*********************************************
12+
About |prod-name-short| with the GPU Operator
13+
*********************************************
14+
15+
|prod-name-short| is as a "super control plane" designed to ensure the consistent provisioning and lifecycle
16+
management of Kubernetes clusters and the services that make them useful. The goal of the k0rdent project is
17+
to provide platform engineers with the means to deliver a distributed container management environment (DCME)
18+
and enable them to compose unique internal developer platforms (IDP) to support a diverse range of complex
19+
modern application workloads.
20+
21+
The NVIDIA GPU Operator uses the operator framework within Kubernetes to automate
22+
both the deployment and management of all NVIDIA software components needed to provision NVIDIA GPUs.
23+
These components include the NVIDIA GPU drivers to enable CUDA, Kubernetes device plugin for GPUs,
24+
the NVIDIA Container Toolkit, automatic node labeling using GFD, DCGM based monitoring and others.
25+
26+
27+
******************************
28+
Validated Configuration Matrix
29+
******************************
30+
31+
|prod-name-long| has self-validated with the following components and versions:
32+
33+
.. list-table::
34+
:header-rows: 1
35+
36+
* - Version
37+
- | NVIDIA
38+
| GPU
39+
| Operator
40+
- | Operating
41+
| System
42+
- | Container
43+
| Runtime
44+
- Kubernetes
45+
- Helm
46+
- NVIDIA GPU
47+
- Hardware Model
48+
49+
* - k0rdent 0.2.0 / k0s v1.31.5+k0s
50+
- v24.9.2
51+
- | Ubuntu 22.04
52+
- containerd v1.7.24 with the NVIDIA Container Toolkit v1.17.4
53+
- 1.31.5
54+
- Helm v3
55+
- | 2x NVIDIA RTX 4000 SFF Ada 20GB GDDR6 (ECC)
56+
- | Supermicro SuperServer 6028U-E1CNR4T+
57+
58+
| 1000W Supermicro PWS-1K02A-1R
59+
60+
| 2x Intel Xeon E5-2630v4, 10C/20T 2.2/3.1 GHz LGA 2011-3 25MB 85W
61+
62+
| 32GB DDR4-2666 RDIMM, M393A4K40BB2-CTD6Q
63+
64+
| NVMe 960GB PM983 NVMe M.2, MZ1LB960HAJQ-00007
65+
66+
| 2 x NVIDIA RTX 4000 SFF Ada 20GB GDDR6 (ECC), 70W, PCIe 4.0x16, 4x
67+
68+
| 4x Mini DisplayPort 1.4a
69+
70+
71+
*************
72+
Prerequisites
73+
*************
74+
75+
* A running |prod-name-short| managed cluster with at least one control plane node and two worker nodes.
76+
The recommended configuration is at least three control plane nodes and at least two worker nodes.
77+
78+
* At least one worker node with an NVIDIA GPU physically installed.
79+
The GPU Operator can locate the GPU and label the node accordingly.
80+
81+
* The kubeconfig file for the |prod-name-short| managed cluster on the seed node.
82+
You can get the file from the |prod-name-short| control plane.
83+
84+
* You have access to the |prod-name-short| cluster.
85+
86+
87+
*********
88+
Procedure
89+
*********
90+
91+
Perform the following steps to prepare the |prod-name-short| cluster:
92+
93+
#. Install template to k0rdent
94+
95+
.. code-block:: console
96+
97+
$ helm install gpu-operator oci://ghcr.io/k0rdent/catalog/charts/gpu-operator-service-template \
98+
--version 24.9.2 -n kcm-system
99+
100+
#. Verify service template:
101+
102+
.. code-block:: console
103+
104+
$ kubectl get servicetemplates -A
105+
106+
*Example Output*
107+
108+
.. code-block:: output
109+
110+
NAMESPACE NAME VALID
111+
kcm-system gpu-operator-24-9-2 true
112+
113+
#. Deploy service template to child cluster:
114+
115+
.. code-block:: console
116+
117+
apiVersion: k0rdent.mirantis.com/v1alpha1
118+
kind: MultiClusterService
119+
metadata:
120+
name: gpu-operator
121+
spec:
122+
clusterSelector:
123+
matchLabels:
124+
group: demo
125+
serviceSpec:
126+
services:
127+
- template: gpu-operator-24-9-2
128+
name: gpu-operator
129+
namespace: gpu-operator
130+
values: |
131+
operator:
132+
defaultRuntime: containerd
133+
toolkit:
134+
env:
135+
- name: CONTAINERD_CONFIG
136+
value: /etc/k0s/containerd.d/nvidia.toml
137+
- name: CONTAINERD_SOCKET
138+
value: /run/k0s/containerd.sock
139+
- name: CONTAINERD_RUNTIME_CLASS
140+
value: nvidia
141+
142+
143+
The |prod-name-short| managed clusters will now have the NVIDIA GPU operator
144+
145+
*************************************************
146+
Verifying |prod-name-short| with the GPU Operator
147+
*************************************************
148+
149+
Refer to :external+gpuop:ref:`running sample gpu applications` to verify the installation.
150+
151+
***************
152+
Getting Support
153+
***************
154+
155+
Refer to the k0RDENT product documentation for information about working with k0RDENT.
156+
157+
*******************
158+
Related information
159+
*******************
160+
161+
* https://docs.k0rdent.io/v0.2.0/

partner-validated/mirantis-mke.rst

Lines changed: 0 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -44,27 +44,6 @@ Validated Configuration Matrix
4444
- NVIDIA GPU
4545
- Hardware Model
4646

47-
* - k0s v1.31.5+k0s / k0rdent 0.1.0
48-
- v24.9.2
49-
- | Ubuntu 22.04
50-
- containerd v1.7.24 with the NVIDIA Container Toolkit v1.17.4
51-
- 1.31.5
52-
- Helm v3
53-
- | 2x NVIDIA RTX 4000 SFF Ada 20GB GDDR6 (ECC)
54-
- | Supermicro SuperServer 6028U-E1CNR4T+
55-
56-
| 1000W Supermicro PWS-1K02A-1R
57-
58-
| 2x Intel Xeon E5-2630v4, 10C/20T 2.2/3.1 GHz LGA 2011-3 25MB 85W
59-
60-
| 32GB DDR4-2666 RDIMM, M393A4K40BB2-CTD6Q
61-
62-
| NVMe 960GB PM983 NVMe M.2, MZ1LB960HAJQ-00007
63-
64-
| 2 x NVIDIA RTX 4000 SFF Ada 20GB GDDR6 (ECC), 70W, PCIe 4.0x16, 4x
65-
66-
| 4x Mini DisplayPort 1.4a
67-
6847
* - MKE 3.8
6948
- v24.9.2
7049
- | Ubuntu 22.04

0 commit comments

Comments
 (0)