-
Notifications
You must be signed in to change notification settings - Fork 99
README: refer to external install instructions #699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jgehrcke
merged 1 commit into
NVIDIA:main
from
jgehrcke:jp/readme-installation-instruction
Oct 22, 2025
Merged
README: refer to external install instructions #699
jgehrcke
merged 1 commit into
NVIDIA:main
from
jgehrcke:jp/readme-installation-instruction
Oct 22, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]>
Collaborator
Author
|
That one is uncontroversial I think -- a small improvement to the README that we will hopefully iterate on more in the future. |
Collaborator
Author
klueska
approved these changes
Oct 22, 2025
Collaborator
klueska
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you include the webhook instructions in the wiki.
asymingt
added a commit
to asymingt/k8s-dra-driver-gpu
that referenced
this pull request
Nov 14, 2025
commit 55fc7b0 Merge: 5443e0f ef23484 Author: Shiva Krishna Merla <[email protected]> Date: Thu Nov 6 16:04:43 2025 -0800 Merge pull request NVIDIA#668 from varunrsekar/vfio-support-1.33 Support VFIO passthrough commit ef23484 Author: Varun Ramachandra Sekar <[email protected]> Date: Tue Oct 14 17:29:18 2025 -0700 vfio passthrough support Signed-off-by: Varun Ramachandra Sekar <[email protected]> use chroot to run modprobe Signed-off-by: Varun Ramachandra Sekar <[email protected]> deadvertise sibling devices on preparation Signed-off-by: Varun Ramachandra Sekar <[email protected]> soft check for VFs before attempting unbind Signed-off-by: Varun Ramachandra Sekar <[email protected]> address review comments Signed-off-by: Varun Ramachandra Sekar <[email protected]> address comments (2) Signed-off-by: Varun Ramachandra Sekar <[email protected]> use fuser to check if gpu is free Signed-off-by: Varun Ramachandra Sekar <[email protected]> remove unnecessary securityContext Signed-off-by: Varun Ramachandra Sekar <[email protected]> don't mix vfio and mig devices Signed-off-by: Varun Ramachandra Sekar <[email protected]> commit 5443e0f Merge: 59d775b 3babfe5 Author: Shiva Krishna Merla <[email protected]> Date: Tue Nov 4 12:48:00 2025 -0800 Merge pull request NVIDIA#711 from shivamerla/add_gpu_stress_tests tests: Add separate targets for GPU plugin tests + add stress tests commit 3babfe5 Author: Shiva Krishna, Merla <[email protected]> Date: Tue Nov 4 11:47:01 2025 -0800 tests: Use BATS_TEST_TMPDIR and failfast on errors during cleanup Signed-off-by: Shiva Krishna, Merla <[email protected]> commit 2b3e70b Author: Shiva Krishna, Merla <[email protected]> Date: Tue Nov 4 11:07:19 2025 -0800 tests: Add separate targets for GPU plugin tests + add stress tests * Add separate make targets to run GPU and CD specific tests * Add a stress test for GPU allocation * Refactor Makefile to share common docker setup between targets Signed-off-by: Shiva Krishna, Merla <[email protected]> commit 59d775b Merge: 852b56f 1e79179 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Mon Nov 3 19:38:02 2025 +0100 Merge pull request NVIDIA#709 from jgehrcke/jp/basic-gpu-tests tests: cover basic GPU allocation, misc improvements commit 852b56f Merge: 1ee1b4a e8fa8e6 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Mon Nov 3 19:21:53 2025 +0100 Merge pull request NVIDIA#706 from Gacko/vkptt kubelet plugins: add /opt/bin to binary search paths commit 1ee1b4a Merge: f4d11e3 068bb76 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Mon Nov 3 19:10:16 2025 +0100 Merge pull request NVIDIA#710 from NVIDIA/dependabot/docker/deployments/container/main/nvidia/distroless/cc-v3.2.1-dev build(deps): bump nvidia/distroless/cc from v3.2.0-dev to v3.2.1-dev in /deployments/container commit 1e79179 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Sat Nov 1 11:44:03 2025 -0700 tests: cover basic GPU allocation Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> misc fixes Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> remove cdi spec removal again Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 068bb76 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon Nov 3 17:59:31 2025 +0000 build(deps): bump nvidia/distroless/cc in /deployments/container Bumps nvidia/distroless/cc from v3.2.0-dev to v3.2.1-dev. --- updated-dependencies: - dependency-name: nvidia/distroless/cc dependency-version: v3.2.1-dev dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> commit fcd74d1 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Sat Nov 1 11:42:35 2025 -0700 tests: add nvmm helper Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 977f421 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Sat Nov 1 11:42:10 2025 -0700 tests: per-user tmp dir (relevant on shared machines) Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 1c2da2c Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Sat Nov 1 11:41:09 2025 -0700 tests: parallelize per-node state dir cleanup Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit e8fa8e6 Author: Marco Ebert <[email protected]> Date: Wed Oct 29 09:52:34 2025 +0100 kubelet plugins: add /opt/bin to binary search paths Signed-off-by: Marco Ebert <[email protected]> commit f4d11e3 Merge: 89c8258 9b20929 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 29 13:09:19 2025 +0100 Merge pull request NVIDIA#707 from jgehrcke/jp/version25120 Increment version to 25.12.0-dev commit 89c8258 Merge: a772441 de830d3 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 29 13:07:49 2025 +0100 Merge pull request NVIDIA#703 from NVIDIA/dependabot/docker/deployments/container/main/nvidia/distroless/cc-v3.2.0-dev build(deps): bump nvidia/distroless/cc from v3.1.13-dev to v3.2.0-dev in /deployments/container commit a772441 Merge: 7f591c2 2a2eeec Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 29 13:07:04 2025 +0100 Merge pull request NVIDIA#705 from NVIDIA/dependabot/go_modules/main/github.com/NVIDIA/nvidia-container-toolkit-1.18.0 build(deps): bump github.com/NVIDIA/nvidia-container-toolkit from 1.18.0-rc.6 to 1.18.0 commit 9b20929 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 29 12:47:26 2025 +0100 Increment version to 25.12.0-dev Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 2a2eeec Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Sun Oct 26 17:02:01 2025 +0000 build(deps): bump github.com/NVIDIA/nvidia-container-toolkit Bumps [github.com/NVIDIA/nvidia-container-toolkit](https://github.com/NVIDIA/nvidia-container-toolkit) from 1.18.0-rc.6 to 1.18.0. - [Release notes](https://github.com/NVIDIA/nvidia-container-toolkit/releases) - [Changelog](https://github.com/NVIDIA/nvidia-container-toolkit/blob/main/CHANGELOG.md) - [Commits](NVIDIA/nvidia-container-toolkit@v1.18.0-rc.6...v1.18.0) --- updated-dependencies: - dependency-name: github.com/NVIDIA/nvidia-container-toolkit dependency-version: 1.18.0 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> commit de830d3 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Fri Oct 24 17:13:23 2025 +0000 build(deps): bump nvidia/distroless/cc in /deployments/container Bumps nvidia/distroless/cc from v3.1.13-dev to v3.2.0-dev. --- updated-dependencies: - dependency-name: nvidia/distroless/cc dependency-version: v3.2.0-dev dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> commit 7f591c2 Merge: cfe35ff 70fbda6 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 22 16:56:21 2025 +0200 Merge pull request NVIDIA#699 from jgehrcke/jp/readme-installation-instruction README: refer to external install instructions commit 70fbda6 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 21 14:26:18 2025 +0200 README: refer to external install instructions Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit cfe35ff Merge: 2762688 151c766 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Fri Oct 17 14:56:42 2025 +0200 Merge pull request NVIDIA#687 from jgehrcke/jp/unbreak-ci ci: fix downstream pipeline issues commit 151c766 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Fri Oct 17 14:26:43 2025 +0200 ci: bump regctl conservatively Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 7238e5d Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Fri Oct 17 14:26:24 2025 +0200 ci: rename gl pipeline stages Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 87b7915 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Sep 3 17:04:33 2025 +0200 ci: push image w/o version prefix Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 24e765d Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Fri Oct 17 12:27:52 2025 +0200 ci: remove scan-images step Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 2762688 Merge: 1516ec7 784ba18 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Thu Oct 16 20:12:17 2025 +0200 Merge pull request NVIDIA#685 from jgehrcke/jp/tests-v1-exactly tests: construct ResourceClaim differently on v1 commit 784ba18 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Thu Oct 16 17:30:44 2025 +0000 tests: construct ResourceClaim differently on v1 Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 1516ec7 Merge: 38b42bb e14beed Author: Shiva Krishna Merla <[email protected]> Date: Thu Oct 16 10:01:55 2025 -0700 Merge pull request NVIDIA#682 from shivamerla/fix_attestations Ensure attestation parameters are passed only for multi-arch builds using buildx. commit 38b42bb Merge: 0d83254 6cef363 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Thu Oct 16 11:27:20 2025 +0200 Merge pull request NVIDIA#679 from jgehrcke/jp/tests-split-into-modules-add-failover tests: split into modules, add CD failover coverage commit 6cef363 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Thu Oct 16 08:53:23 2025 +0000 tests: explicit log on launcher container start, misc Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit db70cd7 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 15 15:27:53 2025 +0000 tests: add test_cd_failover.bats and support Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 38036ac Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 15 15:16:21 2025 +0000 tests: split tests.bats into modules Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit e14beed Author: Shiva Krishna, Merla <[email protected]> Date: Wed Oct 15 11:52:42 2025 -0700 Ensure attestation parameters are passed only for multi-arch builds using buildx. Signed-off-by: Shiva Krishna, Merla <[email protected]> commit 0d83254 Merge: 65cd2c5 f8ace2e Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 15 18:06:27 2025 +0200 Merge pull request NVIDIA#676 from jgehrcke/jp/curl-retry-tcp-rst build: retry TCP RST when curling bash source commit 65cd2c5 Merge: b3f4e07 c40b44b Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 15 16:59:35 2025 +0200 Merge pull request NVIDIA#677 from jgehrcke/jp/test-abort-on-failure tests: abort suite on first failure, misc commit c40b44b Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 15 11:24:06 2025 +0000 tests: adjust readme Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 6e783bf Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 15 10:57:25 2025 +0000 tests: rundir in /tmp (too much cruft in home dir) Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit dafa4f5 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 15 11:15:32 2025 +0000 tests: merge two simple tests into one Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit c14c2ef Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 15 11:14:09 2025 +0000 tests: add on_failure hook to emit debug info Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 89bb88a Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 15 11:12:22 2025 +0000 tests: use new --abort flag for bats (fail suite fast) Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit f8ace2e Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 15 10:42:05 2025 +0000 build: retry TCP RST when curling bash source Error seen: curl: (7) Failed to connect to mirror.cs.odu.edu port 443 after 306 ms: Connection refused By default, a TCP connection rejection (RST) is not treated by curl as a transient error, see https://curl.se/docs/manpage.html#--retry-connrefused It's a transient error in the sense that it's often a way to implement backpressure. We retry at slow rate. `--retry-all-errors` is what we want here, it includes `--retry-connrefused`. Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit b3f4e07 Merge: ab5a2b3 4e5cdf2 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 15 12:31:08 2025 +0200 Merge pull request NVIDIA#669 from NVIDIA/dependabot/go_modules/main/google.golang.org/grpc-1.76.0 build(deps): bump google.golang.org/grpc from 1.75.1 to 1.76.0 commit ab5a2b3 Merge: 23ccbd2 803a35a Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 14 19:56:45 2025 +0200 Merge pull request NVIDIA#675 from NVIDIA/dependabot/docker/deployments/devel/main/golang-1.25.3 build(deps): bump golang from 1.25.2 to 1.25.3 in /deployments/devel commit 803a35a Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue Oct 14 17:16:27 2025 +0000 build(deps): bump golang from 1.25.2 to 1.25.3 in /deployments/devel Bumps golang from 1.25.2 to 1.25.3. --- updated-dependencies: - dependency-name: golang dependency-version: 1.25.3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> commit 23ccbd2 Merge: 83b8249 9d02cea Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 14 17:30:58 2025 +0200 Merge pull request NVIDIA#672 from jgehrcke/jp/periodic-cleanup-partially-prepared-rcs CD kubelet plugin: add state reconciliation for partially prepared claims commit 9d02cea Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Mon Oct 13 13:03:11 2025 +0000 tests: cover cleanup for stale partially prepared claims Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit f7a3310 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Sun Oct 12 22:06:38 2025 +0000 CD plugin: handle stale partially prepared claims Add a fundamentally required state reconciliation: Periodically, perform a self-initiated Unprepare() of previously partially prepared claims. Perform periodically: - Read checkpoint - Iterate through RCs in PrepareStarted state - For each: RC still known in API server? If not: 1) initiate an Unprepare 2) Remove from checkpoint file if unprepr was successful Relevance: Unpreparing any partially performed claim preparation might revert a state mutation that would otherwise be permanently inconsistent with API server state (e.g., this could remove a node label). Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 83b8249 Merge: 5235bed e22cdba Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 14 15:11:35 2025 +0200 Merge pull request NVIDIA#674 from jgehrcke/jp/use-custom-config-dir-for-daemon CD daemon: /imexd instead of /etc/nvidia-imex commit e22cdba Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 14 07:15:09 2025 +0000 CD daemon: /imexd instead of /etc/nvidia-imex Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 5235bed Merge: 7b5e2cd aa15924 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 14 12:50:05 2025 +0200 Merge pull request NVIDIA#658 from jgehrcke/jp/log-full-component-config-on-startup Log full startup config in all CLIs in `Before` hook commit aa15924 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Sat Oct 11 21:09:36 2025 +0000 tests: confirm startup config logged on lvl 0 Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit e2ea590 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Mon Sep 29 13:24:00 2025 +0000 Introduce LogStartupConfig(), use in all CLIs in Before() hook Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 4e5cdf2 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Mon Oct 13 08:53:56 2025 +0000 build(deps): bump google.golang.org/grpc from 1.75.1 to 1.76.0 Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.75.1 to 1.76.0. - [Release notes](https://github.com/grpc/grpc-go/releases) - [Commits](grpc/grpc-go@v1.75.1...v1.76.0) --- updated-dependencies: - dependency-name: google.golang.org/grpc dependency-version: 1.76.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> commit 7b5e2cd Merge: a1d2fd7 11f6c02 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Mon Oct 13 10:37:08 2025 +0200 Merge pull request NVIDIA#670 from NVIDIA/dependabot/go_modules/main/golang.org/x/time-0.14.0 build(deps): bump golang.org/x/time from 0.9.0 to 0.14.0 commit a1d2fd7 Merge: c614e61 6b2af09 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Mon Oct 13 10:32:50 2025 +0200 Merge pull request NVIDIA#671 from NVIDIA/dependabot/go_modules/main/github.com/NVIDIA/nvidia-container-toolkit-1.18.0-rc.6 build(deps): bump github.com/NVIDIA/nvidia-container-toolkit from 1.18.0-rc.5 to 1.18.0-rc.6 commit 6b2af09 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Sun Oct 12 17:02:23 2025 +0000 build(deps): bump github.com/NVIDIA/nvidia-container-toolkit Bumps [github.com/NVIDIA/nvidia-container-toolkit](https://github.com/NVIDIA/nvidia-container-toolkit) from 1.18.0-rc.5 to 1.18.0-rc.6. - [Release notes](https://github.com/NVIDIA/nvidia-container-toolkit/releases) - [Changelog](https://github.com/NVIDIA/nvidia-container-toolkit/blob/main/CHANGELOG.md) - [Commits](NVIDIA/nvidia-container-toolkit@v1.18.0-rc.5...v1.18.0-rc.6) --- updated-dependencies: - dependency-name: github.com/NVIDIA/nvidia-container-toolkit dependency-version: 1.18.0-rc.6 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> commit 11f6c02 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Sun Oct 12 17:02:18 2025 +0000 build(deps): bump golang.org/x/time from 0.9.0 to 0.14.0 Bumps [golang.org/x/time](https://github.com/golang/time) from 0.9.0 to 0.14.0. - [Commits](golang/time@v0.9.0...v0.14.0) --- updated-dependencies: - dependency-name: golang.org/x/time dependency-version: 0.14.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> commit c614e61 Merge: a79a9fd 4ced422 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Sat Oct 11 16:54:20 2025 +0200 Merge pull request NVIDIA#633 from jgehrcke/jp/verbosity-vs-debuggability-improvements Add `logVerbosity` Helm chart parameter, reduce default log verbosity commit 4ced422 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Sat Oct 11 14:45:52 2025 +0000 Remove newline, document env-based log verb flip Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 4cf3d9b Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Fri Oct 10 17:57:00 2025 +0000 Fix a typo in an error message Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 2c943f7 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Sat Oct 11 13:48:35 2025 +0000 tests: remove sinful duplicate env strategy This also had a side effect on subsequent tests, with the controller starting with _no_ LOG_VERBOSITY environment variable set. I don't understand that, but that must be a funky Helm-ism. Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 3d5c51f Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Sat Oct 11 13:01:21 2025 +0000 tests: fix: wait for controller flip Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit b172342 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Sat Oct 11 12:21:07 2025 +0000 tests: replace hard-coded sleep with dynamic wait Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 9748095 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Fri Oct 10 19:36:24 2025 +0000 tests: cover CD daemon log levels Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 4767092 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Thu Oct 9 11:38:40 2025 +0000 Helm logVerbosity param: add docs, start building tests Helm values.yaml: defaultLogVerbosity incl. docs Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> values.yaml: tweak, based on in log level insights Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> improve helm chart artifact commentary Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> squash: tweak docs Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> Rename chart var, start building tests Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> tests: cover log verbosity set per-component via env Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> helm: rename defaultLogVerbosity to logVerbosity Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 3828da9 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Fri Oct 10 18:28:50 2025 +0000 CD daemon: change verbosity of "wait for nodes update" message Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 6d35ac1 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Fri Oct 10 17:56:31 2025 +0000 CD controller: make CD daemon verbosity a required arg Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 84530ab Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Thu Oct 9 14:07:33 2025 +0000 CD controller: log manager config on startup Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit bb16c33 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Thu Oct 9 11:31:36 2025 +0000 CD controller/plugins/daemon: introduce LOG_VERBOSITY Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 7e89b22 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Thu Oct 9 11:29:46 2025 +0000 CD controller: introduce LOG_VERBOSITY_CD_DAEMON Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit c5b147b Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Thu Oct 9 14:10:33 2025 +0000 tests: add note about instability around chart flip Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 4cc705a Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Thu Oct 9 11:46:10 2025 +0000 Helm: expose kubelet plugin env via chart variables Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 5f143b2 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Thu Oct 9 15:53:07 2025 +0000 Upper-case log msg, no explicit verb 0 Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 8321983 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Sep 30 12:16:17 2025 +0000 Change log message levels according to new system Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit a36e214 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Sep 30 12:12:38 2025 +0000 Add logVerbosity Helm chart parameter Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit a79a9fd Merge: 3903df7 6e56823 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Sat Oct 11 13:43:17 2025 +0200 Merge pull request NVIDIA#646 from jgehrcke/jp/no-clique-update-cd-node-status Release workload on a non-MNNVL node in a CD commit 6e56823 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Fri Oct 10 19:47:48 2025 +0000 CD plugin: move CDI edit gen into computeDomainDaemonSettings Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> make diff smaller, rename func Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit f7e4a45 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Fri Oct 10 16:30:27 2025 +0000 CD daemon: always mount in IMEX daemon config files Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> CD plugin: always prepare IMEX config on the host and mount it in Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit c040429 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Fri Oct 10 15:59:07 2025 +0000 Fix typos in comments and log message Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit deccb4d Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 7 11:39:33 2025 +0000 CD plugin: always inject CD details via CDI Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> Rename 'domain' to 'domainID' Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> squash: review feedback Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> shorten comment Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 023e7f9 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 7 11:38:23 2025 +0000 Enrich error message with CD detail when CD not found Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 32180ad Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 7 11:37:43 2025 +0000 CD daemon: unconditionally write IMEX daemon config Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> Break out of select/case, MkdirAll() before writing file Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 13df4da Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 7 09:58:50 2025 +0000 CD daemon: init node status as NotReady, misc log msg & comment tweaks Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 3cbd5a4 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 7 09:55:47 2025 +0000 CD daemon: keep business logic in no-IMEX-daemon noop mode Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit fffcea2 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 7 09:50:30 2025 +0000 Introduce maxNodesPerIMEXDomain special case for empty cliqueID Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit e0b8990 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 7 09:49:07 2025 +0000 Update code comments Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 3903df7 Merge: 14dc9fe 72e39e9 Author: Kevin Klues <[email protected]> Date: Fri Oct 10 13:22:31 2025 +0200 Merge pull request NVIDIA#661 from jgehrcke/jp/flush-logs-on-shutdown Flush logs in CLI app `After` hook commit 14dc9fe Merge: 8788dd1 d34a12f Author: Kevin Klues <[email protected]> Date: Fri Oct 10 13:16:53 2025 +0200 Merge pull request NVIDIA#656 from jgehrcke/jp/custom-rate-limiting Introduce DefaultPrepUnprepRateLimiter (less aggressive) commit 8788dd1 Merge: 23d205f 0770c0a Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Fri Oct 10 12:43:33 2025 +0200 Merge pull request NVIDIA#666 from klueska/rbac-update Separate controller and kubeletplugin into separate RBAC permissions commit 0770c0a Author: Kevin Klues <[email protected]> Date: Thu Oct 9 13:41:03 2025 +0000 Separate controller and kubeletplugin into separate RBAC permissions Signed-off-by: Kevin Klues <[email protected]> commit 23d205f Merge: fca1c08 816c7a1 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Thu Oct 9 10:01:28 2025 +0200 Merge pull request NVIDIA#664 from NVIDIA/dependabot/docker/deployments/container/main/nvidia/distroless/cc-v3.1.13-dev build(deps): bump nvidia/distroless/cc from v3.1.12-dev to v3.1.13-dev in /deployments/container commit fca1c08 Merge: e089759 b15d633 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Thu Oct 9 09:56:21 2025 +0200 Merge pull request NVIDIA#665 from NVIDIA/dependabot/docker/deployments/devel/main/golang-1.25.2 build(deps): bump golang from 1.25.1 to 1.25.2 in /deployments/devel commit b15d633 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Wed Oct 8 17:15:07 2025 +0000 build(deps): bump golang from 1.25.1 to 1.25.2 in /deployments/devel Bumps golang from 1.25.1 to 1.25.2. --- updated-dependencies: - dependency-name: golang dependency-version: 1.25.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> commit 816c7a1 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Wed Oct 8 17:15:03 2025 +0000 build(deps): bump nvidia/distroless/cc in /deployments/container Bumps nvidia/distroless/cc from v3.1.12-dev to v3.1.13-dev. --- updated-dependencies: - dependency-name: nvidia/distroless/cc dependency-version: v3.1.13-dev dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> commit 72e39e9 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Sep 30 09:42:57 2025 +0000 Flush logs in CLI app `After` hook Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit d34a12f Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 8 15:18:53 2025 +0200 Adjust go.mod to recent changes Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 7e18c33 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Sep 30 12:18:41 2025 +0000 Introduce DefaultPrepUnprepRateLimiter (less aggressive) Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit e089759 Merge: 765892d e9f647e Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Wed Oct 8 13:09:32 2025 +0200 Merge pull request NVIDIA#651 from jgehrcke/jp/issue-694 CD daemon: coordinate CD updates on shutdown via mutation cache commit e9f647e Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 7 17:55:26 2025 +0000 tests: cover CD daemon cleanup-on-shutdown Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 980a6a1 Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 7 17:06:42 2025 +0000 CD daemon: pod mngr: store UpdateStatus return value in mutation cache This makes sure that fast incremental mutations on the same CD object performed during shutdown are done conflict-free (i.e., in actual, incremental fashion using intermediate state returned by the API server). Without this patch: I1007 16:49:01.678050 1 podmanager.go:196] Successfully updated node gb-nvl-043-compute06 status to NotReady E1007 16:49:01.681345 1 computedomain.go:161] Failed to remove node from ComputeDomain during shutdown: [...] \ "the object has been modified" [...] With this patch: I1007 16:59:55.350436 1 podmanager.go:200] Successfully updated node gb-nvl-043-compute07 status to NotReady I1007 16:59:55.353551 1 computedomain.go:402] Successfully removed node with IP 192.168.34.153 from ComputeDomain default/imex-channel-injection Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 4b91fce Author: Dr. Jan-Philip Gehrcke <[email protected]> Date: Tue Oct 7 15:50:06 2025 +0000 CD daemon: coordinate CD updates on shutdown via mutationcache Signed-off-by: Dr. Jan-Philip Gehrcke <[email protected]> commit 765892d Merge: 2b7e899 754a758 Author: Kevin Klues <[email protected]> Date: Wed Oct 8 09:52:51 2025 +0200 Merge pull request NVIDIA#650 from NVIDIA/dependabot/github_actions/github/codeql-action-4 build(deps): bump github/codeql-action from 3 to 4 commit 754a758 Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Date: Tue Oct 7 17:08:56 2025 +0000 build(deps): bump github/codeql-action from 3 to 4 Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3 to 4. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@v3...v4) --- updated-dependencies: - dependency-name: github/codeql-action dependency-version: '4' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Refer to installation instructions from the README.
As a middle-ground, as things are rather heavily in flux, point to the Wiki for now. That's also where we will for now point to from within the CNT docs until things stabilize (upon GPU Operator integration, maybe).