-
Notifications
You must be signed in to change notification settings - Fork 62
[Ubuntu24.04] Install the driver in a single step #156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ece1c16 to
e6bc8a8
Compare
e6bc8a8 to
41d0861
Compare
41d0861 to
2a1f2b2
Compare
ubuntu24.04/nvidia-driver
Outdated
| --x-module-path=/tmp/null \ | ||
| --x-library-path=/tmp/null \ | ||
| --x-sysconfig-path=/tmp/null \ | ||
| -m=${KERNEL_TYPE} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of scope for this PR -- looking at this again, it is probably best to not explicitly set this option unless the user sets KERNEL_TYPE (or whatever API is exposed by the operator). Especially now that we are leveraging the nvidia-installer to perform the compilation and installation, we can depend on the defaults the nvidia-installer applies for KERNEL_TYPE -- for example, on newer driver versions it will automatically choose to install the open modules on compatible systems.
b4041f6 to
38a6df2
Compare
6c9414b to
11b3b8e
Compare
Signed-off-by: Tariq Ibrahim <[email protected]>
11b3b8e to
3a0e5fd
Compare
This change condenses the two-step driver install into a single step.
Currently, the driver image installs the userspace components and kernel modules separately. This was to allow for signing of the kernel modules with a custom private key and then relinking the signed kernel modules as well as updating the Kernel module should the underlying kernel host be updated.
As none of these workflow apply today, we simplify the driver installation and allow for defining an API in gpu-operator where users can easily pass custom runfile installation arguments
I have tested Driver upgrades and updates (both Open and ClosedRM modules) with these changes and the driver container has been running with no issues