ld.so.preload hardcodes /usr/local/vgpu regardless of devicePlugin.libPath, breaks on Bottlerocket EKS

## Problem

When deploying HAMi on EKS with Bottlerocket nodes, `devicePlugin.libPath` must be set to a writable path such as `/var/lib/hami/vgpu` because Bottlerocket has a read-only `/usr/local`.

`vgpu-init.sh` (postStart lifecycle hook) copies all files from `/k8s-vgpu/lib/nvidia/` to `libPath` on the host. The `ld.so.preload` file in the image hardcodes `/usr/local/vgpu/libvgpu.so` regardless of the configured `libPath`. After every device plugin pod restart, the host `ld.so.preload` is overwritten with the wrong path (or left wrong if the file already exists with matching MD5 from a previous hardcoded copy), causing `libvgpu.so` to fail to preload in every workload container.

This was surfaced during Bottlerocket deployment investigation, related to #969 and #971.

## Reproduction

1. Deploy HAMi on EKS with Bottlerocket (`aws-k8s-1.33-nvidia` variant)
2. Set `devicePlugin.libPath: /var/lib/hami/vgpu` (required — `/usr/local` is read-only on Bottlerocket)
3. After device plugin pod starts:
   ```
   kubectl exec -n <ns> <device-plugin-pod> -c device-plugin -- \
     cat /var/lib/hami/vgpu/ld.so.preload
   # Output: /usr/local/vgpu/libvgpu.so   ← wrong path
   ```
4. Any workload pod requesting `nvidia.com/gpumem` gets:
   ```
   ERROR: ld.so: object '/usr/local/vgpu/libvgpu.so' from /etc/ld.so.preload cannot be preloaded
   ```

## Root cause

`/k8s-vgpu/lib/nvidia/ld.so.preload` in the image contains `/usr/local/vgpu/libvgpu.so` hardcoded. `vgpu-init.sh` copies it as-is using MD5-based diffing. The chart renders `libPath` correctly into env vars and volume mounts but never writes the correct path into `ld.so.preload`.

## Proposed fix

Add `ld.so.preload` as a data key in the existing device-plugin ConfigMap, rendered from `{{ .Values.devicePlugin.libPath }}/libvgpu.so`. Mount it into the device-plugin container at `/k8s-vgpu/lib/nvidia/ld.so.preload` using `subPath` on the existing `deviceconfig` volume. `vgpu-init.sh`'s MD5-based copy logic then picks up the correct path from the ConfigMap instead of the image's hardcoded version. No new Kubernetes resources are required.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ld.so.preload hardcodes /usr/local/vgpu regardless of devicePlugin.libPath, breaks on Bottlerocket EKS #1713

Problem

Reproduction

Root cause

Proposed fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

ld.so.preload hardcodes /usr/local/vgpu regardless of devicePlugin.libPath, breaks on Bottlerocket EKS #1713

Description

Problem

Reproduction

Root cause

Proposed fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions