Skip to content

Conversation

@weizhoublue
Copy link
Member

I am robot, upgrade: project gpu-operator chart upgrade from v25.3.0 to v25.10.0

@gemini-code-assist
Copy link

Summary of Changes

Hello @weizhoublue, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request performs a significant version upgrade for the gpu-operator Helm chart, moving from v25.3.0 to v25.10.0. This comprehensive update primarily focuses on bringing the node-feature-discovery sub-chart to its latest version (v0.18.2) and updating various configurations within the GPU Operator components. Key changes include enabling CDI by default, standardizing probe mechanisms from gRPC to HTTPGet, introducing PodDisruptionBudgets for NFD components, and refining CRD definitions to support new features and improve security practices.

Highlights

  • GPU Operator Chart Upgrade: The core gpu-operator Helm chart has been upgraded from version v25.3.0 to v25.10.0, reflecting a significant update to the overall GPU Operator deployment.
  • Node Feature Discovery (NFD) Sub-chart Update: The node-feature-discovery (NFD) sub-chart, a key dependency, has been updated from v0.17.2 to v0.18.2, bringing in new features and bug fixes from the NFD project.
  • CDI Enabled by Default: The Container Device Interface (CDI) is now enabled by default in the gpu-operator chart, streamlining GPU access for containers.
  • NFD Probe and Port Configuration Changes: NFD components (master, GC, topologyUpdater, worker) now use httpGet probes instead of grpc for health checks, and their port definitions have been standardized to a single port instead of separate metricsPort and healthPort.
  • New NFD PodDisruptionBudgets: PodDisruptionBudgets (PDBs) have been introduced for the NFD master and garbage collector components, enhancing the availability and resilience of these critical services during voluntary disruptions.
  • CRD Enhancements and Deprecations: Custom Resource Definitions (CRDs) for ClusterPolicy and NVIDIADriver have been updated, introducing new fields like hostPID for DCGM-Exporter, secretEnv for NVIDIA Driver, and deprecating licensingConfig.configMapName in favor of a more secure secretName.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request upgrades the gpu-operator Helm chart from v25.3.0 to v25.10.0. The changes primarily involve version bumps across various configuration files and sub-charts, along with updates to reflect new features and configurations in the upgraded dependencies, such as node-feature-discovery. While most of the changes are straightforward dependency updates, I've identified a few areas with incorrect YAML templating that could lead to deployment failures. My review includes suggestions to fix these templating issues to ensure the chart is valid and robust.

Comment on lines 107 to 109
{{- with .Values.master.extraEnvs }}
{{- toYaml . | nindent 8 }}
{{- toYaml . | nindent 10 }}
{{- end}}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The extraEnvs block is incorrectly placed and indented, which will result in an invalid YAML structure. It should be placed inside the env list of the container definition. This change corrects the placement and indentation to ensure that extraEnvs are properly appended to the container's environment variables.

          env:
          - name: NODE_NAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          {{- with .Values.master.extraEnvs }}
          {{- toYaml . | nindent 10 }}
          {{- end }}

imagePullPolicy: {{ .Values.operator.imagePullPolicy }}
command:
- /bin/sh
- sh

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using /bin/sh is unnecessary as sh is typically in the PATH. It's a minor simplification to use sh directly.

          - sh

imagePullPolicy: {{ .Values.operator.imagePullPolicy }}
command:
- /bin/sh
- sh

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using /bin/sh is not necessary as sh is usually available in the PATH. You can simplify this by using sh directly.

          - sh

@github-actions github-actions bot force-pushed the upgrade/gpu-operator/v25.10.0 branch 7 times, most recently from 4c59744 to d275249 Compare November 2, 2025 20:07
@github-actions github-actions bot force-pushed the upgrade/gpu-operator/v25.10.0 branch 7 times, most recently from 09d52dd to fe968de Compare November 9, 2025 20:07
@github-actions github-actions bot force-pushed the upgrade/gpu-operator/v25.10.0 branch 8 times, most recently from 0bef275 to 98ad774 Compare November 17, 2025 20:07
@github-actions github-actions bot force-pushed the upgrade/gpu-operator/v25.10.0 branch 2 times, most recently from 5b347b6 to c2159b2 Compare November 19, 2025 20:07
@github-actions github-actions bot force-pushed the upgrade/gpu-operator/v25.10.0 branch 8 times, most recently from 5eebeb3 to ef4c8ce Compare November 27, 2025 20:07
@github-actions github-actions bot force-pushed the upgrade/gpu-operator/v25.10.0 branch from ef4c8ce to 364f880 Compare November 28, 2025 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants