fix(geneformer): resolve protobuf conflict with nvidia-resiliency-ext>=0.6.0#1598
Open
svc-bionemo wants to merge 1 commit into
Open
Conversation
Contributor
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Collaborator
|
/ok to test 3e37d22 |
pstjohn
approved these changes
Jun 3, 2026
jstjohn
approved these changes
Jun 3, 2026
…>=0.6.0 megatron-core==0.17.1 requires nvidia-resiliency-ext>=0.6.0 at runtime, but nvidia-resiliency-ext 0.6.0 pulls in grpcio-tools>=1.76.0 which requires protobuf>=6.30.0 — conflicting with nemo-toolkit==2.4.0 pinning protobuf~=5.29.5. Fix by using a .ci_build.sh script that installs nvidia-resiliency-ext with --no-deps (skipping grpcio-tools) and then installs the package normally. grpcio-tools is not needed for geneformer test execution. Signed-off-by: svc-bionemo <267129667+svc-bionemo@users.noreply.github.com>
auto-merge was automatically disabled
June 3, 2026 15:25
Head branch was pushed to by a user without write access
3e37d22 to
9fe6df5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The geneformer nightly CI fails because megatron-core==0.17.1 asserts nvidia-resiliency-ext>=0.6.0 at import time, but that version isn't installed in the CI container.
Simply adding nvidia-resiliency-ext>=0.6.0 to pyproject.toml creates an unresolvable pip conflict:
Fix
Add a
.ci_build.shthat installs nvidia-resiliency-ext>=0.6.0 with--no-deps(skipping grpcio-tools which is not needed for test execution), then installs the package normally.The CI workflow already supports
.ci_build.shas a hook — if the file exists, it runs that instead of the defaultpip install -e ..Root Cause
nvidia-resiliency-ext 0.6.0 added grpcio/grpcio-tools as hard dependencies (for its new gRPC-based fault tolerance features), but these are incompatible with nemo-toolkit 2.4.0's protobuf pin. This will resolve when nemo-toolkit relaxes its protobuf constraint.