-
Notifications
You must be signed in to change notification settings - Fork 71
Add Nemotron nano v2 vl #1136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Add Nemotron nano v2 vl #1136
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
…motron-nano-v2-vl
# Conflicts: # src/megatron/bridge/training/config.py
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
model Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Nemotron Nano V2 VL bridge and provider See merge request chcui/Megatron-Bridge!1
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
HF export See merge request chcui/Megatron-Bridge!2
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
suiyoubi
approved these changes
Nov 17, 2025
liding-nv
approved these changes
Nov 17, 2025
pablo-garay
approved these changes
Nov 17, 2025
chtruong814
pushed a commit
that referenced
this pull request
Nov 17, 2025
Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Li Ding <[email protected]> Signed-off-by: NeMo Bot <[email protected]>
Contributor
|
Code LGTM from CICD/tests perspective |
liding-nv
reviewed
Nov 18, 2025
| default=None, | ||
| help="Path to load the model in Megatron checkpoint format. If provided, model will not start from HF checkpoint.", | ||
| ) | ||
| parser.add_argument("--not-strict", action="store_true", help="Perform loose validation during weight export") |
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like args.not_strict was not passed to main function... @cuichenx maybe fix this in another pr since this one has been merged
sudostock
pushed a commit
to sudostock/Megatron-Bridge
that referenced
this pull request
Nov 21, 2025
Signed-off-by: yaoyu-33 <[email protected]> Signed-off-by: Chen Cui <[email protected]> Co-authored-by: yaoyu-33 <[email protected]> Co-authored-by: Li Ding <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
NVIDIA Nemotron Nano v2 VL is an open 12B multimodal reasoning model for document intelligence and video understanding. It enables AI assistants to extract, interpret, and act on information across text, images, tables, and videos. This makes the model valuable for agents focused on data analysis, document processing and visual understanding in applications like generating reports, curating videos, and dense captioning for media asset management, and retrieval-augmented search.
NeMo Megatron Bridge supports finetuning this model (including LoRA finetuning) on single-image, multi-image, and video datasets. The finetuned model can be converted back to the 🤗 Hugging Face format for downstream evaluation.
The model is currently available in the
nvcr.io/nvidia/nemo:25.09.nemotron_nano_v2_vlcontainer. This is the PR to the main branch.Documentation: https://docs.nvidia.com/nemo/megatron-bridge/latest/models/vlm/nemotron-nano-v2-vl.html
Notable differences compared to the code in the nvcr.io/nvidia/nemo:25.09.nemotron_nano_v2_vl container:
--use_llava_modelis removed (hard coded into the new script)Requires this megatron branch: NVIDIA/Megatron-LM#2115