Skip to content

Conversation

@cuichenx
Copy link
Contributor

@cuichenx cuichenx commented Oct 29, 2025

NVIDIA Nemotron Nano v2 VL is an open 12B multimodal reasoning model for document intelligence and video understanding. It enables AI assistants to extract, interpret, and act on information across text, images, tables, and videos. This makes the model valuable for agents focused on data analysis, document processing and visual understanding in applications like generating reports, curating videos, and dense captioning for media asset management, and retrieval-augmented search.

NeMo Megatron Bridge supports finetuning this model (including LoRA finetuning) on single-image, multi-image, and video datasets. The finetuned model can be converted back to the 🤗 Hugging Face format for downstream evaluation.

The model is currently available in the nvcr.io/nvidia/nemo:25.09.nemotron_nano_v2_vl container. This is the PR to the main branch.

Documentation: https://docs.nvidia.com/nemo/megatron-bridge/latest/models/vlm/nemotron-nano-v2-vl.html
Notable differences compared to the code in the nvcr.io/nvidia/nemo:25.09.nemotron_nano_v2_vl container:

  1. The forward step is renamed to llava_step instead of nemotron_nano_v2_vl_step
  2. The vlm inference script is moved to a standalone script hf_to_megatron_generate_nemotron_vlm.py‎ to distinguish the two different types of models, and the argument --use_llava_model is removed (hard coded into the new script)

Requires this megatron branch: NVIDIA/Megatron-LM#2115

cuichenx and others added 30 commits September 17, 2025 22:01
Signed-off-by: yaoyu-33 <[email protected]>
# Conflicts:
#	src/megatron/bridge/training/config.py
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: yaoyu-33 <[email protected]>
Nemotron Nano V2 VL bridge and provider

See merge request chcui/Megatron-Bridge!1
HF export

See merge request chcui/Megatron-Bridge!2
Signed-off-by: yaoyu-33 <[email protected]>
@cuichenx cuichenx merged commit 04c9f05 into main Nov 17, 2025
43 checks passed
@cuichenx cuichenx deleted the chcui/nemotron-nano-v2-vl branch November 17, 2025 20:26
chtruong814 pushed a commit that referenced this pull request Nov 17, 2025
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: Chen Cui <[email protected]>
Co-authored-by: yaoyu-33 <[email protected]>
Co-authored-by: Li Ding <[email protected]>
Signed-off-by: NeMo Bot <[email protected]>
@pablo-garay
Copy link
Contributor

Code LGTM from CICD/tests perspective

default=None,
help="Path to load the model in Megatron checkpoint format. If provided, model will not start from HF checkpoint.",
)
parser.add_argument("--not-strict", action="store_true", help="Perform loose validation during weight export")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like args.not_strict was not passed to main function... @cuichenx maybe fix this in another pr since this one has been merged

@yfw yfw restored the chcui/nemotron-nano-v2-vl branch November 19, 2025 01:52
sudostock pushed a commit to sudostock/Megatron-Bridge that referenced this pull request Nov 21, 2025
Signed-off-by: yaoyu-33 <[email protected]>
Signed-off-by: Chen Cui <[email protected]>
Co-authored-by: yaoyu-33 <[email protected]>
Co-authored-by: Li Ding <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

r0.2.0 Cherry-pick label for r0.2.0 release branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants