-
Notifications
You must be signed in to change notification settings - Fork 77
Add Nemotron nano v2 vl #1136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Add Nemotron nano v2 vl #1136
Changes from all commits
Commits
Show all changes
111 commits
Select commit
Hold shift + click to select a range
e63ed61
add wip code
cuichenx 7858117
update utils for transformers config in hydra
yaoyu-33 457bace
temp save
yaoyu-33 22233a2
pipeclean conversion (forward wip)
cuichenx 6937da4
Merge branch 'refs/heads/main' into qwen-25vl-training
yaoyu-33 c67f734
vlm generate script updates for nemotron vl
cuichenx fcca45c
Merge remote-tracking branch 'refs/remotes/origin/main' into chcui/ne…
cuichenx 790cd8d
fix after merging with main
cuichenx 3a9ab4f
clean up
cuichenx e0fc7d1
fix forward pass
cuichenx 44faee0
add /no_think sys prompt
cuichenx 8a51440
Merge branch 'refs/heads/main' into qwen-25vl-training
yaoyu-33 3bc6ba5
lint
yaoyu-33 8061e0f
revert qwen-vl changes in gpt
yaoyu-33 df4755a
revert qwen-vl changes in gpt #2
yaoyu-33 975efd2
Add mock dataset provider for qwen25 vl
yaoyu-33 be708c2
add qwen25 vl dataset support from auto
yaoyu-33 6822d34
lint
yaoyu-33 ec9c7cd
enable multi image and video inputs
cuichenx bc8c605
update _attn_implementation
yaoyu-33 689f491
update comments
yaoyu-33 cf2c769
Merge branch 'chcui/nemotron-nano-v2-vl' into 'dev/nemotron-nano-v2-vl'
cuichenx 4f0e90f
add preloaded dataset provider
yaoyu-33 4959ea5
enable hf export (need to manually copy over modeling files)
cuichenx 98caa7a
expose strict
cuichenx 2af0c2e
update _processor to a private attr
yaoyu-33 4a3ef3b
Merge branch 'chcui/hf_export' into 'dev/nemotron-nano-v2-vl'
cuichenx 7f3818e
Merge branch 'refs/heads/main' into chcui/nano-v2-vl-training
cuichenx ccf6abe
update qwen training utils
yaoyu-33 94c6192
training bug fix
yaoyu-33 95d3002
fix finalize grad
yaoyu-33 4b7ef60
save qwen25 vl recipes
yaoyu-33 c37ffa0
training WIP
cuichenx 03e3a7c
undo ckpt modification, loading works
cuichenx b095aae
Merge branch 'chcui/nano-v2-vl-training' into 'dev/nemotron-nano-v2-vl'
cuichenx 608117e
add padding logic for pp
yaoyu-33 a9f0e15
vlm step general
yaoyu-33 6ddd4b3
default update
yaoyu-33 f30aa39
Merge branch 'main' into qwen-25vl-training
yaoyu-33 e425113
update to model specific visual inputs, also update mock dataset to b…
yaoyu-33 5bc1f29
Merge branch 'main' into qwen-25vl-training
yaoyu-33 90a0ff0
add ci tests
yaoyu-33 49759bc
lint
yaoyu-33 62ffa88
update dependency
yaoyu-33 6af4e4c
build: add qwen-vl-utils and update lockfile
yaoyu-33 7e0ceaf
remove `start_of_response_token` use
yaoyu-33 a7e5fdc
add few more unit tests
yaoyu-33 1e44b97
fix wandb reinit issue
yaoyu-33 18012cd
Revert "fix wandb reinit issue"
yaoyu-33 b0b910e
lint
yaoyu-33 d2031ca
update and fix tests for vlm dataset
yaoyu-33 3d8f4b3
Merge remote-tracking branch 'origin/qwen-25vl-training' into chcui/n…
cuichenx 70aafe2
training works
cuichenx 398a812
add raven and llava-video datasets
cuichenx a44d26c
push discussion code
cuichenx cbc25d4
Merge branch 'chcui/nano-v2-vl-training' into 'dev/nemotron-nano-v2-vl'
cuichenx 56f9ad9
support video training
liding-nv a8ad5fd
add peft merge
cuichenx 46cd9b9
change wording
cuichenx 6008b3e
save every 200
cuichenx 2da5696
clean up internal paths
cuichenx d3dd155
add merge lora script..
cuichenx 3a13a6c
fix import
liding-nv b9da6cf
support multi subset video
liding-nv 0bcfcb8
export with copy
cuichenx e9ee70d
qa fixes
cuichenx 546c233
Merge remote-tracking branch 'refs/remotes/origin/main' into chcui/ne…
cuichenx e69586d
clean up code
cuichenx 85c6a44
Merge remote-tracking branch 'origin/main' into chcui/nemotron-nano-v…
cuichenx d31d50f
Merge remote-tracking branch 'origin/main' into chcui/nemotron-nano-v…
cuichenx 2e223e8
change to supported HF architectures
cuichenx 1eb8fa3
add tests
cuichenx 6f739cf
Merge remote-tracking branch 'refs/remotes/origin/main' into chcui/ne…
cuichenx 0abb526
Merge remote-tracking branch 'refs/remotes/origin/main' into chcui/ne…
cuichenx 0567e20
address comments
cuichenx edc2d98
copy over py and json files only
cuichenx 9e80f35
merge causal lm and vlm so that output saves preprocessor config auto…
cuichenx bd447ae
move nemotron vlm generation to a new script
cuichenx bac193a
address comment
cuichenx c0756ce
move path helper to common utils
cuichenx 707562a
Merge branch 'main' into chcui/nemotron-nano-v2-vl
cuichenx f7e0d3b
update model name
cuichenx b6a60d7
Merge branch 'chcui/nemotron-nano-v2-vl' of github.com:NVIDIA-NeMo/Me…
cuichenx bfda67e
refactor to llava_step
cuichenx 71b4e78
clean up
cuichenx 8813087
Merge branch 'main' into chcui/nemotron-nano-v2-vl
cuichenx e67e9f1
revert previous export copy code
cuichenx ced4190
raise error if trying to access validation split for raven and llava …
cuichenx f603601
Fix typo
cuichenx 0cd3961
Merge branch 'main' into chcui/nemotron-nano-v2-vl
cuichenx e490457
Clean up provider initialization in nemotron_vl_bridge.py
cuichenx a5dd884
Merge branch 'main' into chcui/nemotron-nano-v2-vl
cuichenx e4760d4
lint and cleanup
cuichenx 04f9393
lint and clean up
cuichenx 81dc629
add back strict arg
cuichenx 792eaf4
manual seed for cpu backend for nano v2 vl export
cuichenx fc3cd79
fix tests
cuichenx f2c3709
workaround rst render issue
cuichenx 58694ac
fix tests
cuichenx c2298f0
fix recipe test
cuichenx b48e57f
fix unit test
cuichenx af09bc6
fix functional test
cuichenx 9434a71
add strict flag to rountrip script
cuichenx ef2a730
Merge remote-tracking branch 'origin/main' into chcui/nemotron-nano-v…
cuichenx 88fcded
merge main and rework conversion test
cuichenx d8f9e45
use toy model config in test
cuichenx fd71d17
fix test
cuichenx 3ed33ea
fix tests for coverage
cuichenx c205b59
use correct mappings for mamba
cuichenx feec14f
install timm
cuichenx 0f40f5f
install open_clip_torch
cuichenx File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like
args.not_strictwas not passed to main function... @cuichenx maybe fix this in another pr since this one has been merged