-
Notifications
You must be signed in to change notification settings - Fork 103
Context Parallel Squashed MR for recipes dir #1338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
e1b0489 to
ce56fce
Compare
9 tasks
75e4023 to
e0038dd
Compare
- removed models dir - rebased 11/24/2025 Signed-off-by: Jonathan Mitchell <[email protected]>
e0038dd to
cc4824b
Compare
Signed-off-by: Jonathan Mitchell <[email protected]>
Signed-off-by: Jonathan Mitchell <[email protected]>
Signed-off-by: Jonathan Mitchell <[email protected]>
pstjohn
approved these changes
Nov 24, 2025
Signed-off-by: Jonathan Mitchell <[email protected]>
Signed-off-by: Jonathan Mitchell <[email protected]>
Signed-off-by: Jonathan Mitchell <[email protected]>
Collaborator
Author
|
/ok to test 4f26cd8 |
pstjohn
approved these changes
Nov 24, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add context based parallelism to ESM2 through the addition of a CPAware Dataloader
test_cp_dataloaderfile that tests the dataloader outputs.dataset.pywhich brings in all the CP stufftrain_ddp_cp.pythat runs both DDP and CP.Usage
train_dataloader, dataset_or_sampler = create_cp_dataloader(dist_config, cp_world_size=torch.distributed.get_world_size(group=cp_group), cp_group=cp_group, cp_rank=cp_rank, **args.dataset)Description
Usage
Type of changes
CI Pipeline Configuration
Configure CI behavior by applying the relevant labels. By default, only basic unit tests are run.
Unit tests marked as
@pytest.mark.multi_gpuor@pytest.mark.distributedare not run in the PR pipeline.For more details, see CONTRIBUTING
Note
By default, only basic unit tests are run. Add appropriate labels to enable an additional test coverage.
Authorizing CI Runs
We use copy-pr-bot to manage authorization of CI
runs on NVIDIA's compute resources.
automatically be copied to a pull-request/ prefixed branch in the source repository (e.g. pull-request/123)
/ok to testcomment on the pull request to trigger CI. This will need to be done for each new commit.Pre-submit Checklist