Commit 4b78163
Replay: [20251111] Ko3n1g/chore/main to dev (NVIDIA#2267)
Signed-off-by: dimapihtar <[email protected]>
Signed-off-by: oliver könig <[email protected]>
Signed-off-by: Asha Anoosheh <[email protected]>
Signed-off-by: Keshav Santhanam <[email protected]>
Signed-off-by: Evgeny <[email protected]>
Signed-off-by: root <Evgeny>
Co-authored-by: Teodor-Dumitru Ene <[email protected]>
Co-authored-by: Dmytro Pykhtar <[email protected]>
Co-authored-by: Asha Anoosheh <[email protected]>
Co-authored-by: Keshav Santhanam <[email protected]>
Co-authored-by: Teodor-Dumitru Ene <[email protected]>
Co-authored-by: Robert Kirby <[email protected]>
Co-authored-by: Lawrence McAfee <[email protected]>
Co-authored-by: Robert Kirby <[email protected]>
Co-authored-by: Mcore Bot <[email protected]>
Co-authored-by: helen ngo <[email protected]>
Co-authored-by: Evgeny Tsykunov <[email protected]>File tree
141 files changed
+35891
-2289
lines changed- .github
- workflows
- .gitlab/stages
- docker
- examples
- inference/gpt
- post_training/modelopt
- megatron
- core
- datasets
- inference
- contexts
- attention_context
- engines
- text_generation_controllers
- models
- gpt
- mamba
- ssm
- transformer
- post_training
- rl
- agent
- inference
- training
- tokenizer
- tests
- functional_tests
- python_test_utils
- test_cases
- gpt
- gpt3_mcore_te_tp1_pp1_dist_optimizer_no_mmap_bin_files
- gpt3_mcore_te_tp1_pp1_resume_torch_dist_dist_optimizer
- gpt3_mcore_te_tp1_pp1_resume_torch_dist_uniform_full_recompute
- gpt3_mcore_te_tp1_pp2_resume_torch_dist_rope_embeddings_interleaved_no_fusion
- gpt3_mcore_te_tp1_pp4_resume_torch_dist_persistent_disable_bias_linear
- gpt3_mcore_te_tp1_pp4_resume_torch_dist_untie_embeddings_and_outputs
- gpt3_mcore_te_tp1_pp4_vp1_dist_optimizer_overlap_grad_reduce_param_gather_overlap_optimizer
- gpt3_mcore_te_tp1_pp4_vp1_resume_torch_decoupled_lr
- gpt3_mcore_te_tp1_pp4_vp1_resume_torch_dist_calculate_per_token_loss
- gpt3_mcore_te_tp1_pp4_vp1_resume_torch_dist_dist_optimizer_overlap_grad_reduce_param_gather
- gpt3_mcore_te_tp1_pp4_vp1_resume_torch_dist_dist_optimizer_overlap_grad_reduce_untied
- gpt3_mcore_te_tp1_pp4_vp1_resume_torch_dist_dist_optimizer_overlap_grad_reduce
- gpt3_mcore_te_tp1_pp4_vp1_resume_torch_dist_tunable_overlap
- gpt3_mcore_te_tp1_pp4_vp1_tunable_overlap
- gpt3_mcore_te_tp1_pp4_vp1_uneven_pipeline
- gpt3_mcore_te_tp1_pp4_vp1
- gpt3_mcore_te_tp1_pp4_vp2_account_for_embedding_loss_in_pipeline_split
- gpt3_mcore_te_tp2_pp1_resume_torch_dist_multi_dist_optimizer_instances
- gpt3_mcore_te_tp2_pp2_resume_torch_dist_ddp_average_in_collective
- gpt3_mcore_te_tp2_pp2_resume_torch_dist_defer_embedding_wgrad_compute
- gpt3_mcore_te_tp2_pp2_resume_torch_dist_no_create_attention_mask_in_dataloader
- gpt3_mcore_te_tp2_pp2_resume_torch_dist
- gpt3_mcore_te_tp4_pp1_dist_optimizer_overlap_grad_reduce_param_gather
- gpt3_mcore_te_tp4_pp1_resume_torch_dist_dist_optimizer_overlap_grad_reduce_param_gather
- gpt3_mcore_te_tp4_pp1_resume_torch_dist_dist_optimizer_overlap_grad_reduce
- gpt3_mcore_te_tp4_pp1_resume_torch_dist_qk_layernorm_test_mode
- gpt3_mcore_tp1_pp1_resume_torch_dist_dist_optimizer_overlap_grad_reduce_param_gather
- gpt3_mcore_tp1_pp2_resume_torch_dist
- gpt3_mcore_tp1_pp2
- gpt3_mcore_tp1_pp4_resume_torch_dist
- gpt3_mcore_tp1_pp4
- gpt3_mcore_tp4_pp1_resume_torch_dist
- gpt3_mcore_tp4_pp1_resume_torch
- gpt_dynamic_inference_tp1_pp1_583m_cuda_graphs_fp8_logitsmatch
- gpt_dynamic_inference_tp1_pp1_583m_cuda_graphs_logitsmatch_decode_graphs_only
- gpt_dynamic_inference_tp1_pp1_583m_logitsmatch
- gpt_dynamic_inference_tp8_pp1_583m_logitsmatch
- moe
- gpt3_mcore_te_tp2_pp1_te_8experts2parallel_ddp_average_in_collective
- gpt3_mcore_te_tp2_pp1_te_8experts_etp1_ep4
- gpt3_mcore_te_tp2_pp1_te_a2a_ovlp_8experts_etp1_ep4
- gpt3_mcore_tp2_pp2_ep2_etp2_te_4experts2parallel_dp_last
- gpt3_mcore_tp2_pp2_ep2_etp2_te_4experts2parallel
- gpt3_mcore_tp2_pp2_ep2_te_4experts2parallel
- gpt_dynamic_inference_cuda_graphs_pad_tp4_pp1_ep4_16B_logitsmatch
- gpt_dynamic_inference_tp4_pp1_ep4_16B_logitsmatch
- test_utils/python_scripts
- unit_tests/inference
- contexts
- engines
- model_inference_wrappers/gpt
- text_generation_controllers
- tools
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
141 files changed
+35891
-2289
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
| 4 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
233 | 233 | | |
234 | 234 | | |
235 | 235 | | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
236 | 242 | | |
237 | 243 | | |
238 | 244 | | |
239 | 245 | | |
240 | 246 | | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
241 | 256 | | |
242 | 257 | | |
243 | 258 | | |
| |||
276 | 291 | | |
277 | 292 | | |
278 | 293 | | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
279 | 310 | | |
280 | 311 | | |
281 | 312 | | |
| |||
287 | 318 | | |
288 | 319 | | |
289 | 320 | | |
290 | | - | |
| 321 | + | |
| 322 | + | |
291 | 323 | | |
292 | 324 | | |
293 | 325 | | |
| |||
This file was deleted.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
56 | | - | |
57 | | - | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
58 | 59 | | |
59 | 60 | | |
| 61 | + | |
60 | 62 | | |
61 | 63 | | |
62 | 64 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
172 | 172 | | |
173 | 173 | | |
174 | 174 | | |
175 | | - | |
| 175 | + | |
176 | 176 | | |
177 | 177 | | |
178 | 178 | | |
179 | 179 | | |
180 | 180 | | |
181 | 181 | | |
182 | | - | |
183 | | - | |
| 182 | + | |
| 183 | + | |
184 | 184 | | |
185 | 185 | | |
186 | 186 | | |
| |||
217 | 217 | | |
218 | 218 | | |
219 | 219 | | |
220 | | - | |
| 220 | + | |
221 | 221 | | |
222 | 222 | | |
223 | 223 | | |
| |||
This file was deleted.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | 3 | | |
4 | | - | |
5 | | - | |
6 | 4 | | |
7 | 5 | | |
8 | 6 | | |
| |||
25 | 23 | | |
26 | 24 | | |
27 | 25 | | |
28 | | - | |
29 | 26 | | |
30 | 27 | | |
31 | 28 | | |
| 29 | + | |
32 | 30 | | |
33 | 31 | | |
34 | 32 | | |
35 | 33 | | |
36 | 34 | | |
37 | | - | |
| 35 | + | |
38 | 36 | | |
39 | 37 | | |
40 | 38 | | |
| |||
0 commit comments