You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
│ ├── core/ # Megatron Core (kernels, parallelism, building blocks)
104
104
│ │ ├── models/ # Transformer models
105
105
│ │ ├── transformer/ # Transformer building blocks
@@ -128,7 +128,7 @@ Megatron-LM/
128
128
129
129
-**Training state-of-the-art foundation models** at scale with cutting-edge performance on latest NVIDIA hardware
130
130
-**Research teams** exploring new architectures and training techniques
131
-
-**Learning distributed training** concepts and best practices
131
+
-**Learning distributed training** concepts and best practices
132
132
-**Quick experimentation** with proven model configurations
133
133
134
134
**What you get:**
@@ -137,7 +137,7 @@ Megatron-LM/
137
137
- End-to-end examples from data prep to evaluation
138
138
- Research-focused tools and utilities
139
139
140
-
### Megatron Core: Composable Library
140
+
### Megatron Core: Composable Library
141
141
142
142
**Composable library** with GPU-optimized building blocks for custom training frameworks.
143
143
@@ -170,7 +170,7 @@ Megatron-LM/
170
170
-**[Megatron Bridge](https://github.com/NVIDIA-NeMo/Megatron-Bridge)** - Training library with bidirectional Hugging Face ↔ Megatron checkpoint conversion, flexible training loops, and production-ready recipes
171
171
-**[NeMo RL](https://github.com/NVIDIA-NeMo/RL)** - Scalable toolkit for efficient reinforcement learning with RLHF, DPO, and other post-training methods
172
172
-**[NeMo Framework](https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html)** - Enterprise framework with cloud-native support and end-to-end examples
173
-
-**[TensorRT Model Optimizer (ModelOpt)](https://github.com/NVIDIA/TensorRT-Model-Optimizer)** - Model optimization toolkit for quantization, pruning, and distillation
173
+
-**[TensorRT Model Optimizer (ModelOpt)](https://github.com/NVIDIA/TensorRT-Model-Optimizer)** - Model optimization toolkit for quantization, pruning, distillation, speculative decoding, and more. Checkout end-to-end examples in [examples/post_training/modelopt](./examples/post_training/modelopt/).
174
174
175
175
**Compatible with:**[Hugging Face Accelerate](https://github.com/huggingface/accelerate), [Colossal-AI](https://github.com/hpcaitech/ColossalAI), [DeepSpeed](https://github.com/microsoft/DeepSpeed)
176
176
@@ -257,7 +257,7 @@ Our codebase efficiently trains models from 2B to 462B parameters across thousan
257
257
**Benchmark Configuration:**
258
258
259
259
-**Vocabulary size**: 131,072 tokens
260
-
-**Sequence length**: 4096 tokens
260
+
-**Sequence length**: 4096 tokens
261
261
-**Model scaling**: Varied hidden size, attention heads, and layers to achieve target parameter counts
262
262
-**Communication optimizations**: Fine-grained overlapping with DP (`--overlap-grad-reduce`, `--overlap-param-gather`), TP (`--tp-comm-overlap`), and PP (enabled by default)
0 commit comments