NeMo Framework is NVIDIA's GPU accelerated, end-to-end training framework for large language models (LLMs), multi-modal models and speech models. It enables seamless scaling of training (both pretraining and post-training) workloads from single GPU to thousand-node clusters for both 🤗Hugging Face, Megatron, and Pytorch models.
This GitHub organization hosts repositories for NeMo's core components and integrations:
- State of the art post-training techniques such as GRPO, DPO, SFT etc.
- Distributed inference runtime with Ray-based orchestration.
- Seamless integration with 🤗Hugging Face for users to post-train a wide range of models.
- High performance Megatron Core-based implementation with many parallelisms for large models and long context lengths.
- Streamlined configuration, execution, and management of machine learning experiments across multiple computing environments.
- Seamless portability via support for Local, Slurm, Docker, Lepton, RunAI and Skypilot executors.
- Support for defining complex experiments via a DAG based interface
To learn more about NVIDIA NeMo Framework and all of its component libraries, please refer to the NeMo Framework User Guide, which includes quick start guide, tutorials, model-specific recipes, best practice guides and performance benchmarks.
Apache 2.0 licensed with third-party attributions documented in each repository.