Releases · aws-neuron/aws-neuron-sdk

21 Nov 03:15

ivashkst

v2.20.2

a574acb

Neuron SDK Release - November 20, 2024

Neuron 2.20.2 release fixes a stability issue in Neuron Scheduler Extension that previously caused crashes in Kubernetes (K8) deployments. See Neuron K8 Release Notes.

This release also addresses a security patch update to Neuron Driver that fixes a kernel address leak issue. See more on Neuron Driver Release Notes and Neuron Runtime Release Notes.

Addtionally, Neuron 2.20.2 release updates torch-neuronx and libneuronxla packages to add support for torch-xla 2.1.5 package which fixes checkpoint loading issues with Zero Redundancy Optimizer (ZeRO-1). See PyTorch Neuron (torch-neuronx) release notes and Neuron XLA pluggable device (libneuronxla) release notes.

Neuron supported DLAMIs and DLCs are updated with this release (Neuron 2.20.2 SDK). The Training DLC is also updated to address the version dependency issues in NxD Training library. See Neuron DLC Release Notes.

NxD Training library in Neuron 2.20.2 release is updated to transformers 4.36.0 package. See NxD Training Release Notes (neuronx-distributed-training).

Assets 2

26 Oct 04:04

ivashkst

v2.20.1

3ab9d96

Neuron SDK Release - October 25, 2024

Neuron 2.20.1 release addresses an issue with the Neuron Persistent Cache that was brought forth in 2.20 release. In the 2.20 release, the Neuron persistent cache issue resulted in a cache-miss scenario when attempting to load a previously compiled Neuron Executable File Format (NEFF) from a different path or Python environment than the one used for the initial Neuron SDK installation and NEFF compilation. This release resolves the cache-miss problem, ensuring that NEFFs can be loaded correctly regardless of the path or Python environment used to install the Neuron SDK, as long as they were compiled using the same Neuron SDK version.

This release also addresses the excessive lock wait time issue during neuron_parallel_compile graph extraction for large cluster training. See PyTorch Neuron (torch-neuronx) release notes and Neuron XLA pluggable device (libneuronxla) release notes.

Additionally, Neuron 2.20.1 introduces new Multi Framework DLAMI for Amazon Linux 2023 (AL2023) that customers can use to easily get started with latest Neuron SDK on multiple frameworks that Neuron supports. See Neuron DLAMI Release Notes.

Neuron 2.20.1 Training DLC is also updated to pre-install the necessary dependencies and support NxD Training library out of the box. See Neuron DLC Release Notes

Assets 2

17 Sep 01:58

trsharm25

v2.20.0

ce3668b

Neuron SDK Release - September 16th, 2024

Neuron 2.20 release introduces usability improvements and new capabilities across training and inference workloads. A key highlight is the introduction of Neuron Kernel Interface (beta). NKI, pronounced ‘Nicky’, is enabling developers to build optimized custom compute kernels for Trainium and Inferentia. Additionally, this release introduces NxD Training (beta), a PyTorch-based library enabling efficient distributed training, with a user-friendly interface compatible with NeMo. This release also introduces the support for the JAX framework (beta).

Neuron 2.20 also adds inference support for Pixart-alpha and Pixart-sigma Diffusion-Transformers (DiT) models, and adds support for Llama 3.1 8B,70B and 405B models inference supporting up to 128K context length.

Assets 2

22 Jul 17:20

aws-rxgupta

v2.19.1

800d00f

Neuron SDK Release - July 19, 2024

This release (Neuron 2.19.1) addresses an issue with the Neuron Persistent Cache that was introduced in the previous release, Neuron 2.19. The issue resulted in a cache-miss scenario when attempting to load a previously compiled Neuron Executable File Format (NEFF) from a different path or Python environment than the one used for the initial Neuron SDK installation and NEFF compilation. This release resolves the cache-miss problem, ensuring that NEFFs can be loaded correctly regardless of the path or Python environment used to install the Neuron SDK, as long as they were compiled using the same Neuron SDK version.

Assets 2

04 Jul 01:21

aws-rxgupta

v2.19.0

215b421

Neuron SDK Release - July 3, 2024

Neuron 2.19 release adds Llama 3 training support and introduces Flash Attention kernel support to enable LLM training and inference for large sequence lengths. Neuron 2.19 also introduces new features and performance improvements to LLM training, improves LLM inference performance for Llama 3 model by upto 20%, and adds tools for monitoring, problem detection and recovery in Kubernetes (EKS) environments, improving efficiency and reliability.

Training highlights: LLM model training user experience using NeuronX Distributed (NxD) is improved by support for Flash Attention to enable training with longer sequence lengths >= 8K. Neuron 2.19 adds support for Llama 3 model training. This release also adds support for Interleaved pipeline parallelism to reduce idle time (bubble size) and enhance training efficiency and resource utilization for large cluster sizes.

Inference highlights: Flash Attention kernel support in the Transformers NeuronX library enables LLM inference for context lengths of up to 32k. This release also adds [Beta] support for continuous batching with mistralai/Mistral-7B-v0.2 in Transformers NeuronX.

Tools and Neuron DLAMI/DLC highlights: This release introduces the new Neuron Node Problem Detector and Recovery plugin in EKS supported Kubernetes environments:a tool to monitor the health of Neuron instances and triggers automatic node replacement upon detecting an unrecoverable error. Neuron 2.19 introduces the new Neuron Monitor container to enable easy monitoring of Neuron metrics in Kubernetes, and adds monitoring support with Prometheus and Grafana. This release also introduces new PyTorch 2.1 and PyTorch 1.13 single framework DLAMIs for Ubuntu 22. Neuron DLAMIs and Neuron DLCs are also updated to support this release (Neuron 2.19).

Assets 2

26 Apr 01:12

trsharm25

v2.18.2

d4f1951

Neuron SDK Release - April 25, 2024

Patch release with minor Neuron Compiler bug fixes and enhancements. See more in Neuron Compiler (neuronx-cc) release notes

Assets 2

11 Apr 00:49

awsjoshir

v2.18.1

710a67a

Neuron SDK Release - April 10, 2024

Neuron 2.18.1 release introduces Continuous batching(beta) and Neuron vLLM integration(beta) support in Transformers NeuronX library that improves LLM inference throughput. This release also fixes hang issues related to Triton Inference Server as well as updating Neuron DLAMIs and DLCs with this release(2.18.1). See more in Transformers Neuron (transformers-neuronx) release notes and Neuron Compiler (neuronx-cc) release notes

Assets 2

02 Apr 01:34

aws-rxgupta

v2.18.0

af96728

Neuron SDK Release - April 1, 2024

What's New

Neuron 2.18 release introduces stable support (out of beta) for PyTorch 2.1, introduces new features and performance improvements to LLM training and inference, and updates Neuron DLAMIs and Neuron DLCs to support this release (Neuron 2.18).

Training highlights: LLM model training user experience using NeuronX Distributed (NxD) is improved by introducing asynchronous checkpointing. This release also adds support for auto partitioning pipeline parallelism in NxD and introduces Pipeline Parallelism in PyTorch Lightning Trainer (beta).

Inference highlights: Speculative Decoding support (beta) in TNx library improves LLM inference throughput and output token latency(TPOT) by up to 25% (for LLMs such as Llama-2-70B). TNx also improves weight loading performance by adding support for SafeTensor checkpoint format. Inference using Bucketing in PyTorch NeuronX and NeuronX Distributed is improved by introducing auto-bucketing feature. This release also adds a new sample for Mixtral-8x7B-v0.1 and mistralai/Mistral-7B-Instruct-v0.2 in TNx.

Neuron DLAMI and Neuron DLC support highlights: This release introduces new Multi Framework DLAMI for Ubuntu 22 that customers can use to easily get started with latest Neuron SDK on multiple frameworks that Neuron supports as well as SSM parameter support for DLAMIs to automate the retrieval of latest DLAMI ID in cloud automation flows. Support for new Neuron Training and Inference Deep Learning containers (DLCs) for PyTorch 2.1, as well as a new dedicated GitHub repository to host Neuron container dockerfiles and a public Neuron container registry to host Neuron container images.

Assets 2

14 Feb 02:27

aws-mesharma

v2.17.0

82ffe52

Neuron SDK Release - February 13, 2024

What's New

Neuron 2.17 release improves small collective communication operators (smaller than 16MB) by up to 30%, which improves large language model (LLM) Inference performance by up to 10%. This release also includes improvements in :ref:`Neuron Profiler <neuron-profile-ug>` and other minor enhancements and bug fixes.

For more detailed release notes of the new features and resolved issues, see :ref:`components-rn`.

To learn about the model architectures currently supported on Inf1, Inf2, Trn1 and Trn1n instances, please see :ref:`model_architecture_fit`.

Neuron Components Release Notes

Inf1, Trn1/Trn1n and Inf2 common packages

Component	Instance/s	Package/s	Details
Neuron Runtime	Trn1/Trn1n, Inf1, Inf2	Trn1/Trn1n: aws-neuronx-runtime-lib (.deb, .rpm) Inf1: Runtime is linked into the ML frameworks packages	:ref:`neuron-runtime-rn`
Neuron Runtime Driver	Trn1/Trn1n, Inf1, Inf2	aws-neuronx-dkms (.deb, .rpm)	:ref:`neuron-driver-release-notes`
Neuron System Tools	Trn1/Trn1n, Inf1, Inf2	aws-neuronx-tools (.deb, .rpm)	:ref:`neuron-tools-rn`
Containers	Trn1/Trn1n, Inf1, Inf2	aws-neuronx-k8-plugin (.deb, .rpm) aws-neuronx-k8-scheduler (.deb, .rpm) aws-neuronx-oci-hooks (.deb, .rpm)	:ref:`neuron-k8-rn` :ref:`neuron-containers-release-notes`
NeuronPerf (Inference only)	Trn1/Trn1n, Inf1, Inf2	neuronperf (.whl)	:ref:`neuronperf_rn`
TensorFlow Model Server Neuron	Trn1/Trn1n, Inf1, Inf2	tensorflow-model-server-neuronx (.deb, .rpm)	:ref:`tensorflow-modeslserver-neuronx-rn`
Neuron Documentation	Trn1/Trn1n, Inf1, Inf2		:ref:`neuron-documentation-rn`

Assets 2

18 Jan 23:51

aws-mesharma

v2.16.1

1351ee1

Neuron SDK Release - January 18, 2024

Patch release with compiler bug fixes, updates to Neuron Device Plugin and Neuron Kubernetes Scheduler .

Assets 2

Releases: aws-neuron/aws-neuron-sdk

Neuron SDK Release - November 20, 2024

Uh oh!

Neuron SDK Release - October 25, 2024

Uh oh!

Neuron SDK Release - September 16th, 2024

Uh oh!

Neuron SDK Release - July 19, 2024

Uh oh!

Neuron SDK Release - July 3, 2024

Uh oh!

Neuron SDK Release - April 25, 2024

Uh oh!

Neuron SDK Release - April 10, 2024

Uh oh!

Neuron SDK Release - April 1, 2024

What's New

Uh oh!

Neuron SDK Release - February 13, 2024

What's New

Neuron Components Release Notes

Inf1, Trn1/Trn1n and Inf2 common packages

Uh oh!

Neuron SDK Release - January 18, 2024

Uh oh!