Skip to content

Commit a574acb

Browse files
ivashkstjeffhatawsnatemail-awsahsan-z-khanmounchin
committed
Neuron SDK Release 2.20.2
Release notes for Neuron SDK Release 2.20.2 --------- Co-authored-by: Jeffrey Huynh <[email protected]> Co-authored-by: Nathan Mailhot <[email protected]> Co-authored-by: Ahsan Khan <[email protected]> Co-authored-by: Mounik Chinthapanti <[email protected]> Co-authored-by: Nathan Mailhot <[email protected]> Co-authored-by: Ryan King <[email protected]> Co-authored-by: musunita <[email protected]> Co-authored-by: Vikas Paliwal <[email protected]> Co-authored-by: Pradeep Roy <[email protected]> Co-authored-by: Esha Lakhotia <[email protected]> Co-authored-by: Nicholas Waldron <[email protected]> Co-authored-by: Roopnath <[email protected]>
1 parent 79a71b5 commit a574acb

File tree

13 files changed

+164
-16
lines changed

13 files changed

+164
-16
lines changed

conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@
157157

158158
#top_banner_message="<span>&#9888;</span><a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/setup-troubleshooting.html#gpg-key-update'> Neuron repository GPG key for Ubuntu installation has expired, see instructions how to update! </a>"
159159

160-
top_banner_message="Neuron 2.20.1 is released! check <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/index.html#latest-neuron-release'> What's New </a> and <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/announcements/index.html'> Announcements </a>"
160+
top_banner_message="Neuron 2.20.2 is released! check <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/index.html#latest-neuron-release'> What's New </a> and <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/announcements/index.html'> Announcements </a>"
161161

162162
html_theme = "sphinx_book_theme"
163163
html_theme_options = {

release-notes/containers/neuron-dlc.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,18 @@ Neuron DLC Release Notes
77
:local:
88
:depth: 1
99

10+
Neuron 2.20.2
11+
-------------
12+
Date: 11/20/2024
13+
14+
- Neuron 2.20.2 DLC fixes dependency bug for NxDT use case by pinning the correct torch version.
15+
1016

1117
Neuron 2.20.1
1218
-------------
1319

1420
Date: 10/25/2024
21+
1522
- Neuron 2.20.1 DLC includes prerequisites for :ref:`nxdt_installation_guide`. Customers can expect to use NxDT out of the box.
1623

1724

release-notes/containers/neuron-k8.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,16 @@ To Pull the Images from ECR:
3535
docker pull public.ecr.aws/neuron/neuron-scheduler:2.x.y.z
3636

3737

38+
Neuron K8 release [2.22.20.0]
39+
=============================
40+
41+
Date: 11/20/2024
42+
43+
Bug fixes
44+
---------
45+
46+
- This release addresses a stability issue in the Neuron Scheduler Extension that previously caused crashes shortly after installation.
47+
3848
Neuron K8 release [2.22.4.0]
3949
============================
4050

release-notes/index.rst

Lines changed: 22 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,23 @@ What's New
1111
.. _neuron-2.20.0-whatsnew:
1212

1313

14+
Neuron 2.20.2 (11/20/2024)
15+
---------------------------
16+
17+
Neuron 2.20.2 release fixes a stability issue in Neuron Scheduler Extension that previously caused crashes in Kubernetes (K8) deployments. See :ref:`neuron-k8-rn`.
18+
19+
This release also addresses a security patch update to Neuron Driver that fixes a kernel address leak issue.
20+
See more on :ref:`neuron-driver-release-notes` and :ref:`neuron-runtime-rn`.
21+
22+
Addtionally, Neuron 2.20.2 release updates ``torch-neuronx`` and ``libneuronxla`` packages to add support for ``torch-xla`` 2.1.5 package
23+
which fixes checkpoint loading issues with Zero Redundancy Optimizer (ZeRO-1). See :ref:`torch-neuronx-rn` and :ref:`libneuronxla-rn`.
24+
25+
Neuron supported DLAMIs and DLCs are updated with this release (Neuron 2.20.2 SDK). The Training DLC is also updated to address the
26+
version dependency issues in NxD Training library. See :ref:`neuron-dlc-release-notes`.
27+
28+
NxD Training library in Neuron 2.20.2 release is updated to transformers 4.36.0 package. See :ref:`neuronx-distributed-training-rn`.
29+
30+
1431
Neuron 2.20.1 (10/25/2024)
1532
---------------------------
1633

@@ -399,27 +416,27 @@ Release Artifacts
399416
Trn1 packages
400417
^^^^^^^^^^^^^^
401418

402-
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=trn1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1
419+
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=trn1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.2
403420

404421
Inf2 packages
405422
^^^^^^^^^^^^^^
406423

407-
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=inf2 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1
424+
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=inf2 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.2
408425

409426
Inf1 packages
410427
^^^^^^^^^^^^^^
411428

412-
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=inf1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1
429+
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=inf1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.2
413430

414431
Supported Python Versions for Inf1 packages
415432
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
416433

417-
.. program-output:: python3 src/helperscripts/n2-helper.py --list=pyversions --instance=inf1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1
434+
.. program-output:: python3 src/helperscripts/n2-helper.py --list=pyversions --instance=inf1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.2
418435

419436
Supported Python Versions for Inf2/Trn1 packages
420437
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
421438

422-
.. program-output:: python3 src/helperscripts/n2-helper.py --list=pyversions --instance=inf2 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1
439+
.. program-output:: python3 src/helperscripts/n2-helper.py --list=pyversions --instance=inf2 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.2
423440

424441
Supported Numpy Versions
425442
^^^^^^^^^^^^^^^^^^^^^^^^

release-notes/libneuronxla/index.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,15 @@ the `PJRT <https://openxla.org/xla/pjrt_integration>`__ runtime, built using
1515
the `PJRT C-API plugin <https://github.com/openxla/xla/blob/5564a9220af230c6c194e37b37938fb40692cfc7/xla/pjrt/c/docs/pjrt_integration_guide.md>`__
1616
mechanism.
1717

18+
Release [2.0.5347.0]
19+
--------------------
20+
Date: 11/20/2024
21+
22+
Summary
23+
~~~~~~~
24+
25+
Add support for torch-xla 2.1.5 which fixes the "list index out of range" error when using the Zero Redundancy Optimizer (ZeRO1) checkpoint loading.
26+
1827
Release [2.0.4986.0]
1928
--------------------
2029
Date: 10/25/2024

release-notes/neuronx-distributed-training/neuronx-distributed-training.rst

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,14 +10,25 @@ NxD Training Release Notes (``neuronx-distributed-training``)
1010

1111
This document lists the release notes for Neuronx Distributed Training library.
1212

13-
.. _neuronx-distributed-rn-1-0-0:
13+
.. _neuronx-distributed-training-rn-1-0-1:
14+
15+
Neuronx Distributed Training [1.0.1]
16+
17+
Date: 11/20/2024
18+
19+
Features in this release
20+
------------------------
21+
22+
* Added support for transformers 4.36.0
23+
24+
.. _neuronx-distributed-training-rn-1-0-0:
1425

1526
Neuronx Distributed Training [1.0.0]
1627

1728
Date: 09/16/2024
1829

19-
Features this release
20-
---------------------
30+
Features in this release
31+
------------------------
2132

2233
This is the first release of NxD Training (NxDT), NxDT is a PyTorch-based library that adds support for user-friendly distributed training experience through a YAML configuration file compatible with NeMo,, allowing users to easily set up their training workflows. At the same time, NxDT maintains flexibility, enabling users to choose between using the YAML configuration file, PyTorch Lightning Trainer, or writing their own custom training script using the NxD Core.
2334
The library supports PyTorch model classes including Hugging Face and Megatron-LM. Additionally, it leverages NeMo's data engineering and data science modules enabling end-to-end training workflows on NxDT, and providing a compatability with NeMo through minimal changes to the YAML configuration file for models that are already supported in NxDT. Furthermore, the functionality of the Neuron NeMo Megatron (NNM) library is now part of NxDT, ensuring a smooth migration path from NNM to NxDT.

release-notes/prev/content.rst

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,21 @@ Previous Releases Artifacts (Neuron 2.x)
77
:local:
88
:depth: 1
99

10+
Neuron 2.20.1 (10/25/2024)
11+
---------------------------
12+
13+
Trn1 packages
14+
^^^^^^^^^^^^^
15+
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=trn1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1
16+
17+
Inf2 packages
18+
^^^^^^^^^^^^^
19+
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=inf2 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1
20+
21+
Inf1 packages
22+
^^^^^^^^^^^^^
23+
.. program-output:: python3 src/helperscripts/n2-helper.py --list=packages --instance=inf1 --file=src/helperscripts/n2-manifest.json --neuron-version=2.20.1
24+
1025
Neuron 2.20.0 (09/16/2024)
1126
---------------------------
1227

release-notes/runtime/aws-neuronx-dkms/index.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,15 @@ Updated : 04/29/2022
1515

1616
- In rare cases of multi-process applications running under heavy stress a model load failure my occur. This may require reloading of the Neuron Driver as a workaround.
1717

18+
19+
Neuron Driver release [2.18.20.0]
20+
--------------------------------
21+
Date: 11/20/2024
22+
23+
Bug Fixes
24+
^^^^^^^^^
25+
* This release addresses an issue with Neuron Driver that can lead to a user-space application either gaining access to kernel addresses or providing the driver with spoofed memory handles (kernel addresses) that can be potentially used to gain elevated privileges. We would like to thank `Cossack9989 <https://github.com/Cossack9989>`_ for reporting and collaborating on this issue.
26+
1827
Neuron Driver release [2.18.12.0]
1928
--------------------------------
2029

release-notes/runtime/aws-neuronx-runtime-lib/index.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,14 @@ NEFF Version Runtime Version Range Notes
3131
2.0 >= 1.6.5.0 Starting support for 2.0 NEFFs
3232
============ ===================== ===================================
3333

34+
Neuron Runtime Library [2.22.19.0]
35+
---------------------------------
36+
Date: 11/20/2024
37+
38+
New in this release
39+
^^^^^^^^^^^^^^^^^^^
40+
* Minor improvements and bug fixes
41+
3442
Neuron Runtime Library [2.22.14.0]
3543
---------------------------------
3644
Date: 09/16/2024

release-notes/torch/torch-neuronx/index.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,16 @@ PyTorch Neuron for |Trn1|/|Inf2| is a software package that enables PyTorch
1414
users to train, evaluate, and perform inference on second-generation Neuron
1515
hardware (See: :ref:`NeuronCore-v2 <neuroncores-v2-arch>`).
1616

17+
Release [2.1.2.2.3.2]
18+
----------------------
19+
Date: 11/20/2024
20+
21+
Summary
22+
~~~~~~~
23+
24+
This patch narrows the range of dependent libneuronxla versions to support minor version bumps
25+
and fixes the "list index out of range" error when using the Zero Redundancy Optimizer (ZeRO1) checkpoint loading.
26+
1727
Release [2.1.2.2.3.1]
1828
----------------------
1929
Date: 10/25/2024

0 commit comments

Comments
 (0)