Skip to content

Commit 9f6387f

Browse files
authored
Merge pull request #1125 from aws-neuron/release_222
Release 2.22.0 (#2216)
2 parents 82231d6 + 2ae3525 commit 9f6387f

File tree

129 files changed

+4971
-1248
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

129 files changed

+4971
-1248
lines changed

_ext/neuron_tag.py

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,9 @@
2222
'general/announcements/index',
2323
'frameworks/tensorflow/tensorflow-neuron/'
2424
]
25-
add_trn1_tag = ['frameworks/neuron-customops/','frameworks/torch/inference-torch-neuronx', 'libraries/nemo-megatron/']
26-
# add_trn2_tag = ['']
27-
add_neuronx_tag = ['frameworks/torch/torch-neuronx/','frameworks/tensorflow/tensorflow-neuronx/','frameworks/torch/inference-torch-neuronx/','libraries/transformers-neuronx/','libraries/neuronx-distributed/','neuronx-distributed/nxd-training', 'general/setup/tensorflow-neuronx']
25+
add_trn1_tag = ['frameworks/neuron-customops/','frameworks/torch/inference-torch-neuronx', 'libraries/nemo-megatron/', 'libraries/nxd-training/']
26+
add_trn2_tag = ['libraries/nxd-training/']
27+
add_neuronx_tag = ['frameworks/torch/torch-neuronx/','frameworks/tensorflow/tensorflow-neuronx/','frameworks/torch/inference-torch-neuronx/','libraries/transformers-neuronx/','libraries/neuronx-distributed/','libraries/nxd-training', 'general/setup/tensorflow-neuronx']
2828
clear_inf1_tag = ['general/arch/neuron-features/neuron-caching',
2929
'general/arch/neuron-features/eager-debug-mode',
3030
'general/arch/neuron-features/collective-communication-operations',
@@ -82,7 +82,8 @@
8282
'general/arch/neuron-hardware/trn2-arch',
8383
'general/arch/neuron-hardware/neuron-core-v3',
8484
'/general/announcements/neuron2.x/announce-neuron-trn2',
85-
'/general/arch/neuron-features/logical-neuroncore-config'
85+
'/general/arch/neuron-features/logical-neuroncore-config',
86+
'libraries/nxd-training/'
8687
]
8788

8889
clear_inf2_tag = ['frameworks/torch/torch-neuronx/training',
@@ -99,7 +100,8 @@
99100
'general/arch/neuron-hardware/trn2-arch',
100101
'general/arch/neuron-hardware/neuron-core-v3',
101102
'/general/announcements/neuron2.x/announce-neuron-trn2',
102-
'/general/arch/neuron-features/logical-neuroncore-config'
103+
'/general/arch/neuron-features/logical-neuroncore-config',
104+
'libraries/nxd-training/'
103105
]
104106

105107

@@ -265,6 +267,9 @@ def run(self):
265267
return_instances.append('Inf2')
266268
return_instances.append('Trn1')
267269
return_instances.append('Trn2')
270+
271+
if cur_file=='general/appnotes/neuronx-distributed/introducing-nxdt-training':
272+
return_instances = ['Trn1','Trn2']
268273

269274
# generate text from instances list if the list is not empty.
270275
return_instances = sorted(set(return_instances))

compiler/neuronx-cc/api-reference-guide/neuron-compiler-cli-reference-guide.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ Common parameters for the Neuron CLI:
122122

123123
- ``llm-training``: Enable the compiler to perform optimizations applicable to large language model (LLMS) training runs that shard parameters, gradients, and optimizer states across data-parallel workers. This is equivalent to the previously documented option argument value of ``NEMO``, which will be deprecated in a future release.
124124

125-
- :option:`--logical-nc-config <shard_degree>`: Instructs the compiler to shard the input graph across physical NeuronCore accelerators. Possible numeric values are {1, 2}. (only available on trn2; Default: ``2``)
125+
- :option:`--logical-nc-config <shard_degree>`: Instructs the compiler to shard the input graph across physical NeuronCore accelerators. Possible numeric values are {1, 2}. (Only available on trn2; Default: ``2``)
126126

127127
Valid values:
128128

@@ -141,7 +141,7 @@ Common parameters for the Neuron CLI:
141141

142142
- :option:`--enable-mixed-precision-accumulation`: Perform intermediate calculations of accumulation operators (such as softmax and layernorm) in FP32 and cast the result to the model-designated datatype. This improves the operator's resulting accuracy.
143143

144-
- :option:`--enable-saturate-infinity`: Convert +/- infinity values to MAX/MIN_FLOAT for compiler-introduced matrix-multiply transpose computations that have a high risk of generating Not-a-Number (NaN) values. There is a potential performance impact during model execution when this conversion is enabled.
144+
- :option:`--enable-saturate-infinity`: Convert +/- infinity values to MAX/MIN_FLOAT for compiler-introduced matrix-multiply transpose computations that have a high risk of generating Not-a-Number (NaN) values. There is a potential performance impact during model execution when this conversion is enabled. (Only needed on trn1; while the trn2 compiler will accept this flag for compatibility reasons, it has no effect on the compilation.)
145145

146146
- :option:`--enable-fast-context-switch`: Optimize for faster model switching rather than execution latency.
147147
This option will defer loading some weight constants until the start of model execution. This results in overall faster system performance when your application switches between models frequently on the same Neuron Core (or set of cores).

conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -195,7 +195,7 @@
195195

196196
#top_banner_message="<span>&#9888;</span><a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/setup-troubleshooting.html#gpg-key-update'> Neuron repository GPG key for Ubuntu installation has expired, see instructions how to update! </a>"
197197

198-
top_banner_message="Neuron 2.21.1 is released! check <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/index.html#latest-neuron-release'> What's New </a> and <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/announcements/index.html'> Announcements </a>"
198+
top_banner_message="Neuron 2.22.0 is released! check <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/index.html#latest-neuron-release'> What's New </a> and <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/announcements/index.html'> Announcements </a>"
199199

200200
html_theme = "sphinx_book_theme"
201201
html_theme_options = {

dlami/index.rst

Lines changed: 97 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,31 @@ comes pre-installed with all the Neuron libraries including Neuron compiler and
2626
Tensorflow-neuron 2.10 (inf1) released in SDK v2.20.2 is not compatible with the latest runtime in v2.21 SDK.
2727
Code that compiles will face runtime errors with the latest SDK 2.21.1 version.
2828

29-
Neuron team is aware of this issue and it will be fixed in the next minor release.
29+
Neuron team is aware of this issue and we will ship a single-framework AMI for TF 2.10 inf1 in a future release.
30+
31+
You can use multi-framework DLAMIs from Neuron SDK v2.20.0 for inf1 workloards to avoid this issue. For example:
32+
33+
Deep Learning AMI Neuron (Ubuntu 22.04/AL2023) 20241027
34+
35+
| Ubuntu22: ami-017ff4652165fd617
36+
| AL2023: ami-06fdb253ce8a32239
37+
38+
.. code-block:: shell
39+
40+
aws ec2 run-instances --image-id <ami-id>
41+
42+
43+
Alternatively, you can use the latest Neuron DLAMIs on Ubuntu and run this command as a work-around:
3044

31-
Please refer to `this page <https://github.com/aws-neuron/aws-neuron-sdk/issues/1071>`_ for more information on the issue and a temporary work-around.
45+
.. code-block:: shell
46+
47+
sudo apt-get remove -y aws-neuronx-dkms aws-neuronx-collectives aws-neuronx-runtime-lib aws-neuronx-tools
48+
sudo apt-get install aws-neuronx-dkms=2.18.* -y
49+
sudo apt-get install aws-neuronx-collectives=2.22.* -y
50+
sudo apt-get install aws-neuronx-runtime-lib=2.22.* -y
51+
sudo apt-get install aws-neuronx-tools=2.19.* -y
52+
53+
https://github.com/aws-neuron/aws-neuron-sdk/issues/1071 for more information on the issue.
3254

3355
Multi Framework DLAMIs supported
3456
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -44,10 +66,11 @@ Multi Framework DLAMIs supported
4466
- DLAMI Name
4567

4668
* - Ubuntu 22.04
47-
- Inf2, Trn1, Trn1n, Trn2
69+
- Inf1, Inf2, Trn1, Trn1n, Trn2
4870
- Deep Learning AMI Neuron (Ubuntu 22.04)
71+
4972
* - Amazon Linux 2023
50-
- Inf2, Trn1, Trn1n, Trn2
73+
- Inf1, Inf2, Trn1, Trn1n, Trn2
5174
- Deep Learning AMI Neuron (Amazon Linux 2023)
5275

5376

@@ -72,18 +95,21 @@ Virtual Environments pre-installed
7295

7396
* - PyTorch 2.5 NxD Inference, Torch NeuronX
7497
- /opt/aws_neuronx_venv_pytorch_2_5_nxd_inference
98+
99+
* - Transformers NeuronX (PyTorch 2.5)
100+
- /opt/aws_neuronx_venv_pytorch_2_5_transformers
75101

76102
* - JAX 0.4 NeuronX
77103
- /opt/aws_neuronx_venv_jax_0_4
78104

79105
* - Tensorflow 2.10 NeuronX
80106
- /opt/aws_neuronx_venv_tensorflow_2_10
81107

82-
* - Transformers NeuronX (PyTorch 2.5)
83-
- /opt/aws_neuronx_venv_pytorch_2_5_transformers
84-
85108
* - Tensorflow 2.10 Neuron (Inf1)
86109
- /opt/aws_neuron_venv_tensorflow_2_10_inf1
110+
111+
* - PyTorch 1.13 Neuron (Inf1)
112+
- /opt/aws_neuron_venv_pytorch_1_13_inf1
87113

88114
You can easily get started with the multi-framework DLAMI through AWS console by following this :ref:`setup guide <setup-ubuntu22-multi-framework-dlami>`. If you are looking to
89115
use the Neuron DLAMI in your cloud automation flows, Neuron also supports :ref:`SSM parameters <ssm-parameter-neuron-dlami>` to easily retrieve the latest DLAMI id.
@@ -109,15 +135,30 @@ Single Framework DLAMIs supported
109135
- Neuron Instances Supported
110136
- DLAMI Name
111137

112-
* - Tensorflow 2.10
138+
* - PyTorch 2.5
113139
- Ubuntu 22.04
114-
- Inf1, Inf2, Trn1, Trn1n, Trn2
115-
- Deep Learning AMI Neuron TensorFlow 2.10 (Ubuntu 22.04)
140+
- Inf2, Trn1, Trn1n, Trn2
141+
- Deep Learning AMI Neuron PyTorch 2.5 (Ubuntu 22.04)
142+
143+
* - PyTorch 2.5
144+
- Amazon Linux 2023
145+
- Inf2, Trn1, Trn1n, Trn2
146+
- Deep Learning AMI Neuron PyTorch 2.5 (Amazon Linux 2023)
116147

117148
* - Tensorflow 2.10
118-
- Ubuntu 20.04
119-
- Inf2, Trn1, Trn1n
120-
- Deep Learning AMI Neuron TensorFlow 2.10 (Ubuntu 20.04)
149+
- Ubuntu 22.04
150+
- Inf2, Trn1, Trn1n, Trn2
151+
- Deep Learning AMI Neuron TensorFlow 2.10 (Ubuntu 22.04)
152+
153+
* - Tensorflow 2.10 (Inf1)
154+
- Ubuntu 22.04
155+
- Inf1
156+
- Deep Learning AMI Neuron TensorFlow 2.10 Inf1 (Ubuntu 22.04)
157+
158+
* - PyTorch 1.13 (Inf1)
159+
- Ubuntu 22.04
160+
- Inf1
161+
- Deep Learning AMI Neuron PyTorch 1.13 Inf1 (Ubuntu 22.04)
121162

122163

123164
Virtual Environments pre-installed
@@ -133,19 +174,36 @@ Virtual Environments pre-installed
133174
- Neuron Libraries supported
134175
- Virtual Environment
135176

136-
* - Deep Learning AMI Neuron TensorFlow 2.10 (Ubuntu 22.04)
137-
- tensorflow-neuronx
138-
- /opt/aws_neuronx_venv_tensorflow_2_10
177+
* - Deep Learning AMI Neuron PyTorch 2.5 (Ubuntu 22.04, Amazon Linux 2023)
178+
- PyTorch 2.5 Torch NeuronX, NxD Core
179+
- /opt/aws_neuronx_venv_pytorch_2_5
139180

140-
* - Deep Learning AMI Neuron TensorFlow 2.10 (Ubuntu 20.04)
141-
- tensorflow-neuronx
142-
- /opt/aws_neuron_venv_tensorflow_2_10
181+
* - Deep Learning AMI Neuron PyTorch 2.5 (Ubuntu 22.04, Amazon Linux 2023)
182+
- PyTorch 2.5 NxD Training, Torch NeuronX
183+
- /opt/aws_neuronx_venv_pytorch_2_5_nxd_training
143184

185+
* - Deep Learning AMI Neuron PyTorch 2.5 (Ubuntu 22.04, Amazon Linux 2023)
186+
- PyTorch 2.5 NxD Inference, Torch NeuronX
187+
- /opt/aws_neuronx_venv_pytorch_2_5_nxd_inference
188+
189+
* - Deep Learning AMI Neuron PyTorch 2.5 (Ubuntu 22.04, Amazon Linux 2023)
190+
- Transformers NeuronX PyTorch 2.5
191+
- /opt/aws_neuronx_venv_pytorch_2_5_transformers
192+
193+
* - Deep Learning AMI Neuron PyTorch 1.13 (Ubuntu 22.04)
194+
- Pytorch Neuron (Inf1)
195+
- /opt/aws_neuron_venv_pytorch_1_13_inf1
196+
197+
* - Deep Learning AMI Neuron TensorFlow 2.10 (Ubuntu 22.04)
198+
- Tensorflow Neuronx
199+
- /opt/aws_neuronx_venv_tensorflow_2_10
200+
144201
* - Deep Learning AMI Neuron TensorFlow 2.10 (Ubuntu 22.04)
145-
- tensorflow-neuron (Inf1)
202+
- Tensorflow Neuron (Inf1)
146203
- /opt/aws_neuron_venv_tensorflow_2_10_inf1
147-
148-
You can easily get started with the single framework DLAMI through AWS console by following one of the corresponding setup guides . If you are looking to
204+
205+
206+
You can easily get started with the single framework DLAMI through AWS console by following one of the corresponding setup guides . If you are looking to
149207
use the Neuron DLAMI in your cloud automation flows , Neuron also supports :ref:`SSM parameters <ssm-parameter-neuron-dlami>` to easily retrieve the latest DLAMI id.
150208

151209
Neuron Base DLAMI
@@ -166,14 +224,14 @@ Base DLAMIs supported
166224
- Neuron Instances Supported
167225
- DLAMI Name
168226

227+
* - Amazon Linux 2023
228+
- Inf1, Inf2, Trn1n, Trn1, Trn2
229+
- Deep Learning Base Neuron AMI (Amazon Linux 2023)
230+
169231
* - Ubuntu 22.04
170-
- Inf1, Inf2, Trn1, Trn1n
232+
- Inf1, Inf2, Trn1n, Trn1, Trn2
171233
- Deep Learning Base Neuron AMI (Ubuntu 22.04)
172234

173-
* - Ubuntu 20.04
174-
- Inf1, Inf2, Trn1, Trn1n
175-
- Deep Learning Base Neuron AMI (Ubuntu 20.04)
176-
177235

178236
.. _ssm-parameter-neuron-dlami:
179237

@@ -221,12 +279,24 @@ SSM Parameter Prefix
221279
* - Deep Learning AMI Neuron (Amazon Linux 2023)
222280
- /aws/service/neuron/dlami/multi-framework/amazon-linux-2023
223281

282+
* - Deep Learning AMI Neuron PyTorch 2.5 (Ubuntu 22.04)
283+
- /aws/service/neuron/dlami/pytorch-2.5/ubuntu-22.04
284+
285+
* - Deep Learning AMI Neuron PyTorch 2.5 (Amazon Linux 2023)
286+
- /aws/service/neuron/dlami/pytorch-2.5/amazon-linux-2023
287+
288+
* - Deep Learning AMI Neuron PyTorch 1.13 Inf1 (Ubuntu 22.04)
289+
- /aws/service/neuron/dlami/pytorch-1.13-inf1/ubuntu-22.04
290+
224291
* - Deep Learning AMI Neuron TensorFlow 2.10 (Ubuntu 22.04)
225292
- /aws/service/neuron/dlami/tensorflow-2.10/ubuntu-22.04
226293

227294
* - Deep Learning AMI Neuron TensorFlow 2.10 (Ubuntu 20.04)
228295
- /aws/service/neuron/dlami/tensorflow-2.10/ubuntu-20.04
229296

297+
* - Deep Learning Base Neuron AMI (Amazon Linux 2023)
298+
- /aws/service/neuron/dlami/base/amazon-linux-2023
299+
230300
* - Deep Learning Base Neuron AMI (Ubuntu 22.04)
231301
- /aws/service/neuron/dlami/base/ubuntu-22.04
232302

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
.. _jax-neuronx-api-reference-guide:
2+
3+
API Reference Guide for JAX Neuronx
4+
====================================================
5+
6+
.. toctree::
7+
:maxdepth: 1
8+
:hidden:
9+
10+
/frameworks/jax/api-reference-guide/neuron-envvars
11+
12+
* :ref:`jax-neuronx-envvars`
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
.. _jax-neuronx-envvars:
2+
3+
JAX NeuronX Environment Variables
4+
======================================
5+
6+
Environment variables allow modifications to JAX NeuronX behavior
7+
without requiring code change to user script. It is recommended to set
8+
them in code or just before invoking the python process, such as
9+
``NEURON_RT_VISIBLE_CORES=8 python3 <script>`` to avoid inadvertently
10+
changing behavior for other scripts. Environment variables specific to
11+
JAX Neuronx are:
12+
13+
``NEURON_CC_FLAGS``
14+
15+
- Compiler options. Full compiler options are described in the :ref:`mixed-precision-casting-options`.
16+
17+
``XLA_FLAGS``
18+
19+
- When set to ``"--xla_dump_hlo_snapshots --xla_dump_to=<dir>"``, this environmental variable enables dumping snapshots in ``<dir>`` directory. See :ref:`torch-neuronx-snapshotting` section for more information. The snapshotting interface for JAX and Pytorch are identical.
20+
- When set to ``"--xla_dump_hlo_as_text --xla_dump_hlo_as_proto --xla_dump_to=<dir> --xla_dump_hlo_pass_re='.*'"``, this environmental variable enables dumping HLOs in proto and text formats after each XLA pass. The dumped ``*.hlo.pb`` files are in HloProto format.
21+
22+
``NEURON_FORCE_PJRT_PLUGIN_REGISTRATION``
23+
24+
- When ``NEURON_FORCE_PJRT_PLUGIN_REGISTRATION=1``, the Neuron PJRT plugin will be registered in JAX regardless of the instance type.
25+
26+
``NEURON_RUN_TRIVIAL_COMPUTATION_ON_CPU``
27+
28+
- When ``NEURON_RUN_TRIVIAL_COMPUTATION_ON_CPU=1``, the Neuron PJRT plugin will compile and execute "trivial" computations on CPU instead of Neuron cores. A "trivial" computation is defined as an HLO program that does not contain any collective-compute instructions. The HLO program will be compiled by the XLA CPU compiler and outputs of the computation will be allocated on Neuron cores. The following HLO instructions are considered as collective-compute instructions.
29+
30+
- ``all-gather``
31+
- ``all-gather-done``
32+
- ``all-gather-start``
33+
- ``all-reduce-done``
34+
- ``all-reduce-start``
35+
- ``all-to-all``
36+
- ``collective-permute``
37+
- ``partition-id``
38+
- ``replica-id``
39+
- ``recv``
40+
- ``recv-done``
41+
- ``reduce-scatter``
42+
- ``send``
43+
- ``send-done``
44+
45+
``NEURON_PJRT_PROCESSES_NUM_DEVICES``
46+
47+
- Should be set to a comma-separated list stating the number of NeuronCores used by each worker process. It is used to construct a global device array with its size equal to the sum of the list. This gets reported to the XLA PJRT runtime when requested. Must be set for multi-process executions. It can be used in conjunction with ``NEURON_RT_VISIBLE_CORES`` to expose a limited number of NeuronCores to each worker process. If ``NEURON_RT_VISIBLE_CORES`` is not set, it should be set to available number of NeuronCores on the host. ``NEURON_PJRT_PROCESSES_NUM_DEVICES`` must be less than or equal to ``NEURON_RT_VISIBLE_CORES``.
48+
49+
``NEURON_PJRT_PROCESS_INDEX``
50+
51+
- An integer stating the index (or rank) of the current worker process. This is required for multi-process environments where all workers need to know information on all participating processes. Must be set for multi-process executions. The value should be between ``0`` and ``NEURON_PJRT_PROCESS_INDEX - 1``.
52+
53+
``NEURON_RT_STOCHASTIC_ROUNDING_EN`` **[Neuron Runtime]**
54+
55+
- When ``NEURON_RT_STOCHASTIC_ROUNDING_EN=1``, JAX Neuron will use stochastic rounding instead of
56+
round-nearest-even for all internal rounding operations when casting from FP32 to a reduced precision data type (FP16, BF16, FP8, TF32).
57+
This feature has been shown to improve
58+
training convergence for reduced precision training jobs.
59+
To switch to round-nearest-even mode, set ``NEURON_RT_STOCHASTIC_ROUNDING_EN=0``.
60+
61+
``NEURON_RT_STOCHASTIC_ROUNDING_SEED`` **[Neuron Runtime]**
62+
63+
- Sets the seed for the random number generator used in stochastic rounding (see previous section). If this environment variable is not set, the seed is set to 0 by default. Please set ``NEURON_RT_STOCHASTIC_ROUNDING_SEED`` to a fixed value to ensure reproducibility between runs.
64+
65+
``NEURON_RT_VISIBLE_CORES`` **[Neuron Runtime]**
66+
67+
- Integer range of specific NeuronCores needed by the process (for example, 0-3 specifies NeuronCores 0, 1, 2, and 3). Use this environment variable when launching processes to limit the launched process to specific consecutive NeuronCores.
68+
69+
Additional Neuron runtime environment variables are described in :ref:`nrt-configuration`.

frameworks/jax/index.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ a tested combination of the ``jax-neuronx``, ``jax``, ``jaxlib``, ``libneuronxla
1515

1616
/frameworks/jax/setup/jax-setup
1717
/frameworks/jax/setup/jax-neuronx-known-issues
18+
/frameworks/jax/api-reference-guide/index
1819

1920
* :ref:`jax-neuron-setup`
20-
* :ref:`jax_neuron-known-issues`
21+
* :ref:`jax-neuron-known-issues`
22+
* :ref:`jax-neuronx-api-reference-guide`

0 commit comments

Comments
 (0)