Skip to content

Commit af96728

Browse files
authored
Neuron SDK 2.18.0 updates (#1618) (#857)
Release notes updates for Neuron SDK 2.18.0
1 parent e84f22b commit af96728

File tree

125 files changed

+3506
-1017
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

125 files changed

+3506
-1017
lines changed

compiler/neuronx-cc/api-reference-guide/neuron-compiler-cli-reference-guide.rst

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ Available Commands:
6060
[--auto-cast-type <data_type>]
6161
[--distribution-strategy <distribution_type>]
6262
[--optlevel <opt-level>], or [-O <opt-level>]
63+
[--enable-mixed-precision-accumulation]
6364
[--enable-saturate-infinity]
6465
[--enable-fast-context-switch>]
6566
[--enable-fast-loading-neuron-binaries]
@@ -135,7 +136,9 @@ Available Commands:
135136

136137
.. note:: This option supercedes, and deprecates, the ``—enable-experimental-O1`` option introduced in an earlier release.
137138

138-
- :option:`--enable-saturate-infinity`: Convert +/- infinity values to MAX/MIN_FLOAT for certain computations that have a high risk of generating Not-a-Number (NaN) values. There is a potential performance impact during model execution when this conversion is enabled.
139+
- :option:`--enable-mixed-precision-accumulation`: Perform intermediate calculations of accumulation operators (such as softmax and layernorm) in FP32 and cast the result to the model-designated datatype. This improves the operator's resulting accuracy.
140+
141+
- :option:`--enable-saturate-infinity`: Convert +/- infinity values to MAX/MIN_FLOAT for compiler-introduced matrix-multiply transpose computations that have a high risk of generating Not-a-Number (NaN) values. There is a potential performance impact during model execution when this conversion is enabled.
139142

140143
- :option:`--enable-fast-context-switch`: Optimize for faster model switching rather than execution latency.
141144
This option will defer loading some weight constants until the start of model execution. This results in overall faster system performance when your application switches between models frequently on the same Neuron Core (or set of cores).

conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@
142142
#top_banner_message="<span>&#9888;</span><a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/setup/setup-troubleshooting.html#gpg-key-update'> Neuron repository GPG key for Ubuntu installation has expired, see instructions how to update! </a>"
143143

144144

145-
top_banner_message="Neuron 2.17.0 is released! check <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/index.html#latest-neuron-release'> What's New </a> and <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/announcements/index.html'> Announcements </a>"
145+
top_banner_message="Neuron 2.18.0 is released! check <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/index.html#latest-neuron-release'> What's New </a> and <a class='reference internal' style='color:white;' href='https://awsdocs-neuron.readthedocs-hosted.com/en/latest/general/announcements/index.html'> Announcements </a>"
146146

147147

148148
html_theme = "sphinx_book_theme"

containers/developerflows.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Containers - Developer Flows
99
/containers/dlc-then-ecs-devflow
1010
/containers/dlc-then-eks-devflow
1111
/containers/container-sm-hosting-devflow
12+
/containers/dlc-then-customize-devflow
1213

1314

1415

containers/developerflows.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,4 @@
66
* :ref:`containers-dlc-then-ecs-devflow`
77
* :ref:`containers-dlc-then-eks-devflow`
88
* :ref:`containers-byoc-hosting-devflow`
9+
* :ref:`containers-dlc-then-customize-devflow`
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
.. _containers-dlc-then-customize-devflow:
2+
3+
.. include:: /general/devflows/dlc-then-customize-devflow.rst
Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,26 @@
1-
.. tab-set::
1+
.. tab-set::
22

3-
.. tab-item:: Latest Neuron DLC images
3+
.. tab-item:: Introduction
44

5+
The Pytorch Neuron DLC images are published to ECR Public, which is the recommended URL to use for most cases. If you are working within AWS SageMaker, you should use the Amazon ECR URL instead of the Amazon ECR Public one because of the restriction of Sagemaker. TensorFlow DLCs are not updated with the latest release. For earlier releases please check `here <https://github.com/aws/deep-learning-containers/blob/master/available_images.md#neuron-containers>`_.
6+
7+
.. tab-set::
8+
9+
.. tab-item:: Neuron DLC images in Amazon ECR Public
10+
11+
.. df-table::
12+
:header-rows: 1
13+
14+
df = pd.read_csv('neuron_dlc_images.csv')
15+
16+
.. tab-set::
17+
18+
.. tab-item:: Latest Neuron DLC images in Amazon ECR
519

620
Find latest `Neuron DLC images <https://github.com/aws/deep-learning-containers/blob/master/available_images.md#user-content-neuron-containers>`_.
721

8-
.. tab-set::
22+
.. tab-set::
923

10-
.. tab-item:: Locate specific Neuron DLC release
24+
.. tab-item:: Locate specific Neuron DLC release in Amazon ECR
1125

1226
In the `DLC release page <https://github.com/aws/deep-learning-containers/releases>`_ do a search for Neuron to get the ECR repo location of specific Neuron DLC release.

containers/neuron_dlc_images.csv

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
Framework,Neuron Package,Job Type,Supported EC2 Instance Types,Python Version Options,ECR Public Repo URL,Image Details,Other Packages
2+
PyTorch 2.1.2,"aws-neuronx-tools, neuronx_distributed, torch-neuronx, transformers-neuronx",inference,trn1 and inf2,3.10 (py310),https://gallery.ecr.aws/neuron/pytorch-inference-neuronx,https://github.com/aws-neuron/deep-learning-containers#pytorch-inference-neuronx,torchserve
3+
PyTorch 2.1.2,"aws-neuronx-tools, neuronx_distributed, torch-neuronx",training,trn1 and inf2,3.10 (py310),https://gallery.ecr.aws/neuron/pytorch-training-neuronx,https://github.com/aws-neuron/deep-learning-containers#pytorch-training-neuronx,
4+
PyTorch 1.13.1,"aws-neuronx-tools, torch-neuron",inference,inf1,3.10 (py310),https://gallery.ecr.aws/neuron/pytorch-inference-neuron,https://github.com/aws-neuron/deep-learning-containers#pytorch-inference-neuron,torchserve
5+
PyTorch 1.13.1,"aws-neuronx-tools, neuronx_distributed, torch-neuronx, transformers-neuronx",inference,trn1 and inf2,3.10 (py310),https://gallery.ecr.aws/neuron/pytorch-inference-neuronx,https://github.com/aws-neuron/deep-learning-containers#pytorch-inference-neuronx,torchserve
6+
PyTorch 1.13.1,"aws-neuronx-tools, neuronx_distributed, torch-neuronx",training,trn1 and inf2,3.10 (py310),https://gallery.ecr.aws/neuron/pytorch-training-neuronx,https://github.com/aws-neuron/deep-learning-containers#pytorch-training-neuronx,

0 commit comments

Comments
 (0)