Skip to content

Commit 886d5b8

Browse files
authored
Update support_stack.rst (#1006)
1 parent 79a6613 commit 886d5b8

File tree

3 files changed

+23
-21
lines changed

3 files changed

+23
-21
lines changed

README.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
## [NVTabular](https://github.com/NVIDIA/NVTabular) | [Documentation](https://nvidia.github.io/NVTabular/main/Introduction.html)
22

3-
[NVTabular](https://github.com/NVIDIA/NVTabular) is a feature engineering and preprocessing library for tabular data that is designed to quickly and easily manipulate terabyte scale datasets and train deep learning (DL) based recommender systems. It provides high-level abstraction to simplify code and accelerates computation on the GPU using the [RAPIDS Dask-cuDF](https://github.com/rapidsai/cudf/tree/main/python/dask_cudf) library. NVTabular is designed to be interoperable with both PyTorch and TensorFlow using dataloaders that have been developed as extensions of native framework code. In our experiments, we were able to speed up existing TensorFlow pipelines by 9 times and existing PyTorch pipelines by 5 times with our highly optimized dataloaders.
3+
[NVTabular](https://github.com/NVIDIA/NVTabular) is a feature engineering and preprocessing library for tabular data that is designed to easily manipulate terabyte scale datasets and train deep learning (DL) based recommender systems. It provides high-level abstraction to simplify code and accelerates computation on the GPU using the [RAPIDS Dask-cuDF](https://github.com/rapidsai/cudf/tree/main/python/dask_cudf) library. NVTabular is designed to be interoperable with both PyTorch and TensorFlow using dataloaders that have been developed as extensions of native framework code. In our experiments, we were able to speed up existing TensorFlow pipelines by nine times and existing PyTorch pipelines by five times with our highly-optimized dataloaders.
44

55
NVTabular is a component of [NVIDIA Merlin Open Beta](https://developer.nvidia.com/nvidia-merlin). NVIDIA Merlin is used for building large-scale recommender systems. With NVTabular being a part of the Merlin ecosystem, it also works with the other Merlin components including [HugeCTR](https://github.com/NVIDIA/HugeCTR) and [Triton Inference Server](https://github.com/NVIDIA/tensorrt-inference-server) to provide end-to-end acceleration of recommender systems on the GPU. Extending beyond model training, with NVIDIA’s Triton Inference Server, the feature engineering and preprocessing steps performed on the data during training can be automatically applied to incoming data during inference.
66

@@ -36,13 +36,13 @@ To learn more about NVTabular's core features, see the following:
3636

3737
### Performance
3838

39-
When running NVTabular on the Criteo 1TB Click Logs Dataset using a single V100 32GB GPU, feature engineering and preprocessing was able to be completed in 13 minutes. Futhermore, when running NVTabular on a DGX-1 cluster with eight V100 GPUs, feature engineering and preprocessing was able to be completed within 3 minutes. Combined with [HugeCTR](http://www.github.com/NVIDIA/HugeCTR/), the dataset can be processed and a full model can be trained in only 6 minutes.
39+
When running NVTabular on the Criteo 1TB Click Logs Dataset using a single V100 32GB GPU, feature engineering and preprocessing was able to be completed in 13 minutes. Futhermore, when running NVTabular on a DGX-1 cluster with eight V100 GPUs, feature engineering and preprocessing was able to be completed within three minutes. Combined with [HugeCTR](http://www.github.com/NVIDIA/HugeCTR/), the dataset can be processed and a full model can be trained in only six minutes.
4040

4141
The performance of the Criteo DRLM workflow also demonstrates the effectiveness of the NVTabular library. The original ETL script provided in Numpy took over five days to complete. Combined with CPU training, the total iteration time is over one week. By optimizing the ETL code in Spark and running on a DGX-1 equivalent cluster, the time to complete feature engineering and preprocessing was reduced to three hours. Meanwhile, training was completed in one hour.
4242

4343
### Installation
4444

45-
To install NVTabular, ensure that you meet the following prerequisites:
45+
Prior to installing NVTabular, ensure that you meet the following prerequisites:
4646

4747
* CUDA version 10.1+
4848
* Python version 3.7+
@@ -78,12 +78,12 @@ NVTabular Docker containers are available in the [NVIDIA Merlin container reposi
7878

7979
| Container Name | Container Location | Functionality |
8080
| -------------------------- | ------------------ | ------------- |
81-
| merlin-training | https://ngc.nvidia.com/catalog/containers/nvidia:merlin:merlin-training | NVTabular and HugeCTR |
81+
| merlin-inference | https://ngc.nvidia.com/catalog/containers/nvidia:merlin:merlin-inference | NVTabular, HugeCTR, and Triton Inference |
82+
| merlin-training | https://ngc.nvidia.com/catalog/containers/nvidia:merlin:merlin-training | NVTabular and HugeCTR |
8283
| merlin-tensorflow-training | https://ngc.nvidia.com/catalog/containers/nvidia:merlin:merlin-tensorflow-training | NVTabular, TensorFlow, and HugeCTR Tensorflow Embedding plugin |
83-
| merlin-pytorch-training | https://ngc.nvidia.com/catalog/containers/nvidia:merlin:merlin-pytorch-training | NVTabular and PyTorch |
84-
| merlin-inference | https://ngc.nvidia.com/catalog/containers/nvidia:merlin:merlin-inference | NVTabular, HugeCTR, and Triton Inference |
84+
| merlin-pytorch-training | https://ngc.nvidia.com/catalog/containers/nvidia:merlin:merlin-pytorch-training | NVTabular and PyTorch |
8585

86-
To use these Docker containers, you'll first need to install the [NVIDIA Container Toolkit](https://github.com/NVIDIA/nvidia-docker) to provide GPU support for Docker. You can use the NGC links referenced in the table above to obtain more information about how to launch and run these containers.
86+
To use these Docker containers, you'll first need to install the [NVIDIA Container Toolkit](https://github.com/NVIDIA/nvidia-docker) to provide GPU support for Docker. You can use the NGC links referenced in the table above to obtain more information about how to launch and run these containers. To obtain more information about the software and model versions that NVTabular supports per container, see [Support Matrix](https://github.com/NVIDIA/NVTabular/blob/main/docs/source/resources/support_stack.rst).
8787

8888
### Notebook Examples and Tutorials
8989

@@ -106,7 +106,7 @@ Each Jupyter notebook covers the following:
106106

107107
### Feedback and Support
108108

109-
If you'd like to contribute to the library directly, please see the [Contributing.md](https://github.com/NVIDIA/NVTabular/blob/main/CONTRIBUTING.md). We're particularly interested in contributions or feature requests for our feature engineering and preprocessing operations. To further advance our Merlin Roadmap, we encourage you to share all the details regarding your recommender system pipeline using this [survey](https://developer.nvidia.com/merlin-devzone-survey).
109+
If you'd like to contribute to the library directly, see the [Contributing.md](https://github.com/NVIDIA/NVTabular/blob/main/CONTRIBUTING.md). We're particularly interested in contributions or feature requests for our feature engineering and preprocessing operations. To further advance our Merlin Roadmap, we encourage you to share all the details regarding your recommender system pipeline in this [survey](https://developer.nvidia.com/merlin-devzone-survey).
110110

111-
If you're interested in learning more about how NVTabular works see
112-
[our documentation](https://nvidia.github.io/NVTabular/main/Introduction.html). We also have [API documentation](https://nvidia.github.io/NVTabular/main/api/index.html) that outlines the specifics of the available calls within the library.
111+
If you're interested in learning more about how NVTabular works, see
112+
[our NVTabular documentation](https://nvidia.github.io/NVTabular/main/Introduction.html). We also have [API documentation](https://nvidia.github.io/NVTabular/main/api/index.html) that outlines the specifics of the available calls within the library.

docs/source/resources/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Additional Resources
44
.. toctree::
55
:maxdepth: 2
66

7-
Support Stack <support_stack>
7+
Support Matrix <support_matrix>
88
Architecture <architecture>
99
Cloud Integration <cloud_integration>
1010
Troubleshooting <troubleshooting>

docs/source/resources/support_stack.rst renamed to docs/source/resources/support_matrix.rst

Lines changed: 12 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,23 @@
1-
Support Stack
2-
==============
1+
NVTabular Support Matrix
2+
========================
33

44
.. role:: raw-html(raw)
55
:format: html
66

7-
We offer the following support stacks:
7+
We offer the following containers:
88

99
* `Merlin Inference <#table-1>`_: Allows you to deploy NVTabular workflows and HugeCTR or TensorFlow models to the Triton Inference server for production.
1010
* `Merlin Training <#table-2>`_: Allows you to do preprocessing and feature engineering with NVTabular so that you can train a deep learning recommendation model with HugeCTR.
1111
* `Merlin TensorFlow Training <#table-3>`_: Allows you to do preprocessing and feature engineering with NVTabular so that you can train a deep learning recommendation model with TensorFlow.
1212
* `Merlin PyTorch Training <#table-4>`_: Allows you to do preprocessing and feature engineering with NVTabular so that you can train a deep learning recommendation model with PyTorch.
1313

14-
The following tables provide the software and model versions that NVTabular version 0.6 supports.
14+
The following tables provide the software and model versions that NVTabular version 0.6 supports per container.
1515

1616
:raw-html:`<br/>`
1717

1818
.. _table-1:
1919

20-
:raw-html:`<p align="center"><b>Table 1: Support stack matrix for the Merlin Inference (merlin-inference) image</b></p>`
20+
:raw-html:`<p align="center"><b>Table 1: Support matrix for the Merlin Inference (merlin-inference) container</b></p>`
2121

2222
+-----------------------------------------------------+------------------------------------------------------------------------+
2323
| **DGX** |
@@ -75,7 +75,7 @@ The following tables provide the software and model versions that NVTabular vers
7575

7676
.. _table-2:
7777

78-
:raw-html:`<p align="center"><b>Table 2: Support stack matrix for the Merlin Training (merlin-training) image</b></p>`
78+
:raw-html:`<p align="center"><b>Table 2: Support matrix for the Merlin Training (merlin-training) container</b></p>`
7979

8080
+-----------------------------------------------------+------------------------------------------------------------------------+
8181
| **DGX** |
@@ -124,7 +124,7 @@ The following tables provide the software and model versions that NVTabular vers
124124
+-----------------------------------------------------+------------------------------------------------------------------------+
125125
| cuDNN | 8.2.2 |
126126
+-----------------------------------------------------+------------------------------------------------------------------------+
127-
| HugeCTR | N/A |
127+
| HugeCTR | 3.1 |
128128
+-----------------------------------------------------+------------------------------------------------------------------------+
129129
| NVTabular | 0.6 |
130130
+-----------------------------------------------------+------------------------------------------------------------------------+
@@ -133,7 +133,7 @@ The following tables provide the software and model versions that NVTabular vers
133133

134134
.. _table-3:
135135

136-
:raw-html:`<p align="center"><b>Table 3: Support stack matrix for the Merlin TensorFlow Training (merlin-tensorflow-training) image</b></p>`
136+
:raw-html:`<p align="center"><b>Table 3: Support matrix for the Merlin TensorFlow Training (merlin-tensorflow-training) container</b></p>`
137137

138138
+-----------------------------------------------------+------------------------------------------------------------------------+
139139
| **DGX** |
@@ -173,7 +173,9 @@ The following tables provide the software and model versions that NVTabular vers
173173
| Container Operating System | Ubuntu version 20.04 |
174174
+-----------------------------------------------------+------------------------------------------------------------------------+
175175
| Base Container | `nvcr.io/nvidia/tensorflow:21.07-tf2-py3 |
176-
| | <https://nvcr.io/nvidia/tensorflow:21.06-tf2-py3>`_ |
176+
| | <https://nvcr.io/nvidia/tensorflow:21.07-tf2-py3>`_ |
177+
| | |
178+
| | \*Customized with TensorFlow version 2.4.2 |
177179
+-----------------------------------------------------+------------------------------------------------------------------------+
178180
| CUDA | 11.4 |
179181
+-----------------------------------------------------+------------------------------------------------------------------------+
@@ -192,7 +194,7 @@ The following tables provide the software and model versions that NVTabular vers
192194

193195
.. _table-4:
194196

195-
:raw-html:`<p align="center"><b>Table 4: Support stack matrix for the Merlin PyTorch Training (merlin-pytorch-training) image</b></p>`
197+
:raw-html:`<p align="center"><b>Table 4: Support matrix for the Merlin PyTorch Training (merlin-pytorch-training) container</b></p>`
196198

197199
+-----------------------------------------------------+------------------------------------------------------------------------+
198200
| **DGX** |

0 commit comments

Comments
 (0)