From 5a9718b7830875dde3e687cdd57be569645886b8 Mon Sep 17 00:00:00 2001 From: MaraschinoGirl Date: Thu, 14 Aug 2025 17:13:07 +0200 Subject: [PATCH] Docs: improve install guide, add examples & troubleshooting --- README.md | 99 ++++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 87 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index 254a9c34..e6506096 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,4 @@ +
# 🤗 Optimum ONNX @@ -8,41 +9,77 @@
+--- + +## Installation + +Before you begin, make sure you have **Python 3.9 or higher** installed. -### Installation +### 1. Create a virtual environment (recommended) +``` +python -m venv .venv +source .venv/bin/activate # macOS / Linux +.venv\Scripts\activate # Windows +``` -Before you begin, make sure you install all necessary libraries by running: +### 2. Install Optimum ONNX (CPU version) -```bash +``` pip install "optimum-onnx[onnxruntime]"@git+https://github.com/huggingface/optimum-onnx.git ``` -If you want to use the [GPU version of ONNX Runtime](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#cuda-execution-provider), make sure the CUDA and cuDNN [requirements](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements) are satisfied, and install the additional dependencies by running : +### 3. Install Optimum ONNX (GPU version) -```bash +Before installing, ensure your CUDA and cuDNN versions match [ONNX Runtime GPU requirements](https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements). + +``` +pip uninstall onnxruntime # avoid conflicts pip install "optimum-onnx[onnxruntime-gpu]"@git+https://github.com/huggingface/optimum-onnx.git ``` -To avoid conflicts between `onnxruntime` and `onnxruntime-gpu`, make sure the package `onnxruntime` is not installed by running `pip uninstall onnxruntime` prior to installing Optimum. +--- + +## ONNX Export -### ONNX export +It is possible to export 🤗 Transformers, Diffusers, Timm, and Sentence Transformers models to the [ONNX](https://onnx.ai/) format and perform graph optimization as well as quantization easily. -It is possible to export 🤗 Transformers, Diffusers, Timm and Sentence Transformers models to the [ONNX](https://onnx.ai/) format and perform graph optimization as well as quantization easily: +Example: Export **Llama 3.2–1B** to ONNX: -```bash +``` optimum-cli export onnx --model meta-llama/Llama-3.2-1B onnx_llama/ ``` + The model can also be optimized and quantized with `onnxruntime`. +### Additional Examples + +**DistilBERT for text classification** + +``` +optimum-cli export onnx --model distilbert-base-uncased-finetuned-sst-2-english distilbert_onnx/ +``` + +**Whisper for speech-to-text** + +``` +optimum-cli export onnx --model openai/whisper-small whisper_onnx/ +``` + +**Gemma for general-purpose LLM tasks** + +``` +optimum-cli export onnx --model google/gemma-2b gemma_onnx/ +``` + For more information on the ONNX export, please check the [documentation](https://huggingface.co/docs/optimum/exporters/onnx/usage_guides/export_a_model). -#### Inference +--- -Once the model is exported to the ONNX format, we provide Python classes enabling you to run the exported ONNX model in a seemless manner using [ONNX Runtime](https://onnxruntime.ai/) in the backend: +## Inference +Once the model is exported to the ONNX format, we provide Python classes enabling you to run the exported ONNX model seamlessly using [ONNX Runtime](https://onnxruntime.ai/) in the backend. ```diff - from transformers import AutoTokenizer, pipeline - from transformers import AutoModelForCausalLM + from optimum.onnxruntime import ORTModelForCausalLM @@ -56,3 +93,41 @@ Once the model is exported to the ONNX format, we provide Python classes enablin ``` More details on how to run ONNX models with `ORTModelForXXX` classes [here](https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/models). + +--- + +## Troubleshooting + +**1. `ModuleNotFoundError: No module named 'onnxruntime'`** +Ensure you have installed either `onnxruntime` (CPU) or `onnxruntime-gpu` (GPU): + +``` +pip install "optimum-onnx[onnxruntime]" # CPU +pip install "optimum-onnx[onnxruntime-gpu]" # GPU +``` + +--- + +**2. CUDA/cuDNN not found** +Verify your `nvcc --version` output matches ONNX Runtime GPU requirements. +Install the correct CUDA and cuDNN versions before retrying. + +--- + +**3. Out-of-memory errors** +Use smaller models (e.g., `distilbert-base-uncased`) or enable model quantization: + +``` +optimum-cli export onnx --model distilbert-base-uncased --quantize int8 distilbert_quant/ +``` + +--- + +**4. `onnxruntime` and `onnxruntime-gpu` conflict** +Uninstall the CPU version before installing the GPU version: + +``` +pip uninstall onnxruntime +``` + +---