Skip to content

ONNX models generated by llm_export.py are missing some input and output nodes #1147

@idruker-cerence

Description

@idruker-cerence

Describe the bug

I am using Model-Optimizer/examples/torch_onnx/llm_export.py script to convert a .safetensors LLM model to the ONNX format and quantize it. The model is supposed to be later converted then into the TRT format for being used by TensorRT. The so produced ONNX model has "input_ids", "logits", present_key_values*" but is missing "position_ids", "attention_mask" and "past_kv*" nodes.

Steps/Code to reproduce bug

Install packages

python -m pip install nvidia-modelopt[all]
python -m pip install onnx==1.18.0
python -m pip install onnxruntime[gpu]==1.23.0

and all others on demand once requested during running llm_export.py. Set up paths:

export LD_LIBRARY_PATH=<path/to/cuda/libs>:<path/to/cudnn/lib>
export PATH=<path/to/cuda/bin>:$PATH

Clone the ModelOptimizer repo in order to use the example scripts

git clone https://github.com/NVIDIA/Model-Optimizer.git

Navigate to torch_onnx example

cd Model-Optimizer/examples/torch_onnx

and launch conversion of HF model to ONNX INT4 quantization:

python llm_export.py --hf_model_path=meta-llama/Llama-3.1-8B-Instruct --dtype=int4_awq --calib_size=512 --output_dir=models/Llama-3.1-8B-Instruct-ONNX-INT4

Result: the produced ONNX model is missing "position_ids", "attention_mask" and "past_kv*" nodes.

Expected behavior

A typical LLM model must have "input_ids", "attention_mask", "logits", past and present kv-cache nodes. In fact, some of them are missing.

System information

  • OS (e.g., Ubuntu 22.04, CentOS 7, Windows 10): ? Ubuntu 20.04
  • CPU architecture (x86_64, aarch64): x86_64
  • GPU memory size: enough
  • Library versions (if applicable):
    • Python: 3.12
    • ModelOpt version or commit hash: >=0.39
    • CUDA: 12.3
    • PyTorch: 2.7.1+cu118
    • Transformers: 4.57.3
    • onnxruntime-gpu: 1.23.0

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingtriaged

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions