Skip to content

WebGPU plugin EP Python packaging#28226

Draft
edgchen1 wants to merge 50 commits intomainfrom
edgchen1/webgpu_packaging_python
Draft

WebGPU plugin EP Python packaging#28226
edgchen1 wants to merge 50 commits intomainfrom
edgchen1/webgpu_packaging_python

Conversation

@edgchen1
Copy link
Copy Markdown
Contributor

Description

  • Add WebGPU plugin EP Python package build to existing packaging pipeline.
  • Add WebGPU plugin EP packaging test pipeline. It contains simple Python package tests to start.

Motivation and Context

Put together WebGPU plugin EP Python package.

edgchen1 added 30 commits April 10, 2026 15:20
Add standalone onnxruntime-ep-webgpu Python package that bundles the WebGPU plugin EP native binary (+ DXC deps on Windows). The package provides get_library_path() and get_ep_name() helpers for registering the EP with ONNX Runtime.

New files in plugin-ep-webgpu/: VERSION_NUMBER, pyproject.toml, setup.py, __init__.py, build_wheel.py (handles binary copying, version stamping, auditwheel repair on Linux, and wheel verification), requirements-build-wheel.txt, and a smoke test that validates import, EP registration, and inference.

Pipeline changes: added Python_Package (CPU) and Python_Test (GPU) jobs to each platform stage (Windows, Linux, macOS). Added PluginPythonPackageVersion (PEP 440) output to set-plugin-build-variables-step.yml, sourced from plugin-ep-webgpu/VERSION_NUMBER.
Move user-facing README (installation, usage) into onnxruntime_ep_webgpu/ so it is bundled in the wheel and shown on PyPI. Add developer-facing README in plugin-ep-webgpu/python/ with build and test instructions.
The package has no CPython extension modules, only pre-built native libraries, so a single wheel works across all Python versions. Override bdist_wheel.get_tag() to produce py3-none-{platform} instead of cp3XX-cp3XX-{platform}.
…version stamp

- Build wheel in a temporary directory instead of mutating the source tree

- Copy only the files needed (pyproject.toml, setup.py, onnxruntime_ep_webgpu/) instead of using an exclude list

- Change version placeholder to VERSION_PLACEHOLDER and fail hard if not found

- Disable CPU EP fallback in test to ensure WebGPU EP runs the model

- Simplify docstring and README descriptions
- macOS/Windows: add setup-build-tools.yml to Python Package and Test jobs

- Linux: run Python packaging and testing inside Docker for manylinux compatibility and auditwheel support

- Windows: skip Python package/test jobs for arm64 (cross-compiled, can't run on x64 agents)

- Linux: add gpu_machine_pool parameter for test job pool
Print environment info (Python version, platform, ORT version, relevant env vars), package directory contents, library file size, device enumeration details, session providers, and full tracebacks on failure.
Apple Silicon requires all executable code to be signed. Without this, dlopen triggers a SIGBUS (bus error) when loading the unsigned dylib.
ESRP requires .zip or .dmg input. Zip the dylib before signing, then unzip the signed result and verify.
The Docker image does not have pip pre-installed. Use ensurepip to bootstrap it before installing wheel build dependencies.
Use python -u in all three platform pipelines so prints are flushed immediately, even if the process crashes during native DLL load. Add ort.set_default_logger_severity(0) in the test script for verbose ORT logging.
Remove reinterpret_cast of OrtKernelInfo* to internal OpKernelInfo* that breaks ABI across DLL boundaries (vtable mismatch between plugin EP and ORT core).

- KernelInfoCache: use Ort::ConstKernelInfo::GetEp() instead of casting to OpKernelInfo* and calling GetExecutionProvider()->GetOrtEp()

- GetAllocator: use C API KernelInfoGetAllocator + IAllocatorImplWrappingOrtAllocator instead of casting to OpKernelInfo*

- Remove #include core/framework/op_kernel_info.h (no longer needed)

- Add #include core/session/allocator_adapters.h for IAllocatorImplWrappingOrtAllocator
…ast_issue' into edgchen1/webgpu_packaging_python_fix
Materialize the glob generator to a list so the emptiness check works, and delete each raw wheel after auditwheel repair so only the manylinux wheel remains.
Add the missing set-nightly-build-option-variable-step.yml template to all three platform Python_Package jobs for consistency with the Build jobs.
edgchen1 added 17 commits April 21, 2026 14:28
The WebGPU EP has no CUDA dependency (Dawn uses Vulkan on Linux), so
having the plugin-linux-webgpu-stage.yml pipeline reuse the CUDA
inference Dockerfile pulled in TensorRT/cuDNN unnecessarily and was
missing libvulkan.so.1, causing the test job to fail with:
  Couldn't load Vulkan: libvulkan.so.1: cannot open shared object file

Add a new Dockerfile under inference/x86_64/python/webgpu/ modeled on
the CPU Dockerfile, based on the CPU build-cache image, with an
additional 'dnf install vulkan-loader' step so Dawn can reach the
GPU's Vulkan ICD (injected by the NVIDIA Container Toolkit via
--gpus all at runtime).

Update all three jobs (build, package, test) in
plugin-linux-webgpu-stage.yml to use the new Dockerfile and switch
the docker_base_image default to the CPU base image.
The WebGPU plugin's Python test job was failing with:

  setup_loader_term_phys_devs: Failed to detect any valid GPUs in the
  current config

The NVIDIA Container Toolkit only injects the CUDA portions of the
driver by default (capabilities = utility,compute). Vulkan additionally
requires the 'graphics' capability, which injects libGLX_nvidia, the
NVIDIA Vulkan ICD JSON, and associated userspace libs.

- Set NVIDIA_DRIVER_CAPABILITIES=compute,utility,graphics and
  NVIDIA_VISIBLE_DEVICES=all in the WebGPU plugin image so every
  'docker run --gpus all' gets a working Vulkan ICD.
- Install vulkan-tools in the image for diagnostic purposes (to be
  removed once the test is stable).
- Add a temporary diagnostics block to the Python test job that dumps
  the relevant NVIDIA/Vulkan state (env vars, nvidia-smi, ICD search
  paths, libnvidia/libGLX_nvidia locations, vulkaninfo --summary)
  before running the actual test.
The NVIDIA Container Toolkit on the GPU CI pool injects NVIDIA libraries
but not the Vulkan ICD JSON, so Dawn fails with 'Failed to detect any
valid GPUs'. Switch the plugin Python test job to Mesa lavapipe (a CPU
Vulkan implementation) instead. This unblocks CI and lets the job run
on the standard CPU pool.

- Dockerfile: add mesa-vulkan-drivers (provides lavapipe). Drop NVIDIA
  env vars. Leave VK_ICD_FILENAMES to the caller so the image stays
  reusable for a future real-GPU test job.
- plugin-linux-webgpu-stage.yml: switch test job to the CPU pool, drop
  --gpus all, and set VK_ICD_FILENAMES / VK_DRIVER_FILES to the lavapipe
  ICD on the docker run command line. Trim diagnostics.
Move the Python package test jobs out of the packaging pipeline
(plugin-webgpu-pipeline.yml) into a new resource-triggered pipeline
(plugin-webgpu-test-pipeline.yml), mirroring the
py-packaging-pipeline / py-package-test-pipeline split.

The test pipeline consumes artifacts from the packaging pipeline run
that triggered it (or a run selected at queue time), so the test side
(Dockerfile, Vulkan setup, test script) can be iterated on without
rebuilding Dawn/WebGPU from source.

- New stages/plugin-{linux,win,mac}-webgpu-test-stage.yml with the
  test jobs, downloading the wheel artifact from the 'build' pipeline
  resource.
- Corresponding test jobs removed from
  stages/plugin-{linux,win,mac}-webgpu-stage.yml.
- New top-level plugin-webgpu-test-pipeline.yml wires the platform
  test stages together and declares the packaging pipeline as a
  resource trigger.
Mirrors the structure of plugin-webgpu-pipeline.yml by extending v1/1ES.Official.PipelineTemplate.yml@1esPipelines. Sets sdl.sourceAnalysisPool explicitly since there is no top-level pool; stage templates pin their own pools. Omits codeSignValidation since this pipeline does not produce or publish binaries.
The AlmaLinux 8 based test image ships an old Mesa lavapipe that returns
VK_ERROR_INCOMPATIBLE_DRIVER when Dawn requests a Vulkan 1.3 instance,
causing the WebGPU plugin EP Python test to fail with 'Found no drivers!'
on hosted CI agents.

Build SwiftShader (Google's software Vulkan ICD, used by Dawn for headless
CI) from source in a multi-stage Dockerfile and install it to
/opt/swiftshader. Pin to the commit SHA referenced by Dawn's DEPS.

Update the test stage to point VK_ICD_FILENAMES / VK_DRIVER_FILES at the
SwiftShader ICD instead of lavapipe.

Verified locally: vulkaninfo --summary reports SwiftShader Device
(DRIVER_ID_GOOGLE_SWIFTSHADER) with Vulkan 1.3.
Replace the hardcoded plugin-ep-webgpu/VERSION_NUMBER path in set-plugin-build-variables-step.yml with a required version_file parameter, threaded from the top-level pipeline (epVersionFile variable) through the packaging stage down to each platform stage.
#include "core/common/narrow.h"
#include "core/common/status.h"
#include "core/framework/config_options.h"
#include "core/framework/op_kernel_info.h"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes in this file are a version of the fix from #28081. revert once that PR is merged.

Comment thread plugin-ep-webgpu/python/test/test_webgpu_plugin_ep.py Fixed
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

Comment thread plugin-ep-webgpu/python/build_wheel.py Outdated
Comment thread plugin-ep-webgpu/python/build_wheel.py Outdated
Comment thread plugin-ep-webgpu/python/build_wheel.py Outdated
Comment thread plugin-ep-webgpu/python/onnxruntime_ep_webgpu/__init__.py Outdated
Comment thread plugin-ep-webgpu/python/test/test_webgpu_plugin_ep.py
Comment thread plugin-ep-webgpu/python/test/test_webgpu_plugin_ep.py Outdated
Comment thread plugin-ep-webgpu/python/build_wheel.py Fixed
Comment thread plugin-ep-webgpu/python/onnxruntime_ep_webgpu/__init__.py Fixed
Comment thread plugin-ep-webgpu/python/onnxruntime_ep_webgpu/__init__.py Fixed
Comment thread plugin-ep-webgpu/python/test/test_webgpu_plugin_ep.py Fixed
Comment thread plugin-ep-webgpu/python/test/test_webgpu_plugin_ep.py Fixed
Comment thread plugin-ep-webgpu/python/test/test_webgpu_plugin_ep.py Fixed
Comment thread plugin-ep-webgpu/python/test/test_webgpu_plugin_ep.py Fixed
Comment thread plugin-ep-webgpu/python/test/test_webgpu_plugin_ep.py Fixed
Comment thread plugin-ep-webgpu/python/test/test_webgpu_plugin_ep.py Fixed
Comment thread plugin-ep-webgpu/python/test/test_webgpu_plugin_ep.py Fixed
Comment thread plugin-ep-webgpu/python/test/test_webgpu_plugin_ep.py Dismissed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants