Draft
Conversation
Add standalone onnxruntime-ep-webgpu Python package that bundles the WebGPU plugin EP native binary (+ DXC deps on Windows). The package provides get_library_path() and get_ep_name() helpers for registering the EP with ONNX Runtime. New files in plugin-ep-webgpu/: VERSION_NUMBER, pyproject.toml, setup.py, __init__.py, build_wheel.py (handles binary copying, version stamping, auditwheel repair on Linux, and wheel verification), requirements-build-wheel.txt, and a smoke test that validates import, EP registration, and inference. Pipeline changes: added Python_Package (CPU) and Python_Test (GPU) jobs to each platform stage (Windows, Linux, macOS). Added PluginPythonPackageVersion (PEP 440) output to set-plugin-build-variables-step.yml, sourced from plugin-ep-webgpu/VERSION_NUMBER.
Move user-facing README (installation, usage) into onnxruntime_ep_webgpu/ so it is bundled in the wheel and shown on PyPI. Add developer-facing README in plugin-ep-webgpu/python/ with build and test instructions.
The package has no CPython extension modules, only pre-built native libraries, so a single wheel works across all Python versions. Override bdist_wheel.get_tag() to produce py3-none-{platform} instead of cp3XX-cp3XX-{platform}.
…version stamp - Build wheel in a temporary directory instead of mutating the source tree - Copy only the files needed (pyproject.toml, setup.py, onnxruntime_ep_webgpu/) instead of using an exclude list - Change version placeholder to VERSION_PLACEHOLDER and fail hard if not found - Disable CPU EP fallback in test to ensure WebGPU EP runs the model - Simplify docstring and README descriptions
- macOS/Windows: add setup-build-tools.yml to Python Package and Test jobs - Linux: run Python packaging and testing inside Docker for manylinux compatibility and auditwheel support - Windows: skip Python package/test jobs for arm64 (cross-compiled, can't run on x64 agents) - Linux: add gpu_machine_pool parameter for test job pool
Print environment info (Python version, platform, ORT version, relevant env vars), package directory contents, library file size, device enumeration details, session providers, and full tracebacks on failure.
Apple Silicon requires all executable code to be signed. Without this, dlopen triggers a SIGBUS (bus error) when loading the unsigned dylib.
ESRP requires .zip or .dmg input. Zip the dylib before signing, then unzip the signed result and verify.
The Docker image does not have pip pre-installed. Use ensurepip to bootstrap it before installing wheel build dependencies.
Use python -u in all three platform pipelines so prints are flushed immediately, even if the process crashes during native DLL load. Add ort.set_default_logger_severity(0) in the test script for verbose ORT logging.
Remove reinterpret_cast of OrtKernelInfo* to internal OpKernelInfo* that breaks ABI across DLL boundaries (vtable mismatch between plugin EP and ORT core). - KernelInfoCache: use Ort::ConstKernelInfo::GetEp() instead of casting to OpKernelInfo* and calling GetExecutionProvider()->GetOrtEp() - GetAllocator: use C API KernelInfoGetAllocator + IAllocatorImplWrappingOrtAllocator instead of casting to OpKernelInfo* - Remove #include core/framework/op_kernel_info.h (no longer needed) - Add #include core/session/allocator_adapters.h for IAllocatorImplWrappingOrtAllocator
…p_adapter_cast_issue
…ast_issue' into edgchen1/webgpu_packaging_python_fix
Materialize the glob generator to a list so the emptiness check works, and delete each raw wheel after auditwheel repair so only the manylinux wheel remains.
Add the missing set-nightly-build-option-variable-step.yml template to all three platform Python_Package jobs for consistency with the Build jobs.
The WebGPU EP has no CUDA dependency (Dawn uses Vulkan on Linux), so having the plugin-linux-webgpu-stage.yml pipeline reuse the CUDA inference Dockerfile pulled in TensorRT/cuDNN unnecessarily and was missing libvulkan.so.1, causing the test job to fail with: Couldn't load Vulkan: libvulkan.so.1: cannot open shared object file Add a new Dockerfile under inference/x86_64/python/webgpu/ modeled on the CPU Dockerfile, based on the CPU build-cache image, with an additional 'dnf install vulkan-loader' step so Dawn can reach the GPU's Vulkan ICD (injected by the NVIDIA Container Toolkit via --gpus all at runtime). Update all three jobs (build, package, test) in plugin-linux-webgpu-stage.yml to use the new Dockerfile and switch the docker_base_image default to the CPU base image.
The WebGPU plugin's Python test job was failing with: setup_loader_term_phys_devs: Failed to detect any valid GPUs in the current config The NVIDIA Container Toolkit only injects the CUDA portions of the driver by default (capabilities = utility,compute). Vulkan additionally requires the 'graphics' capability, which injects libGLX_nvidia, the NVIDIA Vulkan ICD JSON, and associated userspace libs. - Set NVIDIA_DRIVER_CAPABILITIES=compute,utility,graphics and NVIDIA_VISIBLE_DEVICES=all in the WebGPU plugin image so every 'docker run --gpus all' gets a working Vulkan ICD. - Install vulkan-tools in the image for diagnostic purposes (to be removed once the test is stable). - Add a temporary diagnostics block to the Python test job that dumps the relevant NVIDIA/Vulkan state (env vars, nvidia-smi, ICD search paths, libnvidia/libGLX_nvidia locations, vulkaninfo --summary) before running the actual test.
The NVIDIA Container Toolkit on the GPU CI pool injects NVIDIA libraries but not the Vulkan ICD JSON, so Dawn fails with 'Failed to detect any valid GPUs'. Switch the plugin Python test job to Mesa lavapipe (a CPU Vulkan implementation) instead. This unblocks CI and lets the job run on the standard CPU pool. - Dockerfile: add mesa-vulkan-drivers (provides lavapipe). Drop NVIDIA env vars. Leave VK_ICD_FILENAMES to the caller so the image stays reusable for a future real-GPU test job. - plugin-linux-webgpu-stage.yml: switch test job to the CPU pool, drop --gpus all, and set VK_ICD_FILENAMES / VK_DRIVER_FILES to the lavapipe ICD on the docker run command line. Trim diagnostics.
Move the Python package test jobs out of the packaging pipeline
(plugin-webgpu-pipeline.yml) into a new resource-triggered pipeline
(plugin-webgpu-test-pipeline.yml), mirroring the
py-packaging-pipeline / py-package-test-pipeline split.
The test pipeline consumes artifacts from the packaging pipeline run
that triggered it (or a run selected at queue time), so the test side
(Dockerfile, Vulkan setup, test script) can be iterated on without
rebuilding Dawn/WebGPU from source.
- New stages/plugin-{linux,win,mac}-webgpu-test-stage.yml with the
test jobs, downloading the wheel artifact from the 'build' pipeline
resource.
- Corresponding test jobs removed from
stages/plugin-{linux,win,mac}-webgpu-stage.yml.
- New top-level plugin-webgpu-test-pipeline.yml wires the platform
test stages together and declares the packaging pipeline as a
resource trigger.
Mirrors the structure of plugin-webgpu-pipeline.yml by extending v1/1ES.Official.PipelineTemplate.yml@1esPipelines. Sets sdl.sourceAnalysisPool explicitly since there is no top-level pool; stage templates pin their own pools. Omits codeSignValidation since this pipeline does not produce or publish binaries.
The AlmaLinux 8 based test image ships an old Mesa lavapipe that returns VK_ERROR_INCOMPATIBLE_DRIVER when Dawn requests a Vulkan 1.3 instance, causing the WebGPU plugin EP Python test to fail with 'Found no drivers!' on hosted CI agents. Build SwiftShader (Google's software Vulkan ICD, used by Dawn for headless CI) from source in a multi-stage Dockerfile and install it to /opt/swiftshader. Pin to the commit SHA referenced by Dawn's DEPS. Update the test stage to point VK_ICD_FILENAMES / VK_DRIVER_FILES at the SwiftShader ICD instead of lavapipe. Verified locally: vulkaninfo --summary reports SwiftShader Device (DRIVER_ID_GOOGLE_SWIFTSHADER) with Vulkan 1.3.
This reverts commit 1c47acf.
Replace the hardcoded plugin-ep-webgpu/VERSION_NUMBER path in set-plugin-build-variables-step.yml with a required version_file parameter, threaded from the top-level pipeline (epVersionFile variable) through the packaging stage down to each platform stage.
…ild-variables-step.yml
edgchen1
commented
Apr 24, 2026
| #include "core/common/narrow.h" | ||
| #include "core/common/status.h" | ||
| #include "core/framework/config_options.h" | ||
| #include "core/framework/op_kernel_info.h" |
Contributor
Author
There was a problem hiding this comment.
the changes in this file are a version of the fix from #28081. revert once that PR is merged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Motivation and Context
Put together WebGPU plugin EP Python package.