Skip to content

Python API for HOST_ACCESSIBLE OrtValue allocation#28038

Open
ericcraw wants to merge 4 commits into
microsoft:mainfrom
ericcraw:python-host-accessible-api
Open

Python API for HOST_ACCESSIBLE OrtValue allocation#28038
ericcraw wants to merge 4 commits into
microsoft:mainfrom
ericcraw:python-host-accessible-api

Conversation

@ericcraw
Copy link
Copy Markdown
Contributor

@ericcraw ericcraw commented Apr 10, 2026

Description

Adds memory_info= parameter to OrtValue.ortvalue_from_shape_and_type(), backed by two new C-level factory methods that look up the registered shared allocator via the full OrtMemoryInfo (including mem_type).

This is required because the current shared allocator query doesn't include the memory type making HOST_ACCESSIBLE invisible to python. UsesCpuMemory() is used in GetPyObjFromTensor so that tensors in HOST_ACCESSIBLE memory are returned as zero-copy numpy views.

Motivation and Context

Enable zero copy interop between numpy and ortvalue.

This is a follow up for #28037

Adds memory_info= parameter to OrtValue.ortvalue_from_shape_and_type(),
backed by two new C-level factory methods that look up the registered
shared allocator via the full OrtMemoryInfo (including mem_type).

This is required because the current shared allocator query doesn't
include the memory type making HOST_ACCESSIBLE invisible to python.
UsesCpuMemory() is used in GetPyObjFromTensor so that tensors in
HOST_ACCESSIBLE memory are returned as zero-copy numpy views.
:param memory_info: An OrtMemoryInfo from an OrtEpDevice (e.g. via ep_device.memory_info(OrtDeviceMemoryType.HOST_ACCESSIBLE)). When provided, the allocator matching this memory info is used directly, which allows allocating HOST_ACCESSIBLE memory for zero-copy numpy interop. The device_type, device_id, and vendor_id parameters are ignored when memory_info is provided.
"""

if memory_info is not None:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if memory_info is not None:

When memory_info is not None, the other device parameters are silently ignored. The docstring documents this. This is acceptable, but a warnings.warn() or a check that the caller didn't set both memory_info and non-default device params would be more user-friendly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a warning.

@yuslepukhin
Copy link
Copy Markdown
Member

No Python test exercising the new memory_info= parameter or verifying that HOST_ACCESSIBLE OrtValues produce zero-copy numpy views.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Python-level support for allocating OrtValue tensors using an explicit OrtMemoryInfo (including mem_type) so plugin EP HOST_ACCESSIBLE shared allocators can be selected, enabling zero-copy numpy interop for those tensors.

Changes:

  • Update tensor-to-numpy conversion to treat HOST_ACCESSIBLE tensors as CPU-memory-compatible via OrtDevice::UsesCpuMemory().
  • Add new pybind factory methods to allocate OrtValue from shape/type using a full OrtMemoryInfo lookup.
  • Extend OrtValue.ortvalue_from_shape_and_type() Python API with an optional memory_info= parameter to route allocations through those new factories.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
onnxruntime/python/onnxruntime_pybind_state.cc Enables zero-copy numpy views for HOST_ACCESSIBLE tensors via UsesCpuMemory().
onnxruntime/python/onnxruntime_pybind_ortvalue.cc Adds OrtMemoryInfo-based OrtValue allocation factories using shared allocator lookup.
onnxruntime/python/onnxruntime_inference_collection.py Exposes memory_info= on OrtValue.ortvalue_from_shape_and_type() and dispatches to new C++ factories.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/python/onnxruntime_pybind_ortvalue.cc
Comment thread onnxruntime/python/onnxruntime_inference_collection.py
Comment thread onnxruntime/python/onnxruntime_pybind_state.cc Outdated
@ericcraw ericcraw marked this pull request as ready for review May 4, 2026 16:29
@ericcraw ericcraw requested review from Copilot and yuslepukhin May 4, 2026 16:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yuslepukhin
Copy link
Copy Markdown
Member

Should we wait for #28037 to merge?

@ericcraw
Copy link
Copy Markdown
Contributor Author

ericcraw commented May 6, 2026

I think this can go in parallel to #28037. We need a way to allocate an ort value with a host accessible device allocator + create np views into them. We just may not avoid all copies until #28037 merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants