Python API for HOST_ACCESSIBLE OrtValue allocation#28038
Conversation
Adds memory_info= parameter to OrtValue.ortvalue_from_shape_and_type(), backed by two new C-level factory methods that look up the registered shared allocator via the full OrtMemoryInfo (including mem_type). This is required because the current shared allocator query doesn't include the memory type making HOST_ACCESSIBLE invisible to python. UsesCpuMemory() is used in GetPyObjFromTensor so that tensors in HOST_ACCESSIBLE memory are returned as zero-copy numpy views.
| :param memory_info: An OrtMemoryInfo from an OrtEpDevice (e.g. via ep_device.memory_info(OrtDeviceMemoryType.HOST_ACCESSIBLE)). When provided, the allocator matching this memory info is used directly, which allows allocating HOST_ACCESSIBLE memory for zero-copy numpy interop. The device_type, device_id, and vendor_id parameters are ignored when memory_info is provided. | ||
| """ | ||
|
|
||
| if memory_info is not None: |
There was a problem hiding this comment.
|
No Python test exercising the new memory_info= parameter or verifying that HOST_ACCESSIBLE OrtValues produce zero-copy numpy views. |
There was a problem hiding this comment.
Pull request overview
Adds Python-level support for allocating OrtValue tensors using an explicit OrtMemoryInfo (including mem_type) so plugin EP HOST_ACCESSIBLE shared allocators can be selected, enabling zero-copy numpy interop for those tensors.
Changes:
- Update tensor-to-numpy conversion to treat
HOST_ACCESSIBLEtensors as CPU-memory-compatible viaOrtDevice::UsesCpuMemory(). - Add new pybind factory methods to allocate
OrtValuefrom shape/type using a fullOrtMemoryInfolookup. - Extend
OrtValue.ortvalue_from_shape_and_type()Python API with an optionalmemory_info=parameter to route allocations through those new factories.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| onnxruntime/python/onnxruntime_pybind_state.cc | Enables zero-copy numpy views for HOST_ACCESSIBLE tensors via UsesCpuMemory(). |
| onnxruntime/python/onnxruntime_pybind_ortvalue.cc | Adds OrtMemoryInfo-based OrtValue allocation factories using shared allocator lookup. |
| onnxruntime/python/onnxruntime_inference_collection.py | Exposes memory_info= on OrtValue.ortvalue_from_shape_and_type() and dispatches to new C++ factories. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Should we wait for #28037 to merge? |
Description
Adds memory_info= parameter to OrtValue.ortvalue_from_shape_and_type(), backed by two new C-level factory methods that look up the registered shared allocator via the full OrtMemoryInfo (including mem_type).
This is required because the current shared allocator query doesn't include the memory type making HOST_ACCESSIBLE invisible to python. UsesCpuMemory() is used in GetPyObjFromTensor so that tensors in HOST_ACCESSIBLE memory are returned as zero-copy numpy views.
Motivation and Context
Enable zero copy interop between numpy and ortvalue.
This is a follow up for #28037