Skip to content

Conversation

@gmarkall
Copy link
Contributor

Public APIs were modified in #536, creating knock-on effects for users of the External Memory Management plugin interface - e.g. Arrow and RMM. See e.g. https://github.com/apache/arrow/pull/48259/files#diff-d57b920879b21c3290244b0fe8cfee56c2581fe9b8e9d7792fead53801d51383

This PR restores the public APIs and replaces the usage of the new APIs with no context parameter with internal APIs prefixed by an underscore.

cc @pitrou

@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 26, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@gmarkall
Copy link
Contributor Author

/ok to test

@pitrou
Copy link

pitrou commented Nov 26, 2025

Note that there is still the context attribute that's been removed from those classes:
apache/arrow#48265

PyArrow needs an actual CUDA context to wrap the Numba-originating pointer:
https://github.com/apache/arrow/blob/60b976bbb7bae76bd59ae37f53bf017c995e6884/python/pyarrow/_cuda.pyx#L453-L456

@gmarkall gmarkall added the 3 - Ready for Review Ready for review by team label Nov 26, 2025
@gmarkall
Copy link
Contributor Author

PyArrow needs an actual CUDA context to wrap the Numba-originating pointer

Thanks for pointing to this - I will consider whether it's better to just revert #536 instead and hold onto the context again.

@pitrou
Copy link

pitrou commented Nov 26, 2025

Or perhaps there is a way for PyArrow to get a context from Numba that's suitable for the pointer?

@gmarkall
Copy link
Contributor Author

I can't see a way to get an appropriate context given just the pointer.

@kkraus14
Copy link
Contributor

kkraus14 commented Nov 26, 2025

You can use cuPointerGetAttribute (https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__UNIFIED.html#group__CUDA__UNIFIED_1g0c28ed0aff848042bc0533110e45820c) to get the Context from a given pointer, but the call is somewhat expensive IIRC

@gmarkall
Copy link
Contributor Author

Given that even with this PR, the API is still changed (as memory pointers won't be holding a context) and changes in Arrow would still be needed, with perhaps non-trivial impact, it may be better to revert the commit that caused this. I have a PR that does this in #611.

@pitrou
Copy link

pitrou commented Nov 27, 2025

I'm curious about the performance reasons: is it because of weakref.proxy? I can't see how storing an additional attribute would be so costly otherwise?

@gmarkall
Copy link
Contributor Author

It's because every time we handle an input to a kernel that is not a "native" Numba DeviceNDArray, the conversion from the CUDA Array Interface requires a CUDA context (the require_context decorator). This is because creating the MemoryPointer requires the context: https://github.com/NVIDIA/numba-cuda/pull/611/files#diff-4a44bf244e6d5f2531d1adc89386bed1911b1781190177a2c3a9b10e543e94a8R52. This introduces a noticeable per-argument overhead.

I suspect that we could avoid the overhead whilst still keeping the context attribute of MemoryPointer, but the use of from_cuda_array_interface would need to be audited to ensure there aren't other uses of it where it would produce a MemoryPointer that is expected to have a valid context. I'm leaning towards pushing this work further down the line and reverting the whole change as in #611, just because the current release is breaking the API and the simple way to make it right again is to revert the whole change, especially as the performance improvement is one that users would not have observed yet.

@gmarkall
Copy link
Contributor Author

Closing as #611 has gone in instead.

@gmarkall gmarkall closed this Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3 - Ready for Review Ready for review by team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants