You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Use NVRTC for supported CCs
Changes in the compute capability support matrix in nvvm.py will
continue to be needed with new CUDA versions if we maintain a list of
explicitly-supported compute capabilities. NVRTC supports retrieving the
supported list programmatically, so we switch to using it instead.
This does assume that the user's environment has a consistent set of
components (NVVM, NVRTC, etc.) - this is generally expected to be the
case with recent developments in package management, and there's little
we can do about an inconsistent environment anyway.
Changes outside of nvvm.py / nvrtc.py are to accommodate the movement of
this functionality. A major side effect is that we no longer need to
initialize the list of supported CCs prior to forking, because we don't
need to use the CUDA runtime to populate the supported CC list.
* Use NVRTC to get runtime version
We only used the CUDA runtime library to get the runtime version so that
we could populate the list of supported compute capabilities in nvvm.py.
Now that we don't do this, and that NVRTC provides the CUDA toolkit
version, there is no need to use the CUDA runtime API at all.
The Numba API for the runtime version is not deleted in case it was used
by external code - instead, it uses NVRTC to obtain the toolkit version.
Because NVRTC used the runtime version to determine what prototypes to
bind, we need to stop doing that to avoid a circular dependency /
deadlock - instead of checking the runtime version and creating the list
of prototypes, we try to add all known prototypes, and ignore errors in
those related to LTOIR, which can occur with CUDA 11 where they were not
present.
The `runtime.is_supported_version()` API and its test is removed - it
would always have been `False` on CUDA 12 (incorrectly) and this has
never been reported as an issue, so it seems very unlikely that anyone
was using it.
* Update for new cccl search location
Recent toolkits move the CCCL headers into their own subdirectory, so we
need to add this subdirectory to the include path so that headers such
as `cuda/atomic` etc. can be located successfully in all cases.
* Handle variants of cuCtxCreate()
The most recent `cuCtxCreate()` API in the CUDA bindings will require an
additional optional parameter. We don't have to supply a value for it
(other than `None`), but we do need to provide the argument on binding
versions where it is required.
* Delete docs on runtime binding
* Default to at least compute capability 7.5
The change to use NVRTC for the supported compute capabilities also had
the implicit effect of making the default compute capability the lowest
supported by the installed NVRTC version. We need it to default to at
least 7.5 (unless specified higher by the user) to maintain the
behaviour of the compute capability logic from nvvm.py that was
replaced.
* Locate NVRTC DLL by searching possible paths
We use NVRTC to get the CUDA version, so we can't use the CUDA version
to determine the NVRTC DLL / SO anymore. Instead, check for the presence
of each version, preferring the highest.
0 commit comments