Skip to content

Commit a6a6374

Browse files
isVoidgmarkall
andauthored
Add Bfloat16 Low++ Bindings (#166)
This PR introduces low++ bfloat16 bindings to Numba-CUDA. They are exposed via `numba.cuda.cuda_bf16` module. Available intrinsic functions includes: * htrunc * hceil * hfloor * hrint * hsqrt * hrsqrt * hrcp * hlog * hlog2 * hlog10 * hcos * hsin * hexp * hexp2 * hexp10 This also vendors bfloat16 headers for CUDA 11 and 12 and float16 headers with separate versions for 11 and 12 (previously only the CUDA 11 float16 headers were vendored). `StringIO` and `BytesIO` objects can be used in linkable code, in order to defer the generation of the data for a linkable code object until it is actually used / needed. A workaround for numba/numba#10047, "fix `IntEnumMember` conversion" is needed - the issue is unrelated to this PR but is triggered by it. This is a temporary workaround until the Numba PR is merged and available in released and supported versions. --------- Co-authored-by: Graham Markall <[email protected]>
1 parent 9c727b0 commit a6a6374

File tree

16 files changed

+30780
-708
lines changed

16 files changed

+30780
-708
lines changed

configs/cuda_bf16.yml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
Name: Numba Bfloat16
2+
Version: 0.0.1
3+
Entry Point: ./numba_cuda/numba/cuda/include/12/cuda_bf16.h
4+
File List:
5+
- ./numba_cuda/numba/cuda/include/12/cuda_bf16.h
6+
Exclude: {}
7+
Types:
8+
__nv_bfloat16_raw: Number
9+
__nv_bfloat16: Number
10+
__nv_bfloat162_raw: Type
11+
__nv_bfloat162: Type
12+
nv_bfloat16: Number
13+
nv_bfloat162: Type
14+
Data Models:
15+
__nv_bfloat16_raw: PrimitiveModel
16+
__nv_bfloat16: PrimitiveModel
17+
nv_bfloat16: PrimitiveModel
18+
__nv_bfloat162_raw: StructModel
19+
__nv_bfloat162: StructModel
20+
nv_bfloat162: StructModel
21+
Shim Include Override: "\"cuda_bf16.h\""
22+
Additional Import:
23+
- os
24+
Require Pynvjitlink: False

0 commit comments

Comments
 (0)