Skip to content

[BUG] Warp vote operations must use a constant int for the mode #592

@gmarkall

Description

@gmarkall

Describe the bug

NVVM IR generation for warp sync intrinsics was fixed by #231 (for Issue #228). This only fixes the issue for some intrinsics; the warp vote intrinsics, documented at https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html?highlight=data%2520movement#vote also require the mode parameter to be a constant int.

These are used in the implementation of the following functions in Numba-CUDA:

  • cuda.all_sync()
  • cuda.any_sync()
  • cuda.eq_sync()
  • cuda.ballot_sync()

Steps/Code to reproduce bug

This issue can be observed for all_sync() (others are similar) by running:

from numba import cuda, types

def use_shfl_sync_idx(ary):
    i = cuda.grid(1)
    val = cuda.all_sync(0xFFFFFFFF, ary[i])
    ary[i] = val


args = (types.int32[::1],)
ptx, resty = cuda.compile_ptx(use_shfl_sync_idx, args, cc=(10, 0))

print(ptx)

with

NUMBA_DUMP_LLVM=1 python repro.py

The output will contain something similar to:

  %".5" = trunc i64 4294967295 to i32
  %".6" = trunc i64 0 to i32
  %".7" = icmp ne i32 %"predicate", 0
  %".8" = call {i32, i1} @"llvm.nvvm.vote.sync"(i32 %".5", i32 %".6", i1 %".7")

to set up the parameters for and call llvm.nvvm.vote.sync(). The parameter %.6 is the mode parameter. Instead of being a value, it should be an constant i32 0. So the generated IR should look something like:

  %".5" = trunc i64 4294967295 to i32
  %".6" = icmp ne i32 %"predicate", 0
  %".7" = call {i32, i1} @"llvm.nvvm.vote.sync"(i32 %".5", i32 0, i1 %".6")

(note that values are renumbered now that there is no need for a trunc i64 to 0 value).

The output PTX may be printed by the example; however, prior to the fix there is also a chance of crashing NVVM.

Expected behavior

Constant parameters should be generated for the mode parameter, as seen in the corrected output in the example above.

Additional context

The fix will look very similar for that to #288.

I think that there will be no need for the intrinsic_wrapper.py file after this fix, and it can be removed.

NVBug: 5122224.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions