Open
Description
sortperm!
takes indices as inputs, and preforms @inbounds
accesses that can crash the GPU:
julia> CUDA.@sync sortperm!(cu([1,10000000]), CUDA.rand(2); initialized=true);
ERROR: CUDA error: an illegal memory access was encountered (code 700, ERROR_ILLEGAL_ADDRESS)
Running this with --check-bounds=yes
reveals that the bitonic sort implementation takes these input indices and blindly uses them to index shared memory. That's bad.
Normally we solve this by marking the function @propagate_inbounds
instead of @inbounds
, and expect the user to vouch for the inputs. However, @inbounds
doesn't propagate across kernel launch boundaries.