Replies: 0 comments 1 reply
-
|
Thrust does not expose block size or shmem allocations to users, these are always hardcoded for each algorithm/architecture. You'll need to write a custom CUDA kernel outside of Thrust if you want this level of control. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Is there a way to customize the kernel launch parameter for thrust algorithms?
thrust::for_eachalways launches 512 CUDA threads per block. I am wondering if it something user can customize for performance tuning?Also related to launch parameters, but possible a new topic entirely. Is it possible to use shared memory in functor passed to
thrust::for_eachand if so, is dynamic shared memory possible and how to specify the size?Beta Was this translation helpful? Give feedback.
All reactions