Skip to content

Enable PrefixSumCUDA to work on system with multiple GPUs#2

Open
asnaylor wants to merge 2 commits into
lxxue:masterfrom
asnaylor:master
Open

Enable PrefixSumCUDA to work on system with multiple GPUs#2
asnaylor wants to merge 2 commits into
lxxue:masterfrom
asnaylor:master

Conversation

@asnaylor
Copy link
Copy Markdown

By default PrefixSumCUDA will always use GPU0. This is a problem when at::Tensor is not on GPU0 (e.g. when the code uses multiple GPUs).

I have added cudaSetDevice to scan which now tells cuda to use the device that the at::Tensor is located on.

This code is backwards compatible (so won't break anyones code) and has been tested with a single gpu + multi gpu setup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant