Replies: 1 comment
-
For the existing code, I changed the first part in PR : #655 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
🐛 Describe the bug
TL;DR
tensor.mutable_data_ptr<T>()
andtensor.const_data_ptr<T>()
. Stop using<scalar_t>tensor.data_ptr()
.char*
. Either useint *
or use the allocator directly for buffer accessing.1. Use template data_ptr API for tensor
1.1 Existing code usage and its problem
Existing code uses raw pointers like this:
The above usage doesn't check whether the
tensor
is initialized or not. When thetensor
's storage is not initialized (for example, FakeTensor), thisdata_ptr
is null. Then the kernel may try to write to a non-initialized memory, and causing a page fault.1.2 template data_ptr
The template data_ptr will do additional checks. Like
has_storage()
andstorage_initialized()
. Please refer to the PyTorch TensorImpl.h for detail.In view of that, change to a safer usage for old usage.
1.3 New API:
mutable_data_ptr()
andconst_data_ptr()
PyTorch has introduced new APIs for
tensor.data_ptr()
. Their usage is the same with the templatedata_ptr
. Their meaning is just as the name suggests.2. Avoid using
char *
data ptr for bufferThe existing code has the following pattern:
For example, the code snippet in torch-xpu-ops/...Reduce.h:
The template
tensor.data_ptr<T>
does not support thechar
type. Thus, we should avoid using this. We could align with PyTorch's writing usage with the following:2.1 Use
int*
instead ofchar*
For example, PyTorch has the following code in Normalization.cuh
However, the above usage is not the best practice, this usage is constructing a tensor and using data in it. We could directly use an allocator. See section 2.2.
2.2 Use the allocator directly.
For example, for the code snippet in Reduce.cuh:
It directly uses the allocator, rather than constructing the tensor. I believe that this usage could have some performance gain.
Beta Was this translation helpful? Give feedback.
All reactions