-
Notifications
You must be signed in to change notification settings - Fork 1.1k
gpu: intel: ocl: allow unlimited allocations #4354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
83b8209 to
bdf6bda
Compare
|
make test |
src/gpu/intel/ocl/usm_utils.cpp
Outdated
| bool large_buffer = size | ||
| > utils::downcast<const xpu::ocl::engine_impl_t *>(engine->impl()) | ||
| ->max_allocation_size(); | ||
| static cl_bitfield properties[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I'm not sure static is good for properties object that's passed to CL calls especially in case it will need to be extended in the future. Also current code doesn't provide a scalable way of expanding it either, probably a TODO with that note could help to look into that direction when/if needed.
| #endif | ||
|
|
||
| int get_gpu_ram_sizes(size_t &ram_size, size_t &max_alloc_size) { | ||
| int get_gpu_ram_size(size_t &ram_size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you have a chance to verify the change works with all 4 supported memory kinds times correctness and fast performance mode where different approach used for memory object management?
As a part of this question also: should this call be updated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it seems that CL_MEM_ALLOW_UNRESTRICTED_SIZE_INTEL is truly unrestricted, allowing to allocate any size even over the GPU VRAM. So, I am going to have to add some additional guard.
NVM it looks like driver bug.
The flag according to documentation only applies to clCreateBuffer, clCreateBufferWithProperties, clCreateBufferWithPropertiesINTEL, clSVMAlloc, clSharedMemAllocINTEL, clDeviceMemAllocINTEL and clHostMemAllocINTEL.
f3bf210 to
79caa93
Compare
79caa93 to
179f908
Compare
|
make test |
|
make test perf-gpu |
Allow allocation of buffers bigger than
CL_DEVICE_MAX_MEM_ALLOC_SIZElimited byCL_DEVICE_GLOBAL_MEM_SIZE.