-
Notifications
You must be signed in to change notification settings - Fork 12.1k
vulkan : add GGML_VK_FORCE_HEAP_INDEX env var #9734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Some vulkan devices (namely integrated graphics cards) have multiple memory heaps: a smaller dedicated memory and a larger shared memory. ggml uses the first usable memory type, which usually resides on the smaller dedicated memory heap. This can likely cause allocation failures. This patch adds an environment variable that forces allocation on a specific memory heap.
buf->memory_property_flags = req_flags; | ||
|
||
if (memory_type_index == UINT32_MAX && fallback_flags) { | ||
memory_type_index = find_properties(&mem_props, &mem_req, fallback_flags); | ||
memory_type_index = find_properties(&mem_props, &mem_req, fallback_flags, device->force_heap_index); | ||
buf->memory_property_flags = fallback_flags; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this fallback already handle that situation? That's basically what it was introduced for. You have a smaller (DEVICE_LOCAL) and a larger heap (DEVICE_LOCAL, HOST_VISIBLE and HOST_COHERENT), so once the small one is exhausted you fall back to the larger one.
Can you add the vulkaninfo of the device where this fallback does not work and describe why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that find_properties
doesn't take in account about memory usage (as vk::PhysicalDeviceMemoryProperties
reports total mem, not available mem)
On boot, with a few chrome tabs open, my laptop consumes 2.8/4GiB of the smaller heap, so if it tries to allocate 2GiB, it will fail.
One could argue the better fix is to either
- use
VK_EXT_memory_budget
for querying actual memory usage infind_properties
, so it doesn't select a memory type that does not have enough available memory - use a loop while allocating, if the allocation fails with
ErrorOutOfDeviceMemory
, try the next available memory type - or a combination of both
I did the minimal viable solution here, which is to force allocation from a known bigger heap (in my case, heap 1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gyf304 Thank you.
Take a look at ggml-org/whisper.cpp#2451. This might be the kind of fallback that you described in your second point. Can you check whether that would fix your issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I just pulled ggml-org/whisper.cpp#2451 and will soon sync it upstream with ggml
and llama.cpp
. Not sure if it fixes the problem here, but it worked for a couple of whisper.cpp
-related issues.
Some vulkan devices (namely integrated graphics cards) have multiple memory heaps: a smaller dedicated memory and a larger shared memory.
ggml uses the first usable memory type, which usually resides on the smaller dedicated memory heap. This can likely cause allocation failures. This patch adds an environment variable that forces allocation on a specific memory heap.