Skip to content

vulkan : add GGML_VK_FORCE_HEAP_INDEX env var #9734

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gyf304
Copy link

@gyf304 gyf304 commented Oct 4, 2024

Some vulkan devices (namely integrated graphics cards) have multiple memory heaps: a smaller dedicated memory and a larger shared memory.

ggml uses the first usable memory type, which usually resides on the smaller dedicated memory heap. This can likely cause allocation failures. This patch adds an environment variable that forces allocation on a specific memory heap.

Some vulkan devices (namely integrated graphics cards) have multiple
memory heaps: a smaller dedicated memory and a larger shared memory.

ggml uses the first usable memory type, which usually resides on the
smaller dedicated memory heap. This can likely cause allocation
failures. This patch adds an environment variable that forces
allocation on a specific memory heap.
@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Oct 4, 2024
@0cc4m 0cc4m self-requested a review October 4, 2024 19:20
buf->memory_property_flags = req_flags;

if (memory_type_index == UINT32_MAX && fallback_flags) {
memory_type_index = find_properties(&mem_props, &mem_req, fallback_flags);
memory_type_index = find_properties(&mem_props, &mem_req, fallback_flags, device->force_heap_index);
buf->memory_property_flags = fallback_flags;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this fallback already handle that situation? That's basically what it was introduced for. You have a smaller (DEVICE_LOCAL) and a larger heap (DEVICE_LOCAL, HOST_VISIBLE and HOST_COHERENT), so once the small one is exhausted you fall back to the larger one.
Can you add the vulkaninfo of the device where this fallback does not work and describe why?

Copy link
Author

@gyf304 gyf304 Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vulkaninfo.txt

The problem is that find_properties doesn't take in account about memory usage (as vk::PhysicalDeviceMemoryProperties reports total mem, not available mem)
On boot, with a few chrome tabs open, my laptop consumes 2.8/4GiB of the smaller heap, so if it tries to allocate 2GiB, it will fail.

One could argue the better fix is to either

  • use VK_EXT_memory_budget for querying actual memory usage in find_properties, so it doesn't select a memory type that does not have enough available memory
  • use a loop while allocating, if the allocation fails with ErrorOutOfDeviceMemory, try the next available memory type
  • or a combination of both

I did the minimal viable solution here, which is to force allocation from a known bigger heap (in my case, heap 1).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gyf304 Thank you.

Take a look at ggml-org/whisper.cpp#2451. This might be the kind of fallback that you described in your second point. Can you check whether that would fix your issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I just pulled ggml-org/whisper.cpp#2451 and will soon sync it upstream with ggml and llama.cpp. Not sure if it fixes the problem here, but it worked for a couple of whisper.cpp-related issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants