-
Notifications
You must be signed in to change notification settings - Fork 12.9k
Description
Name and Version
build: 6240 (54a241f) version: 6140 (9ebfcceb)
built with Apple clang version 17.0.0 (clang-1700.0.13.5) for x86_64-apple-darwin24.6.0 molten-vk/1.4.0 vulkan-headers/1.4.325/
./build/bin/llama-cli -ngl 99 --no-warmup --temp 0 --seed 123 -n 64 -p "what is the capital of France?" -no-cnv -m ~/Models/Qwen3-Coder-30B-A3B-Instruct-1M-BF16-00001-of-00002.gguf
Other tested models:
or for gpt-oss-20b-F16.gguf
or for Qwen3-8B-Q4_K_M.gguf
or for Qwen3-Coder-30B-A3B-Instruct-1M-Q8_0.gguf
Error msg:
ggml_vulkan: Compute pipeline creation failed for multi_add_f32_2ggml_vulkan: Compute pipeline creation failed for multi_add_f32_6
No issue with other models like THUDM_GLM-Z1-32B-0414-Q8_0.gguf or llama-2-7b.Q4_0.gguf
No issue with this version: 6140 (9ebfcceb) built with Apple clang version 17.0.0 (clang-1700.0.13.5) for x86_64-apple-darwin24.6.0 molten-vk/1.4.0 vulkan-headers/1.4.325/
Operating systems
Mac
Which llama.cpp modules do you know to be affected?
llama-cli
Command line
./build/bin/llama-cli -ngl 99 --no-warmup --temp 0 --seed 123 -n 64 -p "what is the capital of France?" -no-cnv -m ~/Models/Qwen3-Coder-30B-A3B-Instruct-1M-BF16-00001-of-00002.gguf
Problem description & steps to reproduce
vk::Device::createComputePipeline: ErrorInitializationFailed
First Bad Commit
No response
Relevant log output
Error msg:
ggml_vulkan: Compute pipeline creation failed for multi_add_f32_2ggml_vulkan: Compute pipeline creation failed for multi_add_f32_6
ggml_vulkan:
ggml_vulkan: vk::Device::createComputePipeline: ErrorInitializationFailed
vk::Device::createComputePipeline: ErrorInitializationFailed
zsh: segmentation fault ./build/bin/llama-cli -ngl 99 --no-warmup --temp 0 --seed 123 -n 64 -p -m