forked from ggml-org/llama.cpp
-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
Description
Name and Version
测试设备:骁龙8gen3
测试模型:qwen2.5-1.5b-instruct-fp16.gguf
Operating systems
Android
GGML backends
QNN
Hardware
骁龙8gen3
Models
No response
Problem description & steps to reproduce
同标题(分配到qnn_gpu上时可正常推理)
First Bad Commit
No response
Relevant log output
[qnn-npu][MUL_MATf32_1536x8960f32_1536x42f32#MUL(SILU,MUL_MAT)#MUL_MAT(NONE,MUL)#ADD(MUL_MAT,ADD)f32_1536x42f32][execute]error: QNN_GRAPH_ERROR_INVALID_HANDLE
graph_compute: ggml_backend_sched_graph_compute_async failed with error -1
process_ubatch: failed to compute graph, compute status: -1
decode: removing KV cache entries for seq_id = 0, pos = [0, +inf)
llama_decode: failed to decode, ret = -3
main : failed to eval
idx 1, name:qnn-gpu
idx 2, name:qnn-npu
FORTIFY: pthread_mutex_destroy called on a destroyed mutex (0xb4000070ae23a450)
Aborted (core dumped)
chraac
Metadata
Metadata
Assignees
Labels
Projects
Status
Backlog