Skip to content

[qnn][bug] FP16 matmul 分配到 qnn_npu 上运行时推理崩溃 #46

@finneyyan

Description

@finneyyan

Name and Version

测试设备:骁龙8gen3
测试模型:qwen2.5-1.5b-instruct-fp16.gguf

Operating systems

Android

GGML backends

QNN

Hardware

骁龙8gen3

Models

No response

Problem description & steps to reproduce

同标题(分配到qnn_gpu上时可正常推理)

First Bad Commit

No response

Relevant log output

[qnn-npu][MUL_MATf32_1536x8960f32_1536x42f32#MUL(SILU,MUL_MAT)#MUL_MAT(NONE,MUL)#ADD(MUL_MAT,ADD)f32_1536x42f32][execute]error: QNN_GRAPH_ERROR_INVALID_HANDLE
graph_compute: ggml_backend_sched_graph_compute_async failed with error -1
process_ubatch: failed to compute graph, compute status: -1
decode: removing KV cache entries for seq_id = 0, pos = [0, +inf)
llama_decode: failed to decode, ret = -3
main : failed to eval
idx 1, name:qnn-gpu
idx 2, name:qnn-npu
FORTIFY: pthread_mutex_destroy called on a destroyed mutex (0xb4000070ae23a450)
Aborted (core dumped)

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingqnn

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions