Skip to content

Releases: ngxson/llama.cpp

b5966

23 Jul 04:26
14c28df
Compare
Choose a tag to compare
CANN: weight format to NZ for Ascend310P3 (#14407)

* weight format to nz for 310p

* remove quant weight format to nz

* clean code

* fix

* make the conditions for converting weights to NZ format consistent

* clean code

b5965

23 Jul 02:13
8c988fa
Compare
Choose a tag to compare
CUDA: add fused rms norm (#14800)

b5963

22 Jul 15:58
84712b6
Compare
Choose a tag to compare
vulkan: fix rms_norm_mul to handle broadcasting dim0 (#14817)

b5962

22 Jul 15:25
d4d1522
Compare
Choose a tag to compare
llama : add model type detection for rwkv7 7B&14B (#14816)

Signed-off-by: Molly Sophia <[email protected]>

b5961

22 Jul 12:52
d1aa0cc
Compare
Choose a tag to compare
imatrix: add option to display importance score statistics for a give…

b5960

22 Jul 11:16
c8ade30
Compare
Choose a tag to compare
Mtmd: add a way to select device for vision encoder (#14236)

* Mtmd: add a way to select device for vision encoder

* simplify

* format

* Warn user if manual device selection failed

* initialize backend to nullptr

b5959

22 Jul 10:50
e28c0b8
Compare
Choose a tag to compare
cuda : implement bf16 cpy ops and enable bf16 cont (#14763)

* implement bf16 cpy ops and enable bf16 cont

* deduplicate copy functions

* deduplicate checks

b5958

22 Jul 07:16
8e6f8bc
Compare
Choose a tag to compare
opencl: remove unreachable `return` (#14806)

b5957

22 Jul 01:42
adef817
Compare
Choose a tag to compare
server : allow setting `--reverse-prompt` arg (#14799)

Signed-off-by: Molly Sophia <[email protected]>

b5956

22 Jul 00:08
48b86c4
Compare
Choose a tag to compare
cuda: remove linking to cublasLt (#14790)

Signed-off-by: Xiaodong Ye <[email protected]>