Releases: allozaur/llama.cpp
Releases · allozaur/llama.cpp
b6637
Fix thinking blocks with quotes + add handling `[THINK]...[/THINK]` b…
b6625
fix: preserved zero values in chat settings inputs and textareas by s…
b6623
perplexity : show more kl-divergence data (#16321) Adds additional percentile data for displayed in the output of `llama-perplexity --kl-divergence`: - Added 95 percentile (mirroring existing 5 percentile) - Added 0.1 percentile (mirroring existing 99.9 percentile)
b6567
model : add label for LiquidAI LFM2-2.6B model (#16204) * model : add label for LiquidAI LFM2-2.6B model HF link: [LiquidAI/LFM2-2.6B](https://huggingface.co/LiquidAI/LFM2-2.6B). Support for GGUF conversion and inference is added in #14620. However, due to similar `n_embd`, it identifies as a 1.2B model. Fix the label by using `n_ff` to identify the model instead. Output of `llama-bench`: ``` | model | size | params | backend | threads | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: | | lfm2 1.2B F16 | 2.18 GiB | 1.17 B | CPU | 10 | pp512 | 223.97 ± 5.32 | | lfm2 2.6B F16 | 4.79 GiB | 2.57 B | CPU | 10 | pp512 | 92.53 ± 4.14 | | lfm2 350M F16 | 676.25 MiB | 354.48 M | CPU | 10 | pp512 | 725.52 ± 11.70 | | lfm2 700M F16 | 1.38 GiB | 742.49 M | CPU | 10 | pp512 | 336.22 ± 12.93 | ``` * Update src/llama-model.cpp Co-authored-by: Sigbjørn Skjæret <[email protected]> --------- Co-authored-by: Sigbjørn Skjæret <[email protected]>
b6565
common : add missing chrono header for common.cpp (#16211) Signed-off-by: Uilian Ries <[email protected]>
b6556
zdnn: refactor codebase + add docs (#16178) * zdnn: initial matmul refactor Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: rm static from funcs Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: update ggml-zdnn.h Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: change header files to hpp Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: switch to common.hpp Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: move mulmat forward around Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: rm inline from utils Signed-off-by: Aaron Teo <[email protected]> * ggml-zdnn: code cleanup Signed-off-by: Aaron Teo <[email protected]> * docs: add zDNN docs Signed-off-by: Aaron Teo <[email protected]> --------- Signed-off-by: Aaron Teo <[email protected]>
b6520
feat: Improve mobile UI for Settings Dialog (#16084) * feat: Improve mobile UI for Settings Dialog * chore: update webui build output * fix: Linting errors * chore: update webui build output
b6517
ggml-amx : fix ggml_amx_init() on generic Linux (#16049)
Generalize Linux check to `__linux__` to support non-glibc systems (like musl).
Also, return `false` on unknown/untested OS.
Without this commit, the code compiles (with warnings) but fails:
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (Intel(R) Xeon(R) Platinum 8488C)
build: 6487 (51c4cac6) with x86_64-linux-musl-gcc (GCC) 15.1.0 for x86_64-linux-musl (debug)
system info: n_threads = 8, n_threads_batch = 8, total_threads = 16
....
print_info: n_ctx_orig_yarn = 262144
print_info: rope_finetuned = unknown
print_info: model type = 4B
Illegal instruction (core dumped)
Signed-off-by: Adrien Gallouët <[email protected]>
b6501
metal : refactor + optimize v2 (#15995) * metal : improve naming * metal : refactor device ggml-ci * cont : props ggml-ci * metal : apply ggml_mem_ranges_t ggml-ci * metal : remove GGML_METAL_USE_BF16 ggml-ci * metal : refactor device buffer ggml-ci * cont : fix naming * metal : sync before destroying the backend ggml-ci * metal : refactor context ggml-ci * metal : migrate ggml-metal.m to ggml-metal.cpp ggml-ci * metal : adjust ops API ggml-ci * metal : use C++ to store piplienes ggml-ci * metal : migrate ops to separate functions ggml-ci * metal : add ggml_metal_library_t ggml-ci * metal : improve naming ggml-ci * metal : cleanp ggml-ci * metal : add support for GGML_OP_LOG ggml-ci * metal : fix error handling ggml-ci
b6393
Implement --log-colors with always/never/auto (#15792) With auto by default Signed-off-by: Eric Curtin <[email protected]>