-
Notifications
You must be signed in to change notification settings - Fork 14.4k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
ggml-cuda: refactor cuda graph usage
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#18637
opened Jan 6, 2026 by
am17an
Loading…
Added note for compiling on integrated GPUs
documentation
Improvements or additions to documentation
#18633
opened Jan 6, 2026 by
alosslessdev
•
Draft
vulkan: optimize ssm_scan
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#18630
opened Jan 5, 2026 by
jeffbolznv
Loading…
ggml-webgpu: Fix GGML_MEM_ALIGN to 8 for emscripten.
ggml
changes relating to the ggml tensor library for machine learning
#18628
opened Jan 5, 2026 by
yomaytk
Loading…
rpc : implement event and async backend APIs
ggml
changes relating to the ggml tensor library for machine learning
CANN: Remove unused functions
Ascend NPU
issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
#18625
opened Jan 5, 2026 by
rauletorresc
Loading…
CANN: Rename issues specific to Ascend NPUs
ggml
changes relating to the ggml tensor library for machine learning
get_env to get_env_as_lowercase
Ascend NPU
#18624
opened Jan 5, 2026 by
rauletorresc
Loading…
Hexagon add support for f16/f32 flash attention, scale, set-rows and improve f16/32 matmul
ggml
changes relating to the ggml tensor library for machine learning
#18611
opened Jan 5, 2026 by
max-krasnyansky
Loading…
ggml webgpu: initial flashattention implementation
ggml
changes relating to the ggml tensor library for machine learning
#18610
opened Jan 5, 2026 by
reeselevine
Loading…
Fix grammar parsing issues to prevent stack overflow and hangs
testing
Everything test related
#18604
opened Jan 5, 2026 by
aagit
Loading…
common: build as shared library when BUILD_SHARED_LIBS is ON
#18602
opened Jan 5, 2026 by
rsauciuc
Loading…
ggml: fix assertion in ggml_build_backward_expand for inplace operations
ggml
changes relating to the ggml tensor library for machine learning
#18589
opened Jan 4, 2026 by
nlasky2000-dot
Loading…
mtmd : fix integer overflow when n_tokens equals INT32_MIN
examples
#18588
opened Jan 4, 2026 by
ylwango613
Loading…
Fix division by zero vulnerability in gguf_init_from_file_impl
ggml
changes relating to the ggml tensor library for machine learning
#18586
opened Jan 4, 2026 by
ylwango613
Loading…
add option --tensor-type-file to llama-quantize
examples
#18572
opened Jan 3, 2026 by
EugeoSynthesisThirtyTwo
Loading…
llama: max ctx by default, fix fit magic number
examples
testing
Everything test related
#18567
opened Jan 3, 2026 by
JohannesGaessler
Loading…
cuda : check src shapes for CUDA graphs
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.