Skip to content

Fix missing MMA flash attention instances in build#16

Merged
volkermauel merged 1 commit intomainfrom
codex/fix-undefined-reference-errors-in-cuda-build
Aug 3, 2025
Merged

Fix missing MMA flash attention instances in build#16
volkermauel merged 1 commit intomainfrom
codex/fix-undefined-reference-errors-in-cuda-build

Conversation

@volkermauel
Copy link
Copy Markdown
Owner

Summary

  • ensure CUDA flash attention MMA template instances are compiled

Testing

  • make llama-server GGML_CUDA=1 (fails: Could not find compiler "nvcc"; CUDA_DOCKER_ARCH must be set for <11.7)

https://chatgpt.com/codex/tasks/task_b_688f2ec7e34883258a7adcd4174e3a6a

@volkermauel volkermauel merged commit 8721a3b into main Aug 3, 2025
0 of 2 checks passed
@volkermauel volkermauel deleted the codex/fix-undefined-reference-errors-in-cuda-build branch August 3, 2025 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant