Maxhbr/diffusion models by maxhbr · Pull Request #56 · maxhbr/myconfig

maxhbr · 2026-06-19T15:16:53Z

No description provided.

Signed-off-by: Maximilian Huber <oss@maximilian-huber.de>

Add diffusiongemma-26B-A4B-it-GGUF from unsloth as an RTX model with 262k context, Q6_K quantization, and alias diffusiongemma-26B.

- Create diffusionllama-cpp overlay on thing host, building llama.cpp from PR #24423 (diffusion-gemma GPU offload support) - Add diffusionLlamaCpp option to myconfig.ai.llama-cpp module - Add diffusionCUDA device prefix support to devices.nix: - mkGuardDevice maps diffusionCUDA to "nvidia" GPU variant - backendForDevice returns "diffusion" backend tag - llamaServerForDiffusion / llamaBenchForDiffusion helpers - Wire diffusion package through scripts.nix, router.nix, llama-swap.nix via lib/default.nix parameter threading - Deploy diffusiongemma-26B-A4B-it-Q6_K on diffusionCUDA0 device DiffusionCUDA0 sets LLAMA_ARG_DEVICE=diffusionCUDA0 so the patched llama.cpp binary routes the model through the diffusion-gemma GPU offload path.

Use the real sha256 from the PR #24423 tarball instead of lib.fakeHash.

The diffusionCUDA prefix is only for Nix-side routing to pick the patched binary. The actual llama.cpp env var expects standard CUDA0.

Override postBuild to compile the llama-diffusion-cli target from PR #24423, and route diffusionCUDA devices through it instead of the standard llama-server binary.

Maximilian Huber added 7 commits June 19, 2026 14:17

drop local model files

b4ca9cf

Signed-off-by: Maximilian Huber <oss@maximilian-huber.de>

ai: add diffusiongemma-26B-A4B-it-Q6_K model on thing

7631fa7

Add diffusiongemma-26B-A4B-it-GGUF from unsloth as an RTX model with 262k context, Q6_K quantization, and alias diffusiongemma-26B.

ai: fix diffusionllama-cpp source hash

3eec01d

Use the real sha256 from the PR #24423 tarball instead of lib.fakeHash.

ai: fix diffusionllama-cpp npmDeps hash

c02d2fe

ai: resolve diffusionCUDA0 to CUDA0 for LLAMA_ARG_DEVICE

78c52f6

The diffusionCUDA prefix is only for Nix-side routing to pick the patched binary. The actual llama.cpp env var expects standard CUDA0.

ai: build llama-diffusion-cli target in diffusionllama-cpp

13d91dd

Override postBuild to compile the llama-diffusion-cli target from PR #24423, and route diffusionCUDA devices through it instead of the standard llama-server binary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Maxhbr/diffusion models#56

Maxhbr/diffusion models#56
maxhbr wants to merge 7 commits into
masterfrom
maxhbr/diffusion_models

maxhbr commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

maxhbr commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant