feat(audio-car-cockpit): Improved ROCm GPU automatic detection and added CPU override by ThomasGmeinder · Pull Request #74 · Liquid4All/cookbook

ThomasGmeinder · 2026-03-13T21:04:44Z

Summary

Auto-detect AMD ROCm GPUs via rocm-smi and build llama.cpp with HIP acceleration when available, falling back to CPU-only build otherwise
Auto-detect GPU architecture (e.g. gfx1151) from rocm-smi instead of hardcoding gfx1150, and pass --n-gpu-layers 9999 to offload model layers to the GPU
Fix audio streaming to handle both the ROCm-built audio server (delta.audio, int16 PCM) and the pre-built CPU binary (delta.audio_chunk, float32 PCM)
Add CPU=1 make option to force a CPU-only build even when ROCm is available
Add make clean target and check_system.sh diagnostic script

Test report

Tested on AMD Ryzen with Radeon iGPU (gfx1151) using ROCm 7.2
Tested CPU-only build with make CPU=1 audioserver && make CPU=1 serve
Audio playback verified on both ROCm and CPU paths

Made with Cursor

…ers for ROCm builds Auto-detect HIP_ARCH from rocminfo instead of hardcoding gfx1150, with fallback if detection fails. Pass --n-gpu-layers 9999 to the audio server when ROCm is present so model layers are offloaded to the GPU. Also add a clean target and a check_system.sh helper script. Made-with: Cursor

… server The audio server built from PR #18641 (ROCm path) returns the audio field as delta.audio with int16 PCM format, while the pre-built CPU binary uses delta.audio_chunk with float32 PCM. Handle both field names in server.py and auto-detect the PCM format (int16 vs float32) in the browser for correct playback on both build paths. Made-with: Cursor

…ng HIP build Replace the /opt/rocm directory check with rocm-smi --showproductname to verify a GPU is actually present and the driver is loaded. Falls back to CPU build if ROCm is installed but not functional. Also use rocm-smi for HIP_ARCH detection instead of rocminfo. Made-with: Cursor

The ROCm-built audio server (PR #18641) returns delta.audio with int16 PCM, while the pre-built CPU binary returns delta.audio_chunk with float32 PCM. Use the field name to set the format explicitly instead of fragile auto-detection probing that failed on quiet audio chunks. Made-with: Cursor

Allow users to bypass ROCm GPU detection and force a CPU-only build with make CPU=1. Useful for testing or when the GPU build is unwanted. Tested on ROCm (gfx1151) and on CPU with CPU=1 override. Made-with: Cursor

…at via env Separate build directories (llama.cpp-rocm, llama.cpp-cpu) and binaries (llama-server-rocm, llama-server-cpu) so switching between ROCm and CPU no longer requires make clean. Both builds coexist on disk. The Makefile passes AUDIO_PCM_FORMAT (int16 for ROCm, float32 for CPU) to server.py via env var. server.py sends it to the browser as a config message on websocket connect. The JS uses it directly — no data probing. This resolves the audio distortion issues caused by the two audio server binary versions returning different PCM encodings (int16 vs float32) while both reporting format as "pcm". Tested on ROCm (gfx1151) and CPU (make CPU=1) — audio works on both. Made-with: Cursor

Server-side: log time-to-first-audio-byte (ASR + tool calling + TTS first decode) and total end-to-end latency from ASR receive to last TTS chunk. Client-side: log TTFA in the browser console measuring from button release to first audio chunk received, capturing the full user-perceived latency including client overhead. Also skip empty audio chunks to avoid createBuffer errors. Made-with: Cursor

Made-with: Cursor

ROCm <= 7.2 ships rocBLAS Tensile kernels for gfx1100/1101/1102/1150/ 1151/1200/1201 but NOT gfx1153 (Ryzen AI 7 / Krackan). The audio server's multimodal warmup (mmproj + vocoder + speaker tokenizer) dispatches GEMM shapes that have no matching gfx1153 kernel and segfaults at `common_init_from_params: warming up the model`. Set HSA_OVERRIDE_GFX_VERSION=11.5.0 (gfx1150 — binary-compatible RDNA 3.5) as a recipe-line prefix on the `audioserver` target only. Applying it globally is not an option: the same override crashes the tool model (llama-server-rocm) instead, so the env var is intentionally NOT exported and does not leak via `serve` to spawn_server's child process. A `$(wildcard /opt/rocm*/lib/rocblas/library/*gfx1153*)` check makes the override self-disable once a future ROCm release ships the missing kernels — the audio server will then run with native gfx1153 dispatch without further Makefile edits. Made-with: Cursor

Paulescu · 2026-04-28T11:57:58Z

Hi @ThomasGmeinder ,

What is the main intent of this PR?

Was the code not working on your AMD machine and you needed to fix it? Or are these 2nd-order optimizations to speed up inference?

Pau

ThomasGmeinder · 2026-04-30T18:20:19Z

Hi Pau
I added support for ROCm and AMD GPUs with the PR #41 which was merged on 25th Feb 2026. The intent of this PR is adding improvements on the same branch ROCm_support.

I have tested this on a wide range or Ryzen AI devices:
Krackan2e (4 CU iGPU), Strix (16 CU iGPU) and Strix Halo (40 CU iGPU)

Kind Regards, Thomas

Thomas Gmeinder and others added 9 commits March 13, 2026 20:25

feat(audio-car-cockpit): add CPU=1 option to force CPU-only build

0581e7f

Allow users to bypass ROCm GPU detection and force a CPU-only build with make CPU=1. Useful for testing or when the GPU build is unwanted. Tested on ROCm (gfx1151) and on CPU with CPU=1 override. Made-with: Cursor

Merge audio_latency_measurement into ROCm_support

06a585a

Made-with: Cursor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(audio-car-cockpit): Improved ROCm GPU automatic detection and added CPU override#74

feat(audio-car-cockpit): Improved ROCm GPU automatic detection and added CPU override#74
ThomasGmeinder wants to merge 9 commits intoLiquid4All:mainfrom
ThomasGmeinder:ROCm_support

ThomasGmeinder commented Mar 13, 2026

Uh oh!

Paulescu commented Apr 28, 2026

Uh oh!

ThomasGmeinder commented Apr 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ThomasGmeinder commented Mar 13, 2026

Summary

Test report

Uh oh!

Paulescu commented Apr 28, 2026

Uh oh!

ThomasGmeinder commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ThomasGmeinder commented Apr 30, 2026 •

edited

Loading