Skip to content

feat(windows): Native AMD ROCm GPU Acceleration (Resolves #531)#538

Open
JuanCMPDev wants to merge 2 commits intojamiepine:mainfrom
JuanCMPDev:feature/rocm-windows-support
Open

feat(windows): Native AMD ROCm GPU Acceleration (Resolves #531)#538
JuanCMPDev wants to merge 2 commits intojamiepine:mainfrom
JuanCMPDev:feature/rocm-windows-support

Conversation

@JuanCMPDev
Copy link
Copy Markdown

@JuanCMPDev JuanCMPDev commented Apr 22, 2026

Description

This PR introduces full native AMD ROCm support for Windows, allowing users with Radeon GPUs to leverage hardware acceleration via MIOpen. It mirrors the existing CUDA architecture while introducing robust fallbacks and architectural fixes.

Note on #531: Contrary to popular belief, ROCm is available on Windows. Instead of relying on DirectML (which suffers from performance overhead), this PR dynamically downloads and bundles the official AMD Radeon PyTorch wheels (rocm-rel-7.2.1), bringing near-native Linux performance to Windows AMD users.

Key Changes

  1. Backend Build Pipeline: Added --rocm to build_binary.py. It pulls official AMD Radeon wheels, bundles the server via --onedir, and safely restores the developer's CPU/CUDA environment post-build. Included --no-deps to prevent NumPy 2.x upgrades from breaking Numba/Qwen3-TTS.
  2. Platform Detection: Added is_amd_gpu_windows() using CIM queries. Tensor routing works transparently via PyTorch's cuda namespace abstraction (and UI gracefully identifies the backend as ROCm instead of CUDA).
  3. Dynamic Downloads: Added services/rocm.py and the respective FastAPI endpoints for SSE progress tracking and binary extraction.
  4. Tauri Priority Launcher: Modified main.rs to detect and spawn voicebox-server-rocm.exe with priority (ROCm > CUDA > CPU).
  5. Architectural Fixes:
    • Switch to CPU Lock: Added backend_override state to Tauri. Switching to CPU now passes an override flag before restarting, bypassing Windows OS-level file locks on active GPU binaries (no destructive uninstallation required).
    • PyInstaller rocm_sdk Crash: Added a runtime hook (pyi_rth_rocm_sdk.py) and explicit --collect-all flags to prevent silent UnboundLocalError crashes when PyInstaller's static analyzer misses AMD's dynamic DLL imports.
    • HF_HUB_OFFLINE Catch-22: Patched _is_model_cached in qwen_custom_voice_backend.py to ensure AutoProcessor caching (specifically preprocessor_config.json for the Base dependency) resolves correctly before forcing offline mode, preventing generation crashes.

Testing

Closes #531

Summary by CodeRabbit

Release Notes

  • New Features

    • Added AMD GPU (ROCm) backend support alongside existing CUDA acceleration
    • Users can now download, enable, and switch between CPU, CUDA, and ROCm backends
    • Enhanced GPU detection and real-time status reporting
  • Bug Fixes

    • Improved AMD GPU detection on Windows systems
    • Enhanced CUDA compatibility validation for better error handling

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 22, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces ROCm backend support for AMD GPUs alongside existing CUDA support. It adds ROCm service management, API endpoints, frontend UI components, build configuration, and comprehensive tests spanning frontend, backend, and build layers.

Changes

Cohort / File(s) Summary
Frontend GPU Components & Tests
app/src/components/ServerSettings/GpuAcceleration.test.tsx, app/src/components/ServerSettings/GpuAcceleration.tsx, app/src/components/ServerTab/GpuPage.tsx
Adds Vitest suite for GpuAcceleration with React Query mocks. Refactors components to support ROCm as first-class backend alongside CUDA: adds ROCm download/progress tracking, SSE listener for /backend/rocm-progress, separate handlers for CUDA/ROCm operations, and updated UI gating for both backends.
Frontend API & Types
app/src/lib/api/client.ts, app/src/lib/api/types.ts, app/src/lib/api/models/ModelStatus.ts
Adds ApiClient methods getRocmStatus(), downloadRocmBackend(), deleteRocmBackend(). Introduces RocmStatus and RocmDownloadProgress TypeScript interfaces mirroring CUDA types. Minor comment formatting fix in ModelStatus.
Frontend Build & Test Setup
app/vite.config.ts, app/src/test/setup.ts, app/package.json
Configures Vitest with jsdom environment, global test APIs, and setup module. Adds testing dependencies (@testing-library/react, @testing-library/jest-dom, jsdom, vitest). Adds test npm script.
Frontend Internationalization
app/src/i18n/locales/en/translation.json
Adds ROCm translation keys for backend control (activeTitle, description, downloading, etc.) and actions (downloadRocm, switchToRocm, removeRocm). Updates CUDA and general GPU sections with activeTitle and refactored CPU-switch/error messages.
Frontend Platform Lifecycle
tauri/src/platform/lifecycle.ts, app/src/platform/types.ts
Adds `setBackendOverride(backend?: string
Backend ROCm Service
backend/services/rocm.py
Comprehensive ROCm backend service managing on-disk extraction, status reporting, async download with incremental progress, integrity verification, atomic directory swapping, and startup auto-update. Includes helpers for binary versioning and active-state detection.
Backend ROCm Routes
backend/routes/rocm.py
Adds FastAPI endpoints: /backend/rocm-status (GET), /backend/download-rocm (POST), /backend/rocm (DELETE), /backend/rocm-progress (GET SSE stream). Implements background download task, progress subscription, and conflict checking.
Backend Startup & Detection
backend/app.py, backend/server.py, backend/routes/health.py
Moves ROCm environment variables to conditional-only-when-rocm gating. Launches check_and_update_rocm_binary() background task at startup alongside CUDA update. Detects ROCm via torch.version.hip in health/startup checks. Matches frozen executable name to set backend variant.
Backend Compatibility & Dtype
backend/backends/base.py, backend/backends/hume_backend.py
Early-returns from CUDA compatibility check when HIP/ROCm detected, skipping NVIDIA-specific validation. Adds exception-safe try/catch wrapper around torch.cuda.is_bf16_supported() call.
Backend Model Loading
backend/backends/qwen_custom_voice_backend.py
Updates CustomVoice model caching to verify matching Qwen3 TTS base repo is cached. Passes explicit HF_HUB_CACHE and cache_dir to handle offline-mode constraints.
Build Configuration
backend/build_binary.py
Adds --rocm CLI flag and rocm parameter to build_server(). Implements ROCm binary naming (voicebox-server-rocm), onedir packaging, ROCm-specific hidden imports, runtime hook collection, and Windows torch distribution swapping. Prevents simultaneous CUDA+ROCm builds.
PyInstaller Runtime Hook
backend/pyi_rth_rocm_sdk.py, backend/requirements-rocm.txt
Adds runtime hook patching rocm_sdk.find_libraries() to resolve bundled ROCm backend packages and DLL/SO files, including traversal protection and detailed error reporting. Adds requirements file pinning ROCm 7.2.1 torch wheels.
Tauri Backend Control
tauri/src-tauri/src/main.rs
Introduces ServerState.backend_override persistence and set_backend_override command. Adds shared probe_binary_version() helper for version verification. Extends binary discovery to include ROCm, attempts ROCm-first spawn flow, and updates restart logic.
Platform Detection & Routes
backend/utils/platform_detect.py, backend/routes/__init__.py
Adds Windows AMD GPU detector is_amd_gpu_windows() via PowerShell/WMI with torch fallback. Registers new ROCm router.
Test Suites
backend/tests/test_amd_gpu_detect.py, backend/tests/test_rocm_backends.py, backend/tests/test_rocm_build.py, backend/tests/test_rocm_download.py, backend/tests/test_rocm_requirements.py
Adds comprehensive test coverage: AMD GPU detection on Windows, CUDA compatibility with HIP skip-through, ROCm build configuration and CLI, end-to-end ROCm download with mocked HTTP, and ROCm requirements file validation with optional hardware test.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Frontend as React UI<br/>(GpuAcceleration)
    participant Tauri as Tauri<br/>(setBackendOverride)
    participant API as FastAPI<br/>(rocm routes)
    participant Service as ROCm<br/>Service
    participant Progress as Progress<br/>Manager

    User->>Frontend: Click "Download ROCm"
    Frontend->>API: POST /backend/download-rocm
    API->>API: Check if download already in progress
    API->>Service: Spawn background task: download_rocm_binary()
    API-->>Frontend: Return {message, progress_key}
    Frontend->>Progress: Subscribe to rocm-backend SSE stream
    Service->>Service: Stream tarball download
    Service->>Progress: Emit progress events (percent, filename)
    Progress-->>Frontend: SSE stream progress updates
    Service->>Service: Extract archive atomically
    Service->>Service: Write rocm-libs.json manifest
    Service->>Progress: Emit complete/error status
    Progress-->>Frontend: Final status event
    
    User->>Frontend: Click "Switch to ROCm"
    Frontend->>API: GET /backend/rocm-status
    API->>Service: get_rocm_status()
    Service-->>API: {available: true, active: false}
    API-->>Frontend: ROCm status
    Frontend->>Tauri: setBackendOverride('rocm')
    Tauri->>Tauri: Persist backend_override
    Frontend->>Tauri: restart_server()
    Tauri->>Tauri: start_server() with override
    Tauri->>Tauri: Probe ROCm binary version
    Tauri->>Tauri: Spawn voicebox-server-rocm
    Tauri-->>Frontend: Server restarted
    Frontend->>API: GET /health
    API-->>Frontend: {gpu_type: "ROCm", backend_variant: "rocm"}
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • PR #252: Implements parallel CUDA backend-download architecture (server health polling, SSE progress, API client methods, Tauri lifecycle) that the main PR extends with ROCm-specific counterparts.
  • PR #285: Refactors backend/app.py, backend/routes/health.py, build_binary.py, and Tauri main.rs for backend detection and variant handling; this PR's ROCm additions directly overlap those changed files.

Poem

🐰 Hops with joy at GPU gains,
ROCm paths through AMD lanes,
Downloads dance on SSE streams,
Progress bars fulfill our dreams!
CUDA and AMDs now align, 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 61.46% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and accurately summarizes the main change: adding native AMD ROCm GPU acceleration for Windows, which directly addresses the linked issue #531.
Linked Issues check ✅ Passed The PR fully implements ROCm support for Windows AMD GPUs, providing downloadable backend binaries and GPU acceleration via MIOpen, directly resolving issue #531's request for AMD GPU acceleration on Windows.
Out of Scope Changes check ✅ Passed All changes are scoped to ROCm backend support (platform detection, build pipeline, services, routes, Tauri launcher, tests) and supporting infrastructure (model caching, CUDA compatibility, environment variables). No unrelated functionality modifications were introduced.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 20

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
backend/app.py (1)

38-42: ⚠️ Potential issue | 🟠 Major

Gate AMD environment variable overrides on VOICEBOX_BACKEND_VARIANT == "rocm" and detect GPU architecture.

HSA_OVERRIDE_GFX_VERSION=10.3.0 and MIOPEN_LOG_LEVEL are applied unconditionally on all platforms and backends. They set RDNA2 overrides globally for Linux, NVIDIA, and CPU users regardless of whether ROCm is active. The hardcoded 10.3.0 is only correct for RDNA2; RX 7800 XT (RDNA3/gfx1101) users will silently mismatch with MIOpen.

Since VOICEBOX_BACKEND_VARIANT is already used to detect the active backend (set in backend/server.py based on the binary name), gate this block on os.environ.get("VOICEBOX_BACKEND_VARIANT") == "rocm" before setting these overrides. Additionally, detect the GPU architecture (or at least validate against known RDNA2 GFX codes) rather than blindly applying the RDNA2-specific override; the existing issue #469 documents this harms RDNA3/4 cards.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/app.py` around lines 38 - 42, Gate the AMD-specific environment
overrides on the backend being ROCm by checking
os.environ.get("VOICEBOX_BACKEND_VARIANT") == "rocm" before touching
HSA_OVERRIDE_GFX_VERSION and MIOPEN_LOG_LEVEL; then detect or validate the GPU
architecture (e.g., by probing ROCm-provided env vars or system GPU info) and
only set HSA_OVERRIDE_GFX_VERSION="10.3.0" when the detected GPU matches known
RDNA2 gfx codes, otherwise skip setting the override and log a warning; update
the block that sets HSA_OVERRIDE_GFX_VERSION and MIOPEN_LOG_LEVEL to perform
these checks and avoid applying RDNA2-specific overrides on non-rocm backends or
non-RDNA2 GPUs.
backend/build_binary.py (1)

455-481: ⚠️ Potential issue | 🟠 Major

Add ROCm torch detection to CPU build and implement ROCm restoration.

The CPU build torch detection at line 460 only checks torch.version.cuda, missing ROCm Torch. A Windows venv with only ROCm Torch installed will skip the CPU-wheel swap and bundle ROCm DLLs into the CPU binary. Additionally, the finally block (lines 541-580) lacks a restoration path for ROCm Torch.

Required fixes:

  1. Extend line 460 to check both torch.version.cuda and torch.version.hip
  2. Set restore_torch = "rocm" if only HIP Torch is detected
  3. Add elif restore_torch == "rocm": handler in the finally block to reinstall ROCm packages (or set to "cpu" if CPU swap was intended as the fallback for non-CUDA GPU variants)

Also applies to: 541-581

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/build_binary.py` around lines 455 - 481, The CPU-build detection only
checks torch.version.cuda; update the detection in the subprocess.run snippet
that sets has_cuda_torch to instead inspect both torch.version.cuda and
torch.version.hip (e.g., treat hip non-empty as ROCm), and set restore_torch =
"rocm" when only HIP is present; then extend the finally block that currently
handles restore_torch == "cuda" to also handle restore_torch == "rocm" by
reinstalling the ROCm-specific torch packages (or restoring to the intended CPU
fallback) using the same pip subprocess pattern used for CUDA restoration so
ROCm-only venvs are properly restored.
🧹 Nitpick comments (8)
backend/utils/platform_detect.py (1)

54-62: Fallback may misreport on multi-GPU/hybrid systems.

torch.cuda.get_device_name(0) only inspects device 0. On a laptop with integrated AMD Radeon graphics + discrete NVIDIA, device 0 under a CUDA-built torch will be the NVIDIA GPU and this returns False even though an AMD adapter is present; conversely, under a ROCm-built torch the integrated AMD iGPU can shadow a more capable discrete card. If the intent is "is ROCm appropriate?", consider iterating range(torch.cuda.device_count()) and matching any device, or documenting that this function reflects "the torch-visible device 0" rather than "the primary system GPU" as the docstring claims.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/utils/platform_detect.py` around lines 54 - 62, The current check
uses torch.cuda.get_device_name(0) and thus only inspects device 0, which can
misreport on multi-GPU/hybrid systems; update the logic in platform_detect.py
(the torch import block using torch.cuda.is_available and
torch.cuda.get_device_name) to iterate over all visible CUDA devices via
torch.cuda.device_count() and call torch.cuda.get_device_name(i) for each i,
returning True if any device name contains "Radeon" or "AMD", or alternatively
adjust the docstring to explicitly state the function only reflects the
torch-visible device 0 rather than the primary system GPU.
backend/routes/health.py (1)

104-117: LGTM — consistent ROCm labeling.

hasattr(torch.version, "hip") and torch.version.hip matches the detection pattern already in backend/app.py::_get_gpu_status and backend/backends/base.py::check_cuda_compatibility, so the gpu_type label stays aligned across the three call sites. Since check_cuda_compatibility() now short-circuits to (True, None) on ROCm, gpu_compatibility_warning will correctly be absent for AMD builds.

Optional nit: the ROCm-vs-CUDA label computation now exists in three places (this file, app.py, and implicitly base). A tiny shared helper (e.g., get_cuda_flavor_name() returning "ROCm" or "CUDA") would eliminate drift risk if the detection pattern ever tightens.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/health.py` around lines 104 - 117, Duplicate ROCm-vs-CUDA
detection logic appears in backend/routes/health.py (gpu_type assignment),
backend/app.py::_get_gpu_status, and
backend/backends/base.py::check_cuda_compatibility; extract that logic into a
shared helper (e.g., get_cuda_flavor_name() returning "ROCm" or "CUDA" or None)
placed in a common module, then replace the inline checks in those three
locations to call the helper (use its return to build the gpu_type string where
needed) so labeling stays consistent and avoids drift.
backend/tests/test_amd_gpu_detect.py (1)

12-12: Unused pytest import.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_amd_gpu_detect.py` at line 12, The file imports pytest but
never uses it; remove the unused import statement (the top-level "import
pytest") from backend/tests/test_amd_gpu_detect.py or, if you intended to use
pytest fixtures or marks, reference them in the test functions (e.g., add a
pytest.mark or fixture parameter) so the import is actually used; update the
import accordingly to eliminate the linter/test warning.
backend/tests/test_rocm_backends.py (2)

13-13: Unused pytest import.

pytest is imported but never referenced (no markers/fixtures used). Safe to drop.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_rocm_backends.py` at line 13, Remove the unused pytest
import from the test module; locate the top-level import statement "import
pytest" in backend/tests/test_rocm_backends.py and delete it since no pytest
markers, fixtures, or functions reference it.

27-35: torch.version.hip patch relies on attribute existing.

On non-ROCm torch builds torch.version.hip exists and is None, so patch(..., "6.2.41133") works. If a future CUDA-only wheel omits the attribute, this test will fail with AttributeError from mock. Consider adding create=True (or patching via patch.object(torch.version, "hip", ...)) for robustness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_rocm_backends.py` around lines 27 - 35, The test
test_rocm_skips_compute_check patches torch.version.hip in a way that can raise
AttributeError if the attribute is missing; update the patch to create the
attribute when absent (e.g., use patch("torch.version.hip", "6.2.41133",
create=True) or patch.object(torch.version, "hip", "6.2.41133")) so the test
reliably exercises check_cuda_compatibility() on nonstandard torch builds
without failing due to missing attributes.
backend/server.py (1)

267-276: Substring match on binary name is fragile.

"rocm" in binary_name / "cuda" in binary_name will match any install path-derived basename that incidentally contains those substrings (e.g., a user-renamed binary, or future variants like voicebox-server-rocmprof.exe). Prefer an explicit suffix/regex check against the known set of variant names, e.g.:

-        binary_name = os.path.basename(sys.executable).lower()
-        if "rocm" in binary_name:
+        binary_name = os.path.basename(sys.executable).lower()
+        # Match the documented variant suffixes only.
+        if re.search(r"voicebox-server-rocm(\.exe)?$", binary_name):
             os.environ["VOICEBOX_BACKEND_VARIANT"] = "rocm"
             logger.info("Backend variant: ROCm")
-        elif "cuda" in binary_name:
+        elif re.search(r"voicebox-server-cuda(\.exe)?$", binary_name):
             os.environ["VOICEBOX_BACKEND_VARIANT"] = "cuda"

Also worth noting: because is_rocm_active() and the 409 Cannot delete while active guard in backend/routes/rocm.py rely exclusively on this env var, a misdetection here turns into a user-facing deletion lock bug.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 267 - 276, The current detection uses fragile
substring checks on binary_name and sets VOICEBOX_BACKEND_VARIANT, which can
misdetect and break is_rocm_active()/backend/routes/rocm.py deletion guards;
replace the substring logic in the block that computes binary_name to perform an
explicit match (e.g., compare the basename against a whitelist of allowed
variants or use a strict regex that matches whole-word/suffix like
^(rocm|cuda|cpu)$) and only set os.environ["VOICEBOX_BACKEND_VARIANT"] when the
match is exact; ensure the code documents/uses the same canonical variant
strings ("rocm","cuda","cpu") so is_rocm_active() and the ROCm route rely on the
exact env var value.
backend/routes/rocm.py (1)

43-44: Use rocm.PROGRESS_KEY instead of duplicating the progress key.

The route already depends on services.rocm; using the constant avoids breaking download responses or SSE subscriptions if the service key changes.

Proposed refactor
 `@router.post`("/backend/download-rocm")
 async def download_rocm_backend():
     """Download the ROCm backend binary."""
     from ..services import rocm
@@
     create_background_task(_download())
-    return {"message": "ROCm backend download started", "progress_key": "rocm-backend"}
+    return {"message": "ROCm backend download started", "progress_key": rocm.PROGRESS_KEY}
@@
 `@router.get`("/backend/rocm-progress")
 async def get_rocm_download_progress():
     """Get ROCm backend download progress via Server-Sent Events."""
+    from ..services import rocm
+
     progress_manager = get_progress_manager()
 
     async def event_generator():
-        async for event in progress_manager.subscribe("rocm-backend"):
+        async for event in progress_manager.subscribe(rocm.PROGRESS_KEY):
             yield event

Also applies to: 65-72

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/rocm.py` around lines 43 - 44, The route is returning the
literal progress key "rocm-backend" instead of using the service constant;
replace the hard-coded string with the service's constant (use rocm.PROGRESS_KEY
from services.rocm) in both places where the route returns {"message": ...,
"progress_key": "rocm-backend"} (the block that calls
create_background_task(_download()) and the similar block around lines 65-72) so
responses and SSE subscriptions use the canonical key; ensure you reference the
existing rocm import/variable used by the module (e.g., services.rocm or the
local rocm name) when substituting the value.
app/src/i18n/locales/en/translation.json (1)

686-753: Consider symmetry between CUDA and ROCm key groups.

The CUDA backend uses flat keys (settings.gpu.download, settings.gpu.remove) while ROCm uses suffixed keys (settings.gpu.downloadRocm, settings.gpu.removeRocm). Similarly, switchToCpu description hardcodes "You can re-download CUDA later." which is now inaccurate when the user is switching off ROCm.

Consider either:

  • Renaming CUDA keys to downloadCuda/removeCuda for consistency, or
  • Making switchToCpu.description backend-agnostic (e.g., "You can re-download the GPU backend later.") so it reads correctly regardless of whether CUDA or ROCm is active.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/i18n/locales/en/translation.json` around lines 686 - 753, The
localization keys are inconsistent between CUDA and ROCm and the switchToCpu
description is backend-specific; update either by renaming the CUDA keys
"settings.gpu.download" and "settings.gpu.remove" to "settings.gpu.downloadCuda"
and "settings.gpu.removeCuda" to match
"settings.gpu.downloadRocm"/"settings.gpu.removeRocm", or (simpler) make
"settings.gpu.switchToCpu.description" backend-agnostic by replacing "You can
re-download CUDA later." with a neutral string like "You can re-download the GPU
backend later."; after changing keys, update all code/UI references that use
"download"/"remove" or the old description to the new keys/phrase so
translations remain consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/src/components/ServerSettings/GpuAcceleration.test.tsx`:
- Around line 165-172: The test replaces the global EventSource via (globalThis
as any).EventSource = vi.fn(() => mockEventSource) which leaks to other tests;
restore the original global after the SSE test by using
vi.stubGlobal('EventSource', ...) or save the original EventSource and restore
it in an afterEach/afterAll hook (referencing the mockEventSource and the direct
assignment to globalThis.EventSource in the test), or call vi.unstubAllGlobals
in afterEach to ensure the EventSource mock is removed.

In `@app/src/components/ServerTab/GpuPage.tsx`:
- Around line 548-552: The new hard-coded strings in the conditional render
around isCurrentlyCuda/isCurrentlyRocm and platform.metadata.isTauri (the
SettingSection title and description) must be moved to i18n; replace the
literals with calls to the translation function used elsewhere (e.g., t('...'))
and add matching keys to the locale resource files. Update the SettingSection
usage in GpuPage (the block referencing isCurrentlyCuda, isCurrentlyRocm,
platform.metadata.isTauri) to use i18n keys like 'gpu.backendActiveTitle.cuda',
'gpu.backendActiveTitle.rocm' and 'gpu.backendActiveDescription', and add those
keys and translations to the project's localization files.
- Around line 438-448: The “Switch to CUDA” button only restarts but doesn't
explicitly set CUDA as the target backend, so when both ROCm and CUDA are
downloaded the launcher may still choose ROCm; before calling handleRestart
(from the SettingRow action), persist an explicit backend override (e.g., call
the existing preference/update function to set preferred backend to "cuda" or
set an env/launcher override) so the launcher will choose CUDA on next start,
and additionally gate or adjust the UI when ROCm is present (check
rocmDownloaded/rocmAvailable) to either show a confirmation or disable the
button unless the override is applied; update the logic around cudaAvailable,
isCurrentlyCuda and platform.metadata.isTauri to perform the override prior to
restart.

In `@backend/pyi_rth_rocm_sdk.py`:
- Around line 35-38: The KeyError raised when indexing _dist_info.ALL_LIBRARIES
with shortname should be suppressed so we raise a clean ModuleNotFoundError;
modify the except block around the lookup of lib_entry (the try/except using
_dist_info.ALL_LIBRARIES[shortname]) to re-raise ModuleNotFoundError(f"Unknown
rocm library '{shortname}'") from None, ensuring the original KeyError context
is removed.
- Around line 48-52: The current except block swallows ModuleNotFoundError for
transitive imports; change it to "except ModuleNotFoundError as e:" and only
treat the backend package as missing when the exception refers to the requested
module (check e.name == py_package_name or compare the missing module string to
py_package_name); if it matches, add package.logical_name to missing_extras and
continue, otherwise re-raise the exception so transitive dependency errors are
not misattributed.

In `@backend/requirements-rocm.txt`:
- Around line 4-7: Update requirements-rocm.txt to pin ROCm-matched wheels and
use the Windows AMD wheel index: replace the loose `torch>=2.2.0`, `torchaudio`,
`torchvision` entries with exact ROCm pins `torch==2.9.1+rocm7.2.1`,
`torchaudio==2.9.1+rocm7.2.1`, `torchvision==0.24.1+rocm7.2.1` and swap the
current `--extra-index-url https://download.pytorch.org/whl/rocm7.2` with the
AMD Radeon Windows ROCm index `--extra-index-url
https://repo.radeon.com/rocm/windows/rocm-rel-7.2.1/` so Windows `win_amd64`
wheels are available for the test_rocm_requirements.py WMI hardware detection
path.

In `@backend/services/rocm.py`:
- Around line 288-290: The libs archive URL is incorrectly tied to the app
release tag via base_url and will break if ROCM_LIBS_VERSION isn't re-uploaded
under that tag; update the download logic so libs_archive (and its fetch) uses a
dedicated libs_base_url or falls back to a URL containing ROCM_LIBS_VERSION
(instead of using base_url/version), e.g. construct a separate libs_url derived
from ROCM_LIBS_VERSION and use that for downloading
rocm-libs-{ROCM_LIBS_VERSION}.tar.gz while keeping server_archive fetched from
the original base_url/version; modify the code paths that reference
base_url/version so only server_archive uses it and functions/methods that
download libs reference the new libs URL/fallback.
- Around line 292-348: Refactor the download/extract flow to use a temporary
staging directory instead of writing directly into rocm_dir: create a temp dir
(e.g., via tempfile.TemporaryDirectory or pathlib + uuid) and pass that staging
path as dest_dir to _download_and_extract_archive for both server and libs;
after both downloads, write the manifest to staging using
get_rocm_libs_manifest_path() semantics (but targeting the staging path), make
the server exe executable in staging using get_rocm_exe_name(), and finally
atomically replace the active rocm_dir with the staging dir (shutil.replace or
os.replace) so the swap only happens on complete success; ensure exceptions
trigger cleanup of the staging dir and that progress.mark_complete(PROGRESS_KEY)
and logger.info(...) run only after the atomic swap.

In `@backend/tests/test_amd_gpu_detect.py`:
- Around line 86-89: The test test_torch_not_installed is not reliably forcing
an ImportError because setting sys.modules["torch"]=None can be bypassed if
torch is already imported; update the test to either patch builtins.__import__
to raise ImportError when name == "torch" (e.g., patch("builtins.__import__",
side_effect=import_side_effect)) or explicitly patch the module-level reference
used by is_amd_gpu_windows (patch "backend.utils.platform_detect.torch" to raise
or be None) so that the is_amd_gpu_windows function cannot resolve a torch
module and will deterministically exercise the import-failure path.
- Around line 14-89: Update the module import and all patch targets to the
package-qualified path so tests resolve when run from repo root: change the
import "from utils.platform_detect import is_amd_gpu_windows" to "from
backend.utils.platform_detect import is_amd_gpu_windows" and update every `@patch`
target that currently begins with "utils.platform_detect." to
"backend.utils.platform_detect." (this applies to the subprocess.run patches and
any other patched attributes referencing utils.platform_detect).

In `@backend/tests/test_rocm_build.py`:
- Around line 79-92: The ROCm E2E test
TestRocmBuildE2E::test_rocm_binary_compiles_and_runs assumes a Windows-only
PyInstaller output by hard-coding "voicebox-server-rocm.exe"; update the test to
either skip unless running on Windows or compute the executable name from
sys.platform. Specifically, add a platform guard using
pytest.mark.skipif(sys.platform != "win32", reason=...) on the TestRocmBuildE2E
class or test method, or replace the hard-coded binary_exe construction in
test_rocm_binary_compiles_and_runs with logic like exe_name =
"voicebox-server-rocm.exe" if sys.platform == "win32" else
"voicebox-server-rocm" and use exe_name to build binary_exe so the test only
runs/looks for the correct artifact per platform.

In `@backend/tests/test_rocm_download.py`:
- Around line 139-146: The two dictionary keys in the responses mapping use
unnecessary f-strings (no interpolation) causing Ruff F541; replace the
f-strings with plain string literals for the keys that reference the ROCm
release URLs (the entries creating FakeResponse with content=fake_tar_gz and
content=f"{server_sha}  voicebox-server-rocm.tar.gz\n".encode()). Update the
keys in the responses dict (and leave the FakeResponse, fake_tar_gz, server_sha
usages unchanged) so the URL strings are regular quoted strings rather than
f"..." templates.
- Around line 103-112: The nested async context manager class _Streamer uses
parameter name "s" for its methods which triggers N805; rename the receiver to
"self" in the methods __aenter__, __aexit__, and aiter_bytes within class
_Streamer so method signatures become async def __aenter__(self), async def
__aexit__(self, *args), and async def aiter_bytes(self, chunk_size: int = 1024)
and update any internal references accordingly to satisfy the linter.

In `@backend/tests/test_rocm_requirements.py`:
- Around line 54-60: The test_requirements_file_content currently searches the
whole file string (variable content) so "torch" can be matched inside
"torchaudio" or "torchvision"; update the assertion logic to parse non-comment
requirement lines from req_file (ignore lines starting with # and blank lines)
or use anchored regexes to match package names at the start of a requirement
line (e.g., ^torch(?:[=<>!~].*)?$) and similarly for torchaudio and torchvision,
then assert each package appears as a standalone requirement in the parsed lines
rather than as a substring.
- Around line 62-122: The test_rocm_torch_installs_and_detects_amd currently
always runs and performs heavy network/ROCm installs; gate it behind an explicit
integration opt-in by skipping unless an env var or pytest marker is set. Update
the test to check for a flag like os.environ.get("RUN_ROCM_INTEGRATION") == "1"
or use pytest.config.getoption/pytestmark for a custom --run-integration option,
and call pytest.skip("skipping ROCm integration test") when not enabled;
reference the test function name test_rocm_torch_installs_and_detects_amd, the
has_amd check, and req_file to locate where to add the guard and early skip so
the venv creation and pip installs only run when explicitly opted-in.

In `@backend/utils/platform_detect.py`:
- Around line 35-51: The PowerShell call in platform_detect.py can hang and can
briefly show a console window; update the subprocess.run invocation used to
probe Win32_VideoController to (1) add a timeout parameter (e.g., 1–3 seconds)
and handle subprocess.TimeoutExpired so the function returns False on timeout,
(2) include "-NoProfile" in the PowerShell args to speed startup, and (3) pass
Windows-specific flags to suppress a console window (use
subprocess.CREATE_NO_WINDOW or the STARTUPINFO/STARTF_USESHOWWINDOW/ SW_HIDE
pattern) so no window flashes; ensure the existing result parsing
(int(result.stdout.strip())) remains guarded by the new exception handling and
that other exceptions still return False.

In `@tauri/src-tauri/src/main.rs`:
- Around line 260-296: The ROCm version probe currently uses Command::output()
which can block; change the probe around exe_path in main.rs to spawn the child
via
std::process::Command::new(&exe_path).arg("--version").current_dir(&rocm_dir).spawn(),
then wait for completion with a bounded timeout (e.g. use child.wait_timeout
from the wait-timeout crate or implement a thread + join with Duration). If the
wait times out, kill the child (child.kill()/child.wait()), log a timeout
message and set version_ok = false; if the child exits in time, collect its
stdout/stderr (child.wait_with_output or read pipes) and perform the same
version string parsing and comparison against app.config().version to set
version_ok. Ensure all error paths (spawn error, timeout, kill failure)
consistently fall back to CPU.
- Around line 421-435: The current logic assigns spawn_result from the ROCm
branch so an immediate spawn() error blocks fallback; refactor launching into an
ordered candidate loop (ROCm → CUDA → CPU) that checks backend_override (skip
GPU candidates when backend_override.as_deref() == Some("cpu")), attempts to
build and spawn each candidate using the same setup steps (the
app.shell().command(...) flow, args including "--data-dir", "--port",
"--parent-pid", optional "--host" when is_remote, and VOICEBOX_MODELS_DIR from
effective_models_dir), and on Err from spawn() continue to the next candidate
instead of returning early, only returning/propagating an error if all
candidates fail; use the existing symbols rocm_binary, cuda_binary,
backend_override, spawn_result, app.shell().command, and effective_models_dir to
locate and implement the change.

In `@tauri/src/platform/lifecycle.ts`:
- Around line 55-60: The setBackendOverride method currently swallows errors
from invoke('set_backend_override') so callers never see failures; modify async
setBackendOverride(backend?: string | null) to propagate the error (after
optionally logging) by re-throwing the caught error or removing the try/catch so
the rejected promise surfaces to callers, ensuring callers of setBackendOverride
can abort and surface the failure.

---

Outside diff comments:
In `@backend/app.py`:
- Around line 38-42: Gate the AMD-specific environment overrides on the backend
being ROCm by checking os.environ.get("VOICEBOX_BACKEND_VARIANT") == "rocm"
before touching HSA_OVERRIDE_GFX_VERSION and MIOPEN_LOG_LEVEL; then detect or
validate the GPU architecture (e.g., by probing ROCm-provided env vars or system
GPU info) and only set HSA_OVERRIDE_GFX_VERSION="10.3.0" when the detected GPU
matches known RDNA2 gfx codes, otherwise skip setting the override and log a
warning; update the block that sets HSA_OVERRIDE_GFX_VERSION and
MIOPEN_LOG_LEVEL to perform these checks and avoid applying RDNA2-specific
overrides on non-rocm backends or non-RDNA2 GPUs.

In `@backend/build_binary.py`:
- Around line 455-481: The CPU-build detection only checks torch.version.cuda;
update the detection in the subprocess.run snippet that sets has_cuda_torch to
instead inspect both torch.version.cuda and torch.version.hip (e.g., treat hip
non-empty as ROCm), and set restore_torch = "rocm" when only HIP is present;
then extend the finally block that currently handles restore_torch == "cuda" to
also handle restore_torch == "rocm" by reinstalling the ROCm-specific torch
packages (or restoring to the intended CPU fallback) using the same pip
subprocess pattern used for CUDA restoration so ROCm-only venvs are properly
restored.

---

Nitpick comments:
In `@app/src/i18n/locales/en/translation.json`:
- Around line 686-753: The localization keys are inconsistent between CUDA and
ROCm and the switchToCpu description is backend-specific; update either by
renaming the CUDA keys "settings.gpu.download" and "settings.gpu.remove" to
"settings.gpu.downloadCuda" and "settings.gpu.removeCuda" to match
"settings.gpu.downloadRocm"/"settings.gpu.removeRocm", or (simpler) make
"settings.gpu.switchToCpu.description" backend-agnostic by replacing "You can
re-download CUDA later." with a neutral string like "You can re-download the GPU
backend later."; after changing keys, update all code/UI references that use
"download"/"remove" or the old description to the new keys/phrase so
translations remain consistent.

In `@backend/routes/health.py`:
- Around line 104-117: Duplicate ROCm-vs-CUDA detection logic appears in
backend/routes/health.py (gpu_type assignment), backend/app.py::_get_gpu_status,
and backend/backends/base.py::check_cuda_compatibility; extract that logic into
a shared helper (e.g., get_cuda_flavor_name() returning "ROCm" or "CUDA" or
None) placed in a common module, then replace the inline checks in those three
locations to call the helper (use its return to build the gpu_type string where
needed) so labeling stays consistent and avoids drift.

In `@backend/routes/rocm.py`:
- Around line 43-44: The route is returning the literal progress key
"rocm-backend" instead of using the service constant; replace the hard-coded
string with the service's constant (use rocm.PROGRESS_KEY from services.rocm) in
both places where the route returns {"message": ..., "progress_key":
"rocm-backend"} (the block that calls create_background_task(_download()) and
the similar block around lines 65-72) so responses and SSE subscriptions use the
canonical key; ensure you reference the existing rocm import/variable used by
the module (e.g., services.rocm or the local rocm name) when substituting the
value.

In `@backend/server.py`:
- Around line 267-276: The current detection uses fragile substring checks on
binary_name and sets VOICEBOX_BACKEND_VARIANT, which can misdetect and break
is_rocm_active()/backend/routes/rocm.py deletion guards; replace the substring
logic in the block that computes binary_name to perform an explicit match (e.g.,
compare the basename against a whitelist of allowed variants or use a strict
regex that matches whole-word/suffix like ^(rocm|cuda|cpu)$) and only set
os.environ["VOICEBOX_BACKEND_VARIANT"] when the match is exact; ensure the code
documents/uses the same canonical variant strings ("rocm","cuda","cpu") so
is_rocm_active() and the ROCm route rely on the exact env var value.

In `@backend/tests/test_amd_gpu_detect.py`:
- Line 12: The file imports pytest but never uses it; remove the unused import
statement (the top-level "import pytest") from
backend/tests/test_amd_gpu_detect.py or, if you intended to use pytest fixtures
or marks, reference them in the test functions (e.g., add a pytest.mark or
fixture parameter) so the import is actually used; update the import accordingly
to eliminate the linter/test warning.

In `@backend/tests/test_rocm_backends.py`:
- Line 13: Remove the unused pytest import from the test module; locate the
top-level import statement "import pytest" in
backend/tests/test_rocm_backends.py and delete it since no pytest markers,
fixtures, or functions reference it.
- Around line 27-35: The test test_rocm_skips_compute_check patches
torch.version.hip in a way that can raise AttributeError if the attribute is
missing; update the patch to create the attribute when absent (e.g., use
patch("torch.version.hip", "6.2.41133", create=True) or
patch.object(torch.version, "hip", "6.2.41133")) so the test reliably exercises
check_cuda_compatibility() on nonstandard torch builds without failing due to
missing attributes.

In `@backend/utils/platform_detect.py`:
- Around line 54-62: The current check uses torch.cuda.get_device_name(0) and
thus only inspects device 0, which can misreport on multi-GPU/hybrid systems;
update the logic in platform_detect.py (the torch import block using
torch.cuda.is_available and torch.cuda.get_device_name) to iterate over all
visible CUDA devices via torch.cuda.device_count() and call
torch.cuda.get_device_name(i) for each i, returning True if any device name
contains "Radeon" or "AMD", or alternatively adjust the docstring to explicitly
state the function only reflects the torch-visible device 0 rather than the
primary system GPU.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1ee16310-cf77-4b25-8f09-af8393f21fb0

📥 Commits

Reviewing files that changed from the base of the PR and between ed2eec5 and 50e0b26.

📒 Files selected for processing (30)
  • app/src/components/ServerSettings/GpuAcceleration.test.tsx
  • app/src/components/ServerSettings/GpuAcceleration.tsx
  • app/src/components/ServerTab/GpuPage.tsx
  • app/src/i18n/locales/en/translation.json
  • app/src/lib/api/client.ts
  • app/src/lib/api/models/ModelStatus.ts
  • app/src/lib/api/types.ts
  • app/src/platform/types.ts
  • app/src/test/setup.ts
  • app/vite.config.ts
  • backend/app.py
  • backend/backends/base.py
  • backend/backends/hume_backend.py
  • backend/backends/qwen_custom_voice_backend.py
  • backend/build_binary.py
  • backend/pyi_rth_rocm_sdk.py
  • backend/requirements-rocm.txt
  • backend/routes/__init__.py
  • backend/routes/health.py
  • backend/routes/rocm.py
  • backend/server.py
  • backend/services/rocm.py
  • backend/tests/test_amd_gpu_detect.py
  • backend/tests/test_rocm_backends.py
  • backend/tests/test_rocm_build.py
  • backend/tests/test_rocm_download.py
  • backend/tests/test_rocm_requirements.py
  • backend/utils/platform_detect.py
  • tauri/src-tauri/src/main.rs
  • tauri/src/platform/lifecycle.ts

Comment thread app/src/components/ServerSettings/GpuAcceleration.test.tsx Outdated
Comment thread app/src/components/ServerSettings/GpuAcceleration.tsx
Comment thread app/src/components/ServerTab/GpuPage.tsx
Comment thread app/src/components/ServerTab/GpuPage.tsx
Comment thread backend/pyi_rth_rocm_sdk.py Outdated
Comment thread backend/tests/test_rocm_requirements.py
Comment on lines +35 to +51
try:
result = subprocess.run(
[
"powershell",
"-Command",
"Get-CimInstance Win32_VideoController | "
"Where-Object {$_.AdapterCompatibility -like '*AMD*'} | "
"Measure-Object | Select-Object -ExpandProperty Count",
],
capture_output=True,
text=True,
check=True,
)
if int(result.stdout.strip()) > 0:
return True
except Exception:
pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a timeout and suppress the console window on the PowerShell call.

Two practical issues with the WMI probe in a GUI/Tauri context:

  1. No timeout=. If PowerShell or WMI hangs (domain environments, corrupt CIM repository, AV hooks), subprocess.run blocks indefinitely. The broad except won't help because nothing is raised.
  2. Console window flash on Windows. When the Tauri-launched server process invokes powershell.exe without CREATE_NO_WINDOW (or STARTF_USESHOWWINDOW / SW_HIDE), a console window can briefly pop up — visible to end users on the exact platform this PR targets.

Also worth considering: PowerShell cold start is slow (~0.5–2s). If this is ever called outside tests (e.g., on a startup/health path), prefer wmic or a cached result.

🛠️ Proposed fix
     # Primary method: WMI query for AMD adapters
     try:
+        creationflags = 0
+        if hasattr(subprocess, "CREATE_NO_WINDOW"):
+            creationflags = subprocess.CREATE_NO_WINDOW  # Windows: suppress console flash
         result = subprocess.run(
             [
                 "powershell",
+                "-NoProfile",
                 "-Command",
                 "Get-CimInstance Win32_VideoController | "
                 "Where-Object {$_.AdapterCompatibility -like '*AMD*'} | "
                 "Measure-Object | Select-Object -ExpandProperty Count",
             ],
             capture_output=True,
             text=True,
             check=True,
+            timeout=10,
+            creationflags=creationflags,
         )
         if int(result.stdout.strip()) > 0:
             return True
     except Exception:
         pass

-NoProfile additionally shaves startup time by skipping user/system PowerShell profiles.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/utils/platform_detect.py` around lines 35 - 51, The PowerShell call
in platform_detect.py can hang and can briefly show a console window; update the
subprocess.run invocation used to probe Win32_VideoController to (1) add a
timeout parameter (e.g., 1–3 seconds) and handle subprocess.TimeoutExpired so
the function returns False on timeout, (2) include "-NoProfile" in the
PowerShell args to speed startup, and (3) pass Windows-specific flags to
suppress a console window (use subprocess.CREATE_NO_WINDOW or the
STARTUPINFO/STARTF_USESHOWWINDOW/ SW_HIDE pattern) so no window flashes; ensure
the existing result parsing (int(result.stdout.strip())) remains guarded by the
new exception handling and that other exceptions still return False.

Comment thread tauri/src-tauri/src/main.rs
Comment thread tauri/src-tauri/src/main.rs Outdated
Comment thread tauri/src/platform/lifecycle.ts
@JuanCMPDev JuanCMPDev force-pushed the feature/rocm-windows-support branch from 50e0b26 to a9ffccc Compare April 22, 2026 21:14
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
app/vite.config.ts (1)

4-18: ⚠️ Potential issue | 🔴 Critical

Add missing vitest and jsdom devDependencies.

The code imports defineConfig from 'vitest/config' and configures the test environment as 'jsdom', but neither vitest nor jsdom are declared in app/package.json. These must be added to devDependencies for the tests to run.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/vite.config.ts` around lines 4 - 18, The project config (defineConfig in
app/vite.config.ts) uses vitest and sets the test environment to 'jsdom' but
package.json is missing those devDependencies; update app/package.json
devDependencies to include "vitest" and "jsdom" (add compatible versions or
match the repo's toolchain), run npm/yarn install, and ensure any test setup
referencing setupFiles remains intact so vitest can run with the jsdom
environment.
backend/server.py (1)

263-277: ⚠️ Potential issue | 🟠 Major

Set VOICEBOX_BACKEND_VARIANT before importing backend.main.

Line 263 detects the backend after from backend.main import app has already run, so import-time ROCm setup in backend/app.py sees no "rocm" variant and skips HSA_OVERRIDE_GFX_VERSION / MIOPEN_LOG_LEVEL.

Proposed direction
+def _set_backend_variant_from_executable():
+    import re
+
+    binary_name = os.path.basename(sys.executable).lower()
+    if re.search(r"voicebox-server-rocm(\.exe)?$", binary_name):
+        os.environ["VOICEBOX_BACKEND_VARIANT"] = "rocm"
+        return "ROCm"
+    if re.search(r"voicebox-server-cuda(\.exe)?$", binary_name):
+        os.environ["VOICEBOX_BACKEND_VARIANT"] = "cuda"
+        return "CUDA"
+    os.environ["VOICEBOX_BACKEND_VARIANT"] = "cpu"
+    return "CPU"
+
+# Must run before importing backend.main/app.
+_backend_variant_label = _set_backend_variant_from_executable()
+logger.info(f"Backend variant: {_backend_variant_label}")
+
 try:
     logger.info("Importing argparse...")

Then remove the later duplicate detection block in __main__.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/server.py` around lines 263 - 277, Move the backend-variant detection
so VOICEBOX_BACKEND_VARIANT is set before backend.main is imported: extract the
detection logic (the code that inspects sys.executable and sets
os.environ["VOICEBOX_BACKEND_VARIANT"]) into a small helper or place it at the
top of backend/server.py and ensure it runs before the line "from backend.main
import app" so backend/app.py can see "rocm" and apply HSA_OVERRIDE_GFX_VERSION
/ MIOPEN_LOG_LEVEL at import time; then remove the later duplicate detection
block in __main__ to avoid redundant/late detection.
♻️ Duplicate comments (9)
backend/tests/test_rocm_download.py (2)

103-112: ⚠️ Potential issue | 🟡 Minor

Use self for the nested async context manager receiver.

Ruff flags these method receivers as N805; rename s to self.

Proposed fix
         class _Streamer:
-            async def __aenter__(s):
+            async def __aenter__(self):
                 return resp
 
-            async def __aexit__(s, *args):
+            async def __aexit__(self, *args):
                 return False
 
-            async def aiter_bytes(s, chunk_size: int = 1024):
+            async def aiter_bytes(self, chunk_size: int = 1024):
                 for i in range(0, len(resp.content), chunk_size):
                     yield resp.content[i : i + chunk_size]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_rocm_download.py` around lines 103 - 112, Rename the
nested async context manager receiver from `s` to `self` in class `_Streamer`:
update the method signatures for `__aenter__`, `__aexit__`, and `aiter_bytes` to
use `self` instead of `s` so they follow the expected instance naming convention
and satisfy the N805 lint rule.

139-146: ⚠️ Potential issue | 🟡 Minor

Remove the placeholder-free f-strings.

Lines 140 and 144 have no interpolation and will trigger Ruff F541.

Proposed fix
-        f"https://github.com/jamiepine/voicebox/releases/download/v0.2.3/voicebox-server-rocm.tar.gz": FakeResponse(
+        "https://github.com/jamiepine/voicebox/releases/download/v0.2.3/voicebox-server-rocm.tar.gz": FakeResponse(
             content=fake_tar_gz,
             headers={"content-length": str(len(fake_tar_gz))},
         ),
-        f"https://github.com/jamiepine/voicebox/releases/download/v0.2.3/voicebox-server-rocm.tar.gz.sha256": FakeResponse(
+        "https://github.com/jamiepine/voicebox/releases/download/v0.2.3/voicebox-server-rocm.tar.gz.sha256": FakeResponse(
             content=f"{server_sha}  voicebox-server-rocm.tar.gz\n".encode(),
         ),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_rocm_download.py` around lines 139 - 146, The tests use
unnecessary f-strings for static URLs which triggers Ruff F541; in the responses
mapping (the dict assigning to responses) replace the placeholder-free f-strings
for the keys pointing to the ROCm tarball and its .sha256 (the entries that
currently construct URLs with f".../voicebox-server-rocm.tar.gz" and
f".../voicebox-server-rocm.tar.gz.sha256") with plain string literals, keeping
the same content and preserving the FakeResponse(...) values that reference
fake_tar_gz and server_sha.
backend/tests/test_rocm_requirements.py (2)

54-60: ⚠️ Potential issue | 🟡 Minor

Make these assertions package-aware.

"torch" in content can be satisfied by torchaudio or torchvision; parse requirement names or use anchored matches.

Proposed fix
     def test_requirements_file_content(self, backend_dir):
         req_file = backend_dir / "requirements-rocm.txt"
         content = req_file.read_text()
+        requirement_lines = {
+            line.strip().split("==", 1)[0].split(">=", 1)[0].split("~=", 1)[0]
+            for line in content.splitlines()
+            if line.strip() and not line.lstrip().startswith(("#", "--"))
+        }
         assert "rocm7.2" in content, "Must point to ROCm 7.2 extra index"
-        assert "torch" in content, "Must include torch"
-        assert "torchaudio" in content, "Must include torchaudio"
-        assert "torchvision" in content, "Must include torchvision"
+        assert "torch" in requirement_lines, "Must include torch"
+        assert "torchaudio" in requirement_lines, "Must include torchaudio"
+        assert "torchvision" in requirement_lines, "Must include torchvision"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_rocm_requirements.py` around lines 54 - 60, The assertions
in test_requirements_file_content are too broad because checks like "torch" in
content can match "torchaudio" or "torchvision"; update the test to parse the
requirements file (req_file/content) into individual package names (split by
lines, strip comments/whitespace and version specifiers/extras) or use anchored
regex matches (e.g., line-start/name boundary) to assert presence of exact
packages "torch", "torchaudio", and "torchvision" so each assertion targets the
canonical package name rather than substring matches.

62-122: ⚠️ Potential issue | 🟠 Major

Gate this heavyweight ROCm install test behind an explicit opt-in.

This creates a venv, upgrades pip, and installs ROCm wheels from the network during normal test runs, which makes CI platform/network dependent.

Proposed fix
 import platform
+import os
 import subprocess
 import sys
@@
     `@pytest.mark.timeout`(900)
     def test_rocm_torch_installs_and_detects_amd(self, backend_dir):
         """
         Create a temporary venv, install requirements-rocm.txt, and verify
         torch.cuda.is_available() returns True on AMD hardware.
         """
+        if os.environ.get("RUN_ROCM_INSTALL_TEST") != "1":
+            pytest.skip("Set RUN_ROCM_INSTALL_TEST=1 to run ROCm install integration test")
+
         req_file = backend_dir / "requirements-rocm.txt"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_rocm_requirements.py` around lines 62 - 122, Gate the
heavyweight test test_rocm_torch_installs_and_detects_amd behind an explicit
opt-in: at the top of the test (in backend/tests/test_rocm_requirements.py)
check for an env var like RUN_ROCM_TESTS or a pytest marker and call pytest.skip
with a clear message if not set; alternatively decorate the function with
`@pytest.mark.skipif`(os.getenv("RUN_ROCM_TESTS") is None, reason="heavy ROCm
install test; opt-in via RUN_ROCM_TESTS=1 or -m rocm") and document that CI must
set RUN_ROCM_TESTS=1 or use -m rocm to run this test. Ensure you import os and
pytest and update any CI configs to set the opt-in when desired.
tauri/src-tauri/src/main.rs (2)

421-435: ⚠️ Potential issue | 🟠 Major

Fall back to CUDA/CPU when ROCm spawn fails immediately.

If the ROCm binary exists but cmd.spawn() returns Err, startup fails instead of trying the available CUDA or CPU backend.

Proposed direction

Refactor launch into an ordered candidate loop: ROCm → CUDA → CPU, skipping GPU candidates when backend_override == Some("cpu"), and continue to the next candidate on immediate spawn() errors.

-            cmd.spawn()
+            match cmd.spawn() {
+                Ok(result) => Ok(result),
+                Err(e) => {
+                    eprintln!("Failed to spawn ROCm backend, trying fallback: {}", e);
+                    // continue with CUDA, then CPU
+                    Err(e)
+                }
+            }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tauri/src-tauri/src/main.rs` around lines 421 - 435, The ROCm launch block
(checking backend_override, rocm_binary, building cmd and calling cmd.spawn())
treats an immediate cmd.spawn() Err as fatal and prevents trying cuda_binary or
CPU; change the logic so spawn errors for a candidate backend are handled by
logging/returning the Err locally and continuing to the next candidate (CUDA
then CPU) rather than failing the whole startup. Refactor the launch into an
ordered candidate loop (or sequential attempts) that respects backend_override
== Some("cpu") to skip GPU candidates, attempts ROCm via rocm_binary then CUDA
via cuda_binary then CPU, and on each candidate wraps cmd.spawn() so Err results
do not short-circuit but cause moving to the next candidate while
preserving/aggregating errors for a final failure; ensure environment handling
(effective_models_dir, is_remote), cwd (rocm_path.parent()), and cmd
construction using app.shell().command(...) remain per-candidate.

272-276: ⚠️ Potential issue | 🟠 Major

Add a timeout around the ROCm --version probe.

A corrupted or incompatible ROCm binary can hang this synchronous .output() call and block start_server before CUDA/CPU fallback.

Proposed direction

Spawn the probe process with piped output, wait with a small bounded timeout, kill it on expiry, and treat timeout as version_ok = false.

-            let version_ok = match std::process::Command::new(&exe_path)
-                .arg("--version")
-                .current_dir(&rocm_dir)
-                .output()
-            {
+            let version_ok = match run_version_probe_with_timeout(
+                &exe_path,
+                &rocm_dir,
+                std::time::Duration::from_secs(5),
+            ) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tauri/src-tauri/src/main.rs` around lines 272 - 276, The synchronous
Command::output() call used to set version_ok
(Command::new(&exe_path).arg("--version").current_dir(&rocm_dir).output()) can
hang; replace it by spawning the process with piped stdout/stderr (use .spawn()
and Stdio::piped()), wait for completion with a bounded timeout (e.g., a few
seconds) and if the timeout expires call child.kill() and treat the probe as
failed (version_ok = false); on successful completion collect the output and set
version_ok based on the exit status/output. Target the code that defines
version_ok (the probe using exe_path and rocm_dir in start_server/main.rs) and
ensure proper cleanup of the child process and handling of errors/timeouts.
backend/tests/test_rocm_build.py (1)

79-92: ⚠️ Potential issue | 🟠 Major

Gate the ROCm E2E build test to Windows.

This still hard-codes voicebox-server-rocm.exe, so non-Windows slow test runs will fail even before validating ROCm packaging.

🐛 Proposed fix
-@pytest.mark.slow()
+@pytest.mark.slow
+@pytest.mark.skipif(sys.platform != "win32", reason="ROCm E2E build uses Windows ROCm wheels/output")
 class TestRocmBuildE2E:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_rocm_build.py` around lines 79 - 92, The test
TestRocmBuildE2E::test_rocm_binary_compiles_and_runs hard-codes a Windows
executable name (binary_exe = ... "voicebox-server-rocm.exe") which will fail on
non-Windows runners; update the test to gate this E2E to Windows or compute the
executable name based on platform (use sys.platform or os.name) so binary_exe is
"voicebox-server-rocm.exe" on Windows and "voicebox-server-rocm" elsewhere,
and/or skip the test when not running on Windows (e.g., pytest.skip or
pytest.mark.skipif) so the ROCm packaging check only runs where the .exe name is
valid, modifying the variables binary_exe and any related assertions in
test_rocm_binary_compiles_and_runs accordingly.
backend/services/rocm.py (2)

293-312: ⚠️ Potential issue | 🟠 Major

Stage libs-only updates too.

The server update path is staged, but when only need_libs is true this still extracts directly into the active backend. A failed libs extraction can corrupt the installed ROCm runtime while get_rocm_binary_path() still reports the backend as available.

🐛 Proposed fix
-    # When re-downloading the server core, extract into a staging directory and
+    # Extract updates into a staging directory and
     # atomically rename it over rocm_dir on success.  This prevents a failed
     # mid-extraction from leaving rocm_dir in a partially-installed state that
     # passes the get_rocm_binary_path() existence check.
-    #
-    # If only the libs need updating (server already installed), we skip staging
-    # and extract directly so we don't have to copy the existing server files.
-    use_staging = need_server
+    use_staging = need_server or need_libs
     staging_dir = get_backends_dir() / "rocm-staging"
 
     if use_staging:
         if staging_dir.exists():
             shutil.rmtree(staging_dir)
         staging_dir.mkdir(parents=True, exist_ok=True)
-        # Preserve already-cached libs so we don't force a re-download
-        if not need_libs and rocm_dir.exists():
+        # Preserve the existing backend; updated archives overwrite staged files.
+        if rocm_dir.exists():
             shutil.copytree(rocm_dir, staging_dir, dirs_exist_ok=True)
         extract_dir = staging_dir

Also applies to: 353-383

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/rocm.py` around lines 293 - 312, The staging logic should
also run when only libraries are being updated: change the use_staging
assignment to use_staging = need_server or need_libs so libs-only updates
extract into the staging_dir, and when preparing the staging_dir always copy the
existing rocm_dir into staging (i.e., replace the current conditional
shutil.copytree(rocm_dir, staging_dir, dirs_exist_ok=True) which checks not
need_libs with a plain rocm_dir.exists() check) so existing server files are
preserved in staging; make the same change for the second occurrence around the
block that mirrors lines 353-383 (references: need_server, need_libs,
use_staging, staging_dir, get_backends_dir(), rocm_dir, get_rocm_binary_path())
so a failed libs extraction cannot corrupt the active backend.

289-291: ⚠️ Potential issue | 🔴 Critical

Define libs_base_url before using it to download ROCm libraries.

libs_base_url is never assigned, so when ROCm libs need to be downloaded, lines 356-357 raise NameError. This blocks any fresh ROCm installation or update requiring libraries.

🐛 Proposed fix
     base_url = f"{GITHUB_RELEASES_URL}/{version}"
+    libs_base_url = f"{GITHUB_RELEASES_URL}/{ROCM_LIBS_VERSION}"
     server_archive = "voicebox-server-rocm.tar.gz"
     libs_archive = f"rocm-libs-{ROCM_LIBS_VERSION}.tar.gz"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/rocm.py` around lines 289 - 291, The code sets base_url,
server_archive and libs_archive but never defines libs_base_url, causing a
NameError when attempting to download ROCm libs; add an assignment for
libs_base_url (e.g., derive it from GITHUB_RELEASES_URL and version or reuse
base_url) near where base_url/server_archive/libs_archive are declared so the
later download calls that reference libs_base_url succeed; update the same block
that defines base_url/server_archive/libs_archive to include libs_base_url.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/src/components/ServerTab/GpuPage.tsx`:
- Around line 320-338: There are duplicate handler declarations for
handleSwitchToCuda and handleSwitchToRocm; remove the earlier definitions (the
versions in the first block using platform.lifecycle.setBackendOverride and
restartServerWithPolling) and keep the later versions that include
setRestartPhase('stopping') before calling restartServerWithPolling; ensure only
one declaration of handleSwitchToCuda and one of handleSwitchToRocm remains and
that they call platform.lifecycle.setBackendOverride(...) and
restartServerWithPolling(...) as in the later implementations.
- Around line 344-345: The restart is happening even when setBackendOverride
fails because setBackendOverride in tauri/src/platform/lifecycle.ts currently
swallows errors; update setBackendOverride to rethrow any caught errors (or
remove the silent catch) so callers like GpuPage.tsx (where setBackendOverride
is invoked before restartServerWithPolling for CPU/CUDA/ROCm switches) only
proceed to await restartServerWithPolling when the backend override truly
succeeded; ensure the thrown error propagates to the GpuPage handlers so the
restart is gated on success and can show the failure to the user.

In `@backend/build_binary.py`:
- Around line 528-557: Add a fail-fast preflight that checks interpreter
implementation and version before the ROCm wheel install subprocess.run blocks:
verify sys.implementation.name == "cpython" and sys.version_info[:2] == (3, 12)
(or otherwise match the cp312 tag) and exit with a clear error if the check
fails; place this check immediately before the subprocess.run calls that install
the ROCm wheel URLs (the pip install lists containing
"torch-2.9.1%2Brocm7.2.1-cp312..." and the other rocm_* wheel/tar.gz URLs) so
the script aborts before partially modifying the environment when run under
non-CPython or a different minor Python version.

In `@backend/pyi_rth_rocm_sdk.py`:
- Around line 20-24: The current broad except hides all import errors; limit
suppression to only the absent package by replacing the single try/except
Exception around "import rocm_sdk" and "from rocm_sdk import _dist_info" with a
targeted import: first try "import rocm_sdk" and catch only ModuleNotFoundError
(or check e.name == 'rocm_sdk') and return, then perform "from rocm_sdk import
_dist_info" in a separate statement so any import failures for the submodule or
other exceptions propagate; ensure the code that patches find_libraries still
runs when rocm_sdk is present.

In `@backend/routes/rocm.py`:
- Around line 32-35: The current check only treats "downloading" as in-progress
so a concurrent request can proceed during extraction; update the check around
progress_manager.get_progress(rocm.PROGRESS_KEY) (using get_progress_manager() /
progress_manager) to consider extraction as in-progress too—e.g., treat status
== "downloading" OR status == "extracting" (or generally any non-terminal
status) as a conflict and raise the HTTPException to prevent a second request
from starting.

---

Outside diff comments:
In `@app/vite.config.ts`:
- Around line 4-18: The project config (defineConfig in app/vite.config.ts) uses
vitest and sets the test environment to 'jsdom' but package.json is missing
those devDependencies; update app/package.json devDependencies to include
"vitest" and "jsdom" (add compatible versions or match the repo's toolchain),
run npm/yarn install, and ensure any test setup referencing setupFiles remains
intact so vitest can run with the jsdom environment.

In `@backend/server.py`:
- Around line 263-277: Move the backend-variant detection so
VOICEBOX_BACKEND_VARIANT is set before backend.main is imported: extract the
detection logic (the code that inspects sys.executable and sets
os.environ["VOICEBOX_BACKEND_VARIANT"]) into a small helper or place it at the
top of backend/server.py and ensure it runs before the line "from backend.main
import app" so backend/app.py can see "rocm" and apply HSA_OVERRIDE_GFX_VERSION
/ MIOPEN_LOG_LEVEL at import time; then remove the later duplicate detection
block in __main__ to avoid redundant/late detection.

---

Duplicate comments:
In `@backend/services/rocm.py`:
- Around line 293-312: The staging logic should also run when only libraries are
being updated: change the use_staging assignment to use_staging = need_server or
need_libs so libs-only updates extract into the staging_dir, and when preparing
the staging_dir always copy the existing rocm_dir into staging (i.e., replace
the current conditional shutil.copytree(rocm_dir, staging_dir,
dirs_exist_ok=True) which checks not need_libs with a plain rocm_dir.exists()
check) so existing server files are preserved in staging; make the same change
for the second occurrence around the block that mirrors lines 353-383
(references: need_server, need_libs, use_staging, staging_dir,
get_backends_dir(), rocm_dir, get_rocm_binary_path()) so a failed libs
extraction cannot corrupt the active backend.
- Around line 289-291: The code sets base_url, server_archive and libs_archive
but never defines libs_base_url, causing a NameError when attempting to download
ROCm libs; add an assignment for libs_base_url (e.g., derive it from
GITHUB_RELEASES_URL and version or reuse base_url) near where
base_url/server_archive/libs_archive are declared so the later download calls
that reference libs_base_url succeed; update the same block that defines
base_url/server_archive/libs_archive to include libs_base_url.

In `@backend/tests/test_rocm_build.py`:
- Around line 79-92: The test
TestRocmBuildE2E::test_rocm_binary_compiles_and_runs hard-codes a Windows
executable name (binary_exe = ... "voicebox-server-rocm.exe") which will fail on
non-Windows runners; update the test to gate this E2E to Windows or compute the
executable name based on platform (use sys.platform or os.name) so binary_exe is
"voicebox-server-rocm.exe" on Windows and "voicebox-server-rocm" elsewhere,
and/or skip the test when not running on Windows (e.g., pytest.skip or
pytest.mark.skipif) so the ROCm packaging check only runs where the .exe name is
valid, modifying the variables binary_exe and any related assertions in
test_rocm_binary_compiles_and_runs accordingly.

In `@backend/tests/test_rocm_download.py`:
- Around line 103-112: Rename the nested async context manager receiver from `s`
to `self` in class `_Streamer`: update the method signatures for `__aenter__`,
`__aexit__`, and `aiter_bytes` to use `self` instead of `s` so they follow the
expected instance naming convention and satisfy the N805 lint rule.
- Around line 139-146: The tests use unnecessary f-strings for static URLs which
triggers Ruff F541; in the responses mapping (the dict assigning to responses)
replace the placeholder-free f-strings for the keys pointing to the ROCm tarball
and its .sha256 (the entries that currently construct URLs with
f".../voicebox-server-rocm.tar.gz" and
f".../voicebox-server-rocm.tar.gz.sha256") with plain string literals, keeping
the same content and preserving the FakeResponse(...) values that reference
fake_tar_gz and server_sha.

In `@backend/tests/test_rocm_requirements.py`:
- Around line 54-60: The assertions in test_requirements_file_content are too
broad because checks like "torch" in content can match "torchaudio" or
"torchvision"; update the test to parse the requirements file (req_file/content)
into individual package names (split by lines, strip comments/whitespace and
version specifiers/extras) or use anchored regex matches (e.g., line-start/name
boundary) to assert presence of exact packages "torch", "torchaudio", and
"torchvision" so each assertion targets the canonical package name rather than
substring matches.
- Around line 62-122: Gate the heavyweight test
test_rocm_torch_installs_and_detects_amd behind an explicit opt-in: at the top
of the test (in backend/tests/test_rocm_requirements.py) check for an env var
like RUN_ROCM_TESTS or a pytest marker and call pytest.skip with a clear message
if not set; alternatively decorate the function with
`@pytest.mark.skipif`(os.getenv("RUN_ROCM_TESTS") is None, reason="heavy ROCm
install test; opt-in via RUN_ROCM_TESTS=1 or -m rocm") and document that CI must
set RUN_ROCM_TESTS=1 or use -m rocm to run this test. Ensure you import os and
pytest and update any CI configs to set the opt-in when desired.

In `@tauri/src-tauri/src/main.rs`:
- Around line 421-435: The ROCm launch block (checking backend_override,
rocm_binary, building cmd and calling cmd.spawn()) treats an immediate
cmd.spawn() Err as fatal and prevents trying cuda_binary or CPU; change the
logic so spawn errors for a candidate backend are handled by logging/returning
the Err locally and continuing to the next candidate (CUDA then CPU) rather than
failing the whole startup. Refactor the launch into an ordered candidate loop
(or sequential attempts) that respects backend_override == Some("cpu") to skip
GPU candidates, attempts ROCm via rocm_binary then CUDA via cuda_binary then
CPU, and on each candidate wraps cmd.spawn() so Err results do not short-circuit
but cause moving to the next candidate while preserving/aggregating errors for a
final failure; ensure environment handling (effective_models_dir, is_remote),
cwd (rocm_path.parent()), and cmd construction using app.shell().command(...)
remain per-candidate.
- Around line 272-276: The synchronous Command::output() call used to set
version_ok
(Command::new(&exe_path).arg("--version").current_dir(&rocm_dir).output()) can
hang; replace it by spawning the process with piped stdout/stderr (use .spawn()
and Stdio::piped()), wait for completion with a bounded timeout (e.g., a few
seconds) and if the timeout expires call child.kill() and treat the probe as
failed (version_ok = false); on successful completion collect the output and set
version_ok based on the exit status/output. Target the code that defines
version_ok (the probe using exe_path and rocm_dir in start_server/main.rs) and
ensure proper cleanup of the child process and handling of errors/timeouts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: df166566-e448-4606-933b-dd1baa2232b1

📥 Commits

Reviewing files that changed from the base of the PR and between 50e0b26 and a9ffccc.

📒 Files selected for processing (30)
  • app/src/components/ServerSettings/GpuAcceleration.test.tsx
  • app/src/components/ServerSettings/GpuAcceleration.tsx
  • app/src/components/ServerTab/GpuPage.tsx
  • app/src/i18n/locales/en/translation.json
  • app/src/lib/api/client.ts
  • app/src/lib/api/models/ModelStatus.ts
  • app/src/lib/api/types.ts
  • app/src/platform/types.ts
  • app/src/test/setup.ts
  • app/vite.config.ts
  • backend/app.py
  • backend/backends/base.py
  • backend/backends/hume_backend.py
  • backend/backends/qwen_custom_voice_backend.py
  • backend/build_binary.py
  • backend/pyi_rth_rocm_sdk.py
  • backend/requirements-rocm.txt
  • backend/routes/__init__.py
  • backend/routes/health.py
  • backend/routes/rocm.py
  • backend/server.py
  • backend/services/rocm.py
  • backend/tests/test_amd_gpu_detect.py
  • backend/tests/test_rocm_backends.py
  • backend/tests/test_rocm_build.py
  • backend/tests/test_rocm_download.py
  • backend/tests/test_rocm_requirements.py
  • backend/utils/platform_detect.py
  • tauri/src-tauri/src/main.rs
  • tauri/src/platform/lifecycle.ts
✅ Files skipped from review due to trivial changes (5)
  • app/src/test/setup.ts
  • backend/routes/init.py
  • app/src/lib/api/models/ModelStatus.ts
  • app/src/lib/api/types.ts
  • app/src/lib/api/client.ts
🚧 Files skipped from review as they are similar to previous changes (10)
  • app/src/platform/types.ts
  • backend/requirements-rocm.txt
  • backend/backends/base.py
  • tauri/src/platform/lifecycle.ts
  • backend/app.py
  • backend/routes/health.py
  • backend/utils/platform_detect.py
  • app/src/components/ServerSettings/GpuAcceleration.test.tsx
  • app/src/i18n/locales/en/translation.json
  • app/src/components/ServerSettings/GpuAcceleration.tsx

Comment thread app/src/components/ServerTab/GpuPage.tsx Outdated
Comment thread app/src/components/ServerTab/GpuPage.tsx
Comment thread backend/build_binary.py Outdated
Comment thread backend/pyi_rth_rocm_sdk.py
Comment thread backend/routes/rocm.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/backends/qwen_custom_voice_backend.py`:
- Around line 110-130: Import the helper force_offline_if_cached at the top of
the module and ensure you only enter the offline patch when the model is
actually cached: wrap the Qwen3TTSModel.from_pretrained calls with
force_offline_if_cached(is_cached, model_name) as already attempted, but first
add the missing import (from backend.utils.hf_offline_patch import
force_offline_if_cached) and adjust backend/utils/hf_offline_patch.py so its
patching logic is gated behind the context manager (i.e., do not apply global HF
offline constants on import; apply them only inside the force_offline_if_cached
context) so _get_model_path + Qwen3TTSModel.from_pretrained run online for
uncached models without raising NameError or blocking downloads.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cc776e73-0e93-4093-844a-dcf9e125781a

📥 Commits

Reviewing files that changed from the base of the PR and between a9ffccc and cc8c501.

📒 Files selected for processing (1)
  • backend/backends/qwen_custom_voice_backend.py

Comment on lines +110 to +130
from huggingface_hub import constants as hf_constants
tts_cache_dir = hf_constants.HF_HUB_CACHE

model_path = self._get_model_path(model_size)
logger.info("Loading Qwen CustomVoice %s on %s...", model_size, self.device)

if self.device == "cpu":
self.model = Qwen3TTSModel.from_pretrained(
model_path,
torch_dtype=torch.float32,
low_cpu_mem_usage=False,
)
else:
self.model = Qwen3TTSModel.from_pretrained(
model_path,
device_map=self.device,
torch_dtype=torch.bfloat16,
)
with force_offline_if_cached(is_cached, model_name):
if self.device == "cpu":
self.model = Qwen3TTSModel.from_pretrained(
model_path,
cache_dir=tts_cache_dir,
torch_dtype=torch.float32,
low_cpu_mem_usage=False,
)
else:
self.model = Qwen3TTSModel.from_pretrained(
model_path,
cache_dir=tts_cache_dir,
device_map=self.device,
torch_dtype=torch.bfloat16,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Import and gate force_offline_if_cached before wrapping model load.

Line 116 currently raises NameError because force_offline_if_cached is not imported. Also, the current implementation in backend/utils/hf_offline_patch.py patches offline constants before checking is_cached, so entering it for uncached models can still break online downloads.

🐛 Proposed fix
 import asyncio
+from contextlib import nullcontext
 import logging
 from typing import Optional
 
 import numpy as np
 import torch
 
+from ..utils.hf_offline_patch import force_offline_if_cached
 from . import TTSBackend, LANGUAGE_CODE_TO_NAME
 from .base import (
     is_model_cached,
     get_torch_device,
     combine_voice_prompts as _combine_voice_prompts,
@@
-            with force_offline_if_cached(is_cached, model_name):
+            offline_guard = (
+                force_offline_if_cached(True, model_name)
+                if is_cached
+                else nullcontext()
+            )
+
+            with offline_guard:
                 if self.device == "cpu":
                     self.model = Qwen3TTSModel.from_pretrained(
                         model_path,
🧰 Tools
🪛 Ruff (0.15.10)

[error] 116-116: Undefined name force_offline_if_cached

(F821)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/qwen_custom_voice_backend.py` around lines 110 - 130, Import
the helper force_offline_if_cached at the top of the module and ensure you only
enter the offline patch when the model is actually cached: wrap the
Qwen3TTSModel.from_pretrained calls with force_offline_if_cached(is_cached,
model_name) as already attempted, but first add the missing import (from
backend.utils.hf_offline_patch import force_offline_if_cached) and adjust
backend/utils/hf_offline_patch.py so its patching logic is gated behind the
context manager (i.e., do not apply global HF offline constants on import; apply
them only inside the force_offline_if_cached context) so _get_model_path +
Qwen3TTSModel.from_pretrained run online for uncached models without raising
NameError or blocking downloads.

@JuanCMPDev JuanCMPDev force-pushed the feature/rocm-windows-support branch from cc8c501 to bdf476f Compare April 22, 2026 21:50
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
backend/build_binary.py (1)

455-628: ⚠️ Potential issue | 🟠 Major

Restore torch even when the pre-build ROCm install fails.

The finally block starts only at Line 566, so failures during the ROCm pip installs at Lines 534 or 549 can leave the developer environment partially switched. Wrap the torch swap/install phase in the same try/finally, and add --no-deps to restore installs to preserve the NumPy protection this PR adds.

Proposed structure
     restore_torch = None
-    if not cuda and not rocm and platform.system() == "Windows":
-        import subprocess
+    try:
+        if not cuda and not rocm and platform.system() == "Windows":
+            import subprocess
 
-        cuda_result = subprocess.run(
+            cuda_result = subprocess.run(
             [sys.executable, "-c", "import torch; print(torch.version.cuda or '')"], capture_output=True, text=True
-        )
+            )
@@
-    if rocm and platform.system() == "Windows":
-        import subprocess
+        if rocm and platform.system() == "Windows":
+            import subprocess
@@
-    # Run PyInstaller
-    try:
         PyInstaller.__main__.run(args)
     finally:
         # Restore torch if we swapped it out (even on build failure)
@@
                     "--index-url",
                     "https://download.pytorch.org/whl/cu128",
                     "--force-reinstall",
+                    "--no-deps",
                     "-q",
@@
                     "--index-url",
                     "https://download.pytorch.org/whl/cpu",
                     "--force-reinstall",
+                    "--no-deps",
                     "-q",

Apply the same wrapping to the full CPU/ROCm swap blocks and keep --no-deps on all restore paths.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/build_binary.py` around lines 455 - 628, The restore logic currently
in the finally block (checking restore_torch and running pip to restore
CPU/CUDA/ROCm) does not protect against failures during the pre-build torch
swap/install steps (the subprocess.run pip installs inside the Windows CPU/ROCm
blocks and the ROCm torch install sequence), so move/extend the try/finally to
wrap the entire torch-swap/install phase (the blocks that set restore_torch and
call subprocess.run to install CPU or ROCm wheels) so that the finally always
executes even if those pip installs fail; ensure restore_torch is determined
before any environment-overwriting installs (as already done in the ROCm block),
and when performing the restore pip installs in the finally (the branches that
install torch/torchvision/torchaudio for "cpu","cuda","rocm") include the
--no-deps flag on ROCm/restore installs where applicable (and keep existing
--index-url for CPU/CUDA restores) so NumPy protection is preserved; keep
PyInstaller.__main__.run(args) inside the try so finally still runs after the
build attempt.
app/src/components/ServerTab/GpuPage.tsx (1)

320-355: ⚠️ Potential issue | 🟠 Major

restartPhase stays on 'stopping' when setBackendOverride rejects.

All three switch handlers call setRestartPhase('stopping') before await platform.lifecycle.setBackendOverride(...). If the override call throws (now that it rethrows per the earlier review fix), control jumps to the catch which sets an error and refetches status but never resets restartPhase. The page is then stuck rendering the "Stopping server…" spinner with no way out until navigation/remount. restartServerWithPolling's own catch block resets the phase, so only the pre-restart failure window is affected.

Affects handleSwitchToCpu (lines 320-331), handleSwitchToCuda (333-343), and handleSwitchToRocm (345-355).

🐛 Proposed fix (apply to all three handlers)
   const handleSwitchToCpu = async () => {
     setError(null);
     setRestartPhase('stopping');
     try {
       await platform.lifecycle.setBackendOverride('cpu');
       await restartServerWithPolling(t('settings.gpu.errors.switchCpu'));
     } catch (e: unknown) {
+      setRestartPhase('idle');
       setError(e instanceof Error ? e.message : t('settings.gpu.errors.switchCpu'));
       refetchCudaStatus();
       refetchRocmStatus();
     }
   };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/ServerTab/GpuPage.tsx` around lines 320 - 355, The
handlers handleSwitchToCpu, handleSwitchToCuda and handleSwitchToRocm set
restartPhase('stopping') before awaiting platform.lifecycle.setBackendOverride,
so if setBackendOverride throws the catch never resets restartPhase and the UI
remains stuck; fix by moving setRestartPhase('stopping') to after a successful
await platform.lifecycle.setBackendOverride (i.e., just before calling
restartServerWithPolling) or alternately wrap the setBackendOverride call in its
own try/catch that resets restartPhase on error — apply this change to
handleSwitchToCpu, handleSwitchToCuda and handleSwitchToRocm and keep their
existing error/refetch behavior.
♻️ Duplicate comments (4)
backend/backends/qwen_custom_voice_backend.py (1)

108-130: ⚠️ Potential issue | 🔴 Critical

Missing import: force_offline_if_cached will raise NameError at load time.

force_offline_if_cached is referenced on line 116 but never imported in this module, so _load_model_sync will fail with NameError the first time a model load is attempted (Ruff F821 confirms). Add the import from ..utils.hf_offline_patch at the top of the file.

🐛 Proposed fix
 from . import TTSBackend, LANGUAGE_CODE_TO_NAME
 from .base import (
     is_model_cached,
     get_torch_device,
     combine_voice_prompts as _combine_voice_prompts,
     model_load_progress,
 )
+from ..utils.hf_offline_patch import force_offline_if_cached
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/qwen_custom_voice_backend.py` around lines 108 - 130, The
module references force_offline_if_cached inside the model loading block (used
in _load_model_sync / the Qwen3TTSModel loading logic) but never imports it; add
an import for force_offline_if_cached from ..utils.hf_offline_patch at the top
of the file so the call in the with force_offline_if_cached(is_cached,
model_name): block resolves and avoids a NameError at runtime.
backend/build_binary.py (1)

509-513: ⚠️ Potential issue | 🟡 Minor

Also reject non-CPython 3.12 for the cp312 ROCm wheels.

The hard-coded wheels are cp312-cp312-win_amd64; checking only sys.version_info still allows non-CPython 3.12 interpreters to reach the pip install path.

Proposed guard
-        if sys.version_info[:2] != (3, 12):
+        if sys.implementation.name != "cpython" or sys.version_info[:2] != (3, 12):
             raise RuntimeError(
-                f"ROCm wheels are cp312-specific; got Python {sys.version_info[:2]}. "
-                "Use Python 3.12 to build the ROCm binary."
+                "ROCm wheels are cp312-cp312-specific; "
+                f"got {sys.implementation.name} {sys.version.split()[0]}. "
+                "Use CPython 3.12 to build the ROCm binary."
             )

Verification:

#!/bin/bash
python - <<'PY'
import sys
print("implementation:", sys.implementation.name)
print("version:", sys.version.split()[0])
print("expected:", "cpython 3.12.x")
PY
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/build_binary.py` around lines 509 - 513, The current guard only
checks sys.version_info and allows non-CPython interpreters; update the check
around the existing sys.version_info[:2] comparison to also verify
sys.implementation.name == "cpython" (or equivalent) and raise the same
RuntimeError when either the Python major/minor isn't 3.12 or the implementation
is not CPython so that only CPython 3.12 interpreters proceed to build/install
the cp312 ROCm wheels; locate and modify the block that currently inspects
sys.version_info[:2] in build_binary.py to enforce both conditions.
backend/services/rocm.py (2)

369-373: ⚠️ Potential issue | 🟠 Major

Avoid deleting the working ROCm backend before the replacement is installed.

shutil.rmtree(rocm_dir) followed by staging_dir.rename(rocm_dir) is not atomic; if the rename fails after deletion, the previous working backend is gone. Rename the current dir to a backup first, then roll back on failure.

Proposed safer swap
         # Atomic swap: replace rocm_dir with the fully-extracted staging dir
         if use_staging:
+            backup_dir = get_backends_dir() / "rocm-backup"
+            if backup_dir.exists():
+                shutil.rmtree(backup_dir)
             if rocm_dir.exists():
-                shutil.rmtree(rocm_dir)
-            staging_dir.rename(rocm_dir)
+                rocm_dir.rename(backup_dir)
+            try:
+                staging_dir.rename(rocm_dir)
+            except Exception:
+                if backup_dir.exists() and not rocm_dir.exists():
+                    backup_dir.rename(rocm_dir)
+                raise
+            else:
+                if backup_dir.exists():
+                    shutil.rmtree(backup_dir)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/rocm.py` around lines 369 - 373, The current non-atomic swap
deletes rocm_dir then renames staging_dir, risking total loss if rename fails;
instead, when use_staging is true perform an atomic-safe swap by first renaming
existing rocm_dir to a temporary backup name (e.g., rocm_dir.with_suffix(".bak")
or similar), then rename staging_dir to rocm_dir, and on any failure attempt to
rename the backup back to rocm_dir; finally remove the backup only after
successful rename; update the block referencing use_staging, rocm_dir,
staging_dir to implement this rename-to-backup -> rename-staging ->
cleanup-or-rollback sequence and avoid using shutil.rmtree before a successful
replacement.

289-358: ⚠️ Potential issue | 🟠 Major

Keep ROCm libs downloads tied to ROCM_LIBS_VERSION, not the app release tag.

rocm-libs-{ROCM_LIBS_VERSION}.tar.gz is still fetched from {GITHUB_RELEASES_URL}/{version}. A new app release without re-uploading the same libs archive will break fresh ROCm installs.

Proposed fix
-    base_url = f"{GITHUB_RELEASES_URL}/{version}"
+    server_base_url = f"{GITHUB_RELEASES_URL}/{version}"
+    libs_base_url = f"{GITHUB_RELEASES_URL}/{ROCM_LIBS_VERSION}"
     server_archive = "voicebox-server-rocm.tar.gz"
     libs_archive = f"rocm-libs-{ROCM_LIBS_VERSION}.tar.gz"
@@
-                    head = await client.head(f"{base_url}/{server_archive}")
+                    head = await client.head(f"{server_base_url}/{server_archive}")
@@
-                    head = await client.head(f"{base_url}/{libs_archive}")
+                    head = await client.head(f"{libs_base_url}/{libs_archive}")
@@
-                    url=f"{base_url}/{server_archive}",
-                    sha256_url=f"{base_url}/{server_archive}.sha256",
+                    url=f"{server_base_url}/{server_archive}",
+                    sha256_url=f"{server_base_url}/{server_archive}.sha256",
@@
-                    url=f"{base_url}/{libs_archive}",
-                    sha256_url=f"{base_url}/{libs_archive}.sha256",
+                    url=f"{libs_base_url}/{libs_archive}",
+                    sha256_url=f"{libs_base_url}/{libs_archive}.sha256",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/rocm.py` around lines 289 - 358, The code builds base_url
from the app release `version` and uses it for both server and libs downloads,
which can break ROCm installs when libs are versioned separately; change the
logic so `rocm-libs-{ROCM_LIBS_VERSION}.tar.gz` is downloaded from a
libs-specific release URL (e.g. build a separate libs_base_url using
`ROCM_LIBS_VERSION`) and use that URL for the HEAD and download calls related to
the libs archive while keeping server downloads on the original `base_url`;
update references in the download/extract sequence (look for `base_url`,
`libs_archive`, the head/download calls and the call to
`_download_and_extract_archive` guarded by `need_libs`) so libs use the new
`libs_base_url` and server continues to use `base_url`.
🧹 Nitpick comments (3)
app/src/components/ServerTab/GpuPage.tsx (1)

206-238: ROCm SSE effect may miss the initial progress events.

The effect only opens the EventSource when rocmDownloading is already true, which depends on the 10-second rocm-status poll picking up the download. When the user clicks "Download", the POST returns immediately but rocmDownloading won't flip to true until either the next poll or the explicit refetchRocmStatus() in handleDownloadRocm (line 300) completes. During that window, early progress events are lost and the progress bar can appear to start mid-download. Same pattern applies to the CUDA SSE effect.

Consider triggering the SSE subscription right after kicking off the download (e.g. via a dedicated isStreaming state flag set in the download handler) rather than gating it solely on the server-reported status.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/ServerTab/GpuPage.tsx` around lines 206 - 238, The SSE
subscription in the useEffect for ROCm (the EventSource created when
rocmDownloading && serverUrl) can miss early progress events; modify the
download flow by adding a dedicated isStreaming boolean state (e.g.,
rocmStreaming) that you set to true in handleDownloadRocm immediately after the
POST returns, and change the useEffect to open/close the EventSource based on
rocmStreaming && serverUrl (and still clear it when progress is complete/error
and set rocmStreaming to false); apply the same pattern to the CUDA SSE effect
so streaming starts as soon as the user initiates the download rather than
waiting for the server-status poll or refetchRocmStatus.
backend/services/rocm.py (1)

226-230: Remove the obsolete Python-version branch and always use the safe tar filter.

Ruff (UP036) confirms this version guard is outdated for the project's minimum Python version, so the fallback can be removed and extraction can consistently use filter="data".

Proposed cleanup
         )
         with tarfile.open(temp_path, "r:gz") as tar:
-            if sys.version_info >= (3, 12):
-                tar.extractall(path=dest_dir, filter="data")
-            else:
-                tar.extractall(path=dest_dir)
+            tar.extractall(path=dest_dir, filter="data")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/rocm.py` around lines 226 - 230, Remove the obsolete
Python-version branch in the tar extraction block and always call tar.extractall
with the safe filter: replace the if/else around sys.version_info in the with
tarfile.open(temp_path, "r:gz") as tar: block so it always uses
tar.extractall(path=dest_dir, filter="data"); also audit and remove the
now-unused sys import if present (references: temp_path, dest_dir,
tar.extractall).
backend/tests/test_rocm_build.py (1)

25-32: Remove empty parentheses from fixture decorator and consolidate nested context managers.

Ruff identifies two lint issues:

  • PT001: @pytest.fixture() should be @pytest.fixture (line 25)
  • SIM117: Combine three nested with statements into a single statement with multiple contexts (lines 28-30)
Proposed cleanup
-    `@pytest.fixture`()
+    `@pytest.fixture`
     def captured_args(self):
         """Run build_server(rocm=True) with mocked PyInstaller and return args."""
-        with patch("build_binary.PyInstaller.__main__.run") as mock_run:
-            with patch("build_binary.platform.system", return_value="Linux"):
-                with patch("build_binary.os.chdir"):
-                    build_server(rocm=True)
+        with (
+            patch("build_binary.PyInstaller.__main__.run") as mock_run,
+            patch("build_binary.platform.system", return_value="Linux"),
+            patch("build_binary.os.chdir"),
+        ):
+            build_server(rocm=True)
         return mock_run.call_args[0][0]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_rocm_build.py` around lines 25 - 32, Fix the fixture
decorator and collapse nested context managers: change the `@pytest.fixture`() on
the captured_args fixture to `@pytest.fixture`, and combine the three nested
with() patches into a single with statement that patches
"build_binary.PyInstaller.__main__.run" as mock_run,
"build_binary.platform.system" returning "Linux", and "build_binary.os.chdir" in
one line; inside that single context call build_server(rocm=True) and then
return mock_run.call_args[0][0]. Ensure you reference the captured_args fixture,
the build_server call, and the three patch targets when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/settings.local.json:
- Around line 1-7: The .claude/settings.local.json file with the "permissions":
{ "allow": [...] } entry is a machine-specific local settings file and should
not be included in the PR; remove it from the commit and ensure it is ignored
going forward by deleting the file from the repo index (while keeping it
locally) and adding .claude/settings.local.json to .gitignore (or updating an
existing ignore rule for the .claude directory) so the "permissions.allow" entry
is not tracked.

In `@app/src/components/ServerSettings/GpuAcceleration.test.tsx`:
- Around line 64-66: The test teardown currently calls vi.restoreAllMocks() in
the afterEach block but does not undo globals stubbed with vi.stubGlobal, so
stubbed EventSource persists across tests; update the afterEach in
GpuAcceleration.test.tsx to call vi.unstubAllGlobals() (in addition to
vi.restoreAllMocks()) to ensure vi.stubGlobal('EventSource', ...) is torn down
between tests and does not leak into later tests or workers.

In `@backend/app.py`:
- Around line 38-45: Change the ROCm shim and MIOpen logging to opt-in: stop
unconditionally setting HSA_OVERRIDE_GFX_VERSION and MIOPEN_LOG_LEVEL when
VOICEBOX_BACKEND_VARIANT == "rocm"; instead only set
HSA_OVERRIDE_GFX_VERSION="10.3.0" if running on Linux and an explicit opt-in env
(e.g., VOICEBOX_ROCM_FORCE_GFX1030=="1") is present, and only set
MIOPEN_LOG_LEVEL (or any non-default value) if a separate opt-in env like
VOICEBOX_ROCM_ENABLE_MIOPEN_LOG=="1" is present; keep the existing
VOICEBOX_BACKEND_VARIANT check and allow users to override these env vars
externally.

In `@backend/routes/rocm.py`:
- Around line 29-35: The current early 409 check using
rocm.get_rocm_binary_path() prevents triggering download_rocm_binary() for
upgrades; remove that conditional branch so the route only blocks when a
progress entry exists (use progress_manager.get_progress with rocm.PROGRESS_KEY
and its "downloading"/"extracting" states) and allow download_rocm_binary() to
decide via its internal _needs_server_download / _needs_rocm_libs_download logic
whether work is required; in short, delete the get_rocm_binary_path() is not
None 409 path and rely solely on the existing progress-state check to prevent
concurrent downloads.

In `@backend/server.py`:
- Around line 50-60: The current logic in server.py imports re locally and
always assigns VOICEBOX_BACKEND_VARIANT based on sys.executable, which clobbers
explicit overrides; move the re import to the module top-level (so Ruff E402 is
satisfied) and change the fallback assignment to use
os.environ.setdefault("VOICEBOX_BACKEND_VARIANT", "cpu") instead of
unconditional os.environ[...] = "cpu", keeping the existing checks for
"rocm"/"cuda" but only setting "cpu" if the env var is not already present;
remove the local del of _re/_binary_name and keep _binary_name as a transient
variable scoped to the block.

In `@backend/services/rocm.py`:
- Around line 410-438: Add a guard in check_and_update_rocm_binary to skip
auto-update when the ROCm backend is active: after resolving rocm_path (and
before calling _needs_server_download/_needs_rocm_libs_download and before
awaiting download_rocm_binary), call an existing or new helper like
is_rocm_active() (or implement one that detects the running ROCm process/locks
on Windows) and if it returns True, log that update is deferred and return early
so download_rocm_binary() is not invoked from inside the active ROCm process;
keep the rest of the logic (need_server/need_libs checks) unchanged so updates
can be performed later from the CPU backend flow.

In `@tauri/src-tauri/src/main.rs`:
- Around line 294-316: The current version check treats empty strings as valid
matches because app.config().version.unwrap_or_default() and
probe_binary_version(...).await.unwrap_or_default() both return "" on
missing/failure; update the logic in the ROCm branch around app_version,
binary_version and version_ok (and mirror the same change in the CUDA block) so
that you treat empty versions as invalid: compute app_version_opt =
app.config().version.clone() (Option) and binary_version_opt =
probe_binary_version(&exe_path, &rocm_dir).await (Option or Result) and only set
version_ok to true when both are present/non-empty and equal; otherwise log the
appropriate mismatch/failure (including error when available) and return None
instead of launching exe_path.
- Around line 740-750: The restart_server function currently clears the user's
backend override after a successful restart by setting
*state.backend_override.lock().unwrap() = None in restart_server; change this so
restart_server does not unconditionally clear state.backend_override—either
remove that line or only clear the override when an explicit "clear"
flag/command is passed from the UI (e.g., add a parameter to restart_server or a
separate clearBackendOverride handler), and ensure the UI (GpuAcceleration.tsx
setBackendOverride / handleRestart) is relied upon to clear the override when
the user explicitly requests it.
- Around line 104-122: probe_binary_version currently spawns a blocking
std::process::Command inside tokio::task::spawn_blocking which is not
cancel-safe and will leak threads/child processes when tokio::time::timeout
fires; replace the spawn_blocking + std::process::Command approach with
tokio::process::Command (used from within the async context of
probe_binary_version) and set .kill_on_drop(true) on the spawned child so that
when the timeout cancels/wraps the future the underlying child process is
terminated; keep the same semantics of capturing stdout, parsing the version
string, and retaining the timeout duration and error branches but move to using
tokio::process::Command and its async .output() call inside
probe_binary_version.

---

Outside diff comments:
In `@app/src/components/ServerTab/GpuPage.tsx`:
- Around line 320-355: The handlers handleSwitchToCpu, handleSwitchToCuda and
handleSwitchToRocm set restartPhase('stopping') before awaiting
platform.lifecycle.setBackendOverride, so if setBackendOverride throws the catch
never resets restartPhase and the UI remains stuck; fix by moving
setRestartPhase('stopping') to after a successful await
platform.lifecycle.setBackendOverride (i.e., just before calling
restartServerWithPolling) or alternately wrap the setBackendOverride call in its
own try/catch that resets restartPhase on error — apply this change to
handleSwitchToCpu, handleSwitchToCuda and handleSwitchToRocm and keep their
existing error/refetch behavior.

In `@backend/build_binary.py`:
- Around line 455-628: The restore logic currently in the finally block
(checking restore_torch and running pip to restore CPU/CUDA/ROCm) does not
protect against failures during the pre-build torch swap/install steps (the
subprocess.run pip installs inside the Windows CPU/ROCm blocks and the ROCm
torch install sequence), so move/extend the try/finally to wrap the entire
torch-swap/install phase (the blocks that set restore_torch and call
subprocess.run to install CPU or ROCm wheels) so that the finally always
executes even if those pip installs fail; ensure restore_torch is determined
before any environment-overwriting installs (as already done in the ROCm block),
and when performing the restore pip installs in the finally (the branches that
install torch/torchvision/torchaudio for "cpu","cuda","rocm") include the
--no-deps flag on ROCm/restore installs where applicable (and keep existing
--index-url for CPU/CUDA restores) so NumPy protection is preserved; keep
PyInstaller.__main__.run(args) inside the try so finally still runs after the
build attempt.

---

Duplicate comments:
In `@backend/backends/qwen_custom_voice_backend.py`:
- Around line 108-130: The module references force_offline_if_cached inside the
model loading block (used in _load_model_sync / the Qwen3TTSModel loading logic)
but never imports it; add an import for force_offline_if_cached from
..utils.hf_offline_patch at the top of the file so the call in the with
force_offline_if_cached(is_cached, model_name): block resolves and avoids a
NameError at runtime.

In `@backend/build_binary.py`:
- Around line 509-513: The current guard only checks sys.version_info and allows
non-CPython interpreters; update the check around the existing
sys.version_info[:2] comparison to also verify sys.implementation.name ==
"cpython" (or equivalent) and raise the same RuntimeError when either the Python
major/minor isn't 3.12 or the implementation is not CPython so that only CPython
3.12 interpreters proceed to build/install the cp312 ROCm wheels; locate and
modify the block that currently inspects sys.version_info[:2] in build_binary.py
to enforce both conditions.

In `@backend/services/rocm.py`:
- Around line 369-373: The current non-atomic swap deletes rocm_dir then renames
staging_dir, risking total loss if rename fails; instead, when use_staging is
true perform an atomic-safe swap by first renaming existing rocm_dir to a
temporary backup name (e.g., rocm_dir.with_suffix(".bak") or similar), then
rename staging_dir to rocm_dir, and on any failure attempt to rename the backup
back to rocm_dir; finally remove the backup only after successful rename; update
the block referencing use_staging, rocm_dir, staging_dir to implement this
rename-to-backup -> rename-staging -> cleanup-or-rollback sequence and avoid
using shutil.rmtree before a successful replacement.
- Around line 289-358: The code builds base_url from the app release `version`
and uses it for both server and libs downloads, which can break ROCm installs
when libs are versioned separately; change the logic so
`rocm-libs-{ROCM_LIBS_VERSION}.tar.gz` is downloaded from a libs-specific
release URL (e.g. build a separate libs_base_url using `ROCM_LIBS_VERSION`) and
use that URL for the HEAD and download calls related to the libs archive while
keeping server downloads on the original `base_url`; update references in the
download/extract sequence (look for `base_url`, `libs_archive`, the
head/download calls and the call to `_download_and_extract_archive` guarded by
`need_libs`) so libs use the new `libs_base_url` and server continues to use
`base_url`.

---

Nitpick comments:
In `@app/src/components/ServerTab/GpuPage.tsx`:
- Around line 206-238: The SSE subscription in the useEffect for ROCm (the
EventSource created when rocmDownloading && serverUrl) can miss early progress
events; modify the download flow by adding a dedicated isStreaming boolean state
(e.g., rocmStreaming) that you set to true in handleDownloadRocm immediately
after the POST returns, and change the useEffect to open/close the EventSource
based on rocmStreaming && serverUrl (and still clear it when progress is
complete/error and set rocmStreaming to false); apply the same pattern to the
CUDA SSE effect so streaming starts as soon as the user initiates the download
rather than waiting for the server-status poll or refetchRocmStatus.

In `@backend/services/rocm.py`:
- Around line 226-230: Remove the obsolete Python-version branch in the tar
extraction block and always call tar.extractall with the safe filter: replace
the if/else around sys.version_info in the with tarfile.open(temp_path, "r:gz")
as tar: block so it always uses tar.extractall(path=dest_dir, filter="data");
also audit and remove the now-unused sys import if present (references:
temp_path, dest_dir, tar.extractall).

In `@backend/tests/test_rocm_build.py`:
- Around line 25-32: Fix the fixture decorator and collapse nested context
managers: change the `@pytest.fixture`() on the captured_args fixture to
`@pytest.fixture`, and combine the three nested with() patches into a single with
statement that patches "build_binary.PyInstaller.__main__.run" as mock_run,
"build_binary.platform.system" returning "Linux", and "build_binary.os.chdir" in
one line; inside that single context call build_server(rocm=True) and then
return mock_run.call_args[0][0]. Ensure you reference the captured_args fixture,
the build_server call, and the three patch targets when making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dc0c916c-ae3f-4c1e-a8b3-b298e124d8a7

📥 Commits

Reviewing files that changed from the base of the PR and between cc8c501 and acb5416.

📒 Files selected for processing (32)
  • .claude/settings.local.json
  • app/package.json
  • app/src/components/ServerSettings/GpuAcceleration.test.tsx
  • app/src/components/ServerSettings/GpuAcceleration.tsx
  • app/src/components/ServerTab/GpuPage.tsx
  • app/src/i18n/locales/en/translation.json
  • app/src/lib/api/client.ts
  • app/src/lib/api/models/ModelStatus.ts
  • app/src/lib/api/types.ts
  • app/src/platform/types.ts
  • app/src/test/setup.ts
  • app/vite.config.ts
  • backend/app.py
  • backend/backends/base.py
  • backend/backends/hume_backend.py
  • backend/backends/qwen_custom_voice_backend.py
  • backend/build_binary.py
  • backend/pyi_rth_rocm_sdk.py
  • backend/requirements-rocm.txt
  • backend/routes/__init__.py
  • backend/routes/health.py
  • backend/routes/rocm.py
  • backend/server.py
  • backend/services/rocm.py
  • backend/tests/test_amd_gpu_detect.py
  • backend/tests/test_rocm_backends.py
  • backend/tests/test_rocm_build.py
  • backend/tests/test_rocm_download.py
  • backend/tests/test_rocm_requirements.py
  • backend/utils/platform_detect.py
  • tauri/src-tauri/src/main.rs
  • tauri/src/platform/lifecycle.ts
✅ Files skipped from review due to trivial changes (7)
  • backend/routes/init.py
  • app/src/test/setup.ts
  • app/src/lib/api/models/ModelStatus.ts
  • backend/requirements-rocm.txt
  • app/src/lib/api/types.ts
  • app/src/lib/api/client.ts
  • app/src/components/ServerSettings/GpuAcceleration.tsx
🚧 Files skipped from review as they are similar to previous changes (5)
  • backend/routes/health.py
  • app/vite.config.ts
  • backend/utils/platform_detect.py
  • backend/pyi_rth_rocm_sdk.py
  • app/src/i18n/locales/en/translation.json

Comment thread .claude/settings.local.json Outdated
Comment on lines +1 to +7
{
"permissions": {
"allow": [
"Bash(grep -E \"lock|yarn|pnpm\" echo \"---\" grep -n \"testing-library\\\\|vitest\\\\|jsdom\" /e/VoiceBox/voicebox/app/package.json)"
]
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove local Claude permissions from the PR.

This file is machine-specific (/e/VoiceBox/...) and grants local tool execution permissions. It should stay untracked, and .claude/settings.local.json should be ignored if contributors use it locally.

Suggested cleanup
-.claude/settings.local.json

If the repo does not already ignore local Claude settings:

+# Local AI assistant workspace settings
+.claude/settings.local.json
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/settings.local.json around lines 1 - 7, The
.claude/settings.local.json file with the "permissions": { "allow": [...] }
entry is a machine-specific local settings file and should not be included in
the PR; remove it from the commit and ensure it is ignored going forward by
deleting the file from the repo index (while keeping it locally) and adding
.claude/settings.local.json to .gitignore (or updating an existing ignore rule
for the .claude directory) so the "permissions.allow" entry is not tracked.

Comment thread app/src/components/ServerSettings/GpuAcceleration.test.tsx
Comment thread backend/app.py
Comment thread backend/routes/rocm.py Outdated
Comment thread backend/server.py Outdated
Comment thread backend/services/rocm.py
Comment thread tauri/src-tauri/src/main.rs
Comment thread tauri/src-tauri/src/main.rs
Comment thread tauri/src-tauri/src/main.rs Outdated
Implements native ROCm architecture for Windows.

- Adds backend build pipeline for voicebox-server-rocm.exe

- Detects AMD GPUs dynamically and routes PyTorch allocations

- Adds automatic download and update logic for ROCm dependencies

- Refactors UI in GpuPage.tsx and GpuAcceleration.tsx to add AMD flows

- Fixes 'Switch to CPU' lock on Windows via Tauri backend_override state

- Resolves PyInstaller/rocm_sdk UnboundLocalError silent crashes

- Resolves Numba/NumPy 2.x incompatibilities during Qwen3-TTS load

- Resolves HF_HUB_OFFLINE Catch-22 for CustomVoice processor caching
@JuanCMPDev JuanCMPDev force-pushed the feature/rocm-windows-support branch from acb5416 to 81133ba Compare April 22, 2026 23:34
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (2)
app/src/components/ServerSettings/GpuAcceleration.tsx (1)

202-226: ⚠️ Potential issue | 🟠 Major

Make GPU switches update the backend override before restarting.

After a user switches to CPU, setBackendOverride('cpu') persists, but both GPU buttons only call handleRestart(). That restart can keep launching CPU instead of the selected GPU backend. Also, when both GPU backends are installed, “Switch to CUDA Backend” is still ambiguous because the launcher prefers ROCm first.

Set/clear an explicit override for the selected backend before restarting, and make the Tauri launcher honor those values.

🛠️ Proposed direction
-  const handleRestart = async () => {
+  const handleRestart = async (backend?: 'cuda' | 'rocm') => {
     setError(null);
     setRestartPhase('stopping');

     try {
+      if (backend) {
+        await platform.lifecycle.setBackendOverride(backend);
+      } else {
+        await platform.lifecycle.setBackendOverride(null);
+      }
       setRestartPhase('waiting');
       startHealthPolling();
       await platform.lifecycle.restartServer();
-                      <Button onClick={handleRestart} className="w-full" size="sm">
+                      <Button onClick={() => handleRestart('cuda')} className="w-full" size="sm">
-                      <Button onClick={handleRestart} className="w-full" size="sm">
+                      <Button onClick={() => handleRestart('rocm')} className="w-full" size="sm">

Also applies to: 483-492, 565-574

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/ServerSettings/GpuAcceleration.tsx` around lines 202 -
226, The GPU switch handlers must set an explicit backend override before
restarting so the launcher uses the selected backend: update the CUDA and ROCm
button callbacks to call setBackendOverride('cuda') and
setBackendOverride('rocm') respectively (and ensure the CPU button calls
setBackendOverride('cpu') or the "Auto" button clears the override via
setBackendOverride(null or '') as appropriate) immediately before calling
handleRestart(); keep the existing restart flow (handleRestart,
startHealthPolling, healthPollRef cleanup, queryClient.invalidateQueries)
unchanged but ensure the Tauri launcher reads and honors the stored override
value when platform.lifecycle.restartServer() is invoked.
tauri/src-tauri/src/main.rs (1)

417-478: ⚠️ Potential issue | 🟠 Major

Honor explicit cuda/rocm overrides, not just cpu.

set_backend_override accepts arbitrary backend names, but start_server only special-cases "cpu"; "cuda" and "rocm" fall back to auto-detect order. That prevents the UI from reliably switching back from CPU to GPU, and still makes CUDA lose to ROCm when both are installed.

Validate and honor "cpu" | "cuda" | "rocm" in the launcher ordering.

🛠️ Proposed direction
-    let spawn_result = if backend_override.as_deref() != Some("cpu") {
+    let spawn_result = match backend_override.as_deref() {
+        Some("cpu") => {
+            println!("Backend override=cpu: using bundled CPU sidecar");
+            sidecar = sidecar.args(["--data-dir", &data_dir_str, "--port", &port_str, "--parent-pid", &parent_pid_str]);
+            if is_remote { sidecar = sidecar.args(["--host", "0.0.0.0"]); }
+            if let Some(ref dir) = effective_models_dir { sidecar = sidecar.env("VOICEBOX_MODELS_DIR", dir); }
+            sidecar.spawn()
+        }
+        Some("cuda") => {
+            // Try CUDA first, then CPU fallback.
+            // Keep ROCm out of this path so "Switch to CUDA" is deterministic.
+            /* existing CUDA spawn setup, else CPU */
+        }
+        Some("rocm") => {
+            // Try ROCm first, then CPU fallback.
+            /* existing ROCm spawn setup, else CPU */
+        }
+        _ => {
+            // Existing auto order: ROCm -> CUDA -> CPU.
+        }
-    } else {
-        // Override forces CPU — use bundled sidecar, GPU binary stays on disk
-        println!("Backend override=cpu: using bundled CPU sidecar");
-        ...
-        sidecar.spawn()
-    };
+    };

Also applies to: 759-763

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tauri/src-tauri/src/main.rs` around lines 417 - 478, The launcher currently
only treats "cpu" specially via backend_override but ignores explicit
"cuda"/"rocm"; update the logic in start_server to parse backend_override
(backend_override.as_deref()) and branch for three explicit values: "cpu" =>
force using bundled sidecar (sidecar.spawn()), "rocm" => try rocm_binary first
then cuda_binary then CPU sidecar, and "cuda" => try cuda_binary first then
rocm_binary then CPU sidecar. Keep existing behavior for the spawn attempts (use
rocm_binary, cuda_binary, sidecar, setting current_dir, args, env, and printing
on failure) but change the spawn_result selection to follow the requested
override order; modify the block that references backend_override, rocm_binary,
cuda_binary, sidecar, and gpu_spawn to implement this prioritized ordering.
🧹 Nitpick comments (1)
app/src/lib/api/types.ts (1)

171-189: Optional: consider deduplicating with CudaDownloadProgress.

RocmDownloadProgress is structurally identical to CudaDownloadProgress (lines 152-161). A shared GpuDownloadProgress (or a generic BackendDownloadProgress) would reduce drift risk as new fields get added to one variant but not the other. Not blocking — current split keeps API surface explicit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/lib/api/types.ts` around lines 171 - 189, RocmDownloadProgress
duplicates CudaDownloadProgress; create a shared interface (e.g.,
GpuDownloadProgress or BackendDownloadProgress) and replace both
RocmDownloadProgress and CudaDownloadProgress with that shared type, updating
references in RocmStatus and CUDA-related types to use the new shared interface
(ensure fields status, filename, error, timestamp, model_name, current, total,
progress remain identical and exported under the new name).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/src/components/ServerSettings/GpuAcceleration.tsx`:
- Around line 519-547: The ROCm progress block should use
rocmStatus.download_progress as a fallback when rocmDownloadProgress is null;
update the conditional render and all uses of rocmDownloadProgress (filename,
total, progress, current) to read from a merged/fallback object (e.g., const
progress = rocmDownloadProgress || rocmStatus.download_progress) so the UI shows
progress if rocmDownloading is true but no SSE event has arrived yet; ensure
Progress value uses progress.progress and the bytes display uses
progress.current/ progress.total and maintain existing formatting via
formatBytes, keeping existing elements Loader2, Progress and text-muted spans.

In `@backend/pyi_rth_rocm_sdk.py`:
- Around line 67-70: The error message includes a misplaced comma inside the
quoted path in the FileNotFoundError f-string; update the raise statement that
references shortname, relpath, and entry_pattern to remove the comma from inside
the quoted '{relpath},' so the path prints as '{relpath}' (or place the comma
outside the quotes if you want a trailing comma), ensuring the f-string reads
cleanly and uses the existing variables shortname, relpath, and entry_pattern.

In `@backend/tests/test_rocm_build.py`:
- Around line 81-90: The ROCm E2E class TestRocmBuildE2E (and its
test_rocm_binary_compiles_and_runs) is only marked as `@pytest.mark.slow` but not
skipped by default; make it explicitly opt-in by adding a skip condition that
checks an environment variable or pytest marker (e.g., skip if
os.environ.get("RUN_ROCM_E2E") != "1") so the test only runs when the operator
intentionally enables it; update the test file to import os and apply
`@pytest.mark.skipif`(...) to TestRocmBuildE2E (or set pytestmark at class scope)
and document the required env var/marker in the test comment.
- Around line 19-34: The test imports and patch targets are module-relative and
fail when running pytest from the repo root; change the top import to use the
package-qualified module (import build_server from backend.build_binary) and
update all patch target strings inside the captured_args fixture to refer to the
package-qualified module (e.g.,
patch("backend.build_binary.PyInstaller.__main__.run"),
patch("backend.build_binary.platform.system", ...) and
patch("backend.build_binary.os.chdir")). Keep the same fixture name
(captured_args) and function (build_server) references so behaviour is
unchanged.

In `@backend/tests/test_rocm_requirements.py`:
- Around line 66-71: The test test_rocm_torch_installs_and_detects_amd currently
only gates by VOICEBOX_TEST_ROCM_INSTALL but will fail on Linux/macOS because
the ROCm wheel index is Windows-specific; update the skip condition to also
require running on Windows (e.g., os.name == "nt" or platform.system() ==
"Windows") so the test only runs when VOICEBOX_TEST_ROCM_INSTALL is set and the
platform is Windows; modify the `@pytest.mark.skipif` decorator around
test_rocm_torch_installs_and_detects_amd to include this platform check while
keeping the existing environment variable check.

---

Duplicate comments:
In `@app/src/components/ServerSettings/GpuAcceleration.tsx`:
- Around line 202-226: The GPU switch handlers must set an explicit backend
override before restarting so the launcher uses the selected backend: update the
CUDA and ROCm button callbacks to call setBackendOverride('cuda') and
setBackendOverride('rocm') respectively (and ensure the CPU button calls
setBackendOverride('cpu') or the "Auto" button clears the override via
setBackendOverride(null or '') as appropriate) immediately before calling
handleRestart(); keep the existing restart flow (handleRestart,
startHealthPolling, healthPollRef cleanup, queryClient.invalidateQueries)
unchanged but ensure the Tauri launcher reads and honors the stored override
value when platform.lifecycle.restartServer() is invoked.

In `@tauri/src-tauri/src/main.rs`:
- Around line 417-478: The launcher currently only treats "cpu" specially via
backend_override but ignores explicit "cuda"/"rocm"; update the logic in
start_server to parse backend_override (backend_override.as_deref()) and branch
for three explicit values: "cpu" => force using bundled sidecar
(sidecar.spawn()), "rocm" => try rocm_binary first then cuda_binary then CPU
sidecar, and "cuda" => try cuda_binary first then rocm_binary then CPU sidecar.
Keep existing behavior for the spawn attempts (use rocm_binary, cuda_binary,
sidecar, setting current_dir, args, env, and printing on failure) but change the
spawn_result selection to follow the requested override order; modify the block
that references backend_override, rocm_binary, cuda_binary, sidecar, and
gpu_spawn to implement this prioritized ordering.

---

Nitpick comments:
In `@app/src/lib/api/types.ts`:
- Around line 171-189: RocmDownloadProgress duplicates CudaDownloadProgress;
create a shared interface (e.g., GpuDownloadProgress or BackendDownloadProgress)
and replace both RocmDownloadProgress and CudaDownloadProgress with that shared
type, updating references in RocmStatus and CUDA-related types to use the new
shared interface (ensure fields status, filename, error, timestamp, model_name,
current, total, progress remain identical and exported under the new name).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 46b8d504-0212-44f8-b1e3-e0a8022f6b6a

📥 Commits

Reviewing files that changed from the base of the PR and between acb5416 and dea5e7b.

📒 Files selected for processing (32)
  • .gitignore
  • app/package.json
  • app/src/components/ServerSettings/GpuAcceleration.test.tsx
  • app/src/components/ServerSettings/GpuAcceleration.tsx
  • app/src/components/ServerTab/GpuPage.tsx
  • app/src/i18n/locales/en/translation.json
  • app/src/lib/api/client.ts
  • app/src/lib/api/models/ModelStatus.ts
  • app/src/lib/api/types.ts
  • app/src/platform/types.ts
  • app/src/test/setup.ts
  • app/vite.config.ts
  • backend/app.py
  • backend/backends/base.py
  • backend/backends/hume_backend.py
  • backend/backends/qwen_custom_voice_backend.py
  • backend/build_binary.py
  • backend/pyi_rth_rocm_sdk.py
  • backend/requirements-rocm.txt
  • backend/routes/__init__.py
  • backend/routes/health.py
  • backend/routes/rocm.py
  • backend/server.py
  • backend/services/rocm.py
  • backend/tests/test_amd_gpu_detect.py
  • backend/tests/test_rocm_backends.py
  • backend/tests/test_rocm_build.py
  • backend/tests/test_rocm_download.py
  • backend/tests/test_rocm_requirements.py
  • backend/utils/platform_detect.py
  • tauri/src-tauri/src/main.rs
  • tauri/src/platform/lifecycle.ts
✅ Files skipped from review due to trivial changes (8)
  • app/src/test/setup.ts
  • backend/routes/init.py
  • app/src/lib/api/models/ModelStatus.ts
  • backend/requirements-rocm.txt
  • backend/routes/rocm.py
  • backend/tests/test_rocm_download.py
  • backend/build_binary.py
  • app/src/components/ServerTab/GpuPage.tsx
🚧 Files skipped from review as they are similar to previous changes (5)
  • app/src/platform/types.ts
  • backend/backends/base.py
  • backend/server.py
  • backend/utils/platform_detect.py
  • app/src/i18n/locales/en/translation.json

Comment on lines +519 to +547
{rocmDownloading && rocmDownloadProgress && (
<div className="space-y-2">
<div className="flex items-center justify-between text-sm">
<div className="flex items-center gap-2">
<Loader2 className="h-4 w-4 animate-spin" />
<span>
{rocmDownloadProgress.filename ||
(rocmAvailable
? 'Updating ROCm backend...'
: 'Downloading ROCm backend...')}
</span>
</div>
{rocmDownloadProgress.total > 0 && (
<span className="text-muted-foreground">
{rocmDownloadProgress.progress.toFixed(1)}%
</span>
)}
</div>
{rocmDownloadProgress.total > 0 && (
<>
<Progress value={rocmDownloadProgress.progress} className="h-2" />
<div className="text-xs text-muted-foreground">
{formatBytes(rocmDownloadProgress.current)} /{' '}
{formatBytes(rocmDownloadProgress.total)}
</div>
</>
)}
</div>
)}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use rocmStatus.download_progress as the initial progress fallback.

If the page renders while a download is already in progress, rocmDownloading can be true while rocmDownloadProgress is still null, so the ROCm progress block stays hidden until a new SSE event arrives. The status response already carries download_progress; use it as the display fallback.

🧩 Proposed fix
+  const displayedRocmDownloadProgress =
+    rocmDownloadProgress ?? rocmStatus?.download_progress ?? null;
+
...
-              {rocmDownloading && rocmDownloadProgress && (
+              {rocmDownloading && displayedRocmDownloadProgress && (
                 <div className="space-y-2">
...
-                        {rocmDownloadProgress.filename ||
+                        {displayedRocmDownloadProgress.filename ||
                           (rocmAvailable
                             ? 'Updating ROCm backend...'
                             : 'Downloading ROCm backend...')}
...
-                    {rocmDownloadProgress.total > 0 && (
+                    {displayedRocmDownloadProgress.total > 0 && (
                       <span className="text-muted-foreground">
-                        {rocmDownloadProgress.progress.toFixed(1)}%
+                        {displayedRocmDownloadProgress.progress.toFixed(1)}%
                       </span>
                     )}
...
-                  {rocmDownloadProgress.total > 0 && (
+                  {displayedRocmDownloadProgress.total > 0 && (
                     <>
-                      <Progress value={rocmDownloadProgress.progress} className="h-2" />
+                      <Progress value={displayedRocmDownloadProgress.progress} className="h-2" />
                       <div className="text-xs text-muted-foreground">
-                        {formatBytes(rocmDownloadProgress.current)} /{' '}
-                        {formatBytes(rocmDownloadProgress.total)}
+                        {formatBytes(displayedRocmDownloadProgress.current)} /{' '}
+                        {formatBytes(displayedRocmDownloadProgress.total)}
                       </div>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/ServerSettings/GpuAcceleration.tsx` around lines 519 -
547, The ROCm progress block should use rocmStatus.download_progress as a
fallback when rocmDownloadProgress is null; update the conditional render and
all uses of rocmDownloadProgress (filename, total, progress, current) to read
from a merged/fallback object (e.g., const progress = rocmDownloadProgress ||
rocmStatus.download_progress) so the UI shows progress if rocmDownloading is
true but no SSE event has arrived yet; ensure Progress value uses
progress.progress and the bytes display uses progress.current/ progress.total
and maintain existing formatting via formatBytes, keeping existing elements
Loader2, Progress and text-muted spans.

Comment on lines +67 to +70
raise FileNotFoundError(
f"Could not find rocm library '{shortname}' at path "
f"'{relpath},' no match for pattern '{entry_pattern}'"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor: misplaced comma inside quoted path in error message.

The comma inside the closing quote ('{relpath},') makes the error read like at path '/some/path,' — likely unintended.

🔧 Proposed fix
                 raise FileNotFoundError(
                     f"Could not find rocm library '{shortname}' at path "
-                    f"'{relpath},' no match for pattern '{entry_pattern}'"
+                    f"'{relpath}', no match for pattern '{entry_pattern}'"
                 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/pyi_rth_rocm_sdk.py` around lines 67 - 70, The error message includes
a misplaced comma inside the quoted path in the FileNotFoundError f-string;
update the raise statement that references shortname, relpath, and entry_pattern
to remove the comma from inside the quoted '{relpath},' so the path prints as
'{relpath}' (or place the comma outside the quotes if you want a trailing
comma), ensuring the f-string reads cleanly and uses the existing variables
shortname, relpath, and entry_pattern.

Comment on lines +19 to +34
from build_binary import build_server


class TestRocmBuildArgs:
"""Validate PyInstaller arguments for ROCm builds."""

@pytest.fixture
def captured_args(self):
"""Run build_server(rocm=True) with mocked PyInstaller and return args."""
with (
patch("build_binary.PyInstaller.__main__.run") as mock_run,
patch("build_binary.platform.system", return_value="Linux"),
patch("build_binary.os.chdir"),
):
build_server(rocm=True)
return mock_run.call_args[0][0]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python - <<'PY'
import importlib.util
for name in ("build_binary", "backend.build_binary"):
    spec = importlib.util.find_spec(name)
    print(f"{name}: {spec.origin if spec else 'NOT FOUND'}")
PY

Repository: jamiepine/voicebox

Length of output: 150


🏁 Script executed:

#!/bin/bash
# Check the actual content of the test file
echo "=== Checking test_rocm_build.py content ==="
cat -n backend/tests/test_rocm_build.py | head -50

echo -e "\n=== Checking import patterns in other backend tests ==="
# Find other test files in backend/tests and check their imports
fd -t f "\.py$" backend/tests --exec head -5 {} \; 2>/dev/null | head -100

Repository: jamiepine/voicebox

Length of output: 4850


🏁 Script executed:

#!/bin/bash
# More targeted search for import patterns in backend tests
echo "=== Backend test files ==="
fd -t f "\.py$" backend/tests

echo -e "\n=== Sample imports from backend tests ==="
for f in $(fd -t f "\.py$" backend/tests | head -5); do
  echo "--- $f ---"
  head -20 "$f" | grep -E "^(from|import)" || echo "(no imports shown in first 20 lines)"
done

Repository: jamiepine/voicebox

Length of output: 1595


🏁 Script executed:

#!/bin/bash
# Get more specific import patterns from backend tests
echo "=== Imports from backend test files ==="
for f in backend/tests/test_*.py; do
  echo "--- $f ---"
  grep -E "^(from|import)" "$f" | head -10 || echo "(no imports)"
done | head -100

Repository: jamiepine/voicebox

Length of output: 3024


Use the package-qualified backend.build_binary import path.

With the documented python -m pytest backend/tests/test_rocm_build.py invocation from the repo root, from build_binary import build_server cannot resolve without injecting backend/ into PYTHONPATH. Importing backend.build_binary is consistent with other backend tests (e.g., test_amd_gpu_detect.py uses from backend.utils.platform_detect) and keeps the patch targets aligned.

🧪 Proposed fix
-from build_binary import build_server
+from backend.build_binary import build_server
...
-            patch("build_binary.PyInstaller.__main__.run") as mock_run,
-            patch("build_binary.platform.system", return_value="Linux"),
-            patch("build_binary.os.chdir"),
+            patch("backend.build_binary.PyInstaller.__main__.run") as mock_run,
+            patch("backend.build_binary.platform.system", return_value="Linux"),
+            patch("backend.build_binary.os.chdir"),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from build_binary import build_server
class TestRocmBuildArgs:
"""Validate PyInstaller arguments for ROCm builds."""
@pytest.fixture
def captured_args(self):
"""Run build_server(rocm=True) with mocked PyInstaller and return args."""
with (
patch("build_binary.PyInstaller.__main__.run") as mock_run,
patch("build_binary.platform.system", return_value="Linux"),
patch("build_binary.os.chdir"),
):
build_server(rocm=True)
return mock_run.call_args[0][0]
from backend.build_binary import build_server
class TestRocmBuildArgs:
"""Validate PyInstaller arguments for ROCm builds."""
`@pytest.fixture`
def captured_args(self):
"""Run build_server(rocm=True) with mocked PyInstaller and return args."""
with (
patch("backend.build_binary.PyInstaller.__main__.run") as mock_run,
patch("backend.build_binary.platform.system", return_value="Linux"),
patch("backend.build_binary.os.chdir"),
):
build_server(rocm=True)
return mock_run.call_args[0][0]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_rocm_build.py` around lines 19 - 34, The test imports and
patch targets are module-relative and fail when running pytest from the repo
root; change the top import to use the package-qualified module (import
build_server from backend.build_binary) and update all patch target strings
inside the captured_args fixture to refer to the package-qualified module (e.g.,
patch("backend.build_binary.PyInstaller.__main__.run"),
patch("backend.build_binary.platform.system", ...) and
patch("backend.build_binary.os.chdir")). Keep the same fixture name
(captured_args) and function (build_server) references so behaviour is
unchanged.

Comment on lines +81 to +90
@pytest.mark.slow()
@pytest.mark.skipif(sys.platform != "win32", reason="ROCm build E2E only runs on Windows")
class TestRocmBuildE2E:
"""
True end-to-end build test.
Executes build_binary.py --rocm, verifies the binary exists, and runs it
with --help to confirm it boots without import errors.
"""

def test_rocm_binary_compiles_and_runs(self, tmp_path):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make the slow ROCm E2E build explicitly opt-in.

@pytest.mark.slow only labels the test; it does not skip it by default. On Windows CI, this can still run a full ROCm PyInstaller build unless the test command excludes slow tests.

🧪 Proposed fix
+import os
...
 `@pytest.mark.slow`()
 `@pytest.mark.skipif`(sys.platform != "win32", reason="ROCm build E2E only runs on Windows")
+@pytest.mark.skipif(
+    os.environ.get("VOICEBOX_TEST_ROCM_BUILD") != "1",
+    reason="Set VOICEBOX_TEST_ROCM_BUILD=1 to run the ROCm E2E build test",
+)
 class TestRocmBuildE2E:
🧰 Tools
🪛 Ruff (0.15.10)

[warning] 81-81: Use @pytest.mark.slow over @pytest.mark.slow()

Remove parentheses

(PT023)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_rocm_build.py` around lines 81 - 90, The ROCm E2E class
TestRocmBuildE2E (and its test_rocm_binary_compiles_and_runs) is only marked as
`@pytest.mark.slow` but not skipped by default; make it explicitly opt-in by
adding a skip condition that checks an environment variable or pytest marker
(e.g., skip if os.environ.get("RUN_ROCM_E2E") != "1") so the test only runs when
the operator intentionally enables it; update the test file to import os and
apply `@pytest.mark.skipif`(...) to TestRocmBuildE2E (or set pytestmark at class
scope) and document the required env var/marker in the test comment.

Comment on lines +66 to +71
@pytest.mark.timeout(900)
@pytest.mark.skipif(
not os.environ.get("VOICEBOX_TEST_ROCM_INSTALL"),
reason="Set VOICEBOX_TEST_ROCM_INSTALL=1 to run the heavy install test",
)
def test_rocm_torch_installs_and_detects_amd(self, backend_dir):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Windows-gate the ROCm install integration test.

The requirements point at the Windows ROCm wheel index, so an opted-in run on Linux/macOS will fail during install rather than testing AMD ROCm availability.

🧪 Proposed fix
     `@pytest.mark.timeout`(900)
+    `@pytest.mark.skipif`(
+        sys.platform != "win32",
+        reason="ROCm Windows wheels are only validated on Windows",
+    )
     `@pytest.mark.skipif`(
         not os.environ.get("VOICEBOX_TEST_ROCM_INSTALL"),
         reason="Set VOICEBOX_TEST_ROCM_INSTALL=1 to run the heavy install test",
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/test_rocm_requirements.py` around lines 66 - 71, The test
test_rocm_torch_installs_and_detects_amd currently only gates by
VOICEBOX_TEST_ROCM_INSTALL but will fail on Linux/macOS because the ROCm wheel
index is Windows-specific; update the skip condition to also require running on
Windows (e.g., os.name == "nt" or platform.system() == "Windows") so the test
only runs when VOICEBOX_TEST_ROCM_INSTALL is set and the platform is Windows;
modify the `@pytest.mark.skipif` decorator around
test_rocm_torch_installs_and_detects_amd to include this platform check while
keeping the existing environment variable check.

@JuanCMPDev JuanCMPDev marked this pull request as draft April 23, 2026 03:19
@JuanCMPDev JuanCMPDev marked this pull request as ready for review April 23, 2026 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Windows AMD GPU support via DirectML / torch-directml

1 participant