feat(windows): Native AMD ROCm GPU Acceleration (Resolves #531)#538
feat(windows): Native AMD ROCm GPU Acceleration (Resolves #531)#538JuanCMPDev wants to merge 2 commits intojamiepine:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR introduces ROCm backend support for AMD GPUs alongside existing CUDA support. It adds ROCm service management, API endpoints, frontend UI components, build configuration, and comprehensive tests spanning frontend, backend, and build layers. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Frontend as React UI<br/>(GpuAcceleration)
participant Tauri as Tauri<br/>(setBackendOverride)
participant API as FastAPI<br/>(rocm routes)
participant Service as ROCm<br/>Service
participant Progress as Progress<br/>Manager
User->>Frontend: Click "Download ROCm"
Frontend->>API: POST /backend/download-rocm
API->>API: Check if download already in progress
API->>Service: Spawn background task: download_rocm_binary()
API-->>Frontend: Return {message, progress_key}
Frontend->>Progress: Subscribe to rocm-backend SSE stream
Service->>Service: Stream tarball download
Service->>Progress: Emit progress events (percent, filename)
Progress-->>Frontend: SSE stream progress updates
Service->>Service: Extract archive atomically
Service->>Service: Write rocm-libs.json manifest
Service->>Progress: Emit complete/error status
Progress-->>Frontend: Final status event
User->>Frontend: Click "Switch to ROCm"
Frontend->>API: GET /backend/rocm-status
API->>Service: get_rocm_status()
Service-->>API: {available: true, active: false}
API-->>Frontend: ROCm status
Frontend->>Tauri: setBackendOverride('rocm')
Tauri->>Tauri: Persist backend_override
Frontend->>Tauri: restart_server()
Tauri->>Tauri: start_server() with override
Tauri->>Tauri: Probe ROCm binary version
Tauri->>Tauri: Spawn voicebox-server-rocm
Tauri-->>Frontend: Server restarted
Frontend->>API: GET /health
API-->>Frontend: {gpu_type: "ROCm", backend_variant: "rocm"}
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 20
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/app.py (1)
38-42:⚠️ Potential issue | 🟠 MajorGate AMD environment variable overrides on
VOICEBOX_BACKEND_VARIANT == "rocm"and detect GPU architecture.
HSA_OVERRIDE_GFX_VERSION=10.3.0andMIOPEN_LOG_LEVELare applied unconditionally on all platforms and backends. They set RDNA2 overrides globally for Linux, NVIDIA, and CPU users regardless of whether ROCm is active. The hardcoded10.3.0is only correct for RDNA2; RX 7800 XT (RDNA3/gfx1101) users will silently mismatch with MIOpen.Since
VOICEBOX_BACKEND_VARIANTis already used to detect the active backend (set inbackend/server.pybased on the binary name), gate this block onos.environ.get("VOICEBOX_BACKEND_VARIANT") == "rocm"before setting these overrides. Additionally, detect the GPU architecture (or at least validate against known RDNA2 GFX codes) rather than blindly applying the RDNA2-specific override; the existing issue#469documents this harms RDNA3/4 cards.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app.py` around lines 38 - 42, Gate the AMD-specific environment overrides on the backend being ROCm by checking os.environ.get("VOICEBOX_BACKEND_VARIANT") == "rocm" before touching HSA_OVERRIDE_GFX_VERSION and MIOPEN_LOG_LEVEL; then detect or validate the GPU architecture (e.g., by probing ROCm-provided env vars or system GPU info) and only set HSA_OVERRIDE_GFX_VERSION="10.3.0" when the detected GPU matches known RDNA2 gfx codes, otherwise skip setting the override and log a warning; update the block that sets HSA_OVERRIDE_GFX_VERSION and MIOPEN_LOG_LEVEL to perform these checks and avoid applying RDNA2-specific overrides on non-rocm backends or non-RDNA2 GPUs.backend/build_binary.py (1)
455-481:⚠️ Potential issue | 🟠 MajorAdd ROCm torch detection to CPU build and implement ROCm restoration.
The CPU build torch detection at line 460 only checks
torch.version.cuda, missing ROCm Torch. A Windows venv with only ROCm Torch installed will skip the CPU-wheel swap and bundle ROCm DLLs into the CPU binary. Additionally, thefinallyblock (lines 541-580) lacks a restoration path for ROCm Torch.Required fixes:
- Extend line 460 to check both
torch.version.cudaandtorch.version.hip- Set
restore_torch = "rocm"if only HIP Torch is detected- Add
elif restore_torch == "rocm":handler in the finally block to reinstall ROCm packages (or set to "cpu" if CPU swap was intended as the fallback for non-CUDA GPU variants)Also applies to: 541-581
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/build_binary.py` around lines 455 - 481, The CPU-build detection only checks torch.version.cuda; update the detection in the subprocess.run snippet that sets has_cuda_torch to instead inspect both torch.version.cuda and torch.version.hip (e.g., treat hip non-empty as ROCm), and set restore_torch = "rocm" when only HIP is present; then extend the finally block that currently handles restore_torch == "cuda" to also handle restore_torch == "rocm" by reinstalling the ROCm-specific torch packages (or restoring to the intended CPU fallback) using the same pip subprocess pattern used for CUDA restoration so ROCm-only venvs are properly restored.
🧹 Nitpick comments (8)
backend/utils/platform_detect.py (1)
54-62: Fallback may misreport on multi-GPU/hybrid systems.
torch.cuda.get_device_name(0)only inspects device 0. On a laptop with integrated AMD Radeon graphics + discrete NVIDIA, device 0 under a CUDA-built torch will be the NVIDIA GPU and this returns False even though an AMD adapter is present; conversely, under a ROCm-built torch the integrated AMD iGPU can shadow a more capable discrete card. If the intent is "is ROCm appropriate?", consider iteratingrange(torch.cuda.device_count())and matching any device, or documenting that this function reflects "the torch-visible device 0" rather than "the primary system GPU" as the docstring claims.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/utils/platform_detect.py` around lines 54 - 62, The current check uses torch.cuda.get_device_name(0) and thus only inspects device 0, which can misreport on multi-GPU/hybrid systems; update the logic in platform_detect.py (the torch import block using torch.cuda.is_available and torch.cuda.get_device_name) to iterate over all visible CUDA devices via torch.cuda.device_count() and call torch.cuda.get_device_name(i) for each i, returning True if any device name contains "Radeon" or "AMD", or alternatively adjust the docstring to explicitly state the function only reflects the torch-visible device 0 rather than the primary system GPU.backend/routes/health.py (1)
104-117: LGTM — consistent ROCm labeling.
hasattr(torch.version, "hip") and torch.version.hipmatches the detection pattern already inbackend/app.py::_get_gpu_statusandbackend/backends/base.py::check_cuda_compatibility, so thegpu_typelabel stays aligned across the three call sites. Sincecheck_cuda_compatibility()now short-circuits to(True, None)on ROCm,gpu_compatibility_warningwill correctly be absent for AMD builds.Optional nit: the ROCm-vs-CUDA label computation now exists in three places (this file,
app.py, and implicitly base). A tiny shared helper (e.g.,get_cuda_flavor_name()returning"ROCm"or"CUDA") would eliminate drift risk if the detection pattern ever tightens.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/health.py` around lines 104 - 117, Duplicate ROCm-vs-CUDA detection logic appears in backend/routes/health.py (gpu_type assignment), backend/app.py::_get_gpu_status, and backend/backends/base.py::check_cuda_compatibility; extract that logic into a shared helper (e.g., get_cuda_flavor_name() returning "ROCm" or "CUDA" or None) placed in a common module, then replace the inline checks in those three locations to call the helper (use its return to build the gpu_type string where needed) so labeling stays consistent and avoids drift.backend/tests/test_amd_gpu_detect.py (1)
12-12: Unusedpytestimport.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_amd_gpu_detect.py` at line 12, The file imports pytest but never uses it; remove the unused import statement (the top-level "import pytest") from backend/tests/test_amd_gpu_detect.py or, if you intended to use pytest fixtures or marks, reference them in the test functions (e.g., add a pytest.mark or fixture parameter) so the import is actually used; update the import accordingly to eliminate the linter/test warning.backend/tests/test_rocm_backends.py (2)
13-13: Unusedpytestimport.
pytestis imported but never referenced (no markers/fixtures used). Safe to drop.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_rocm_backends.py` at line 13, Remove the unused pytest import from the test module; locate the top-level import statement "import pytest" in backend/tests/test_rocm_backends.py and delete it since no pytest markers, fixtures, or functions reference it.
27-35:torch.version.hippatch relies on attribute existing.On non-ROCm torch builds
torch.version.hipexists and isNone, sopatch(..., "6.2.41133")works. If a future CUDA-only wheel omits the attribute, this test will fail withAttributeErrorfrommock. Consider addingcreate=True(or patching viapatch.object(torch.version, "hip", ...)) for robustness.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_rocm_backends.py` around lines 27 - 35, The test test_rocm_skips_compute_check patches torch.version.hip in a way that can raise AttributeError if the attribute is missing; update the patch to create the attribute when absent (e.g., use patch("torch.version.hip", "6.2.41133", create=True) or patch.object(torch.version, "hip", "6.2.41133")) so the test reliably exercises check_cuda_compatibility() on nonstandard torch builds without failing due to missing attributes.backend/server.py (1)
267-276: Substring match on binary name is fragile.
"rocm" in binary_name/"cuda" in binary_namewill match any install path-derived basename that incidentally contains those substrings (e.g., a user-renamed binary, or future variants likevoicebox-server-rocmprof.exe). Prefer an explicit suffix/regex check against the known set of variant names, e.g.:- binary_name = os.path.basename(sys.executable).lower() - if "rocm" in binary_name: + binary_name = os.path.basename(sys.executable).lower() + # Match the documented variant suffixes only. + if re.search(r"voicebox-server-rocm(\.exe)?$", binary_name): os.environ["VOICEBOX_BACKEND_VARIANT"] = "rocm" logger.info("Backend variant: ROCm") - elif "cuda" in binary_name: + elif re.search(r"voicebox-server-cuda(\.exe)?$", binary_name): os.environ["VOICEBOX_BACKEND_VARIANT"] = "cuda"Also worth noting: because
is_rocm_active()and the409 Cannot delete while activeguard inbackend/routes/rocm.pyrely exclusively on this env var, a misdetection here turns into a user-facing deletion lock bug.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/server.py` around lines 267 - 276, The current detection uses fragile substring checks on binary_name and sets VOICEBOX_BACKEND_VARIANT, which can misdetect and break is_rocm_active()/backend/routes/rocm.py deletion guards; replace the substring logic in the block that computes binary_name to perform an explicit match (e.g., compare the basename against a whitelist of allowed variants or use a strict regex that matches whole-word/suffix like ^(rocm|cuda|cpu)$) and only set os.environ["VOICEBOX_BACKEND_VARIANT"] when the match is exact; ensure the code documents/uses the same canonical variant strings ("rocm","cuda","cpu") so is_rocm_active() and the ROCm route rely on the exact env var value.backend/routes/rocm.py (1)
43-44: Userocm.PROGRESS_KEYinstead of duplicating the progress key.The route already depends on
services.rocm; using the constant avoids breaking download responses or SSE subscriptions if the service key changes.Proposed refactor
`@router.post`("/backend/download-rocm") async def download_rocm_backend(): """Download the ROCm backend binary.""" from ..services import rocm @@ create_background_task(_download()) - return {"message": "ROCm backend download started", "progress_key": "rocm-backend"} + return {"message": "ROCm backend download started", "progress_key": rocm.PROGRESS_KEY} @@ `@router.get`("/backend/rocm-progress") async def get_rocm_download_progress(): """Get ROCm backend download progress via Server-Sent Events.""" + from ..services import rocm + progress_manager = get_progress_manager() async def event_generator(): - async for event in progress_manager.subscribe("rocm-backend"): + async for event in progress_manager.subscribe(rocm.PROGRESS_KEY): yield eventAlso applies to: 65-72
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/rocm.py` around lines 43 - 44, The route is returning the literal progress key "rocm-backend" instead of using the service constant; replace the hard-coded string with the service's constant (use rocm.PROGRESS_KEY from services.rocm) in both places where the route returns {"message": ..., "progress_key": "rocm-backend"} (the block that calls create_background_task(_download()) and the similar block around lines 65-72) so responses and SSE subscriptions use the canonical key; ensure you reference the existing rocm import/variable used by the module (e.g., services.rocm or the local rocm name) when substituting the value.app/src/i18n/locales/en/translation.json (1)
686-753: Consider symmetry between CUDA and ROCm key groups.The CUDA backend uses flat keys (
settings.gpu.download,settings.gpu.remove) while ROCm uses suffixed keys (settings.gpu.downloadRocm,settings.gpu.removeRocm). Similarly,switchToCpudescription hardcodes "You can re-download CUDA later." which is now inaccurate when the user is switching off ROCm.Consider either:
- Renaming CUDA keys to
downloadCuda/removeCudafor consistency, or- Making
switchToCpu.descriptionbackend-agnostic (e.g., "You can re-download the GPU backend later.") so it reads correctly regardless of whether CUDA or ROCm is active.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/i18n/locales/en/translation.json` around lines 686 - 753, The localization keys are inconsistent between CUDA and ROCm and the switchToCpu description is backend-specific; update either by renaming the CUDA keys "settings.gpu.download" and "settings.gpu.remove" to "settings.gpu.downloadCuda" and "settings.gpu.removeCuda" to match "settings.gpu.downloadRocm"/"settings.gpu.removeRocm", or (simpler) make "settings.gpu.switchToCpu.description" backend-agnostic by replacing "You can re-download CUDA later." with a neutral string like "You can re-download the GPU backend later."; after changing keys, update all code/UI references that use "download"/"remove" or the old description to the new keys/phrase so translations remain consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/src/components/ServerSettings/GpuAcceleration.test.tsx`:
- Around line 165-172: The test replaces the global EventSource via (globalThis
as any).EventSource = vi.fn(() => mockEventSource) which leaks to other tests;
restore the original global after the SSE test by using
vi.stubGlobal('EventSource', ...) or save the original EventSource and restore
it in an afterEach/afterAll hook (referencing the mockEventSource and the direct
assignment to globalThis.EventSource in the test), or call vi.unstubAllGlobals
in afterEach to ensure the EventSource mock is removed.
In `@app/src/components/ServerTab/GpuPage.tsx`:
- Around line 548-552: The new hard-coded strings in the conditional render
around isCurrentlyCuda/isCurrentlyRocm and platform.metadata.isTauri (the
SettingSection title and description) must be moved to i18n; replace the
literals with calls to the translation function used elsewhere (e.g., t('...'))
and add matching keys to the locale resource files. Update the SettingSection
usage in GpuPage (the block referencing isCurrentlyCuda, isCurrentlyRocm,
platform.metadata.isTauri) to use i18n keys like 'gpu.backendActiveTitle.cuda',
'gpu.backendActiveTitle.rocm' and 'gpu.backendActiveDescription', and add those
keys and translations to the project's localization files.
- Around line 438-448: The “Switch to CUDA” button only restarts but doesn't
explicitly set CUDA as the target backend, so when both ROCm and CUDA are
downloaded the launcher may still choose ROCm; before calling handleRestart
(from the SettingRow action), persist an explicit backend override (e.g., call
the existing preference/update function to set preferred backend to "cuda" or
set an env/launcher override) so the launcher will choose CUDA on next start,
and additionally gate or adjust the UI when ROCm is present (check
rocmDownloaded/rocmAvailable) to either show a confirmation or disable the
button unless the override is applied; update the logic around cudaAvailable,
isCurrentlyCuda and platform.metadata.isTauri to perform the override prior to
restart.
In `@backend/pyi_rth_rocm_sdk.py`:
- Around line 35-38: The KeyError raised when indexing _dist_info.ALL_LIBRARIES
with shortname should be suppressed so we raise a clean ModuleNotFoundError;
modify the except block around the lookup of lib_entry (the try/except using
_dist_info.ALL_LIBRARIES[shortname]) to re-raise ModuleNotFoundError(f"Unknown
rocm library '{shortname}'") from None, ensuring the original KeyError context
is removed.
- Around line 48-52: The current except block swallows ModuleNotFoundError for
transitive imports; change it to "except ModuleNotFoundError as e:" and only
treat the backend package as missing when the exception refers to the requested
module (check e.name == py_package_name or compare the missing module string to
py_package_name); if it matches, add package.logical_name to missing_extras and
continue, otherwise re-raise the exception so transitive dependency errors are
not misattributed.
In `@backend/requirements-rocm.txt`:
- Around line 4-7: Update requirements-rocm.txt to pin ROCm-matched wheels and
use the Windows AMD wheel index: replace the loose `torch>=2.2.0`, `torchaudio`,
`torchvision` entries with exact ROCm pins `torch==2.9.1+rocm7.2.1`,
`torchaudio==2.9.1+rocm7.2.1`, `torchvision==0.24.1+rocm7.2.1` and swap the
current `--extra-index-url https://download.pytorch.org/whl/rocm7.2` with the
AMD Radeon Windows ROCm index `--extra-index-url
https://repo.radeon.com/rocm/windows/rocm-rel-7.2.1/` so Windows `win_amd64`
wheels are available for the test_rocm_requirements.py WMI hardware detection
path.
In `@backend/services/rocm.py`:
- Around line 288-290: The libs archive URL is incorrectly tied to the app
release tag via base_url and will break if ROCM_LIBS_VERSION isn't re-uploaded
under that tag; update the download logic so libs_archive (and its fetch) uses a
dedicated libs_base_url or falls back to a URL containing ROCM_LIBS_VERSION
(instead of using base_url/version), e.g. construct a separate libs_url derived
from ROCM_LIBS_VERSION and use that for downloading
rocm-libs-{ROCM_LIBS_VERSION}.tar.gz while keeping server_archive fetched from
the original base_url/version; modify the code paths that reference
base_url/version so only server_archive uses it and functions/methods that
download libs reference the new libs URL/fallback.
- Around line 292-348: Refactor the download/extract flow to use a temporary
staging directory instead of writing directly into rocm_dir: create a temp dir
(e.g., via tempfile.TemporaryDirectory or pathlib + uuid) and pass that staging
path as dest_dir to _download_and_extract_archive for both server and libs;
after both downloads, write the manifest to staging using
get_rocm_libs_manifest_path() semantics (but targeting the staging path), make
the server exe executable in staging using get_rocm_exe_name(), and finally
atomically replace the active rocm_dir with the staging dir (shutil.replace or
os.replace) so the swap only happens on complete success; ensure exceptions
trigger cleanup of the staging dir and that progress.mark_complete(PROGRESS_KEY)
and logger.info(...) run only after the atomic swap.
In `@backend/tests/test_amd_gpu_detect.py`:
- Around line 86-89: The test test_torch_not_installed is not reliably forcing
an ImportError because setting sys.modules["torch"]=None can be bypassed if
torch is already imported; update the test to either patch builtins.__import__
to raise ImportError when name == "torch" (e.g., patch("builtins.__import__",
side_effect=import_side_effect)) or explicitly patch the module-level reference
used by is_amd_gpu_windows (patch "backend.utils.platform_detect.torch" to raise
or be None) so that the is_amd_gpu_windows function cannot resolve a torch
module and will deterministically exercise the import-failure path.
- Around line 14-89: Update the module import and all patch targets to the
package-qualified path so tests resolve when run from repo root: change the
import "from utils.platform_detect import is_amd_gpu_windows" to "from
backend.utils.platform_detect import is_amd_gpu_windows" and update every `@patch`
target that currently begins with "utils.platform_detect." to
"backend.utils.platform_detect." (this applies to the subprocess.run patches and
any other patched attributes referencing utils.platform_detect).
In `@backend/tests/test_rocm_build.py`:
- Around line 79-92: The ROCm E2E test
TestRocmBuildE2E::test_rocm_binary_compiles_and_runs assumes a Windows-only
PyInstaller output by hard-coding "voicebox-server-rocm.exe"; update the test to
either skip unless running on Windows or compute the executable name from
sys.platform. Specifically, add a platform guard using
pytest.mark.skipif(sys.platform != "win32", reason=...) on the TestRocmBuildE2E
class or test method, or replace the hard-coded binary_exe construction in
test_rocm_binary_compiles_and_runs with logic like exe_name =
"voicebox-server-rocm.exe" if sys.platform == "win32" else
"voicebox-server-rocm" and use exe_name to build binary_exe so the test only
runs/looks for the correct artifact per platform.
In `@backend/tests/test_rocm_download.py`:
- Around line 139-146: The two dictionary keys in the responses mapping use
unnecessary f-strings (no interpolation) causing Ruff F541; replace the
f-strings with plain string literals for the keys that reference the ROCm
release URLs (the entries creating FakeResponse with content=fake_tar_gz and
content=f"{server_sha} voicebox-server-rocm.tar.gz\n".encode()). Update the
keys in the responses dict (and leave the FakeResponse, fake_tar_gz, server_sha
usages unchanged) so the URL strings are regular quoted strings rather than
f"..." templates.
- Around line 103-112: The nested async context manager class _Streamer uses
parameter name "s" for its methods which triggers N805; rename the receiver to
"self" in the methods __aenter__, __aexit__, and aiter_bytes within class
_Streamer so method signatures become async def __aenter__(self), async def
__aexit__(self, *args), and async def aiter_bytes(self, chunk_size: int = 1024)
and update any internal references accordingly to satisfy the linter.
In `@backend/tests/test_rocm_requirements.py`:
- Around line 54-60: The test_requirements_file_content currently searches the
whole file string (variable content) so "torch" can be matched inside
"torchaudio" or "torchvision"; update the assertion logic to parse non-comment
requirement lines from req_file (ignore lines starting with # and blank lines)
or use anchored regexes to match package names at the start of a requirement
line (e.g., ^torch(?:[=<>!~].*)?$) and similarly for torchaudio and torchvision,
then assert each package appears as a standalone requirement in the parsed lines
rather than as a substring.
- Around line 62-122: The test_rocm_torch_installs_and_detects_amd currently
always runs and performs heavy network/ROCm installs; gate it behind an explicit
integration opt-in by skipping unless an env var or pytest marker is set. Update
the test to check for a flag like os.environ.get("RUN_ROCM_INTEGRATION") == "1"
or use pytest.config.getoption/pytestmark for a custom --run-integration option,
and call pytest.skip("skipping ROCm integration test") when not enabled;
reference the test function name test_rocm_torch_installs_and_detects_amd, the
has_amd check, and req_file to locate where to add the guard and early skip so
the venv creation and pip installs only run when explicitly opted-in.
In `@backend/utils/platform_detect.py`:
- Around line 35-51: The PowerShell call in platform_detect.py can hang and can
briefly show a console window; update the subprocess.run invocation used to
probe Win32_VideoController to (1) add a timeout parameter (e.g., 1–3 seconds)
and handle subprocess.TimeoutExpired so the function returns False on timeout,
(2) include "-NoProfile" in the PowerShell args to speed startup, and (3) pass
Windows-specific flags to suppress a console window (use
subprocess.CREATE_NO_WINDOW or the STARTUPINFO/STARTF_USESHOWWINDOW/ SW_HIDE
pattern) so no window flashes; ensure the existing result parsing
(int(result.stdout.strip())) remains guarded by the new exception handling and
that other exceptions still return False.
In `@tauri/src-tauri/src/main.rs`:
- Around line 260-296: The ROCm version probe currently uses Command::output()
which can block; change the probe around exe_path in main.rs to spawn the child
via
std::process::Command::new(&exe_path).arg("--version").current_dir(&rocm_dir).spawn(),
then wait for completion with a bounded timeout (e.g. use child.wait_timeout
from the wait-timeout crate or implement a thread + join with Duration). If the
wait times out, kill the child (child.kill()/child.wait()), log a timeout
message and set version_ok = false; if the child exits in time, collect its
stdout/stderr (child.wait_with_output or read pipes) and perform the same
version string parsing and comparison against app.config().version to set
version_ok. Ensure all error paths (spawn error, timeout, kill failure)
consistently fall back to CPU.
- Around line 421-435: The current logic assigns spawn_result from the ROCm
branch so an immediate spawn() error blocks fallback; refactor launching into an
ordered candidate loop (ROCm → CUDA → CPU) that checks backend_override (skip
GPU candidates when backend_override.as_deref() == Some("cpu")), attempts to
build and spawn each candidate using the same setup steps (the
app.shell().command(...) flow, args including "--data-dir", "--port",
"--parent-pid", optional "--host" when is_remote, and VOICEBOX_MODELS_DIR from
effective_models_dir), and on Err from spawn() continue to the next candidate
instead of returning early, only returning/propagating an error if all
candidates fail; use the existing symbols rocm_binary, cuda_binary,
backend_override, spawn_result, app.shell().command, and effective_models_dir to
locate and implement the change.
In `@tauri/src/platform/lifecycle.ts`:
- Around line 55-60: The setBackendOverride method currently swallows errors
from invoke('set_backend_override') so callers never see failures; modify async
setBackendOverride(backend?: string | null) to propagate the error (after
optionally logging) by re-throwing the caught error or removing the try/catch so
the rejected promise surfaces to callers, ensuring callers of setBackendOverride
can abort and surface the failure.
---
Outside diff comments:
In `@backend/app.py`:
- Around line 38-42: Gate the AMD-specific environment overrides on the backend
being ROCm by checking os.environ.get("VOICEBOX_BACKEND_VARIANT") == "rocm"
before touching HSA_OVERRIDE_GFX_VERSION and MIOPEN_LOG_LEVEL; then detect or
validate the GPU architecture (e.g., by probing ROCm-provided env vars or system
GPU info) and only set HSA_OVERRIDE_GFX_VERSION="10.3.0" when the detected GPU
matches known RDNA2 gfx codes, otherwise skip setting the override and log a
warning; update the block that sets HSA_OVERRIDE_GFX_VERSION and
MIOPEN_LOG_LEVEL to perform these checks and avoid applying RDNA2-specific
overrides on non-rocm backends or non-RDNA2 GPUs.
In `@backend/build_binary.py`:
- Around line 455-481: The CPU-build detection only checks torch.version.cuda;
update the detection in the subprocess.run snippet that sets has_cuda_torch to
instead inspect both torch.version.cuda and torch.version.hip (e.g., treat hip
non-empty as ROCm), and set restore_torch = "rocm" when only HIP is present;
then extend the finally block that currently handles restore_torch == "cuda" to
also handle restore_torch == "rocm" by reinstalling the ROCm-specific torch
packages (or restoring to the intended CPU fallback) using the same pip
subprocess pattern used for CUDA restoration so ROCm-only venvs are properly
restored.
---
Nitpick comments:
In `@app/src/i18n/locales/en/translation.json`:
- Around line 686-753: The localization keys are inconsistent between CUDA and
ROCm and the switchToCpu description is backend-specific; update either by
renaming the CUDA keys "settings.gpu.download" and "settings.gpu.remove" to
"settings.gpu.downloadCuda" and "settings.gpu.removeCuda" to match
"settings.gpu.downloadRocm"/"settings.gpu.removeRocm", or (simpler) make
"settings.gpu.switchToCpu.description" backend-agnostic by replacing "You can
re-download CUDA later." with a neutral string like "You can re-download the GPU
backend later."; after changing keys, update all code/UI references that use
"download"/"remove" or the old description to the new keys/phrase so
translations remain consistent.
In `@backend/routes/health.py`:
- Around line 104-117: Duplicate ROCm-vs-CUDA detection logic appears in
backend/routes/health.py (gpu_type assignment), backend/app.py::_get_gpu_status,
and backend/backends/base.py::check_cuda_compatibility; extract that logic into
a shared helper (e.g., get_cuda_flavor_name() returning "ROCm" or "CUDA" or
None) placed in a common module, then replace the inline checks in those three
locations to call the helper (use its return to build the gpu_type string where
needed) so labeling stays consistent and avoids drift.
In `@backend/routes/rocm.py`:
- Around line 43-44: The route is returning the literal progress key
"rocm-backend" instead of using the service constant; replace the hard-coded
string with the service's constant (use rocm.PROGRESS_KEY from services.rocm) in
both places where the route returns {"message": ..., "progress_key":
"rocm-backend"} (the block that calls create_background_task(_download()) and
the similar block around lines 65-72) so responses and SSE subscriptions use the
canonical key; ensure you reference the existing rocm import/variable used by
the module (e.g., services.rocm or the local rocm name) when substituting the
value.
In `@backend/server.py`:
- Around line 267-276: The current detection uses fragile substring checks on
binary_name and sets VOICEBOX_BACKEND_VARIANT, which can misdetect and break
is_rocm_active()/backend/routes/rocm.py deletion guards; replace the substring
logic in the block that computes binary_name to perform an explicit match (e.g.,
compare the basename against a whitelist of allowed variants or use a strict
regex that matches whole-word/suffix like ^(rocm|cuda|cpu)$) and only set
os.environ["VOICEBOX_BACKEND_VARIANT"] when the match is exact; ensure the code
documents/uses the same canonical variant strings ("rocm","cuda","cpu") so
is_rocm_active() and the ROCm route rely on the exact env var value.
In `@backend/tests/test_amd_gpu_detect.py`:
- Line 12: The file imports pytest but never uses it; remove the unused import
statement (the top-level "import pytest") from
backend/tests/test_amd_gpu_detect.py or, if you intended to use pytest fixtures
or marks, reference them in the test functions (e.g., add a pytest.mark or
fixture parameter) so the import is actually used; update the import accordingly
to eliminate the linter/test warning.
In `@backend/tests/test_rocm_backends.py`:
- Line 13: Remove the unused pytest import from the test module; locate the
top-level import statement "import pytest" in
backend/tests/test_rocm_backends.py and delete it since no pytest markers,
fixtures, or functions reference it.
- Around line 27-35: The test test_rocm_skips_compute_check patches
torch.version.hip in a way that can raise AttributeError if the attribute is
missing; update the patch to create the attribute when absent (e.g., use
patch("torch.version.hip", "6.2.41133", create=True) or
patch.object(torch.version, "hip", "6.2.41133")) so the test reliably exercises
check_cuda_compatibility() on nonstandard torch builds without failing due to
missing attributes.
In `@backend/utils/platform_detect.py`:
- Around line 54-62: The current check uses torch.cuda.get_device_name(0) and
thus only inspects device 0, which can misreport on multi-GPU/hybrid systems;
update the logic in platform_detect.py (the torch import block using
torch.cuda.is_available and torch.cuda.get_device_name) to iterate over all
visible CUDA devices via torch.cuda.device_count() and call
torch.cuda.get_device_name(i) for each i, returning True if any device name
contains "Radeon" or "AMD", or alternatively adjust the docstring to explicitly
state the function only reflects the torch-visible device 0 rather than the
primary system GPU.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1ee16310-cf77-4b25-8f09-af8393f21fb0
📒 Files selected for processing (30)
app/src/components/ServerSettings/GpuAcceleration.test.tsxapp/src/components/ServerSettings/GpuAcceleration.tsxapp/src/components/ServerTab/GpuPage.tsxapp/src/i18n/locales/en/translation.jsonapp/src/lib/api/client.tsapp/src/lib/api/models/ModelStatus.tsapp/src/lib/api/types.tsapp/src/platform/types.tsapp/src/test/setup.tsapp/vite.config.tsbackend/app.pybackend/backends/base.pybackend/backends/hume_backend.pybackend/backends/qwen_custom_voice_backend.pybackend/build_binary.pybackend/pyi_rth_rocm_sdk.pybackend/requirements-rocm.txtbackend/routes/__init__.pybackend/routes/health.pybackend/routes/rocm.pybackend/server.pybackend/services/rocm.pybackend/tests/test_amd_gpu_detect.pybackend/tests/test_rocm_backends.pybackend/tests/test_rocm_build.pybackend/tests/test_rocm_download.pybackend/tests/test_rocm_requirements.pybackend/utils/platform_detect.pytauri/src-tauri/src/main.rstauri/src/platform/lifecycle.ts
| try: | ||
| result = subprocess.run( | ||
| [ | ||
| "powershell", | ||
| "-Command", | ||
| "Get-CimInstance Win32_VideoController | " | ||
| "Where-Object {$_.AdapterCompatibility -like '*AMD*'} | " | ||
| "Measure-Object | Select-Object -ExpandProperty Count", | ||
| ], | ||
| capture_output=True, | ||
| text=True, | ||
| check=True, | ||
| ) | ||
| if int(result.stdout.strip()) > 0: | ||
| return True | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
Add a timeout and suppress the console window on the PowerShell call.
Two practical issues with the WMI probe in a GUI/Tauri context:
- No
timeout=. If PowerShell or WMI hangs (domain environments, corrupt CIM repository, AV hooks),subprocess.runblocks indefinitely. The broadexceptwon't help because nothing is raised. - Console window flash on Windows. When the Tauri-launched server process invokes
powershell.exewithoutCREATE_NO_WINDOW(orSTARTF_USESHOWWINDOW/SW_HIDE), a console window can briefly pop up — visible to end users on the exact platform this PR targets.
Also worth considering: PowerShell cold start is slow (~0.5–2s). If this is ever called outside tests (e.g., on a startup/health path), prefer wmic or a cached result.
🛠️ Proposed fix
# Primary method: WMI query for AMD adapters
try:
+ creationflags = 0
+ if hasattr(subprocess, "CREATE_NO_WINDOW"):
+ creationflags = subprocess.CREATE_NO_WINDOW # Windows: suppress console flash
result = subprocess.run(
[
"powershell",
+ "-NoProfile",
"-Command",
"Get-CimInstance Win32_VideoController | "
"Where-Object {$_.AdapterCompatibility -like '*AMD*'} | "
"Measure-Object | Select-Object -ExpandProperty Count",
],
capture_output=True,
text=True,
check=True,
+ timeout=10,
+ creationflags=creationflags,
)
if int(result.stdout.strip()) > 0:
return True
except Exception:
pass-NoProfile additionally shaves startup time by skipping user/system PowerShell profiles.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/utils/platform_detect.py` around lines 35 - 51, The PowerShell call
in platform_detect.py can hang and can briefly show a console window; update the
subprocess.run invocation used to probe Win32_VideoController to (1) add a
timeout parameter (e.g., 1–3 seconds) and handle subprocess.TimeoutExpired so
the function returns False on timeout, (2) include "-NoProfile" in the
PowerShell args to speed startup, and (3) pass Windows-specific flags to
suppress a console window (use subprocess.CREATE_NO_WINDOW or the
STARTUPINFO/STARTF_USESHOWWINDOW/ SW_HIDE pattern) so no window flashes; ensure
the existing result parsing (int(result.stdout.strip())) remains guarded by the
new exception handling and that other exceptions still return False.
50e0b26 to
a9ffccc
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
app/vite.config.ts (1)
4-18:⚠️ Potential issue | 🔴 CriticalAdd missing
vitestandjsdomdevDependencies.The code imports
defineConfigfrom'vitest/config'and configures the test environment as'jsdom', but neithervitestnorjsdomare declared inapp/package.json. These must be added to devDependencies for the tests to run.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/vite.config.ts` around lines 4 - 18, The project config (defineConfig in app/vite.config.ts) uses vitest and sets the test environment to 'jsdom' but package.json is missing those devDependencies; update app/package.json devDependencies to include "vitest" and "jsdom" (add compatible versions or match the repo's toolchain), run npm/yarn install, and ensure any test setup referencing setupFiles remains intact so vitest can run with the jsdom environment.backend/server.py (1)
263-277:⚠️ Potential issue | 🟠 MajorSet
VOICEBOX_BACKEND_VARIANTbefore importingbackend.main.Line 263 detects the backend after
from backend.main import apphas already run, so import-time ROCm setup inbackend/app.pysees no"rocm"variant and skipsHSA_OVERRIDE_GFX_VERSION/MIOPEN_LOG_LEVEL.Proposed direction
+def _set_backend_variant_from_executable(): + import re + + binary_name = os.path.basename(sys.executable).lower() + if re.search(r"voicebox-server-rocm(\.exe)?$", binary_name): + os.environ["VOICEBOX_BACKEND_VARIANT"] = "rocm" + return "ROCm" + if re.search(r"voicebox-server-cuda(\.exe)?$", binary_name): + os.environ["VOICEBOX_BACKEND_VARIANT"] = "cuda" + return "CUDA" + os.environ["VOICEBOX_BACKEND_VARIANT"] = "cpu" + return "CPU" + +# Must run before importing backend.main/app. +_backend_variant_label = _set_backend_variant_from_executable() +logger.info(f"Backend variant: {_backend_variant_label}") + try: logger.info("Importing argparse...")Then remove the later duplicate detection block in
__main__.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/server.py` around lines 263 - 277, Move the backend-variant detection so VOICEBOX_BACKEND_VARIANT is set before backend.main is imported: extract the detection logic (the code that inspects sys.executable and sets os.environ["VOICEBOX_BACKEND_VARIANT"]) into a small helper or place it at the top of backend/server.py and ensure it runs before the line "from backend.main import app" so backend/app.py can see "rocm" and apply HSA_OVERRIDE_GFX_VERSION / MIOPEN_LOG_LEVEL at import time; then remove the later duplicate detection block in __main__ to avoid redundant/late detection.
♻️ Duplicate comments (9)
backend/tests/test_rocm_download.py (2)
103-112:⚠️ Potential issue | 🟡 MinorUse
selffor the nested async context manager receiver.Ruff flags these method receivers as N805; rename
stoself.Proposed fix
class _Streamer: - async def __aenter__(s): + async def __aenter__(self): return resp - async def __aexit__(s, *args): + async def __aexit__(self, *args): return False - async def aiter_bytes(s, chunk_size: int = 1024): + async def aiter_bytes(self, chunk_size: int = 1024): for i in range(0, len(resp.content), chunk_size): yield resp.content[i : i + chunk_size]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_rocm_download.py` around lines 103 - 112, Rename the nested async context manager receiver from `s` to `self` in class `_Streamer`: update the method signatures for `__aenter__`, `__aexit__`, and `aiter_bytes` to use `self` instead of `s` so they follow the expected instance naming convention and satisfy the N805 lint rule.
139-146:⚠️ Potential issue | 🟡 MinorRemove the placeholder-free f-strings.
Lines 140 and 144 have no interpolation and will trigger Ruff F541.
Proposed fix
- f"https://github.com/jamiepine/voicebox/releases/download/v0.2.3/voicebox-server-rocm.tar.gz": FakeResponse( + "https://github.com/jamiepine/voicebox/releases/download/v0.2.3/voicebox-server-rocm.tar.gz": FakeResponse( content=fake_tar_gz, headers={"content-length": str(len(fake_tar_gz))}, ), - f"https://github.com/jamiepine/voicebox/releases/download/v0.2.3/voicebox-server-rocm.tar.gz.sha256": FakeResponse( + "https://github.com/jamiepine/voicebox/releases/download/v0.2.3/voicebox-server-rocm.tar.gz.sha256": FakeResponse( content=f"{server_sha} voicebox-server-rocm.tar.gz\n".encode(), ),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_rocm_download.py` around lines 139 - 146, The tests use unnecessary f-strings for static URLs which triggers Ruff F541; in the responses mapping (the dict assigning to responses) replace the placeholder-free f-strings for the keys pointing to the ROCm tarball and its .sha256 (the entries that currently construct URLs with f".../voicebox-server-rocm.tar.gz" and f".../voicebox-server-rocm.tar.gz.sha256") with plain string literals, keeping the same content and preserving the FakeResponse(...) values that reference fake_tar_gz and server_sha.backend/tests/test_rocm_requirements.py (2)
54-60:⚠️ Potential issue | 🟡 MinorMake these assertions package-aware.
"torch" in contentcan be satisfied bytorchaudioortorchvision; parse requirement names or use anchored matches.Proposed fix
def test_requirements_file_content(self, backend_dir): req_file = backend_dir / "requirements-rocm.txt" content = req_file.read_text() + requirement_lines = { + line.strip().split("==", 1)[0].split(">=", 1)[0].split("~=", 1)[0] + for line in content.splitlines() + if line.strip() and not line.lstrip().startswith(("#", "--")) + } assert "rocm7.2" in content, "Must point to ROCm 7.2 extra index" - assert "torch" in content, "Must include torch" - assert "torchaudio" in content, "Must include torchaudio" - assert "torchvision" in content, "Must include torchvision" + assert "torch" in requirement_lines, "Must include torch" + assert "torchaudio" in requirement_lines, "Must include torchaudio" + assert "torchvision" in requirement_lines, "Must include torchvision"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_rocm_requirements.py` around lines 54 - 60, The assertions in test_requirements_file_content are too broad because checks like "torch" in content can match "torchaudio" or "torchvision"; update the test to parse the requirements file (req_file/content) into individual package names (split by lines, strip comments/whitespace and version specifiers/extras) or use anchored regex matches (e.g., line-start/name boundary) to assert presence of exact packages "torch", "torchaudio", and "torchvision" so each assertion targets the canonical package name rather than substring matches.
62-122:⚠️ Potential issue | 🟠 MajorGate this heavyweight ROCm install test behind an explicit opt-in.
This creates a venv, upgrades pip, and installs ROCm wheels from the network during normal test runs, which makes CI platform/network dependent.
Proposed fix
import platform +import os import subprocess import sys @@ `@pytest.mark.timeout`(900) def test_rocm_torch_installs_and_detects_amd(self, backend_dir): """ Create a temporary venv, install requirements-rocm.txt, and verify torch.cuda.is_available() returns True on AMD hardware. """ + if os.environ.get("RUN_ROCM_INSTALL_TEST") != "1": + pytest.skip("Set RUN_ROCM_INSTALL_TEST=1 to run ROCm install integration test") + req_file = backend_dir / "requirements-rocm.txt"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_rocm_requirements.py` around lines 62 - 122, Gate the heavyweight test test_rocm_torch_installs_and_detects_amd behind an explicit opt-in: at the top of the test (in backend/tests/test_rocm_requirements.py) check for an env var like RUN_ROCM_TESTS or a pytest marker and call pytest.skip with a clear message if not set; alternatively decorate the function with `@pytest.mark.skipif`(os.getenv("RUN_ROCM_TESTS") is None, reason="heavy ROCm install test; opt-in via RUN_ROCM_TESTS=1 or -m rocm") and document that CI must set RUN_ROCM_TESTS=1 or use -m rocm to run this test. Ensure you import os and pytest and update any CI configs to set the opt-in when desired.tauri/src-tauri/src/main.rs (2)
421-435:⚠️ Potential issue | 🟠 MajorFall back to CUDA/CPU when ROCm spawn fails immediately.
If the ROCm binary exists but
cmd.spawn()returnsErr, startup fails instead of trying the available CUDA or CPU backend.Proposed direction
Refactor launch into an ordered candidate loop: ROCm → CUDA → CPU, skipping GPU candidates when
backend_override == Some("cpu"), and continue to the next candidate on immediatespawn()errors.- cmd.spawn() + match cmd.spawn() { + Ok(result) => Ok(result), + Err(e) => { + eprintln!("Failed to spawn ROCm backend, trying fallback: {}", e); + // continue with CUDA, then CPU + Err(e) + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tauri/src-tauri/src/main.rs` around lines 421 - 435, The ROCm launch block (checking backend_override, rocm_binary, building cmd and calling cmd.spawn()) treats an immediate cmd.spawn() Err as fatal and prevents trying cuda_binary or CPU; change the logic so spawn errors for a candidate backend are handled by logging/returning the Err locally and continuing to the next candidate (CUDA then CPU) rather than failing the whole startup. Refactor the launch into an ordered candidate loop (or sequential attempts) that respects backend_override == Some("cpu") to skip GPU candidates, attempts ROCm via rocm_binary then CUDA via cuda_binary then CPU, and on each candidate wraps cmd.spawn() so Err results do not short-circuit but cause moving to the next candidate while preserving/aggregating errors for a final failure; ensure environment handling (effective_models_dir, is_remote), cwd (rocm_path.parent()), and cmd construction using app.shell().command(...) remain per-candidate.
272-276:⚠️ Potential issue | 🟠 MajorAdd a timeout around the ROCm
--versionprobe.A corrupted or incompatible ROCm binary can hang this synchronous
.output()call and blockstart_serverbefore CUDA/CPU fallback.Proposed direction
Spawn the probe process with piped output, wait with a small bounded timeout, kill it on expiry, and treat timeout as
version_ok = false.- let version_ok = match std::process::Command::new(&exe_path) - .arg("--version") - .current_dir(&rocm_dir) - .output() - { + let version_ok = match run_version_probe_with_timeout( + &exe_path, + &rocm_dir, + std::time::Duration::from_secs(5), + ) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tauri/src-tauri/src/main.rs` around lines 272 - 276, The synchronous Command::output() call used to set version_ok (Command::new(&exe_path).arg("--version").current_dir(&rocm_dir).output()) can hang; replace it by spawning the process with piped stdout/stderr (use .spawn() and Stdio::piped()), wait for completion with a bounded timeout (e.g., a few seconds) and if the timeout expires call child.kill() and treat the probe as failed (version_ok = false); on successful completion collect the output and set version_ok based on the exit status/output. Target the code that defines version_ok (the probe using exe_path and rocm_dir in start_server/main.rs) and ensure proper cleanup of the child process and handling of errors/timeouts.backend/tests/test_rocm_build.py (1)
79-92:⚠️ Potential issue | 🟠 MajorGate the ROCm E2E build test to Windows.
This still hard-codes
voicebox-server-rocm.exe, so non-Windows slow test runs will fail even before validating ROCm packaging.🐛 Proposed fix
-@pytest.mark.slow() +@pytest.mark.slow +@pytest.mark.skipif(sys.platform != "win32", reason="ROCm E2E build uses Windows ROCm wheels/output") class TestRocmBuildE2E:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_rocm_build.py` around lines 79 - 92, The test TestRocmBuildE2E::test_rocm_binary_compiles_and_runs hard-codes a Windows executable name (binary_exe = ... "voicebox-server-rocm.exe") which will fail on non-Windows runners; update the test to gate this E2E to Windows or compute the executable name based on platform (use sys.platform or os.name) so binary_exe is "voicebox-server-rocm.exe" on Windows and "voicebox-server-rocm" elsewhere, and/or skip the test when not running on Windows (e.g., pytest.skip or pytest.mark.skipif) so the ROCm packaging check only runs where the .exe name is valid, modifying the variables binary_exe and any related assertions in test_rocm_binary_compiles_and_runs accordingly.backend/services/rocm.py (2)
293-312:⚠️ Potential issue | 🟠 MajorStage libs-only updates too.
The server update path is staged, but when only
need_libsis true this still extracts directly into the active backend. A failed libs extraction can corrupt the installed ROCm runtime whileget_rocm_binary_path()still reports the backend as available.🐛 Proposed fix
- # When re-downloading the server core, extract into a staging directory and + # Extract updates into a staging directory and # atomically rename it over rocm_dir on success. This prevents a failed # mid-extraction from leaving rocm_dir in a partially-installed state that # passes the get_rocm_binary_path() existence check. - # - # If only the libs need updating (server already installed), we skip staging - # and extract directly so we don't have to copy the existing server files. - use_staging = need_server + use_staging = need_server or need_libs staging_dir = get_backends_dir() / "rocm-staging" if use_staging: if staging_dir.exists(): shutil.rmtree(staging_dir) staging_dir.mkdir(parents=True, exist_ok=True) - # Preserve already-cached libs so we don't force a re-download - if not need_libs and rocm_dir.exists(): + # Preserve the existing backend; updated archives overwrite staged files. + if rocm_dir.exists(): shutil.copytree(rocm_dir, staging_dir, dirs_exist_ok=True) extract_dir = staging_dirAlso applies to: 353-383
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/rocm.py` around lines 293 - 312, The staging logic should also run when only libraries are being updated: change the use_staging assignment to use_staging = need_server or need_libs so libs-only updates extract into the staging_dir, and when preparing the staging_dir always copy the existing rocm_dir into staging (i.e., replace the current conditional shutil.copytree(rocm_dir, staging_dir, dirs_exist_ok=True) which checks not need_libs with a plain rocm_dir.exists() check) so existing server files are preserved in staging; make the same change for the second occurrence around the block that mirrors lines 353-383 (references: need_server, need_libs, use_staging, staging_dir, get_backends_dir(), rocm_dir, get_rocm_binary_path()) so a failed libs extraction cannot corrupt the active backend.
289-291:⚠️ Potential issue | 🔴 CriticalDefine
libs_base_urlbefore using it to download ROCm libraries.
libs_base_urlis never assigned, so when ROCm libs need to be downloaded, lines 356-357 raiseNameError. This blocks any fresh ROCm installation or update requiring libraries.🐛 Proposed fix
base_url = f"{GITHUB_RELEASES_URL}/{version}" + libs_base_url = f"{GITHUB_RELEASES_URL}/{ROCM_LIBS_VERSION}" server_archive = "voicebox-server-rocm.tar.gz" libs_archive = f"rocm-libs-{ROCM_LIBS_VERSION}.tar.gz"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/rocm.py` around lines 289 - 291, The code sets base_url, server_archive and libs_archive but never defines libs_base_url, causing a NameError when attempting to download ROCm libs; add an assignment for libs_base_url (e.g., derive it from GITHUB_RELEASES_URL and version or reuse base_url) near where base_url/server_archive/libs_archive are declared so the later download calls that reference libs_base_url succeed; update the same block that defines base_url/server_archive/libs_archive to include libs_base_url.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/src/components/ServerTab/GpuPage.tsx`:
- Around line 320-338: There are duplicate handler declarations for
handleSwitchToCuda and handleSwitchToRocm; remove the earlier definitions (the
versions in the first block using platform.lifecycle.setBackendOverride and
restartServerWithPolling) and keep the later versions that include
setRestartPhase('stopping') before calling restartServerWithPolling; ensure only
one declaration of handleSwitchToCuda and one of handleSwitchToRocm remains and
that they call platform.lifecycle.setBackendOverride(...) and
restartServerWithPolling(...) as in the later implementations.
- Around line 344-345: The restart is happening even when setBackendOverride
fails because setBackendOverride in tauri/src/platform/lifecycle.ts currently
swallows errors; update setBackendOverride to rethrow any caught errors (or
remove the silent catch) so callers like GpuPage.tsx (where setBackendOverride
is invoked before restartServerWithPolling for CPU/CUDA/ROCm switches) only
proceed to await restartServerWithPolling when the backend override truly
succeeded; ensure the thrown error propagates to the GpuPage handlers so the
restart is gated on success and can show the failure to the user.
In `@backend/build_binary.py`:
- Around line 528-557: Add a fail-fast preflight that checks interpreter
implementation and version before the ROCm wheel install subprocess.run blocks:
verify sys.implementation.name == "cpython" and sys.version_info[:2] == (3, 12)
(or otherwise match the cp312 tag) and exit with a clear error if the check
fails; place this check immediately before the subprocess.run calls that install
the ROCm wheel URLs (the pip install lists containing
"torch-2.9.1%2Brocm7.2.1-cp312..." and the other rocm_* wheel/tar.gz URLs) so
the script aborts before partially modifying the environment when run under
non-CPython or a different minor Python version.
In `@backend/pyi_rth_rocm_sdk.py`:
- Around line 20-24: The current broad except hides all import errors; limit
suppression to only the absent package by replacing the single try/except
Exception around "import rocm_sdk" and "from rocm_sdk import _dist_info" with a
targeted import: first try "import rocm_sdk" and catch only ModuleNotFoundError
(or check e.name == 'rocm_sdk') and return, then perform "from rocm_sdk import
_dist_info" in a separate statement so any import failures for the submodule or
other exceptions propagate; ensure the code that patches find_libraries still
runs when rocm_sdk is present.
In `@backend/routes/rocm.py`:
- Around line 32-35: The current check only treats "downloading" as in-progress
so a concurrent request can proceed during extraction; update the check around
progress_manager.get_progress(rocm.PROGRESS_KEY) (using get_progress_manager() /
progress_manager) to consider extraction as in-progress too—e.g., treat status
== "downloading" OR status == "extracting" (or generally any non-terminal
status) as a conflict and raise the HTTPException to prevent a second request
from starting.
---
Outside diff comments:
In `@app/vite.config.ts`:
- Around line 4-18: The project config (defineConfig in app/vite.config.ts) uses
vitest and sets the test environment to 'jsdom' but package.json is missing
those devDependencies; update app/package.json devDependencies to include
"vitest" and "jsdom" (add compatible versions or match the repo's toolchain),
run npm/yarn install, and ensure any test setup referencing setupFiles remains
intact so vitest can run with the jsdom environment.
In `@backend/server.py`:
- Around line 263-277: Move the backend-variant detection so
VOICEBOX_BACKEND_VARIANT is set before backend.main is imported: extract the
detection logic (the code that inspects sys.executable and sets
os.environ["VOICEBOX_BACKEND_VARIANT"]) into a small helper or place it at the
top of backend/server.py and ensure it runs before the line "from backend.main
import app" so backend/app.py can see "rocm" and apply HSA_OVERRIDE_GFX_VERSION
/ MIOPEN_LOG_LEVEL at import time; then remove the later duplicate detection
block in __main__ to avoid redundant/late detection.
---
Duplicate comments:
In `@backend/services/rocm.py`:
- Around line 293-312: The staging logic should also run when only libraries are
being updated: change the use_staging assignment to use_staging = need_server or
need_libs so libs-only updates extract into the staging_dir, and when preparing
the staging_dir always copy the existing rocm_dir into staging (i.e., replace
the current conditional shutil.copytree(rocm_dir, staging_dir,
dirs_exist_ok=True) which checks not need_libs with a plain rocm_dir.exists()
check) so existing server files are preserved in staging; make the same change
for the second occurrence around the block that mirrors lines 353-383
(references: need_server, need_libs, use_staging, staging_dir,
get_backends_dir(), rocm_dir, get_rocm_binary_path()) so a failed libs
extraction cannot corrupt the active backend.
- Around line 289-291: The code sets base_url, server_archive and libs_archive
but never defines libs_base_url, causing a NameError when attempting to download
ROCm libs; add an assignment for libs_base_url (e.g., derive it from
GITHUB_RELEASES_URL and version or reuse base_url) near where
base_url/server_archive/libs_archive are declared so the later download calls
that reference libs_base_url succeed; update the same block that defines
base_url/server_archive/libs_archive to include libs_base_url.
In `@backend/tests/test_rocm_build.py`:
- Around line 79-92: The test
TestRocmBuildE2E::test_rocm_binary_compiles_and_runs hard-codes a Windows
executable name (binary_exe = ... "voicebox-server-rocm.exe") which will fail on
non-Windows runners; update the test to gate this E2E to Windows or compute the
executable name based on platform (use sys.platform or os.name) so binary_exe is
"voicebox-server-rocm.exe" on Windows and "voicebox-server-rocm" elsewhere,
and/or skip the test when not running on Windows (e.g., pytest.skip or
pytest.mark.skipif) so the ROCm packaging check only runs where the .exe name is
valid, modifying the variables binary_exe and any related assertions in
test_rocm_binary_compiles_and_runs accordingly.
In `@backend/tests/test_rocm_download.py`:
- Around line 103-112: Rename the nested async context manager receiver from `s`
to `self` in class `_Streamer`: update the method signatures for `__aenter__`,
`__aexit__`, and `aiter_bytes` to use `self` instead of `s` so they follow the
expected instance naming convention and satisfy the N805 lint rule.
- Around line 139-146: The tests use unnecessary f-strings for static URLs which
triggers Ruff F541; in the responses mapping (the dict assigning to responses)
replace the placeholder-free f-strings for the keys pointing to the ROCm tarball
and its .sha256 (the entries that currently construct URLs with
f".../voicebox-server-rocm.tar.gz" and
f".../voicebox-server-rocm.tar.gz.sha256") with plain string literals, keeping
the same content and preserving the FakeResponse(...) values that reference
fake_tar_gz and server_sha.
In `@backend/tests/test_rocm_requirements.py`:
- Around line 54-60: The assertions in test_requirements_file_content are too
broad because checks like "torch" in content can match "torchaudio" or
"torchvision"; update the test to parse the requirements file (req_file/content)
into individual package names (split by lines, strip comments/whitespace and
version specifiers/extras) or use anchored regex matches (e.g., line-start/name
boundary) to assert presence of exact packages "torch", "torchaudio", and
"torchvision" so each assertion targets the canonical package name rather than
substring matches.
- Around line 62-122: Gate the heavyweight test
test_rocm_torch_installs_and_detects_amd behind an explicit opt-in: at the top
of the test (in backend/tests/test_rocm_requirements.py) check for an env var
like RUN_ROCM_TESTS or a pytest marker and call pytest.skip with a clear message
if not set; alternatively decorate the function with
`@pytest.mark.skipif`(os.getenv("RUN_ROCM_TESTS") is None, reason="heavy ROCm
install test; opt-in via RUN_ROCM_TESTS=1 or -m rocm") and document that CI must
set RUN_ROCM_TESTS=1 or use -m rocm to run this test. Ensure you import os and
pytest and update any CI configs to set the opt-in when desired.
In `@tauri/src-tauri/src/main.rs`:
- Around line 421-435: The ROCm launch block (checking backend_override,
rocm_binary, building cmd and calling cmd.spawn()) treats an immediate
cmd.spawn() Err as fatal and prevents trying cuda_binary or CPU; change the
logic so spawn errors for a candidate backend are handled by logging/returning
the Err locally and continuing to the next candidate (CUDA then CPU) rather than
failing the whole startup. Refactor the launch into an ordered candidate loop
(or sequential attempts) that respects backend_override == Some("cpu") to skip
GPU candidates, attempts ROCm via rocm_binary then CUDA via cuda_binary then
CPU, and on each candidate wraps cmd.spawn() so Err results do not short-circuit
but cause moving to the next candidate while preserving/aggregating errors for a
final failure; ensure environment handling (effective_models_dir, is_remote),
cwd (rocm_path.parent()), and cmd construction using app.shell().command(...)
remain per-candidate.
- Around line 272-276: The synchronous Command::output() call used to set
version_ok
(Command::new(&exe_path).arg("--version").current_dir(&rocm_dir).output()) can
hang; replace it by spawning the process with piped stdout/stderr (use .spawn()
and Stdio::piped()), wait for completion with a bounded timeout (e.g., a few
seconds) and if the timeout expires call child.kill() and treat the probe as
failed (version_ok = false); on successful completion collect the output and set
version_ok based on the exit status/output. Target the code that defines
version_ok (the probe using exe_path and rocm_dir in start_server/main.rs) and
ensure proper cleanup of the child process and handling of errors/timeouts.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: df166566-e448-4606-933b-dd1baa2232b1
📒 Files selected for processing (30)
app/src/components/ServerSettings/GpuAcceleration.test.tsxapp/src/components/ServerSettings/GpuAcceleration.tsxapp/src/components/ServerTab/GpuPage.tsxapp/src/i18n/locales/en/translation.jsonapp/src/lib/api/client.tsapp/src/lib/api/models/ModelStatus.tsapp/src/lib/api/types.tsapp/src/platform/types.tsapp/src/test/setup.tsapp/vite.config.tsbackend/app.pybackend/backends/base.pybackend/backends/hume_backend.pybackend/backends/qwen_custom_voice_backend.pybackend/build_binary.pybackend/pyi_rth_rocm_sdk.pybackend/requirements-rocm.txtbackend/routes/__init__.pybackend/routes/health.pybackend/routes/rocm.pybackend/server.pybackend/services/rocm.pybackend/tests/test_amd_gpu_detect.pybackend/tests/test_rocm_backends.pybackend/tests/test_rocm_build.pybackend/tests/test_rocm_download.pybackend/tests/test_rocm_requirements.pybackend/utils/platform_detect.pytauri/src-tauri/src/main.rstauri/src/platform/lifecycle.ts
✅ Files skipped from review due to trivial changes (5)
- app/src/test/setup.ts
- backend/routes/init.py
- app/src/lib/api/models/ModelStatus.ts
- app/src/lib/api/types.ts
- app/src/lib/api/client.ts
🚧 Files skipped from review as they are similar to previous changes (10)
- app/src/platform/types.ts
- backend/requirements-rocm.txt
- backend/backends/base.py
- tauri/src/platform/lifecycle.ts
- backend/app.py
- backend/routes/health.py
- backend/utils/platform_detect.py
- app/src/components/ServerSettings/GpuAcceleration.test.tsx
- app/src/i18n/locales/en/translation.json
- app/src/components/ServerSettings/GpuAcceleration.tsx
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/backends/qwen_custom_voice_backend.py`:
- Around line 110-130: Import the helper force_offline_if_cached at the top of
the module and ensure you only enter the offline patch when the model is
actually cached: wrap the Qwen3TTSModel.from_pretrained calls with
force_offline_if_cached(is_cached, model_name) as already attempted, but first
add the missing import (from backend.utils.hf_offline_patch import
force_offline_if_cached) and adjust backend/utils/hf_offline_patch.py so its
patching logic is gated behind the context manager (i.e., do not apply global HF
offline constants on import; apply them only inside the force_offline_if_cached
context) so _get_model_path + Qwen3TTSModel.from_pretrained run online for
uncached models without raising NameError or blocking downloads.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cc776e73-0e93-4093-844a-dcf9e125781a
📒 Files selected for processing (1)
backend/backends/qwen_custom_voice_backend.py
| from huggingface_hub import constants as hf_constants | ||
| tts_cache_dir = hf_constants.HF_HUB_CACHE | ||
|
|
||
| model_path = self._get_model_path(model_size) | ||
| logger.info("Loading Qwen CustomVoice %s on %s...", model_size, self.device) | ||
|
|
||
| if self.device == "cpu": | ||
| self.model = Qwen3TTSModel.from_pretrained( | ||
| model_path, | ||
| torch_dtype=torch.float32, | ||
| low_cpu_mem_usage=False, | ||
| ) | ||
| else: | ||
| self.model = Qwen3TTSModel.from_pretrained( | ||
| model_path, | ||
| device_map=self.device, | ||
| torch_dtype=torch.bfloat16, | ||
| ) | ||
| with force_offline_if_cached(is_cached, model_name): | ||
| if self.device == "cpu": | ||
| self.model = Qwen3TTSModel.from_pretrained( | ||
| model_path, | ||
| cache_dir=tts_cache_dir, | ||
| torch_dtype=torch.float32, | ||
| low_cpu_mem_usage=False, | ||
| ) | ||
| else: | ||
| self.model = Qwen3TTSModel.from_pretrained( | ||
| model_path, | ||
| cache_dir=tts_cache_dir, | ||
| device_map=self.device, | ||
| torch_dtype=torch.bfloat16, | ||
| ) |
There was a problem hiding this comment.
Import and gate force_offline_if_cached before wrapping model load.
Line 116 currently raises NameError because force_offline_if_cached is not imported. Also, the current implementation in backend/utils/hf_offline_patch.py patches offline constants before checking is_cached, so entering it for uncached models can still break online downloads.
🐛 Proposed fix
import asyncio
+from contextlib import nullcontext
import logging
from typing import Optional
import numpy as np
import torch
+from ..utils.hf_offline_patch import force_offline_if_cached
from . import TTSBackend, LANGUAGE_CODE_TO_NAME
from .base import (
is_model_cached,
get_torch_device,
combine_voice_prompts as _combine_voice_prompts,
@@
- with force_offline_if_cached(is_cached, model_name):
+ offline_guard = (
+ force_offline_if_cached(True, model_name)
+ if is_cached
+ else nullcontext()
+ )
+
+ with offline_guard:
if self.device == "cpu":
self.model = Qwen3TTSModel.from_pretrained(
model_path,🧰 Tools
🪛 Ruff (0.15.10)
[error] 116-116: Undefined name force_offline_if_cached
(F821)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/backends/qwen_custom_voice_backend.py` around lines 110 - 130, Import
the helper force_offline_if_cached at the top of the module and ensure you only
enter the offline patch when the model is actually cached: wrap the
Qwen3TTSModel.from_pretrained calls with force_offline_if_cached(is_cached,
model_name) as already attempted, but first add the missing import (from
backend.utils.hf_offline_patch import force_offline_if_cached) and adjust
backend/utils/hf_offline_patch.py so its patching logic is gated behind the
context manager (i.e., do not apply global HF offline constants on import; apply
them only inside the force_offline_if_cached context) so _get_model_path +
Qwen3TTSModel.from_pretrained run online for uncached models without raising
NameError or blocking downloads.
cc8c501 to
bdf476f
Compare
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/build_binary.py (1)
455-628:⚠️ Potential issue | 🟠 MajorRestore torch even when the pre-build ROCm install fails.
The
finallyblock starts only at Line 566, so failures during the ROCm pip installs at Lines 534 or 549 can leave the developer environment partially switched. Wrap the torch swap/install phase in the sametry/finally, and add--no-depsto restore installs to preserve the NumPy protection this PR adds.Proposed structure
restore_torch = None - if not cuda and not rocm and platform.system() == "Windows": - import subprocess + try: + if not cuda and not rocm and platform.system() == "Windows": + import subprocess - cuda_result = subprocess.run( + cuda_result = subprocess.run( [sys.executable, "-c", "import torch; print(torch.version.cuda or '')"], capture_output=True, text=True - ) + ) @@ - if rocm and platform.system() == "Windows": - import subprocess + if rocm and platform.system() == "Windows": + import subprocess @@ - # Run PyInstaller - try: PyInstaller.__main__.run(args) finally: # Restore torch if we swapped it out (even on build failure) @@ "--index-url", "https://download.pytorch.org/whl/cu128", "--force-reinstall", + "--no-deps", "-q", @@ "--index-url", "https://download.pytorch.org/whl/cpu", "--force-reinstall", + "--no-deps", "-q",Apply the same wrapping to the full CPU/ROCm swap blocks and keep
--no-depson all restore paths.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/build_binary.py` around lines 455 - 628, The restore logic currently in the finally block (checking restore_torch and running pip to restore CPU/CUDA/ROCm) does not protect against failures during the pre-build torch swap/install steps (the subprocess.run pip installs inside the Windows CPU/ROCm blocks and the ROCm torch install sequence), so move/extend the try/finally to wrap the entire torch-swap/install phase (the blocks that set restore_torch and call subprocess.run to install CPU or ROCm wheels) so that the finally always executes even if those pip installs fail; ensure restore_torch is determined before any environment-overwriting installs (as already done in the ROCm block), and when performing the restore pip installs in the finally (the branches that install torch/torchvision/torchaudio for "cpu","cuda","rocm") include the --no-deps flag on ROCm/restore installs where applicable (and keep existing --index-url for CPU/CUDA restores) so NumPy protection is preserved; keep PyInstaller.__main__.run(args) inside the try so finally still runs after the build attempt.app/src/components/ServerTab/GpuPage.tsx (1)
320-355:⚠️ Potential issue | 🟠 Major
restartPhasestays on'stopping'whensetBackendOverriderejects.All three switch handlers call
setRestartPhase('stopping')beforeawait platform.lifecycle.setBackendOverride(...). If the override call throws (now that it rethrows per the earlier review fix), control jumps to thecatchwhich sets an error and refetches status but never resetsrestartPhase. The page is then stuck rendering the "Stopping server…" spinner with no way out until navigation/remount.restartServerWithPolling's own catch block resets the phase, so only the pre-restart failure window is affected.Affects
handleSwitchToCpu(lines 320-331),handleSwitchToCuda(333-343), andhandleSwitchToRocm(345-355).🐛 Proposed fix (apply to all three handlers)
const handleSwitchToCpu = async () => { setError(null); setRestartPhase('stopping'); try { await platform.lifecycle.setBackendOverride('cpu'); await restartServerWithPolling(t('settings.gpu.errors.switchCpu')); } catch (e: unknown) { + setRestartPhase('idle'); setError(e instanceof Error ? e.message : t('settings.gpu.errors.switchCpu')); refetchCudaStatus(); refetchRocmStatus(); } };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/ServerTab/GpuPage.tsx` around lines 320 - 355, The handlers handleSwitchToCpu, handleSwitchToCuda and handleSwitchToRocm set restartPhase('stopping') before awaiting platform.lifecycle.setBackendOverride, so if setBackendOverride throws the catch never resets restartPhase and the UI remains stuck; fix by moving setRestartPhase('stopping') to after a successful await platform.lifecycle.setBackendOverride (i.e., just before calling restartServerWithPolling) or alternately wrap the setBackendOverride call in its own try/catch that resets restartPhase on error — apply this change to handleSwitchToCpu, handleSwitchToCuda and handleSwitchToRocm and keep their existing error/refetch behavior.
♻️ Duplicate comments (4)
backend/backends/qwen_custom_voice_backend.py (1)
108-130:⚠️ Potential issue | 🔴 CriticalMissing import:
force_offline_if_cachedwill raiseNameErrorat load time.
force_offline_if_cachedis referenced on line 116 but never imported in this module, so_load_model_syncwill fail withNameErrorthe first time a model load is attempted (Ruff F821 confirms). Add the import from..utils.hf_offline_patchat the top of the file.🐛 Proposed fix
from . import TTSBackend, LANGUAGE_CODE_TO_NAME from .base import ( is_model_cached, get_torch_device, combine_voice_prompts as _combine_voice_prompts, model_load_progress, ) +from ..utils.hf_offline_patch import force_offline_if_cached🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/qwen_custom_voice_backend.py` around lines 108 - 130, The module references force_offline_if_cached inside the model loading block (used in _load_model_sync / the Qwen3TTSModel loading logic) but never imports it; add an import for force_offline_if_cached from ..utils.hf_offline_patch at the top of the file so the call in the with force_offline_if_cached(is_cached, model_name): block resolves and avoids a NameError at runtime.backend/build_binary.py (1)
509-513:⚠️ Potential issue | 🟡 MinorAlso reject non-CPython 3.12 for the cp312 ROCm wheels.
The hard-coded wheels are
cp312-cp312-win_amd64; checking onlysys.version_infostill allows non-CPython 3.12 interpreters to reach the pip install path.Proposed guard
- if sys.version_info[:2] != (3, 12): + if sys.implementation.name != "cpython" or sys.version_info[:2] != (3, 12): raise RuntimeError( - f"ROCm wheels are cp312-specific; got Python {sys.version_info[:2]}. " - "Use Python 3.12 to build the ROCm binary." + "ROCm wheels are cp312-cp312-specific; " + f"got {sys.implementation.name} {sys.version.split()[0]}. " + "Use CPython 3.12 to build the ROCm binary." )Verification:
#!/bin/bash python - <<'PY' import sys print("implementation:", sys.implementation.name) print("version:", sys.version.split()[0]) print("expected:", "cpython 3.12.x") PY🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/build_binary.py` around lines 509 - 513, The current guard only checks sys.version_info and allows non-CPython interpreters; update the check around the existing sys.version_info[:2] comparison to also verify sys.implementation.name == "cpython" (or equivalent) and raise the same RuntimeError when either the Python major/minor isn't 3.12 or the implementation is not CPython so that only CPython 3.12 interpreters proceed to build/install the cp312 ROCm wheels; locate and modify the block that currently inspects sys.version_info[:2] in build_binary.py to enforce both conditions.backend/services/rocm.py (2)
369-373:⚠️ Potential issue | 🟠 MajorAvoid deleting the working ROCm backend before the replacement is installed.
shutil.rmtree(rocm_dir)followed bystaging_dir.rename(rocm_dir)is not atomic; if the rename fails after deletion, the previous working backend is gone. Rename the current dir to a backup first, then roll back on failure.Proposed safer swap
# Atomic swap: replace rocm_dir with the fully-extracted staging dir if use_staging: + backup_dir = get_backends_dir() / "rocm-backup" + if backup_dir.exists(): + shutil.rmtree(backup_dir) if rocm_dir.exists(): - shutil.rmtree(rocm_dir) - staging_dir.rename(rocm_dir) + rocm_dir.rename(backup_dir) + try: + staging_dir.rename(rocm_dir) + except Exception: + if backup_dir.exists() and not rocm_dir.exists(): + backup_dir.rename(rocm_dir) + raise + else: + if backup_dir.exists(): + shutil.rmtree(backup_dir)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/rocm.py` around lines 369 - 373, The current non-atomic swap deletes rocm_dir then renames staging_dir, risking total loss if rename fails; instead, when use_staging is true perform an atomic-safe swap by first renaming existing rocm_dir to a temporary backup name (e.g., rocm_dir.with_suffix(".bak") or similar), then rename staging_dir to rocm_dir, and on any failure attempt to rename the backup back to rocm_dir; finally remove the backup only after successful rename; update the block referencing use_staging, rocm_dir, staging_dir to implement this rename-to-backup -> rename-staging -> cleanup-or-rollback sequence and avoid using shutil.rmtree before a successful replacement.
289-358:⚠️ Potential issue | 🟠 MajorKeep ROCm libs downloads tied to
ROCM_LIBS_VERSION, not the app release tag.
rocm-libs-{ROCM_LIBS_VERSION}.tar.gzis still fetched from{GITHUB_RELEASES_URL}/{version}. A new app release without re-uploading the same libs archive will break fresh ROCm installs.Proposed fix
- base_url = f"{GITHUB_RELEASES_URL}/{version}" + server_base_url = f"{GITHUB_RELEASES_URL}/{version}" + libs_base_url = f"{GITHUB_RELEASES_URL}/{ROCM_LIBS_VERSION}" server_archive = "voicebox-server-rocm.tar.gz" libs_archive = f"rocm-libs-{ROCM_LIBS_VERSION}.tar.gz" @@ - head = await client.head(f"{base_url}/{server_archive}") + head = await client.head(f"{server_base_url}/{server_archive}") @@ - head = await client.head(f"{base_url}/{libs_archive}") + head = await client.head(f"{libs_base_url}/{libs_archive}") @@ - url=f"{base_url}/{server_archive}", - sha256_url=f"{base_url}/{server_archive}.sha256", + url=f"{server_base_url}/{server_archive}", + sha256_url=f"{server_base_url}/{server_archive}.sha256", @@ - url=f"{base_url}/{libs_archive}", - sha256_url=f"{base_url}/{libs_archive}.sha256", + url=f"{libs_base_url}/{libs_archive}", + sha256_url=f"{libs_base_url}/{libs_archive}.sha256",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/rocm.py` around lines 289 - 358, The code builds base_url from the app release `version` and uses it for both server and libs downloads, which can break ROCm installs when libs are versioned separately; change the logic so `rocm-libs-{ROCM_LIBS_VERSION}.tar.gz` is downloaded from a libs-specific release URL (e.g. build a separate libs_base_url using `ROCM_LIBS_VERSION`) and use that URL for the HEAD and download calls related to the libs archive while keeping server downloads on the original `base_url`; update references in the download/extract sequence (look for `base_url`, `libs_archive`, the head/download calls and the call to `_download_and_extract_archive` guarded by `need_libs`) so libs use the new `libs_base_url` and server continues to use `base_url`.
🧹 Nitpick comments (3)
app/src/components/ServerTab/GpuPage.tsx (1)
206-238: ROCm SSE effect may miss the initial progress events.The effect only opens the
EventSourcewhenrocmDownloadingis alreadytrue, which depends on the 10-secondrocm-statuspoll picking up the download. When the user clicks "Download", the POST returns immediately butrocmDownloadingwon't flip totrueuntil either the next poll or the explicitrefetchRocmStatus()inhandleDownloadRocm(line 300) completes. During that window, early progress events are lost and the progress bar can appear to start mid-download. Same pattern applies to the CUDA SSE effect.Consider triggering the SSE subscription right after kicking off the download (e.g. via a dedicated
isStreamingstate flag set in the download handler) rather than gating it solely on the server-reported status.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/ServerTab/GpuPage.tsx` around lines 206 - 238, The SSE subscription in the useEffect for ROCm (the EventSource created when rocmDownloading && serverUrl) can miss early progress events; modify the download flow by adding a dedicated isStreaming boolean state (e.g., rocmStreaming) that you set to true in handleDownloadRocm immediately after the POST returns, and change the useEffect to open/close the EventSource based on rocmStreaming && serverUrl (and still clear it when progress is complete/error and set rocmStreaming to false); apply the same pattern to the CUDA SSE effect so streaming starts as soon as the user initiates the download rather than waiting for the server-status poll or refetchRocmStatus.backend/services/rocm.py (1)
226-230: Remove the obsolete Python-version branch and always use the safe tar filter.Ruff (UP036) confirms this version guard is outdated for the project's minimum Python version, so the fallback can be removed and extraction can consistently use
filter="data".Proposed cleanup
) with tarfile.open(temp_path, "r:gz") as tar: - if sys.version_info >= (3, 12): - tar.extractall(path=dest_dir, filter="data") - else: - tar.extractall(path=dest_dir) + tar.extractall(path=dest_dir, filter="data")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/rocm.py` around lines 226 - 230, Remove the obsolete Python-version branch in the tar extraction block and always call tar.extractall with the safe filter: replace the if/else around sys.version_info in the with tarfile.open(temp_path, "r:gz") as tar: block so it always uses tar.extractall(path=dest_dir, filter="data"); also audit and remove the now-unused sys import if present (references: temp_path, dest_dir, tar.extractall).backend/tests/test_rocm_build.py (1)
25-32: Remove empty parentheses from fixture decorator and consolidate nested context managers.Ruff identifies two lint issues:
- PT001:
@pytest.fixture()should be@pytest.fixture(line 25)- SIM117: Combine three nested
withstatements into a single statement with multiple contexts (lines 28-30)Proposed cleanup
- `@pytest.fixture`() + `@pytest.fixture` def captured_args(self): """Run build_server(rocm=True) with mocked PyInstaller and return args.""" - with patch("build_binary.PyInstaller.__main__.run") as mock_run: - with patch("build_binary.platform.system", return_value="Linux"): - with patch("build_binary.os.chdir"): - build_server(rocm=True) + with ( + patch("build_binary.PyInstaller.__main__.run") as mock_run, + patch("build_binary.platform.system", return_value="Linux"), + patch("build_binary.os.chdir"), + ): + build_server(rocm=True) return mock_run.call_args[0][0]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/test_rocm_build.py` around lines 25 - 32, Fix the fixture decorator and collapse nested context managers: change the `@pytest.fixture`() on the captured_args fixture to `@pytest.fixture`, and combine the three nested with() patches into a single with statement that patches "build_binary.PyInstaller.__main__.run" as mock_run, "build_binary.platform.system" returning "Linux", and "build_binary.os.chdir" in one line; inside that single context call build_server(rocm=True) and then return mock_run.call_args[0][0]. Ensure you reference the captured_args fixture, the build_server call, and the three patch targets when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.claude/settings.local.json:
- Around line 1-7: The .claude/settings.local.json file with the "permissions":
{ "allow": [...] } entry is a machine-specific local settings file and should
not be included in the PR; remove it from the commit and ensure it is ignored
going forward by deleting the file from the repo index (while keeping it
locally) and adding .claude/settings.local.json to .gitignore (or updating an
existing ignore rule for the .claude directory) so the "permissions.allow" entry
is not tracked.
In `@app/src/components/ServerSettings/GpuAcceleration.test.tsx`:
- Around line 64-66: The test teardown currently calls vi.restoreAllMocks() in
the afterEach block but does not undo globals stubbed with vi.stubGlobal, so
stubbed EventSource persists across tests; update the afterEach in
GpuAcceleration.test.tsx to call vi.unstubAllGlobals() (in addition to
vi.restoreAllMocks()) to ensure vi.stubGlobal('EventSource', ...) is torn down
between tests and does not leak into later tests or workers.
In `@backend/app.py`:
- Around line 38-45: Change the ROCm shim and MIOpen logging to opt-in: stop
unconditionally setting HSA_OVERRIDE_GFX_VERSION and MIOPEN_LOG_LEVEL when
VOICEBOX_BACKEND_VARIANT == "rocm"; instead only set
HSA_OVERRIDE_GFX_VERSION="10.3.0" if running on Linux and an explicit opt-in env
(e.g., VOICEBOX_ROCM_FORCE_GFX1030=="1") is present, and only set
MIOPEN_LOG_LEVEL (or any non-default value) if a separate opt-in env like
VOICEBOX_ROCM_ENABLE_MIOPEN_LOG=="1" is present; keep the existing
VOICEBOX_BACKEND_VARIANT check and allow users to override these env vars
externally.
In `@backend/routes/rocm.py`:
- Around line 29-35: The current early 409 check using
rocm.get_rocm_binary_path() prevents triggering download_rocm_binary() for
upgrades; remove that conditional branch so the route only blocks when a
progress entry exists (use progress_manager.get_progress with rocm.PROGRESS_KEY
and its "downloading"/"extracting" states) and allow download_rocm_binary() to
decide via its internal _needs_server_download / _needs_rocm_libs_download logic
whether work is required; in short, delete the get_rocm_binary_path() is not
None 409 path and rely solely on the existing progress-state check to prevent
concurrent downloads.
In `@backend/server.py`:
- Around line 50-60: The current logic in server.py imports re locally and
always assigns VOICEBOX_BACKEND_VARIANT based on sys.executable, which clobbers
explicit overrides; move the re import to the module top-level (so Ruff E402 is
satisfied) and change the fallback assignment to use
os.environ.setdefault("VOICEBOX_BACKEND_VARIANT", "cpu") instead of
unconditional os.environ[...] = "cpu", keeping the existing checks for
"rocm"/"cuda" but only setting "cpu" if the env var is not already present;
remove the local del of _re/_binary_name and keep _binary_name as a transient
variable scoped to the block.
In `@backend/services/rocm.py`:
- Around line 410-438: Add a guard in check_and_update_rocm_binary to skip
auto-update when the ROCm backend is active: after resolving rocm_path (and
before calling _needs_server_download/_needs_rocm_libs_download and before
awaiting download_rocm_binary), call an existing or new helper like
is_rocm_active() (or implement one that detects the running ROCm process/locks
on Windows) and if it returns True, log that update is deferred and return early
so download_rocm_binary() is not invoked from inside the active ROCm process;
keep the rest of the logic (need_server/need_libs checks) unchanged so updates
can be performed later from the CPU backend flow.
In `@tauri/src-tauri/src/main.rs`:
- Around line 294-316: The current version check treats empty strings as valid
matches because app.config().version.unwrap_or_default() and
probe_binary_version(...).await.unwrap_or_default() both return "" on
missing/failure; update the logic in the ROCm branch around app_version,
binary_version and version_ok (and mirror the same change in the CUDA block) so
that you treat empty versions as invalid: compute app_version_opt =
app.config().version.clone() (Option) and binary_version_opt =
probe_binary_version(&exe_path, &rocm_dir).await (Option or Result) and only set
version_ok to true when both are present/non-empty and equal; otherwise log the
appropriate mismatch/failure (including error when available) and return None
instead of launching exe_path.
- Around line 740-750: The restart_server function currently clears the user's
backend override after a successful restart by setting
*state.backend_override.lock().unwrap() = None in restart_server; change this so
restart_server does not unconditionally clear state.backend_override—either
remove that line or only clear the override when an explicit "clear"
flag/command is passed from the UI (e.g., add a parameter to restart_server or a
separate clearBackendOverride handler), and ensure the UI (GpuAcceleration.tsx
setBackendOverride / handleRestart) is relied upon to clear the override when
the user explicitly requests it.
- Around line 104-122: probe_binary_version currently spawns a blocking
std::process::Command inside tokio::task::spawn_blocking which is not
cancel-safe and will leak threads/child processes when tokio::time::timeout
fires; replace the spawn_blocking + std::process::Command approach with
tokio::process::Command (used from within the async context of
probe_binary_version) and set .kill_on_drop(true) on the spawned child so that
when the timeout cancels/wraps the future the underlying child process is
terminated; keep the same semantics of capturing stdout, parsing the version
string, and retaining the timeout duration and error branches but move to using
tokio::process::Command and its async .output() call inside
probe_binary_version.
---
Outside diff comments:
In `@app/src/components/ServerTab/GpuPage.tsx`:
- Around line 320-355: The handlers handleSwitchToCpu, handleSwitchToCuda and
handleSwitchToRocm set restartPhase('stopping') before awaiting
platform.lifecycle.setBackendOverride, so if setBackendOverride throws the catch
never resets restartPhase and the UI remains stuck; fix by moving
setRestartPhase('stopping') to after a successful await
platform.lifecycle.setBackendOverride (i.e., just before calling
restartServerWithPolling) or alternately wrap the setBackendOverride call in its
own try/catch that resets restartPhase on error — apply this change to
handleSwitchToCpu, handleSwitchToCuda and handleSwitchToRocm and keep their
existing error/refetch behavior.
In `@backend/build_binary.py`:
- Around line 455-628: The restore logic currently in the finally block
(checking restore_torch and running pip to restore CPU/CUDA/ROCm) does not
protect against failures during the pre-build torch swap/install steps (the
subprocess.run pip installs inside the Windows CPU/ROCm blocks and the ROCm
torch install sequence), so move/extend the try/finally to wrap the entire
torch-swap/install phase (the blocks that set restore_torch and call
subprocess.run to install CPU or ROCm wheels) so that the finally always
executes even if those pip installs fail; ensure restore_torch is determined
before any environment-overwriting installs (as already done in the ROCm block),
and when performing the restore pip installs in the finally (the branches that
install torch/torchvision/torchaudio for "cpu","cuda","rocm") include the
--no-deps flag on ROCm/restore installs where applicable (and keep existing
--index-url for CPU/CUDA restores) so NumPy protection is preserved; keep
PyInstaller.__main__.run(args) inside the try so finally still runs after the
build attempt.
---
Duplicate comments:
In `@backend/backends/qwen_custom_voice_backend.py`:
- Around line 108-130: The module references force_offline_if_cached inside the
model loading block (used in _load_model_sync / the Qwen3TTSModel loading logic)
but never imports it; add an import for force_offline_if_cached from
..utils.hf_offline_patch at the top of the file so the call in the with
force_offline_if_cached(is_cached, model_name): block resolves and avoids a
NameError at runtime.
In `@backend/build_binary.py`:
- Around line 509-513: The current guard only checks sys.version_info and allows
non-CPython interpreters; update the check around the existing
sys.version_info[:2] comparison to also verify sys.implementation.name ==
"cpython" (or equivalent) and raise the same RuntimeError when either the Python
major/minor isn't 3.12 or the implementation is not CPython so that only CPython
3.12 interpreters proceed to build/install the cp312 ROCm wheels; locate and
modify the block that currently inspects sys.version_info[:2] in build_binary.py
to enforce both conditions.
In `@backend/services/rocm.py`:
- Around line 369-373: The current non-atomic swap deletes rocm_dir then renames
staging_dir, risking total loss if rename fails; instead, when use_staging is
true perform an atomic-safe swap by first renaming existing rocm_dir to a
temporary backup name (e.g., rocm_dir.with_suffix(".bak") or similar), then
rename staging_dir to rocm_dir, and on any failure attempt to rename the backup
back to rocm_dir; finally remove the backup only after successful rename; update
the block referencing use_staging, rocm_dir, staging_dir to implement this
rename-to-backup -> rename-staging -> cleanup-or-rollback sequence and avoid
using shutil.rmtree before a successful replacement.
- Around line 289-358: The code builds base_url from the app release `version`
and uses it for both server and libs downloads, which can break ROCm installs
when libs are versioned separately; change the logic so
`rocm-libs-{ROCM_LIBS_VERSION}.tar.gz` is downloaded from a libs-specific
release URL (e.g. build a separate libs_base_url using `ROCM_LIBS_VERSION`) and
use that URL for the HEAD and download calls related to the libs archive while
keeping server downloads on the original `base_url`; update references in the
download/extract sequence (look for `base_url`, `libs_archive`, the
head/download calls and the call to `_download_and_extract_archive` guarded by
`need_libs`) so libs use the new `libs_base_url` and server continues to use
`base_url`.
---
Nitpick comments:
In `@app/src/components/ServerTab/GpuPage.tsx`:
- Around line 206-238: The SSE subscription in the useEffect for ROCm (the
EventSource created when rocmDownloading && serverUrl) can miss early progress
events; modify the download flow by adding a dedicated isStreaming boolean state
(e.g., rocmStreaming) that you set to true in handleDownloadRocm immediately
after the POST returns, and change the useEffect to open/close the EventSource
based on rocmStreaming && serverUrl (and still clear it when progress is
complete/error and set rocmStreaming to false); apply the same pattern to the
CUDA SSE effect so streaming starts as soon as the user initiates the download
rather than waiting for the server-status poll or refetchRocmStatus.
In `@backend/services/rocm.py`:
- Around line 226-230: Remove the obsolete Python-version branch in the tar
extraction block and always call tar.extractall with the safe filter: replace
the if/else around sys.version_info in the with tarfile.open(temp_path, "r:gz")
as tar: block so it always uses tar.extractall(path=dest_dir, filter="data");
also audit and remove the now-unused sys import if present (references:
temp_path, dest_dir, tar.extractall).
In `@backend/tests/test_rocm_build.py`:
- Around line 25-32: Fix the fixture decorator and collapse nested context
managers: change the `@pytest.fixture`() on the captured_args fixture to
`@pytest.fixture`, and combine the three nested with() patches into a single with
statement that patches "build_binary.PyInstaller.__main__.run" as mock_run,
"build_binary.platform.system" returning "Linux", and "build_binary.os.chdir" in
one line; inside that single context call build_server(rocm=True) and then
return mock_run.call_args[0][0]. Ensure you reference the captured_args fixture,
the build_server call, and the three patch targets when making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: dc0c916c-ae3f-4c1e-a8b3-b298e124d8a7
📒 Files selected for processing (32)
.claude/settings.local.jsonapp/package.jsonapp/src/components/ServerSettings/GpuAcceleration.test.tsxapp/src/components/ServerSettings/GpuAcceleration.tsxapp/src/components/ServerTab/GpuPage.tsxapp/src/i18n/locales/en/translation.jsonapp/src/lib/api/client.tsapp/src/lib/api/models/ModelStatus.tsapp/src/lib/api/types.tsapp/src/platform/types.tsapp/src/test/setup.tsapp/vite.config.tsbackend/app.pybackend/backends/base.pybackend/backends/hume_backend.pybackend/backends/qwen_custom_voice_backend.pybackend/build_binary.pybackend/pyi_rth_rocm_sdk.pybackend/requirements-rocm.txtbackend/routes/__init__.pybackend/routes/health.pybackend/routes/rocm.pybackend/server.pybackend/services/rocm.pybackend/tests/test_amd_gpu_detect.pybackend/tests/test_rocm_backends.pybackend/tests/test_rocm_build.pybackend/tests/test_rocm_download.pybackend/tests/test_rocm_requirements.pybackend/utils/platform_detect.pytauri/src-tauri/src/main.rstauri/src/platform/lifecycle.ts
✅ Files skipped from review due to trivial changes (7)
- backend/routes/init.py
- app/src/test/setup.ts
- app/src/lib/api/models/ModelStatus.ts
- backend/requirements-rocm.txt
- app/src/lib/api/types.ts
- app/src/lib/api/client.ts
- app/src/components/ServerSettings/GpuAcceleration.tsx
🚧 Files skipped from review as they are similar to previous changes (5)
- backend/routes/health.py
- app/vite.config.ts
- backend/utils/platform_detect.py
- backend/pyi_rth_rocm_sdk.py
- app/src/i18n/locales/en/translation.json
| { | ||
| "permissions": { | ||
| "allow": [ | ||
| "Bash(grep -E \"lock|yarn|pnpm\" echo \"---\" grep -n \"testing-library\\\\|vitest\\\\|jsdom\" /e/VoiceBox/voicebox/app/package.json)" | ||
| ] | ||
| } | ||
| } |
There was a problem hiding this comment.
Remove local Claude permissions from the PR.
This file is machine-specific (/e/VoiceBox/...) and grants local tool execution permissions. It should stay untracked, and .claude/settings.local.json should be ignored if contributors use it locally.
Suggested cleanup
-.claude/settings.local.jsonIf the repo does not already ignore local Claude settings:
+# Local AI assistant workspace settings
+.claude/settings.local.json🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.claude/settings.local.json around lines 1 - 7, The
.claude/settings.local.json file with the "permissions": { "allow": [...] }
entry is a machine-specific local settings file and should not be included in
the PR; remove it from the commit and ensure it is ignored going forward by
deleting the file from the repo index (while keeping it locally) and adding
.claude/settings.local.json to .gitignore (or updating an existing ignore rule
for the .claude directory) so the "permissions.allow" entry is not tracked.
Implements native ROCm architecture for Windows. - Adds backend build pipeline for voicebox-server-rocm.exe - Detects AMD GPUs dynamically and routes PyTorch allocations - Adds automatic download and update logic for ROCm dependencies - Refactors UI in GpuPage.tsx and GpuAcceleration.tsx to add AMD flows - Fixes 'Switch to CPU' lock on Windows via Tauri backend_override state - Resolves PyInstaller/rocm_sdk UnboundLocalError silent crashes - Resolves Numba/NumPy 2.x incompatibilities during Qwen3-TTS load - Resolves HF_HUB_OFFLINE Catch-22 for CustomVoice processor caching
acb5416 to
81133ba
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (2)
app/src/components/ServerSettings/GpuAcceleration.tsx (1)
202-226:⚠️ Potential issue | 🟠 MajorMake GPU switches update the backend override before restarting.
After a user switches to CPU,
setBackendOverride('cpu')persists, but both GPU buttons only callhandleRestart(). That restart can keep launching CPU instead of the selected GPU backend. Also, when both GPU backends are installed, “Switch to CUDA Backend” is still ambiguous because the launcher prefers ROCm first.Set/clear an explicit override for the selected backend before restarting, and make the Tauri launcher honor those values.
🛠️ Proposed direction
- const handleRestart = async () => { + const handleRestart = async (backend?: 'cuda' | 'rocm') => { setError(null); setRestartPhase('stopping'); try { + if (backend) { + await platform.lifecycle.setBackendOverride(backend); + } else { + await platform.lifecycle.setBackendOverride(null); + } setRestartPhase('waiting'); startHealthPolling(); await platform.lifecycle.restartServer();- <Button onClick={handleRestart} className="w-full" size="sm"> + <Button onClick={() => handleRestart('cuda')} className="w-full" size="sm">- <Button onClick={handleRestart} className="w-full" size="sm"> + <Button onClick={() => handleRestart('rocm')} className="w-full" size="sm">Also applies to: 483-492, 565-574
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/components/ServerSettings/GpuAcceleration.tsx` around lines 202 - 226, The GPU switch handlers must set an explicit backend override before restarting so the launcher uses the selected backend: update the CUDA and ROCm button callbacks to call setBackendOverride('cuda') and setBackendOverride('rocm') respectively (and ensure the CPU button calls setBackendOverride('cpu') or the "Auto" button clears the override via setBackendOverride(null or '') as appropriate) immediately before calling handleRestart(); keep the existing restart flow (handleRestart, startHealthPolling, healthPollRef cleanup, queryClient.invalidateQueries) unchanged but ensure the Tauri launcher reads and honors the stored override value when platform.lifecycle.restartServer() is invoked.tauri/src-tauri/src/main.rs (1)
417-478:⚠️ Potential issue | 🟠 MajorHonor explicit
cuda/rocmoverrides, not justcpu.
set_backend_overrideaccepts arbitrary backend names, butstart_serveronly special-cases"cpu";"cuda"and"rocm"fall back to auto-detect order. That prevents the UI from reliably switching back from CPU to GPU, and still makes CUDA lose to ROCm when both are installed.Validate and honor
"cpu" | "cuda" | "rocm"in the launcher ordering.🛠️ Proposed direction
- let spawn_result = if backend_override.as_deref() != Some("cpu") { + let spawn_result = match backend_override.as_deref() { + Some("cpu") => { + println!("Backend override=cpu: using bundled CPU sidecar"); + sidecar = sidecar.args(["--data-dir", &data_dir_str, "--port", &port_str, "--parent-pid", &parent_pid_str]); + if is_remote { sidecar = sidecar.args(["--host", "0.0.0.0"]); } + if let Some(ref dir) = effective_models_dir { sidecar = sidecar.env("VOICEBOX_MODELS_DIR", dir); } + sidecar.spawn() + } + Some("cuda") => { + // Try CUDA first, then CPU fallback. + // Keep ROCm out of this path so "Switch to CUDA" is deterministic. + /* existing CUDA spawn setup, else CPU */ + } + Some("rocm") => { + // Try ROCm first, then CPU fallback. + /* existing ROCm spawn setup, else CPU */ + } + _ => { + // Existing auto order: ROCm -> CUDA -> CPU. + } - } else { - // Override forces CPU — use bundled sidecar, GPU binary stays on disk - println!("Backend override=cpu: using bundled CPU sidecar"); - ... - sidecar.spawn() - }; + };Also applies to: 759-763
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tauri/src-tauri/src/main.rs` around lines 417 - 478, The launcher currently only treats "cpu" specially via backend_override but ignores explicit "cuda"/"rocm"; update the logic in start_server to parse backend_override (backend_override.as_deref()) and branch for three explicit values: "cpu" => force using bundled sidecar (sidecar.spawn()), "rocm" => try rocm_binary first then cuda_binary then CPU sidecar, and "cuda" => try cuda_binary first then rocm_binary then CPU sidecar. Keep existing behavior for the spawn attempts (use rocm_binary, cuda_binary, sidecar, setting current_dir, args, env, and printing on failure) but change the spawn_result selection to follow the requested override order; modify the block that references backend_override, rocm_binary, cuda_binary, sidecar, and gpu_spawn to implement this prioritized ordering.
🧹 Nitpick comments (1)
app/src/lib/api/types.ts (1)
171-189: Optional: consider deduplicating withCudaDownloadProgress.
RocmDownloadProgressis structurally identical toCudaDownloadProgress(lines 152-161). A sharedGpuDownloadProgress(or a genericBackendDownloadProgress) would reduce drift risk as new fields get added to one variant but not the other. Not blocking — current split keeps API surface explicit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/lib/api/types.ts` around lines 171 - 189, RocmDownloadProgress duplicates CudaDownloadProgress; create a shared interface (e.g., GpuDownloadProgress or BackendDownloadProgress) and replace both RocmDownloadProgress and CudaDownloadProgress with that shared type, updating references in RocmStatus and CUDA-related types to use the new shared interface (ensure fields status, filename, error, timestamp, model_name, current, total, progress remain identical and exported under the new name).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/src/components/ServerSettings/GpuAcceleration.tsx`:
- Around line 519-547: The ROCm progress block should use
rocmStatus.download_progress as a fallback when rocmDownloadProgress is null;
update the conditional render and all uses of rocmDownloadProgress (filename,
total, progress, current) to read from a merged/fallback object (e.g., const
progress = rocmDownloadProgress || rocmStatus.download_progress) so the UI shows
progress if rocmDownloading is true but no SSE event has arrived yet; ensure
Progress value uses progress.progress and the bytes display uses
progress.current/ progress.total and maintain existing formatting via
formatBytes, keeping existing elements Loader2, Progress and text-muted spans.
In `@backend/pyi_rth_rocm_sdk.py`:
- Around line 67-70: The error message includes a misplaced comma inside the
quoted path in the FileNotFoundError f-string; update the raise statement that
references shortname, relpath, and entry_pattern to remove the comma from inside
the quoted '{relpath},' so the path prints as '{relpath}' (or place the comma
outside the quotes if you want a trailing comma), ensuring the f-string reads
cleanly and uses the existing variables shortname, relpath, and entry_pattern.
In `@backend/tests/test_rocm_build.py`:
- Around line 81-90: The ROCm E2E class TestRocmBuildE2E (and its
test_rocm_binary_compiles_and_runs) is only marked as `@pytest.mark.slow` but not
skipped by default; make it explicitly opt-in by adding a skip condition that
checks an environment variable or pytest marker (e.g., skip if
os.environ.get("RUN_ROCM_E2E") != "1") so the test only runs when the operator
intentionally enables it; update the test file to import os and apply
`@pytest.mark.skipif`(...) to TestRocmBuildE2E (or set pytestmark at class scope)
and document the required env var/marker in the test comment.
- Around line 19-34: The test imports and patch targets are module-relative and
fail when running pytest from the repo root; change the top import to use the
package-qualified module (import build_server from backend.build_binary) and
update all patch target strings inside the captured_args fixture to refer to the
package-qualified module (e.g.,
patch("backend.build_binary.PyInstaller.__main__.run"),
patch("backend.build_binary.platform.system", ...) and
patch("backend.build_binary.os.chdir")). Keep the same fixture name
(captured_args) and function (build_server) references so behaviour is
unchanged.
In `@backend/tests/test_rocm_requirements.py`:
- Around line 66-71: The test test_rocm_torch_installs_and_detects_amd currently
only gates by VOICEBOX_TEST_ROCM_INSTALL but will fail on Linux/macOS because
the ROCm wheel index is Windows-specific; update the skip condition to also
require running on Windows (e.g., os.name == "nt" or platform.system() ==
"Windows") so the test only runs when VOICEBOX_TEST_ROCM_INSTALL is set and the
platform is Windows; modify the `@pytest.mark.skipif` decorator around
test_rocm_torch_installs_and_detects_amd to include this platform check while
keeping the existing environment variable check.
---
Duplicate comments:
In `@app/src/components/ServerSettings/GpuAcceleration.tsx`:
- Around line 202-226: The GPU switch handlers must set an explicit backend
override before restarting so the launcher uses the selected backend: update the
CUDA and ROCm button callbacks to call setBackendOverride('cuda') and
setBackendOverride('rocm') respectively (and ensure the CPU button calls
setBackendOverride('cpu') or the "Auto" button clears the override via
setBackendOverride(null or '') as appropriate) immediately before calling
handleRestart(); keep the existing restart flow (handleRestart,
startHealthPolling, healthPollRef cleanup, queryClient.invalidateQueries)
unchanged but ensure the Tauri launcher reads and honors the stored override
value when platform.lifecycle.restartServer() is invoked.
In `@tauri/src-tauri/src/main.rs`:
- Around line 417-478: The launcher currently only treats "cpu" specially via
backend_override but ignores explicit "cuda"/"rocm"; update the logic in
start_server to parse backend_override (backend_override.as_deref()) and branch
for three explicit values: "cpu" => force using bundled sidecar
(sidecar.spawn()), "rocm" => try rocm_binary first then cuda_binary then CPU
sidecar, and "cuda" => try cuda_binary first then rocm_binary then CPU sidecar.
Keep existing behavior for the spawn attempts (use rocm_binary, cuda_binary,
sidecar, setting current_dir, args, env, and printing on failure) but change the
spawn_result selection to follow the requested override order; modify the block
that references backend_override, rocm_binary, cuda_binary, sidecar, and
gpu_spawn to implement this prioritized ordering.
---
Nitpick comments:
In `@app/src/lib/api/types.ts`:
- Around line 171-189: RocmDownloadProgress duplicates CudaDownloadProgress;
create a shared interface (e.g., GpuDownloadProgress or BackendDownloadProgress)
and replace both RocmDownloadProgress and CudaDownloadProgress with that shared
type, updating references in RocmStatus and CUDA-related types to use the new
shared interface (ensure fields status, filename, error, timestamp, model_name,
current, total, progress remain identical and exported under the new name).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 46b8d504-0212-44f8-b1e3-e0a8022f6b6a
📒 Files selected for processing (32)
.gitignoreapp/package.jsonapp/src/components/ServerSettings/GpuAcceleration.test.tsxapp/src/components/ServerSettings/GpuAcceleration.tsxapp/src/components/ServerTab/GpuPage.tsxapp/src/i18n/locales/en/translation.jsonapp/src/lib/api/client.tsapp/src/lib/api/models/ModelStatus.tsapp/src/lib/api/types.tsapp/src/platform/types.tsapp/src/test/setup.tsapp/vite.config.tsbackend/app.pybackend/backends/base.pybackend/backends/hume_backend.pybackend/backends/qwen_custom_voice_backend.pybackend/build_binary.pybackend/pyi_rth_rocm_sdk.pybackend/requirements-rocm.txtbackend/routes/__init__.pybackend/routes/health.pybackend/routes/rocm.pybackend/server.pybackend/services/rocm.pybackend/tests/test_amd_gpu_detect.pybackend/tests/test_rocm_backends.pybackend/tests/test_rocm_build.pybackend/tests/test_rocm_download.pybackend/tests/test_rocm_requirements.pybackend/utils/platform_detect.pytauri/src-tauri/src/main.rstauri/src/platform/lifecycle.ts
✅ Files skipped from review due to trivial changes (8)
- app/src/test/setup.ts
- backend/routes/init.py
- app/src/lib/api/models/ModelStatus.ts
- backend/requirements-rocm.txt
- backend/routes/rocm.py
- backend/tests/test_rocm_download.py
- backend/build_binary.py
- app/src/components/ServerTab/GpuPage.tsx
🚧 Files skipped from review as they are similar to previous changes (5)
- app/src/platform/types.ts
- backend/backends/base.py
- backend/server.py
- backend/utils/platform_detect.py
- app/src/i18n/locales/en/translation.json
| {rocmDownloading && rocmDownloadProgress && ( | ||
| <div className="space-y-2"> | ||
| <div className="flex items-center justify-between text-sm"> | ||
| <div className="flex items-center gap-2"> | ||
| <Loader2 className="h-4 w-4 animate-spin" /> | ||
| <span> | ||
| {rocmDownloadProgress.filename || | ||
| (rocmAvailable | ||
| ? 'Updating ROCm backend...' | ||
| : 'Downloading ROCm backend...')} | ||
| </span> | ||
| </div> | ||
| {rocmDownloadProgress.total > 0 && ( | ||
| <span className="text-muted-foreground"> | ||
| {rocmDownloadProgress.progress.toFixed(1)}% | ||
| </span> | ||
| )} | ||
| </div> | ||
| {rocmDownloadProgress.total > 0 && ( | ||
| <> | ||
| <Progress value={rocmDownloadProgress.progress} className="h-2" /> | ||
| <div className="text-xs text-muted-foreground"> | ||
| {formatBytes(rocmDownloadProgress.current)} /{' '} | ||
| {formatBytes(rocmDownloadProgress.total)} | ||
| </div> | ||
| </> | ||
| )} | ||
| </div> | ||
| )} |
There was a problem hiding this comment.
Use rocmStatus.download_progress as the initial progress fallback.
If the page renders while a download is already in progress, rocmDownloading can be true while rocmDownloadProgress is still null, so the ROCm progress block stays hidden until a new SSE event arrives. The status response already carries download_progress; use it as the display fallback.
🧩 Proposed fix
+ const displayedRocmDownloadProgress =
+ rocmDownloadProgress ?? rocmStatus?.download_progress ?? null;
+
...
- {rocmDownloading && rocmDownloadProgress && (
+ {rocmDownloading && displayedRocmDownloadProgress && (
<div className="space-y-2">
...
- {rocmDownloadProgress.filename ||
+ {displayedRocmDownloadProgress.filename ||
(rocmAvailable
? 'Updating ROCm backend...'
: 'Downloading ROCm backend...')}
...
- {rocmDownloadProgress.total > 0 && (
+ {displayedRocmDownloadProgress.total > 0 && (
<span className="text-muted-foreground">
- {rocmDownloadProgress.progress.toFixed(1)}%
+ {displayedRocmDownloadProgress.progress.toFixed(1)}%
</span>
)}
...
- {rocmDownloadProgress.total > 0 && (
+ {displayedRocmDownloadProgress.total > 0 && (
<>
- <Progress value={rocmDownloadProgress.progress} className="h-2" />
+ <Progress value={displayedRocmDownloadProgress.progress} className="h-2" />
<div className="text-xs text-muted-foreground">
- {formatBytes(rocmDownloadProgress.current)} /{' '}
- {formatBytes(rocmDownloadProgress.total)}
+ {formatBytes(displayedRocmDownloadProgress.current)} /{' '}
+ {formatBytes(displayedRocmDownloadProgress.total)}
</div>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src/components/ServerSettings/GpuAcceleration.tsx` around lines 519 -
547, The ROCm progress block should use rocmStatus.download_progress as a
fallback when rocmDownloadProgress is null; update the conditional render and
all uses of rocmDownloadProgress (filename, total, progress, current) to read
from a merged/fallback object (e.g., const progress = rocmDownloadProgress ||
rocmStatus.download_progress) so the UI shows progress if rocmDownloading is
true but no SSE event has arrived yet; ensure Progress value uses
progress.progress and the bytes display uses progress.current/ progress.total
and maintain existing formatting via formatBytes, keeping existing elements
Loader2, Progress and text-muted spans.
| raise FileNotFoundError( | ||
| f"Could not find rocm library '{shortname}' at path " | ||
| f"'{relpath},' no match for pattern '{entry_pattern}'" | ||
| ) |
There was a problem hiding this comment.
Minor: misplaced comma inside quoted path in error message.
The comma inside the closing quote ('{relpath},') makes the error read like at path '/some/path,' — likely unintended.
🔧 Proposed fix
raise FileNotFoundError(
f"Could not find rocm library '{shortname}' at path "
- f"'{relpath},' no match for pattern '{entry_pattern}'"
+ f"'{relpath}', no match for pattern '{entry_pattern}'"
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/pyi_rth_rocm_sdk.py` around lines 67 - 70, The error message includes
a misplaced comma inside the quoted path in the FileNotFoundError f-string;
update the raise statement that references shortname, relpath, and entry_pattern
to remove the comma from inside the quoted '{relpath},' so the path prints as
'{relpath}' (or place the comma outside the quotes if you want a trailing
comma), ensuring the f-string reads cleanly and uses the existing variables
shortname, relpath, and entry_pattern.
| from build_binary import build_server | ||
|
|
||
|
|
||
| class TestRocmBuildArgs: | ||
| """Validate PyInstaller arguments for ROCm builds.""" | ||
|
|
||
| @pytest.fixture | ||
| def captured_args(self): | ||
| """Run build_server(rocm=True) with mocked PyInstaller and return args.""" | ||
| with ( | ||
| patch("build_binary.PyInstaller.__main__.run") as mock_run, | ||
| patch("build_binary.platform.system", return_value="Linux"), | ||
| patch("build_binary.os.chdir"), | ||
| ): | ||
| build_server(rocm=True) | ||
| return mock_run.call_args[0][0] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
python - <<'PY'
import importlib.util
for name in ("build_binary", "backend.build_binary"):
spec = importlib.util.find_spec(name)
print(f"{name}: {spec.origin if spec else 'NOT FOUND'}")
PYRepository: jamiepine/voicebox
Length of output: 150
🏁 Script executed:
#!/bin/bash
# Check the actual content of the test file
echo "=== Checking test_rocm_build.py content ==="
cat -n backend/tests/test_rocm_build.py | head -50
echo -e "\n=== Checking import patterns in other backend tests ==="
# Find other test files in backend/tests and check their imports
fd -t f "\.py$" backend/tests --exec head -5 {} \; 2>/dev/null | head -100Repository: jamiepine/voicebox
Length of output: 4850
🏁 Script executed:
#!/bin/bash
# More targeted search for import patterns in backend tests
echo "=== Backend test files ==="
fd -t f "\.py$" backend/tests
echo -e "\n=== Sample imports from backend tests ==="
for f in $(fd -t f "\.py$" backend/tests | head -5); do
echo "--- $f ---"
head -20 "$f" | grep -E "^(from|import)" || echo "(no imports shown in first 20 lines)"
doneRepository: jamiepine/voicebox
Length of output: 1595
🏁 Script executed:
#!/bin/bash
# Get more specific import patterns from backend tests
echo "=== Imports from backend test files ==="
for f in backend/tests/test_*.py; do
echo "--- $f ---"
grep -E "^(from|import)" "$f" | head -10 || echo "(no imports)"
done | head -100Repository: jamiepine/voicebox
Length of output: 3024
Use the package-qualified backend.build_binary import path.
With the documented python -m pytest backend/tests/test_rocm_build.py invocation from the repo root, from build_binary import build_server cannot resolve without injecting backend/ into PYTHONPATH. Importing backend.build_binary is consistent with other backend tests (e.g., test_amd_gpu_detect.py uses from backend.utils.platform_detect) and keeps the patch targets aligned.
🧪 Proposed fix
-from build_binary import build_server
+from backend.build_binary import build_server
...
- patch("build_binary.PyInstaller.__main__.run") as mock_run,
- patch("build_binary.platform.system", return_value="Linux"),
- patch("build_binary.os.chdir"),
+ patch("backend.build_binary.PyInstaller.__main__.run") as mock_run,
+ patch("backend.build_binary.platform.system", return_value="Linux"),
+ patch("backend.build_binary.os.chdir"),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from build_binary import build_server | |
| class TestRocmBuildArgs: | |
| """Validate PyInstaller arguments for ROCm builds.""" | |
| @pytest.fixture | |
| def captured_args(self): | |
| """Run build_server(rocm=True) with mocked PyInstaller and return args.""" | |
| with ( | |
| patch("build_binary.PyInstaller.__main__.run") as mock_run, | |
| patch("build_binary.platform.system", return_value="Linux"), | |
| patch("build_binary.os.chdir"), | |
| ): | |
| build_server(rocm=True) | |
| return mock_run.call_args[0][0] | |
| from backend.build_binary import build_server | |
| class TestRocmBuildArgs: | |
| """Validate PyInstaller arguments for ROCm builds.""" | |
| `@pytest.fixture` | |
| def captured_args(self): | |
| """Run build_server(rocm=True) with mocked PyInstaller and return args.""" | |
| with ( | |
| patch("backend.build_binary.PyInstaller.__main__.run") as mock_run, | |
| patch("backend.build_binary.platform.system", return_value="Linux"), | |
| patch("backend.build_binary.os.chdir"), | |
| ): | |
| build_server(rocm=True) | |
| return mock_run.call_args[0][0] |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/tests/test_rocm_build.py` around lines 19 - 34, The test imports and
patch targets are module-relative and fail when running pytest from the repo
root; change the top import to use the package-qualified module (import
build_server from backend.build_binary) and update all patch target strings
inside the captured_args fixture to refer to the package-qualified module (e.g.,
patch("backend.build_binary.PyInstaller.__main__.run"),
patch("backend.build_binary.platform.system", ...) and
patch("backend.build_binary.os.chdir")). Keep the same fixture name
(captured_args) and function (build_server) references so behaviour is
unchanged.
| @pytest.mark.slow() | ||
| @pytest.mark.skipif(sys.platform != "win32", reason="ROCm build E2E only runs on Windows") | ||
| class TestRocmBuildE2E: | ||
| """ | ||
| True end-to-end build test. | ||
| Executes build_binary.py --rocm, verifies the binary exists, and runs it | ||
| with --help to confirm it boots without import errors. | ||
| """ | ||
|
|
||
| def test_rocm_binary_compiles_and_runs(self, tmp_path): |
There was a problem hiding this comment.
Make the slow ROCm E2E build explicitly opt-in.
@pytest.mark.slow only labels the test; it does not skip it by default. On Windows CI, this can still run a full ROCm PyInstaller build unless the test command excludes slow tests.
🧪 Proposed fix
+import os
...
`@pytest.mark.slow`()
`@pytest.mark.skipif`(sys.platform != "win32", reason="ROCm build E2E only runs on Windows")
+@pytest.mark.skipif(
+ os.environ.get("VOICEBOX_TEST_ROCM_BUILD") != "1",
+ reason="Set VOICEBOX_TEST_ROCM_BUILD=1 to run the ROCm E2E build test",
+)
class TestRocmBuildE2E:🧰 Tools
🪛 Ruff (0.15.10)
[warning] 81-81: Use @pytest.mark.slow over @pytest.mark.slow()
Remove parentheses
(PT023)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/tests/test_rocm_build.py` around lines 81 - 90, The ROCm E2E class
TestRocmBuildE2E (and its test_rocm_binary_compiles_and_runs) is only marked as
`@pytest.mark.slow` but not skipped by default; make it explicitly opt-in by
adding a skip condition that checks an environment variable or pytest marker
(e.g., skip if os.environ.get("RUN_ROCM_E2E") != "1") so the test only runs when
the operator intentionally enables it; update the test file to import os and
apply `@pytest.mark.skipif`(...) to TestRocmBuildE2E (or set pytestmark at class
scope) and document the required env var/marker in the test comment.
| @pytest.mark.timeout(900) | ||
| @pytest.mark.skipif( | ||
| not os.environ.get("VOICEBOX_TEST_ROCM_INSTALL"), | ||
| reason="Set VOICEBOX_TEST_ROCM_INSTALL=1 to run the heavy install test", | ||
| ) | ||
| def test_rocm_torch_installs_and_detects_amd(self, backend_dir): |
There was a problem hiding this comment.
Windows-gate the ROCm install integration test.
The requirements point at the Windows ROCm wheel index, so an opted-in run on Linux/macOS will fail during install rather than testing AMD ROCm availability.
🧪 Proposed fix
`@pytest.mark.timeout`(900)
+ `@pytest.mark.skipif`(
+ sys.platform != "win32",
+ reason="ROCm Windows wheels are only validated on Windows",
+ )
`@pytest.mark.skipif`(
not os.environ.get("VOICEBOX_TEST_ROCM_INSTALL"),
reason="Set VOICEBOX_TEST_ROCM_INSTALL=1 to run the heavy install test",
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/tests/test_rocm_requirements.py` around lines 66 - 71, The test
test_rocm_torch_installs_and_detects_amd currently only gates by
VOICEBOX_TEST_ROCM_INSTALL but will fail on Linux/macOS because the ROCm wheel
index is Windows-specific; update the skip condition to also require running on
Windows (e.g., os.name == "nt" or platform.system() == "Windows") so the test
only runs when VOICEBOX_TEST_ROCM_INSTALL is set and the platform is Windows;
modify the `@pytest.mark.skipif` decorator around
test_rocm_torch_installs_and_detects_amd to include this platform check while
keeping the existing environment variable check.
Description
This PR introduces full native AMD ROCm support for Windows, allowing users with Radeon GPUs to leverage hardware acceleration via MIOpen. It mirrors the existing CUDA architecture while introducing robust fallbacks and architectural fixes.
Note on #531: Contrary to popular belief, ROCm is available on Windows. Instead of relying on DirectML (which suffers from performance overhead), this PR dynamically downloads and bundles the official AMD Radeon PyTorch wheels (
rocm-rel-7.2.1), bringing near-native Linux performance to Windows AMD users.Key Changes
--rocmtobuild_binary.py. It pulls official AMD Radeon wheels, bundles the server via--onedir, and safely restores the developer's CPU/CUDA environment post-build. Included--no-depsto prevent NumPy 2.x upgrades from breaking Numba/Qwen3-TTS.is_amd_gpu_windows()using CIM queries. Tensor routing works transparently via PyTorch'scudanamespace abstraction (and UI gracefully identifies the backend asROCminstead ofCUDA).services/rocm.pyand the respective FastAPI endpoints for SSE progress tracking and binary extraction.main.rsto detect and spawnvoicebox-server-rocm.exewith priority (ROCm > CUDA > CPU).backend_overridestate to Tauri. Switching to CPU now passes an override flag before restarting, bypassing Windows OS-level file locks on active GPU binaries (no destructive uninstallation required).rocm_sdkCrash: Added a runtime hook (pyi_rth_rocm_sdk.py) and explicit--collect-allflags to prevent silentUnboundLocalErrorcrashes when PyInstaller's static analyzer misses AMD's dynamic DLL imports._is_model_cachedinqwen_custom_voice_backend.pyto ensureAutoProcessorcaching (specificallypreprocessor_config.jsonfor the Base dependency) resolves correctly before forcing offline mode, preventing generation crashes.Testing
HSA_OVERRIDEhacks (Ref: ROCm HSA_OVERRIDE_GFX_VERSION unnecessary/harmful for RDNA 3 / 4 GPUs #469).Closes #531
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes