Skip to content

Conversation

@eschmidbauer
Copy link
Contributor

Summary

CMake’s built-in select_compute_arch.cmake has no notion of the SM 8.9/12.0
targets used by Ada and Blackwell GPUs, so passing those numbers through
CUDA_ARCH_LIST stops the build. This PR updates ct2rs’s build script to
normalize the env var tokens, keep older arches in CUDA_ARCH_LIST, and add
explicit -gencode flags for anything with a major version ≥ 10. That lets
newer GPUs compile cleanly without regressing older cards.

Usage

  • Set your desired architectures via CUDA_ARCH_LIST (e.g. export CUDA_ARCH_LIST="8.9;12.0").
  • Build as usual, e.g. cargo build --release --features "cuda,ct2rs/cuda,ct2rs/cudnn".
  • Older architectures (SM < 10) continue to work with the existing CMake path;
    newer ones automatically get direct NVCC -gencode entries.

- honor CUDA_ARCH_LIST/CT2_CUDA_ARCH_LIST in build.rs and normalize tokens
- keep legacy architectures in CUDA_ARCH_LIST for CMake, but emit -gencode flags
  for SM major >= 10 so new GPUs (Ada/Blackwell) bypass FindCUDA’s limits
- allow optional extra NVCC flags to coexist with the small-binary compression flag
@jkawamoto
Copy link
Owner

Thanks for the PR! I tested it locally and it works great. Merging now.

@jkawamoto jkawamoto merged commit ce9e4f1 into jkawamoto:main Nov 13, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants