Releases: ROCm/rocFFT
Releases · ROCm/rocFFT
rocFFT 1.0.21 for ROCm 5.4.3
Fixed
- Removed source directory from rocm_install_targets call to prevent installation of rocfft.h in an unintended location.
rocFFT 1.0.20 for ROCm 5.4.2
rocFFT code for ROCm 5.4.2 did not change. The library was rebuilt for the updated ROCm 5.4.2 stack.
rocFFT 1.0.20 for ROCm 5.4.1
Fixed
- Fixed incorrect results on strided large 1D FFTs where batch size does not equal the stride.
rocFFT 1.0.19 for ROCm 5.4.0
Optimizations
- Optimized some strided large 1D plans.
Added
- Added rocfft_plan_description_set_scale_factor API to efficiently multiply each output element of a FFT by a given scaling factor.
- Created a rocfft_kernel_cache.db file next to the installed library. SBCC kernels are moved to this file when built with the library, and are runtime-compiled for new GPU architectures.
- Added gfx1100 and gfx1102 to default AMDGPU_TARGETS.
Changed
- Moved runtime compilation cache to in-memory by default. A default on-disk cache can encounter contention problems
on multi-node clusters with a shared filesystem. rocFFT can still be told to use an on-disk cache by setting the
ROCFFT_RTC_CACHE_PATH environment variable.
rocFFT 1.0.18 for ROCm 5.3.3
rocFFT code for ROCm 5.3.3 did not change. The library was rebuilt for the updated ROCm 5.3.3 stack.
rocFFT 1.0.18 for ROCm 5.3.2
rocFFT code for ROCm 5.3.2 did not change. The library was rebuilt for the updated ROCm 5.3.2 stack.
rocFFT 1.0.18 for ROCm 5.3.1
rocFFT code for ROCm 5.3.1 did not change. The library was rebuilt for the updated ROCm 5.3.1 stack.
rocFFT 1.0.18 for ROCm 5.3.0
Changed
- Runtime compilation cache now looks for environment variables XDG_CACHE_HOME (on Linux) and LOCALAPPDATA (on Windows) before falling back to HOME.
Optimizations
- Optimized 2D R2C/C2R to use 2-kernel plans where possible.
- Improved performance of the Bluestein algorithm.
- Optimized sbcc-168 and 100 by using half-lds.
Fixed
- Fixed occasional failures to parallelize runtime compilation of kernels.
Failures would be retried serially and ultimately succeed, but this would take extra time. - Fixed failures of some R2C 3D transforms that use the unsupported TILE_UNALGNED SBRC kernels.
An example is 98^3 R2C out-of-place. - Fixed bugs in SBRC_ERC type.
rocFFT 1.0.17 for ROCm 5.2.3
rocFFT code for ROCm 5.2.3 did not change. The library was rebuilt for the updated ROCm 5.2.3 stack.
rocFFT 1.0.17 for ROCm 5.2.1
rocFFT code for ROCm 5.2.1 did not change. The library was rebuilt for the updated ROCm 5.2.1 stack.