-
Notifications
You must be signed in to change notification settings - Fork 623
Description
Hi folks,
I'm using KataGo v1.16.3 (OpenCL, Linux x64) on an HP Zbook Ultra G1a which has an AMD Strix Halo 395+ CPU/GPU/NPU in it. I have 128G of RAM and allocated 32G to the GPU in BIOS.
I'm running the OEM Ubuntu 24.04. I have installed the mesa-opencl-icd
package to get KataGo to run, version 24.2.8-1ubuntu1~24.04.1
.
I am able to get KataGo to launch, but it crashes and dumps core after a while, or sometimes just locks up the whole machine hard.
./katago benchmark -model kata1-b28c512nbt-s9584861952-d4960414494.bin.gz -config default_gtp.cfg
The output I get is:
2025-07-13 14:06:50-0400: Running with following config:
allowResignation = true
lagBuffer = 1.0
logAllGTPCommunication = true
logDir = gtp_logs
logSearchInfo = true
logSearchInfoForChosenMove = false
logToStderr = false
maxTimePondering = 60.0
maxVisits = 500
numSearchThreads = 6
ponderingEnabled = false
resignConsecTurns = 3
resignThreshold = -0.90
rules = tromp-taylor
searchFactorAfterOnePass = 0.50
searchFactorAfterTwoPass = 0.25
searchFactorWhenWinning = 0.40
searchFactorWhenWinningThreshold = 0.95
2025-07-13 14:06:50-0400: Loading model and initializing benchmark...
2025-07-13 14:06:50-0400: Testing with default positions for board size: 19
2025-07-13 14:06:50-0400: nnRandSeed0 = 5831653519054986926
2025-07-13 14:06:50-0400: After dedups: nnModelFile0 = kata1-b28c512nbt-s9584861952-d4960414494.bin.gz useFP16 auto useNHWC auto
2025-07-13 14:06:50-0400: Initializing neural net buffer to be size 19 * 19 exactly
2025-07-13 14:06:52-0400: Found OpenCL Platform 0: Clover (Mesa) (OpenCL 1.1 Mesa 24.2.8-1ubuntu1~24.04.1)
2025-07-13 14:06:52-0400: Found 1 device(s) on platform 0 with type CPU or GPU or Accelerator
2025-07-13 14:06:52-0400: Found OpenCL Platform 1: rusticl (Mesa/X.org) (OpenCL 3.0 )
2025-07-13 14:06:52-0400: Found 0 device(s) on platform 1 with type CPU or GPU or Accelerator, skipping
2025-07-13 14:06:52-0400: Found OpenCL Device 0: AMD Radeon Graphics (radeonsi, gfx1151, LLVM 19.1.1, DRM 3.61, 6.11.0-1025-oem) (AMD) (score 11000101)
2025-07-13 14:06:52-0400: Creating context for OpenCL Platform: Clover (Mesa) (OpenCL 1.1 Mesa 24.2.8-1ubuntu1~24.04.1)
2025-07-13 14:06:52-0400: Using OpenCL Device 0: AMD Radeon Graphics (radeonsi, gfx1151, LLVM 19.1.1, DRM 3.61, 6.11.0-1025-oem) (AMD) OpenCL 1.1 Mesa 24.2.8-1ubuntu1~24.04.1 (Extensions: cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64 cl_khr_extended_versioning)
2025-07-13 14:06:52-0400: No existing tuning parameters found or parseable or valid at: /home/dfields/.katago/opencltuning/tune11_gpuAMDRadeonGraphicsradeonsigfx1151LLVM1911DRM36161101025oem_x19_y19_c512_mv15.txt
2025-07-13 14:06:52-0400: Performing autotuning
2025-07-13 14:06:52-0400: *** On some systems, this may take several minutes, please be patient ***
2025-07-13 14:06:52-0400: Found OpenCL Platform 0: Clover (Mesa) (OpenCL 1.1 Mesa 24.2.8-1ubuntu1~24.04.1)
2025-07-13 14:06:52-0400: Found 1 device(s) on platform 0 with type CPU or GPU or Accelerator
2025-07-13 14:06:52-0400: Found OpenCL Platform 1: rusticl (Mesa/X.org) (OpenCL 3.0 )
2025-07-13 14:06:52-0400: Found 0 device(s) on platform 1 with type CPU or GPU or Accelerator, skipping
2025-07-13 14:06:52-0400: Found OpenCL Device 0: AMD Radeon Graphics (radeonsi, gfx1151, LLVM 19.1.1, DRM 3.61, 6.11.0-1025-oem) (AMD) (score 11000101)
2025-07-13 14:06:52-0400: Creating context for OpenCL Platform: Clover (Mesa) (OpenCL 1.1 Mesa 24.2.8-1ubuntu1~24.04.1)
2025-07-13 14:06:52-0400: Using OpenCL Device 0: AMD Radeon Graphics (radeonsi, gfx1151, LLVM 19.1.1, DRM 3.61, 6.11.0-1025-oem) (AMD) OpenCL 1.1 Mesa 24.2.8-1ubuntu1~24.04.1 (Extensions: cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64 cl_khr_extended_versioning)
Beginning GPU tuning for AMD Radeon Graphics (radeonsi, gfx1151, LLVM 19.1.1, DRM 3.61, 6.11.0-1025-oem) modelVersion 15 channels 512
2025-07-13 14:06:52-0400: Dummy tuning thread starting
2025-07-13 14:06:52-0400: Creating context for OpenCL Platform: Clover (Mesa) (OpenCL 1.1 Mesa 24.2.8-1ubuntu1~24.04.1)
2025-07-13 14:06:52-0400: Using OpenCL Device 0: AMD Radeon Graphics (radeonsi, gfx1151, LLVM 19.1.1, DRM 3.61, 6.11.0-1025-oem) (AMD) OpenCL 1.1 Mesa 24.2.8-1ubuntu1~24.04.1 (Extensions: cl_khr_byte_addressable_store cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_fp64 cl_khr_extended_versioning)
Setting winograd3x3TileSize = 4
------------------------------------------------------
Tuning xGemmDirect for 1x1 convolutions and matrix mult
Testing 55 different configs
amdgpu: The CS has cancelled because the context is lost. This context is innocent.
Aborted (core dumped)
At this point, sometimes I get a command prompt back, and other times the machine locks up hard. Either way, during the execution the screen goes blank for a moment once or twice, and the mouse stops responding for a few moments now and again.
Does anyone have any thoughts on how to get KataGo to run on this AMD RYZEN AI MAX+ PRO 395 w/ Radeon 8060S
processor using either the GPU or NPU under Ubuntu 24.04 please?
Thanks!