Skip to content

Commit 6d18f4a

Browse files
authored
main: force cudnn.benchmark to false (Comfy-Org#14390)
Some custom nodes try to set this true globally. It messes with dynamic VRAM with one-off spikes that can OOM but this is also very high risk for windows where such allocations might get serviced by shared memory fallback. Trump it.
1 parent 039ed38 commit 6d18f4a

2 files changed

Lines changed: 9 additions & 2 deletions

File tree

comfy/model_management.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -534,8 +534,10 @@ def aotriton_supported(gpu_arch):
534534
except:
535535
pass
536536

537-
if torch.cuda.is_available() and torch.backends.cudnn.is_available() and PerformanceFeature.AutoTune in args.fast:
538-
torch.backends.cudnn.benchmark = True
537+
538+
def set_cudnn_benchmark():
539+
if torch.cuda.is_available() and torch.backends.cudnn.is_available():
540+
torch.backends.cudnn.benchmark = PerformanceFeature.AutoTune in args.fast
539541

540542
try:
541543
if torch_version_numeric >= (2, 5):

main.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -490,6 +490,11 @@ def start_comfyui(asyncio_loop=None):
490490
init_custom_nodes=(not args.disable_all_custom_nodes) or len(args.whitelist_custom_nodes) > 0,
491491
init_api_nodes=not args.disable_api_nodes
492492
))
493+
494+
# Re-apply Comfy's cuDNN benchmark policy after custom-node imports. Benchmark
495+
# mode can request near-card-sized autotune workspaces, and some custom nodes set it at import time.
496+
comfy.model_management.set_cudnn_benchmark()
497+
493498
hook_breaker_ac10a0.restore_functions()
494499

495500
cuda_malloc_warning()

0 commit comments

Comments
 (0)