Skip to content

Commit f0694a8

Browse files
committed
fix oom
Signed-off-by: Dongfeng Yu <[email protected]>
1 parent 5ce4a8d commit f0694a8

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

tests/integration/defs/accuracy/test_llm_api_pytorch.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3975,7 +3975,7 @@ def test_eagle3(self, moe_backend, one_model, overlap_scheduler, mocker):
39753975
# https://nvbugs/5590408: 2-Model overlap scheduling has accuracy issue
39763976
pytorch_config = dict(disable_overlap_scheduler=not overlap_scheduler,
39773977
cuda_graph_config=CudaGraphConfig())
3978-
kv_cache_config = KvCacheConfig(free_gpu_memory_fraction=0.6,
3978+
kv_cache_config = KvCacheConfig(free_gpu_memory_fraction=0.4,
39793979
dtype="auto")
39803980

39813981
eagle_model_dir = f"{llm_models_root()}/gpt_oss/gpt-oss-120b-Eagle3"

0 commit comments

Comments
 (0)