Skip to content

Commit 3991aa9

Browse files
authored
[https://nvbugs/5688388][fix] fix: Reducing num request in disagg test to speed up (#9598)
Signed-off-by: Patrice Castonguay <[email protected]>
1 parent a560ba5 commit 3991aa9

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

tests/integration/defs/disaggregated/test_disaggregated_single_gpu.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -351,8 +351,8 @@ def test_disaggregated_llama_context_capacity(model, enable_cuda_graph,
351351
max_tokens = 25
352352

353353
requests = []
354-
# Send 256 requests to make sure the context worker is saturated
355-
for _ in range(256):
354+
# Send 32 requests to make sure the context worker is saturated
355+
for _ in range(32):
356356
requests.append(
357357
(prompt, SamplingParams(max_tokens=1, ignore_eos=True),
358358
DisaggregatedParams(request_type="context_only")))

0 commit comments

Comments
 (0)