Skip to content

Commit d05079b

Browse files
authored
[None][ci] move some test cases from H100 to A10 (#8449)
Signed-off-by: junq <[email protected]>
1 parent 3c2b3bd commit d05079b

File tree

3 files changed

+32
-35
lines changed

3 files changed

+32
-35
lines changed

jenkins/L0_Test.groovy

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2432,14 +2432,14 @@ def launchTestJobs(pipeline, testFilter)
24322432
"DGX_H100-4_GPUs-PyTorch-Others-1": ["dgx-h100-x4", "l0_dgx_h100", 1, 1, 4],
24332433
"DGX_H100-2_GPUs-PyTorch-Ray-1": ["dgx-h100-x2", "l0_dgx_h100", 1, 1, 2],
24342434
"DGX_H100-4_GPUs-CPP-1": ["dgx-h100-x4", "l0_dgx_h100", 1, 1, 4],
2435-
"A10-PyTorch-1": ["a10", "l0_a10", 1, 1],
2435+
"A10-PyTorch-1": ["a10", "l0_a10", 1, 2],
2436+
"A10-PyTorch-2": ["a10", "l0_a10", 2, 2],
24362437
"A10-CPP-1": ["a10", "l0_a10", 1, 1],
2437-
"A10-TensorRT-1": ["a10", "l0_a10", 1, 6],
2438-
"A10-TensorRT-2": ["a10", "l0_a10", 2, 6],
2439-
"A10-TensorRT-3": ["a10", "l0_a10", 3, 6],
2440-
"A10-TensorRT-4": ["a10", "l0_a10", 4, 6],
2441-
"A10-TensorRT-5": ["a10", "l0_a10", 5, 6],
2442-
"A10-TensorRT-6": ["a10", "l0_a10", 6, 6],
2438+
"A10-TensorRT-1": ["a10", "l0_a10", 1, 5],
2439+
"A10-TensorRT-2": ["a10", "l0_a10", 2, 5],
2440+
"A10-TensorRT-3": ["a10", "l0_a10", 3, 5],
2441+
"A10-TensorRT-4": ["a10", "l0_a10", 4, 5],
2442+
"A10-TensorRT-5": ["a10", "l0_a10", 5, 5],
24432443
"A10-Pybind": ["a10", "l0_a10_pybind", 1, 1],
24442444
"A30-Triton-1": ["a30", "l0_a30", 1, 1],
24452445
"A30-PyTorch-1": ["a30", "l0_a30", 1, 2],

tests/integration/test_lists/test-db/l0_a10.yml

Lines changed: 24 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -21,12 +21,30 @@ l0_a10:
2121
# test list either).
2222
- unittest/_torch/models/checkpoints/hf/test_weight_loader.py
2323
- unittest/others/test_time_breakdown.py
24+
- unittest/disaggregated/test_disagg_utils.py
25+
- unittest/disaggregated/test_router.py
26+
- unittest/disaggregated/test_remoteDictionary.py
27+
- unittest/disaggregated/test_disagg_cluster_manager_worker.py
28+
- unittest/disaggregated/test_cluster_storage.py
2429
- disaggregated/test_disaggregated.py::test_disaggregated_single_gpu_with_mpirun[TinyLlama-1.1B-Chat-v1.0]
2530
- disaggregated/test_disaggregated.py::test_disaggregated_single_gpu_with_mpirun_trt_backend[TinyLlama-1.1B-Chat-v1.0]
2631
- disaggregated/test_disaggregated.py::test_disaggregated_cuda_graph[TinyLlama-1.1B-Chat-v1.0]
2732
- disaggregated/test_disaggregated.py::test_disaggregated_mixed[TinyLlama-1.1B-Chat-v1.0]
2833
- disaggregated/test_disaggregated.py::test_disaggregated_overlap[TinyLlama-1.1B-Chat-v1.0]
2934
- disaggregated/test_disaggregated.py::test_disaggregated_diff_max_tokens[TinyLlama-1.1B-Chat-v1.0]
35+
- disaggregated/test_disaggregated.py::test_disaggregated_kv_cache_time_output[TinyLlama-1.1B-Chat-v1.0]
36+
- disaggregated/test_disaggregated.py::test_disaggregated_perf_metrics[TinyLlama-1.1B-Chat-v1.0]
37+
- disaggregated/test_disaggregated.py::test_disaggregated_cache_aware_balance[TinyLlama-1.1B-Chat-v1.0]
38+
- disaggregated/test_disaggregated.py::test_disaggregated_conditional[TinyLlama-1.1B-Chat-v1.0]
39+
- disaggregated/test_disaggregated.py::test_disaggregated_ngram[TinyLlama-1.1B-Chat-v1.0]
40+
- disaggregated/test_workers.py::test_workers_conditional_disaggregation[TinyLlama-1.1B-Chat-v1.0]
41+
- disaggregated/test_workers.py::test_workers_kv_cache_events[TinyLlama-1.1B-Chat-v1.0]
42+
- disaggregated/test_workers.py::test_workers_kv_cache_aware_router[TinyLlama-1.1B-Chat-v1.0]
43+
- disaggregated/test_workers.py::test_workers_kv_cache_aware_router_eviction[TinyLlama-1.1B-Chat-v1.0]
44+
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_llama[False-False-TinyLlama-1.1B-Chat-v1.0]
45+
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_llama[False-True-TinyLlama-1.1B-Chat-v1.0]
46+
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_llama[True-False-TinyLlama-1.1B-Chat-v1.0]
47+
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_llama[True-True-TinyLlama-1.1B-Chat-v1.0]
3048
- test_e2e.py::test_openai_chat_guided_decoding
3149
- test_e2e.py::test_openai_chat_multimodal_example
3250
- test_e2e.py::test_openai_perf_metrics
@@ -92,7 +110,6 @@ l0_a10:
92110
- examples/test_bert.py::test_llm_bert_general[compare_hf-enable_remove_input_padding-use_attention_plugin-enable_context_fmha-tp:1-pp:1-float16-BertModel-bert/bert-base-uncased]
93111
- unittest/trt/model/test_mistral.py
94112
- unittest/trt/model/test_llama.py
95-
- test_e2e.py::test_gpt3_175b_1layers_build_only # 6 mins
96113
- llmapi/test_llm_api_connector.py::test_connector_simple[True]
97114
- llmapi/test_llm_api_connector.py::test_connector_simple[False]
98115
- llmapi/test_llm_api_connector.py::test_connector_async_onboard[True]
@@ -119,7 +136,6 @@ l0_a10:
119136
- test_e2e.py::test_trtllm_bench_sanity[--non-streaming-FP16-meta-llama/Llama-3.1-8B-llama-3.1-model/Meta-Llama-3.1-8B]
120137
- test_e2e.py::test_trtllm_bench_latency_sanity[FP16-meta-llama/Llama-3.1-8B-llama-3.1-model/Meta-Llama-3.1-8B]
121138
- unittest/trt/quantization
122-
- accuracy/test_cli_flow.py::TestLlama7B::test_streamingllm # 2 mins
123139
- unittest/trt/functional # 37 mins
124140
- llmapi/test_llm_examples.py::test_llmapi_quickstart_atexit
125141
- unittest/api_stability
@@ -140,13 +156,9 @@ l0_a10:
140156
- accuracy/test_cli_flow.py::TestVicuna7B::test_eagle_2[cuda_graph=True-chunked_context=False] # 5 mins
141157
- accuracy/test_cli_flow.py::TestVicuna7B::test_eagle_2[cuda_graph=True-chunked_context=True] # 5 mins
142158
- accuracy/test_cli_flow.py::TestLlama2_7B::test_auto_dtype
143-
- examples/test_chatglm.py::test_llm_glm_4_9b_single_gpu_summary[glm-4-9b-disable_weight_only]
144159
- unittest/trt/attention/test_gpt_attention_IFB.py
145160
- unittest/trt/attention/test_gpt_attention_no_cache.py
146-
- unittest/trt/model/test_mamba.py # 3 mins
147161
- examples/test_whisper.py::test_llm_whisper_general[large-v3-disable_gemm_plugin-enable_attention_plugin-disable_weight_only-float16-nb:1-use_cpp_runtime]
148-
- examples/test_mamba.py::test_llm_mamba_1gpu[mamba2-130m-float16-enable_gemm_plugin]
149-
- examples/test_mamba.py::test_llm_mamba_1gpu[mamba-codestral-7B-v0.1-float16-enable_gemm_plugin] # 3 mins
150162
- condition:
151163
ranges:
152164
system_gpu_count:
@@ -205,6 +217,12 @@ l0_a10:
205217
- accuracy/test_llm_api.py::TestEagle2Vicuna_7B_v1_3::test_auto_dtype
206218
- stress_test/stress_test.py::test_run_stress_test[llama-v3-8b-instruct-hf_tp1-stress_time_300s_timeout_450s-MAX_UTILIZATION-trt-stress-test]
207219
- stress_test/stress_test.py::test_run_stress_test[llama-v3-8b-instruct-hf_tp1-stress_time_300s_timeout_450s-GUARANTEED_NO_EVICT-trt-stress-test]
220+
- test_e2e.py::test_gpt3_175b_1layers_build_only # 6 mins
221+
- examples/test_chatglm.py::test_llm_glm_4_9b_single_gpu_summary[glm-4-9b-disable_weight_only]
222+
- unittest/trt/model/test_mamba.py # 3 mins
223+
- examples/test_mamba.py::test_llm_mamba_1gpu[mamba2-130m-float16-enable_gemm_plugin]
224+
- examples/test_mamba.py::test_llm_mamba_1gpu[mamba-codestral-7B-v0.1-float16-enable_gemm_plugin] # 3 mins
225+
- accuracy/test_cli_flow.py::TestLlama7B::test_streamingllm # 2 mins
208226
- condition:
209227
ranges:
210228
system_gpu_count:

tests/integration/test_lists/test-db/l0_h100.yml

Lines changed: 1 addition & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -32,11 +32,6 @@ l0_h100:
3232
- unittest/_torch/modeling -k "modeling_nemotron"
3333
- unittest/_torch/modeling -k "modeling_gemma3"
3434
- unittest/_torch/modeling -k "modeling_gpt_oss"
35-
- unittest/disaggregated/test_disagg_utils.py
36-
- unittest/disaggregated/test_router.py
37-
- unittest/disaggregated/test_remoteDictionary.py
38-
- unittest/disaggregated/test_disagg_cluster_manager_worker.py
39-
- unittest/disaggregated/test_cluster_storage.py
4035
- accuracy/test_llm_api_pytorch.py::TestGemma3_1BInstruct::test_auto_dtype
4136
- accuracy/test_llm_api_pytorch.py::TestGemma3_1BInstruct::test_auto_dtype_vswa_without_reuse
4237
- accuracy/test_llm_api_pytorch.py::TestGemma3_1BInstruct::test_auto_dtype_vswa_reuse
@@ -81,15 +76,7 @@ l0_h100:
8176
- disaggregated/test_disaggregated.py::test_disaggregated_deepseek_v3_lite_fp8_tp1_single_gpu_mtp[DeepSeek-V3-Lite-fp8]
8277
- disaggregated/test_disaggregated.py::test_disaggregated_deepseek_v3_lite_fp8_tp1_two_mtp[DeepSeek-V3-Lite-fp8]
8378
- disaggregated/test_disaggregated.py::test_disaggregated_deepseek_v3_lite_fp8_ucx_tp1_single_gpu[DeepSeek-V3-Lite-fp8]
84-
- disaggregated/test_disaggregated.py::test_disaggregated_cuda_graph[TinyLlama-1.1B-Chat-v1.0]
85-
- disaggregated/test_disaggregated.py::test_disaggregated_kv_cache_time_output[TinyLlama-1.1B-Chat-v1.0]
86-
- disaggregated/test_disaggregated.py::test_disaggregated_mixed[TinyLlama-1.1B-Chat-v1.0]
87-
- disaggregated/test_disaggregated.py::test_disaggregated_overlap[TinyLlama-1.1B-Chat-v1.0]
88-
- disaggregated/test_disaggregated.py::test_disaggregated_perf_metrics[TinyLlama-1.1B-Chat-v1.0]
89-
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_llama[False-False-TinyLlama-1.1B-Chat-v1.0]
90-
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_llama[False-True-TinyLlama-1.1B-Chat-v1.0]
91-
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_llama[True-False-TinyLlama-1.1B-Chat-v1.0]
92-
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_llama[True-True-TinyLlama-1.1B-Chat-v1.0]
79+
- disaggregated/test_disaggregated.py::test_disaggregated_load_balance[TinyLlama-1.1B-Chat-v1.0]
9380
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_deepseek[False-False-DeepSeek-V3-Lite-fp8/fp8]
9481
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_deepseek[False-True-DeepSeek-V3-Lite-fp8/fp8]
9582
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_deepseek[True-False-DeepSeek-V3-Lite-fp8/fp8]
@@ -98,14 +85,6 @@ l0_h100:
9885
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_qwen3[False-True-Qwen3-8B-FP8]
9986
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_qwen3[True-False-Qwen3-8B-FP8]
10087
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_simple_qwen3[True-True-Qwen3-8B-FP8]
101-
- disaggregated/test_disaggregated.py::test_disaggregated_load_balance[TinyLlama-1.1B-Chat-v1.0]
102-
- disaggregated/test_disaggregated.py::test_disaggregated_cache_aware_balance[TinyLlama-1.1B-Chat-v1.0]
103-
- disaggregated/test_disaggregated.py::test_disaggregated_conditional[TinyLlama-1.1B-Chat-v1.0]
104-
- disaggregated/test_disaggregated.py::test_disaggregated_ngram[TinyLlama-1.1B-Chat-v1.0]
105-
- disaggregated/test_workers.py::test_workers_conditional_disaggregation[TinyLlama-1.1B-Chat-v1.0]
106-
- disaggregated/test_workers.py::test_workers_kv_cache_events[TinyLlama-1.1B-Chat-v1.0]
107-
- disaggregated/test_workers.py::test_workers_kv_cache_aware_router[TinyLlama-1.1B-Chat-v1.0]
108-
- disaggregated/test_workers.py::test_workers_kv_cache_aware_router_eviction[TinyLlama-1.1B-Chat-v1.0]
10988
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_llama_context_capacity[False-False-DeepSeek-V3-Lite-fp8/fp8]
11089
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_spec_dec_batch_slot_limit[True-False-EAGLE3-LLaMA3.1-Instruct-8B-Llama-3.1-8B-Instruct]
11190
- disaggregated/test_disaggregated_single_gpu.py::test_disaggregated_spec_dec_batch_slot_limit[False-False-EAGLE3-LLaMA3.1-Instruct-8B-Llama-3.1-8B-Instruct]

0 commit comments

Comments
 (0)