[Don't Merge] Update cli args qwen#946
[Don't Merge] Update cli args qwen#946zhentaocc wants to merge 4 commits intoSemiAnalysisAI:mainfrom
Conversation
7992757 to
a8cf15f
Compare
|
to double check, @chunfangamd is @zhentaocc part of AMD? can u confirm internally? if so, plz add him to the upstream repo. better developer experience to create branches in upstream then do forks. for example, forks, we can do sweep-enabled label to validate PRs |
BF16 local resultsconc 64, 1k1k, TPUT 501.29->661.46 tokens/s/gpu, 31.95% boost. @functionstackx |
* Added CONTEXT_LENGTH and MAX_PREFILL_TOKENS variables for better configuration. * Updated launch_server command with new options: --tokenizer-worker-num, --enable-aiter-allreduce-fusion, --cuda-graph-max-bs, --context-length, --disable-radix-cache, --max-prefill-tokens, and --scheduler-recv-interval.
… benchmark configurations for MI355X, enhancing performance with updated CLI arguments.
….yaml to v0.5.9, ensuring compatibility with recent changes.
a8cf15f to
fa3b1fb
Compare
FP8 local test resultsconc 64, 1k1k, TPUT 708.75tokens/s/gpu @functionstackx |
|
/sweep test-config --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml --config-keys qwen3.5-bf16-mi355x-sglang qwen3.5-fp8-mi355x-sglang |
|
@zhentaocc Kicking off a sweep. Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/23735484968 |
|
/sweep test-config --config-files .github/configs/amd-master.yaml --runner-config .github/configs/runners.yaml --config-keys qwen3.5-bf16-mi355x-sglang qwen3.5-fp8-mi355x-sglang |
|
@zhentaocc Kicking off a sweep. Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/23735797269 |
No description provided.