Skip to content

Disable prefix caching for mi300 & mi325 qwen3.5 & glm5 AMD benchmarks#970

Merged
functionstackx merged 4 commits intomainfrom
claude/issue-968-20260328-0808
Mar 29, 2026
Merged

Disable prefix caching for mi300 & mi325 qwen3.5 & glm5 AMD benchmarks#970
functionstackx merged 4 commits intomainfrom
claude/issue-968-20260328-0808

Conversation

@functionstackx
Copy link
Copy Markdown
Contributor

Add --disable-radix-cache to SGLang server launch command in all 7 benchmark scripts listed in #968, rebased on latest main.

Closes #968

Generated with Claude Code

Add --disable-radix-cache to SGLang server launch command in all 7
benchmark scripts listed in #968, rebased on latest main.

Closes #968

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — simple, mechanical addition of --disable-radix-cache to 7 benchmark scripts, consistent with the established pattern in this repo (e.g., #966, #926).

Extended reasoning...

Overview\nAdds --disable-radix-cache to the SGLang server launch command in 7 AMD benchmark shell scripts (qwen3.5 bf16/fp8 on MI300X/MI325X/MI355X, and glm5 fp8 on MI355X). Each change is a one-line addition following an existing multi-line argument block.\n\n### Security risks\nNone. These are benchmark configuration scripts with no auth, crypto, or permission-sensitive logic.\n\n### Level of scrutiny\nLow. This is a straightforward config flag addition that mirrors identically structured recent merges (#966 disabling prefix caching for minimax, #926 for kimi). The change is purely additive — no logic branches, no new code paths.\n\n### Other factors\nNo bugs found by the automated system. The PR is self-contained to benchmark scripts with no production code impact. The pattern is well-established in this repo.

@functionstackx
Copy link
Copy Markdown
Contributor Author

@claude add the perf change logs . Yaml

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Klaud-Cold commented Mar 28, 2026

Claude finished @functionstackx's task in 45s —— View job


Added perf-changelog.yaml entry

  • Gather context on changed files
  • Read current perf-changelog.yaml
  • Add changelog entry for --disable-radix-cache changes
  • Commit and push

Added a perf-changelog.yaml entry at the top of the file covering all 7 affected config keys (qwen3.5-bf16-mi300x-sglang, qwen3.5-bf16-mi325x-sglang, qwen3.5-bf16-mi355x-sglang, qwen3.5-fp8-mi300x-sglang, qwen3.5-fp8-mi325x-sglang, qwen3.5-fp8-mi355x-sglang, glm5-fp8-mi355x-sglang) with PR link to #970.
|

…glm5 AMD benchmarks

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
@functionstackx
Copy link
Copy Markdown
Contributor Author

@claude remove the changes to mi355 for now and just do the other changes update the bash scripts & the perf change logs

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Klaud-Cold commented Mar 29, 2026

Claude finished @functionstackx's task in 48s —— View job


Removing MI355X changes and updating PR

  • Gather context on current changes
  • Revert MI355X benchmark scripts to original state
  • Update perf-changelog.yaml to exclude MI355X configs
  • Commit and push

Reverted 3 MI355X files (glm5_fp8_mi355x.sh, qwen3.5_bf16_mi355x.sh, qwen3.5_fp8_mi355x.sh) to their original state. The PR now only adds --disable-radix-cache to the 4 remaining scripts (qwen3.5 bf16/fp8 on MI300X and MI325X). Updated perf-changelog.yaml to match.

… only

Reverted glm5_fp8_mi355x.sh, qwen3.5_bf16_mi355x.sh, qwen3.5_fp8_mi355x.sh
to their original state and updated perf-changelog.yaml accordingly.

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
@functionstackx functionstackx changed the title Disable prefix caching for qwen3.5 & glm5 AMD benchmarks Disable prefix caching for mi300 & mi325 qwen3.5 & glm5 AMD benchmarks Mar 29, 2026
@functionstackx
Copy link
Copy Markdown
Contributor Author

@claude move ur perfchange log diff to the end of the file..

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Klaud-Cold commented Mar 29, 2026

Claude finished @functionstackx's task in 44s —— View job


Moved perf-changelog entry to end of file

  • Read current perf-changelog.yaml
  • Move entry from top to end of file
  • Commit and push

Moved the --disable-radix-cache changelog entry from the top of perf-changelog.yaml to the end, after the last existing entry (PR #911).

View job

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
@functionstackx functionstackx merged commit 2542b74 into main Mar 29, 2026
8 of 52 checks passed
@functionstackx functionstackx deleted the claude/issue-968-20260328-0808 branch March 29, 2026 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

disable prefix on random for qwen3.5 & glm

2 participants