Add GLM-5 NVFP4 B200 TP8 SGLang benchmark (conc 4-64) by functionstackx · Pull Request #973 · SemiAnalysisAI/InferenceX

functionstackx · 2026-03-29T01:09:45Z

Add GLM-5 NVFP4 single-node B200 SGLang benchmark with modelopt_fp4 quantization.

following https://huggingface.co/nvidia/GLM-5-NVFP4 recipe

Closes #972

github-actions · 2026-03-29T01:09:54Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

functionstackx · 2026-03-29T01:18:42Z

@claude use nightly-dev-cu13-20260328-a27651d5 tag instead

Klaud-Cold · 2026-03-29T01:19:06Z

Claude finished @functionstackx's task in 50s —— View job

Updated image tag to `nightly-dev-cu13-20260328-a27651d5`

Find current image tag in config files
Update nvidia-master.yaml: lmsysorg/sglang:nightly-dev-cu13-20260305-33c92732 → lmsysorg/sglang:nightly-dev-cu13-20260328-a27651d5
Update perf-changelog.yaml to reflect new image tag
Commit and push changes
|

- Add glm5-nvfp4-b200-sglang config to nvidia-master.yaml - Create benchmark script with modelopt_fp4 quantization, glm47 tool-call-parser, glm45 reasoning-parser, chunked-prefill-size 131072, mem-fraction-static 0.80 - Uses image lmsysorg/sglang:nightly-dev-cu13-20260305-33c92732 - Model: nvidia/GLM-5-NVFP4 Closes #972 Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

functionstackx · 2026-03-29T03:50:24Z

@claude the perfchangelog diff should be at bottom of file...

remember to have new line at end of file too

Klaud-Cold · 2026-03-29T03:50:47Z

Claude finished @functionstackx's task in 57s —— View job

Moved perf-changelog entry to bottom of file

Read current perf-changelog.yaml
Move the new entry to the bottom of the file
Ensure newline at end of file
Also updated pr-link from /pull/XXX to /pull/973
Commit and push changes

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

functionstackx requested a review from a team March 29, 2026 01:09

functionstackx requested review from jgangani and kedarpotdar-nv as code owners March 29, 2026 01:09

github-project-automation bot added this to InferenceMAX Board Mar 29, 2026

functionstackx added the sweep-enabled label Mar 29, 2026

functionstackx removed the sweep-enabled label Mar 29, 2026

github-actions bot and others added 3 commits March 28, 2026 23:48

Update glm5_nvfp4_b200.sh

dc5f0d5

Update GLM-5 NVFP4 SGLang image to nightly-dev-cu13-20260328-a27651d5

c625c7a

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

functionstackx force-pushed the claude/issue-972-20260329-0102 branch from 1ecda71 to c625c7a Compare March 29, 2026 03:48

github-actions bot and others added 2 commits March 29, 2026 03:52

Move GLM-5 NVFP4 perf-changelog entry to bottom of file

6d1870e

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>

Update glm5_nvfp4_b200.sh

9b57083

functionstackx merged commit d43805f into main Mar 29, 2026
13 checks passed

functionstackx deleted the claude/issue-972-20260329-0102 branch March 29, 2026 03:54

github-project-automation bot moved this to Done in InferenceMAX Board Mar 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GLM-5 NVFP4 B200 TP8 SGLang benchmark (conc 4-64)#973

Add GLM-5 NVFP4 B200 TP8 SGLang benchmark (conc 4-64)#973
functionstackx merged 5 commits intomainfrom
claude/issue-972-20260329-0102

functionstackx commented Mar 29, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 29, 2026

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

Klaud-Cold commented Mar 29, 2026 •

edited

Loading

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

Klaud-Cold commented Mar 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

functionstackx commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 29, 2026

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

Klaud-Cold commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Updated image tag to nightly-dev-cu13-20260328-a27651d5

Uh oh!

functionstackx commented Mar 29, 2026

Uh oh!

Klaud-Cold commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Moved perf-changelog entry to bottom of file

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

functionstackx commented Mar 29, 2026 •

edited

Loading

Klaud-Cold commented Mar 29, 2026 •

edited

Loading

Updated image tag to `nightly-dev-cu13-20260328-a27651d5`

Klaud-Cold commented Mar 29, 2026 •

edited

Loading