Skip to content

Add GLM-5 NVFP4 B200 TP8 SGLang benchmark (conc 4-64)#973

Merged
functionstackx merged 5 commits intomainfrom
claude/issue-972-20260329-0102
Mar 29, 2026
Merged

Add GLM-5 NVFP4 B200 TP8 SGLang benchmark (conc 4-64)#973
functionstackx merged 5 commits intomainfrom
claude/issue-972-20260329-0102

Conversation

@functionstackx
Copy link
Copy Markdown
Contributor

@functionstackx functionstackx commented Mar 29, 2026

Add GLM-5 NVFP4 single-node B200 SGLang benchmark with modelopt_fp4 quantization.

following https://huggingface.co/nvidia/GLM-5-NVFP4 recipe

Closes #972

Generated with Claude Code

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

@functionstackx
Copy link
Copy Markdown
Contributor Author

@claude use nightly-dev-cu13-20260328-a27651d5 tag instead

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Klaud-Cold commented Mar 29, 2026

Claude finished @functionstackx's task in 50s —— View job


Updated image tag to nightly-dev-cu13-20260328-a27651d5

  • Find current image tag in config files
  • Update nvidia-master.yaml: lmsysorg/sglang:nightly-dev-cu13-20260305-33c92732lmsysorg/sglang:nightly-dev-cu13-20260328-a27651d5
  • Update perf-changelog.yaml to reflect new image tag
  • Commit and push changes
    |

github-actions bot and others added 3 commits March 28, 2026 23:48
- Add glm5-nvfp4-b200-sglang config to nvidia-master.yaml
- Create benchmark script with modelopt_fp4 quantization, glm47 tool-call-parser,
  glm45 reasoning-parser, chunked-prefill-size 131072, mem-fraction-static 0.80
- Uses image lmsysorg/sglang:nightly-dev-cu13-20260305-33c92732
- Model: nvidia/GLM-5-NVFP4

Closes #972

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
@functionstackx functionstackx force-pushed the claude/issue-972-20260329-0102 branch from 1ecda71 to c625c7a Compare March 29, 2026 03:48
@functionstackx
Copy link
Copy Markdown
Contributor Author

@claude the perfchangelog diff should be at bottom of file...

remember to have new line at end of file too

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Klaud-Cold commented Mar 29, 2026

Claude finished @functionstackx's task in 57s —— View job


Moved perf-changelog entry to bottom of file

  • Read current perf-changelog.yaml
  • Move the new entry to the bottom of the file
  • Ensure newline at end of file
  • Also updated pr-link from /pull/XXX to /pull/973
  • Commit and push changes

github-actions bot and others added 2 commits March 29, 2026 03:52
Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
@functionstackx functionstackx merged commit d43805f into main Mar 29, 2026
13 checks passed
@functionstackx functionstackx deleted the claude/issue-972-20260329-0102 branch March 29, 2026 03:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

glm5 nvfp4 b200

2 participants