Skip to content

Add Kimi K2.5 INT4 vLLM benchmark for MI300X (TP8)#975

Open
functionstackx wants to merge 2 commits intomainfrom
claude/issue-974-20260329-0122
Open

Add Kimi K2.5 INT4 vLLM benchmark for MI300X (TP8)#975
functionstackx wants to merge 2 commits intomainfrom
claude/issue-974-20260329-0122

Conversation

@functionstackx
Copy link
Copy Markdown
Contributor

@functionstackx functionstackx commented Mar 29, 2026

Closes #974

Add single-node MI300X config for Kimi K2.5 INT4 with vLLM ROCm v0.18.0, matching the existing MI325X recipe.

Generated with Claude Code

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

1 similar comment
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

@functionstackx
Copy link
Copy Markdown
Contributor Author

@claude do v0.18.0 image..

@Klaud-Cold
Copy link
Copy Markdown
Collaborator

Klaud-Cold commented Mar 29, 2026

Claude finished @functionstackx's task in 1m 10s —— View job


Updated vLLM image to v0.18.0

  • Read current files to identify image references
  • Update image tag to v0.18.0 in amd-master.yaml
  • Update perf-changelog.yaml description
  • Commit and push changes

Updated the kimik2.5-int4-mi300x-vllm config image from vllm/vllm-openai-rocm:v0.16.0 to vllm/vllm-openai-rocm:v0.18.0 and updated the perf-changelog entry to match.
|

@benenzhu
Copy link
Copy Markdown
Collaborator

/usr/local/lib/python3.12/dist-packages/huggingface_hub/file_download.py:805: UserWarning: Not enough free disk space to download the file. The expected file size is: 9809.05 MB. The target location /mnt/hf_hub_cache/models--moonshotai--Kimi-K2.5/blobs only has 12.58 MB free disk space.

Seems the disk is full.

@functionstackx
Copy link
Copy Markdown
Contributor Author

/usr/local/lib/python3.12/dist-packages/huggingface_hub/file_download.py:805: UserWarning: Not enough free disk space to download the file. The expected file size is: 9809.05 MB. The target location /mnt/hf_hub_cache/models--moonshotai--Kimi-K2.5/blobs only has 12.58 MB free disk space.

Seems the disk is full.

@cquil11 can u clean up the storage/ get more storage from AMD :sad:

github-actions bot and others added 2 commits March 29, 2026 23:20
Add single-node MI300X config for Kimi K2.5 INT4 with vLLM ROCm v0.16.0,
matching the existing MI325X recipe with AMD Andy Luo's optimizations.

Closes #974

Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
Co-authored-by: functionstackx <functionstackx@users.noreply.github.com>
@functionstackx functionstackx force-pushed the claude/issue-974-20260329-0122 branch from 41ce571 to 49f1bd2 Compare March 30, 2026 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

vllm 0.18 single node mi300 kimi k2.5 vllm tp8

3 participants