[Kernel][Moe Configs] llama4 maverick fp8 moe config tp8 on mi325 #28709

zhewenl · 2025-11-14T07:04:49Z

Purpose

AMD CI is using mi325, but the MoE config is not added:

WARNING [fused_moe.py:886] Using default MoE config. Performance might be sub-optimal!
Config file not found at ['/usr/local/lib/python3.12/dist-packages/vllm/model_executor/layers/fused_moe/configs/E=128,N=1024,device_name=AMD_Instinct_MI325X,dtype=fp8_w8a8.json']

This PR adds llama4 MoE config for mi325, which should be identical to mi300 config here: #16847

Signed-off-by: Zhewen Li <[email protected]>

gemini-code-assist

Code Review

This pull request introduces a new configuration file for Fused Mixture-of-Experts (MoE) kernels, specifically for the AMD Instinct MI325X GPU using the fp8_w8a8 data type. The added file, E=128,N=1024,device_name=AMD_Instinct_MI325X,dtype=fp8_w8a8.json, is intended to resolve a CI failure caused by its absence. The configuration is stated to be identical to that of the MI300X, which is a reasonable approach for enabling support on a similar architecture. The file format and naming convention align with the existing structure for kernel configurations. The change appears correct and should resolve the reported issue.

gemini-code-assist

Code Review

This pull request adds a new configuration file for the Fused MoE kernel to support the AMD Instinct MI325X GPU with fp8 precision. The change is straightforward and aims to fix a missing configuration warning, which should provide better performance than the default settings. The new configuration file's naming and structure are consistent with the existing implementation. The kernel parameters within the file appear valid. I have not identified any high or critical severity issues in this pull request.

yeqcharlotte · 2025-11-14T07:53:54Z

how is the config tuned? is it from autotune script or copied from mi300x? cc: @mxz297 @bradleyhd

zhewenl · 2025-11-14T18:07:06Z

@yeqcharlotte just copied from mi300 since they share a same architecture

…lm-project#28709) Signed-off-by: Zhewen Li <[email protected]> Signed-off-by: George D. Torres <[email protected]>

add mi325 config for llama4

ece402f

Signed-off-by: Zhewen Li <[email protected]>

zhewenl requested review from mgoin and pavanimajety as code owners November 14, 2025 07:04

zhewenl requested review from gshtras, hongxiayang, tjtanaa and yeqcharlotte November 14, 2025 07:05

mergify bot added the llama Related to Llama models label Nov 14, 2025

zhewenl requested a review from houseroad November 14, 2025 07:05

gemini-code-assist bot reviewed Nov 14, 2025

View reviewed changes

hongxiayang approved these changes Nov 14, 2025

View reviewed changes

tjtanaa approved these changes Nov 15, 2025

View reviewed changes

tjtanaa added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 15, 2025

tjtanaa enabled auto-merge (squash) November 15, 2025 00:20

vllm-bot merged commit 1ec978c into vllm-project:main Nov 15, 2025
49 of 51 checks passed

geodavic pushed a commit to geodavic/vllm that referenced this pull request Nov 16, 2025

[Kernel][Moe Configs] llama4 maverick fp8 moe config tp8 on mi325 (vl…

5018c82

…lm-project#28709) Signed-off-by: Zhewen Li <[email protected]> Signed-off-by: George D. Torres <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Kernel][Moe Configs] llama4 maverick fp8 moe config tp8 on mi325 #28709

[Kernel][Moe Configs] llama4 maverick fp8 moe config tp8 on mi325 #28709

Uh oh!

zhewenl commented Nov 14, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

yeqcharlotte commented Nov 14, 2025

Uh oh!

zhewenl commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

[Kernel][Moe Configs] llama4 maverick fp8 moe config tp8 on mi325 #28709

[Kernel][Moe Configs] llama4 maverick fp8 moe config tp8 on mi325 #28709

Uh oh!

Conversation

zhewenl commented Nov 14, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

yeqcharlotte commented Nov 14, 2025

Uh oh!

zhewenl commented Nov 14, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zhewenl commented Nov 14, 2025 •

edited by github-actions bot

Loading