Add LlamaMode matmul for decode #632

Yssx-g · 2025-11-27T16:35:03Z

Added Llama Mode matmul for the decode stage.Implemented the handwritten-style next-matmul-llama.mlir and the corresponding Pass,MatMulLlamaOptimize.cpp.Also added the test case matmul-vectorization-llama.mlir.It has been successfully registered in buddy-opt and can be directly invoked using the-matmul-vectorization-llama option.
Below is the performance comparison for the deepseekR1 decode stage use case,with the upper part representing the BLIS mode and the lower part representing the Llama mode.

That‘s all

Added a new target 'next-matmul-llama' to the makefile for processing MLIR files with various transformations and compiling to LLVM.

Yssx-g · 2025-12-24T10:35:05Z

Below are the DeepSeek end-to-end test results. Multiple tests were conducted, and two screenshots are selected here for demonstration: the left side shows the new matmul implementation, while the right side shows BLIS matmul. The average time per token was reduced by 0.05 seconds.

Yssx-g and others added 16 commits November 28, 2025 00:02

Implement Llama Mode matrix multiplication

5f3f551

Add 'next-matmul-llama' target to makefile

5dd768e

Added a new target 'next-matmul-llama' to the makefile for processing MLIR files with various transformations and compiling to LLVM.

Add MatMulLlamaOptimize pass for matrix multiplication

a2845b9

Add MatMulLlamaOptimize.cpp to CMakeLists

6cce037

Register MatMulLlamaMode function

faec58b

Add matmul vectorization test in MLIR format

e31a1ce

Rename MatMulLlamaOptimize.cpp to MatMulGgmlOptimize.cpp

93e4ddb

Replace MatMulLlamaOptimize with MatMulGgmlOptimize

0ceb425

Rename MatMulLlamaMode to MatMulGgml

33253e4

Update and rename next-matmul-llama.mlir to next-matmul-ggml.mlir

d75f477

Update makefile

aeac1ab

Update MLIR test to use ggml vectorization

fb2e8ca

Merge branch 'buddy-compiler:main' into main

ae68aca

Merge branch 'buddy-compiler:main' into main

1758a42

ggml mode

bf27f2c

fix ggml option

6a97dd0

Yssx-g and others added 2 commits December 24, 2025 10:37

Format

e2b141a

Merge branch 'buddy-compiler:main' into main

1c4dcac

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add LlamaMode matmul for decode #632

Add LlamaMode matmul for decode #632

Uh oh!

Yssx-g commented Nov 27, 2025

Uh oh!

Yssx-g commented Dec 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add LlamaMode matmul for decode #632

Are you sure you want to change the base?

Add LlamaMode matmul for decode #632

Uh oh!

Conversation

Yssx-g commented Nov 27, 2025

Uh oh!

Yssx-g commented Dec 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant