【Hackathon 9th No.53】Unit test for multi_head_latent_attention

Add unit tests for Multi-Head Latent Attention (MLA).

**Source**: custom_ops/gpu_ops/ — look for multi_head_latent_attention
**Registration**: custom_ops/gpu_ops/cpp_extensions.cc
**Test file**: tests/operators/test_multi_head_latent_attention.py

MLA is used in DeepSeek-style models. Compare against a reference using standard attention + low-rank projection. Cover different head configs, sequence lengths, and KV compression ratios.

Branch: task/053-multi-head-latent-attention-test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Hackathon 9th No.53】Unit test for multi_head_latent_attention #11

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

【Hackathon 9th No.53】Unit test for multi_head_latent_attention #11

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions