Fix warp_size in triton kernel for AMD GPUs #476

divakar-amd · 2025-11-20T20:53:00Z

This fix resolves triton.runtime.errors.OutOfResources error on AMD GPUs (mi300).

Here's the error log without this fix:

  File "/Projects/VLLM_DIR/vllm/vllm/v1/worker/gpu_model_runner.py", line 2934, in sample_tokens
    apply_grammar_bitmask(
  File "/Projects/VLLM_DIR/vllm/vllm/v1/structured_output/utils.py", line 126, in apply_grammar_bitmask
    xgr.apply_token_bitmask_inplace(logits, grammar_bitmask, indices=index_tensor)
  File "/usr/local/lib/python3.12/dist-packages/xgrammar/matcher.py", line 147, in apply_token_bitmask_inplace
    apply_token_bitmask_inplace_triton(logits, bitmask, vocab_size, indices)
  File "/usr/local/lib/python3.12/dist-packages/xgrammar/kernels/apply_token_bitmask_inplace_triton.py", line 106, in apply_token_bitmask_inplace_triton
    apply_token_bitmask_inplace_kernel[grid](
  File "/usr/local/lib/python3.12/dist-packages/triton/runtime/jit.py", line 393, in <lambda>
    return lambda *args, **kwargs: self.run(grid=grid, warmup=False, *args, **kwargs)
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/triton/runtime/jit.py", line 623, in run
    kernel.run(grid_0, grid_1, grid_2, stream, kernel.function, kernel.packed_metadata, launch_metadata,
    ^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/triton/compiler/compiler.py", line 467, in __getattribute__
    self._init_handles()
  File "/usr/local/lib/python3.12/dist-packages/triton/compiler/compiler.py", line 461, in _init_handles
    raise OutOfResources(self.metadata.num_warps * warp_size, self.n_max_threads, "threads")
triton.runtime.errors.OutOfResources: out of resource: threads, Required: 2048, Hardware limit: 1024. Reducing block sizes or `num_stages` may help.

This fix resolves triton.runtime.errors.OutOfResources error on AMD GPUs (mi300)

Copilot

Pull Request Overview

This PR fixes a triton.runtime.errors.OutOfResources error that occurs on AMD GPUs (specifically MI300) by correctly setting the warp size for AMD's architecture. AMD GPUs use a warp size of 64, while NVIDIA GPUs use 32. The fix dynamically detects the GPU vendor and sets the appropriate warp size, which is then used to calculate the number of warps needed for the Triton kernel execution.

Key changes:

Added conditional logic to detect AMD GPUs via torch.version.hip
Set WARP_SIZE to 64 for AMD GPUs and 32 for NVIDIA GPUs
Updated num_warps calculation to use the dynamically determined WARP_SIZE

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

python/xgrammar/kernels/apply_token_bitmask_inplace_triton.py

divakar-amd · 2025-11-23T00:07:30Z

@Ubospica @mgorny Looking for a review.

Signed-off-by: Divakar Verma <[email protected]>

Fix warp_size in triton kernel for AMD GPUs

41a849f

This fix resolves triton.runtime.errors.OutOfResources error on AMD GPUs (mi300)

Copilot AI review requested due to automatic review settings November 20, 2025 20:53

Copilot AI reviewed Nov 20, 2025

View reviewed changes

python/xgrammar/kernels/apply_token_bitmask_inplace_triton.py Outdated Show resolved Hide resolved

fix pre-commit

467979c

handle AMD Navi GPUs

eafd4db

Signed-off-by: Divakar Verma <[email protected]>

divakar-amd mentioned this pull request Nov 23, 2025

[CI][ROCm] (wip) Fix test_async_scheduing vllm-project/vllm#29254

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix warp_size in triton kernel for AMD GPUs #476

Fix warp_size in triton kernel for AMD GPUs #476

Uh oh!

divakar-amd commented Nov 20, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

divakar-amd commented Nov 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fix warp_size in triton kernel for AMD GPUs #476

Are you sure you want to change the base?

Fix warp_size in triton kernel for AMD GPUs #476

Uh oh!

Conversation

divakar-amd commented Nov 20, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

divakar-amd commented Nov 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant