[New feature] integrate causal_conv1d Triton kernel for Ascend NPU by ys2025-AI · Pull Request #228 · modelscope/twinkle

ys2025-AI · 2026-06-18T02:09:43Z

PR type

Bug Fix
New Feature
Document Updates
More Models or Datasets Support

PR information

integrate causal_conv1d Triton kernel for Ascend NPU

Experiment results

Model: Qwen3.5-4B
Hardware: Atlas 900 A3 (2 x NPU)
Dataset: GSM8K_ZH
Finetuning type: LoRA
Software: cann9.0.0+ torch/orch_npu 2.9.0 + MindSpeed 0.12.1 + triton-ascend 3.2.1 + transformers 5.9

指标	Baseline	Causal_conv1d优化	差异
加速比	1.0x	1.12x	—
平均 loss	0.6449	0.6456	差异 0.0007

related: https://gitcode.com/Ascend/MindSpeed-Ops

Add self-contained causal_conv1d kernel module (no mindspeed_ops dependency) with full Triton forward/backward implementations adapted from MindSpeed-Ops. Patch monkey_patch_npu to bind npu_causal_conv1d_fn on NPU-patched modules, remove torch fallback in linear_attention_sp, and add NPU-aware causal_conv1d wrapper in gdn_padding_free (no transpose needed, [B,T,D] native format).

…o_not_specialize

gemini-code-assist

Code Review

This pull request introduces a self-contained, NPU-accelerated causal_conv1d Triton kernel module to support Ascend NPUs, integrating it into the monkey patching, sequence parallel, and padding-free GDN mechanisms. The code review identified several critical issues and bugs: a missing HAS_WEIGHT guard in the backward kernel when storing dw, a shape mismatch and argument-dropping bug in the NPU wrapper within gdn_padding_free.py, compatibility issues with smaller feature dimensions due to a hardcoded block size (BD = 256), and an ignored bias parameter in the forward update kernel. Additionally, an optimization was suggested for _prepare_chunk_indices to avoid host-device synchronization.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

opencode and others added 6 commits June 15, 2026 21:49

Update causal_conv1d.py

eb4b3c5

Update monkey_patch_npu.py

4290d2a

Update causal_conv1d.py

71bc28d

fix(causal_conv1d): remove arch35 check, fix BD=32 in backward, add d…

ad9007d

…o_not_specialize

style: format code with formatter

f4c6b60

gemini-code-assist Bot reviewed Jun 18, 2026

View reviewed changes

Comment thread src/twinkle/kernel/causal_conv1d.py Outdated

Comment thread src/twinkle/patch/gdn_padding_free.py Outdated

Comment thread src/twinkle/kernel/causal_conv1d.py Outdated

Comment thread src/twinkle/kernel/causal_conv1d.py

Comment thread src/twinkle/kernel/causal_conv1d.py Outdated

ys2025-AI added 6 commits June 22, 2026 11:00

Merge branch 'modelscope:main' into main

c2e9a3a

Update causal_conv1d.py

79396d8

Update gdn_padding_free.py

1536147

Update monkey_patch_npu.py

15410ef

Update causal_conv1d.py

deadd01

Merge branch 'modelscope:main' into main

5af0fce

tpx818 approved these changes Jun 23, 2026

View reviewed changes

tpx818 merged commit fbabfa9 into modelscope:main Jun 23, 2026
1 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[New feature] integrate causal_conv1d Triton kernel for Ascend NPU #228

[New feature] integrate causal_conv1d Triton kernel for Ascend NPU #228
tpx818 merged 12 commits into
modelscope:mainfrom
ys2025-AI:main

ys2025-AI commented Jun 18, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ys2025-AI commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR type

PR information

Experiment results

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ys2025-AI commented Jun 18, 2026 •

edited

Loading