-
Notifications
You must be signed in to change notification settings - Fork 10
Pull requests: huawei-csl/pto-kernels
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Prefix sum (scan) implementation - multicore
#135
opened Apr 24, 2026 by
castigli
Collaborator
Loading…
[Feat] Implement doubly-stochastic Sinkhorn normalization kernel
#134
opened Apr 21, 2026 by
Mocchibird
Contributor
•
Draft
Complete chunkwise GatedDeltaNet
#91
opened Apr 7, 2026 by
learning-chip
Collaborator
Loading…
7 tasks done
Chunkwise gated linear attention reaching 60~80 TFLOP/s, with step-by-step optimization records
#88
opened Apr 5, 2026 by
learning-chip
Collaborator
Loading…
9 of 17 tasks
compare host vs device-side chunk metadata computation
#84
opened Apr 1, 2026 by
learning-chip
Collaborator
•
Draft
Adds bfloat16 support on tri inv rec unroll
Under Discussion
The issue/pull request is still under discussion
c2v sync example using TSYNC or TPUSH/TPOP
#65
opened Mar 23, 2026 by
learning-chip
Collaborator
Loading…
2 tasks
Code hygiene remove membase define
Under Discussion
The issue/pull request is still under discussion
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.