List view
For triton memory operations (e.g. `tl.load` and `tl.store`) on AMD GPUs, if the address offset is representable by 32 bits, `buffer_load` should be used to optimize performance. This milestone requires triton-viz to detect such patterns and give out necessary debugging information for further fixes.
No due date•0/5 issues closed- No due date•11/27 issues closed
- No due date•5/5 issues closed