Skip to content

Activity

Enable autotuning and bf16 accumulation for SYCL CUTLASS (#4)

Pull request merge
sommerlukaspushed 1 commit to sycl-develop • 02d05b4…bc53ae6 • 
on May 16

Make use of the SYCL queue passed to CUTLASS (#3)

Pull request merge
sommerlukaspushed 1 commit to sycl-develop • e28cd77…02d05b4 • 
on May 5

Update CUTLASS submodule

Force push
sommerlukasforce pushed to sycl-develop • a47e05c…e28cd77 • 
on Apr 25

Initial support of SYCL CUTLASS for XPU backend through Inductor (#2)

sommerlukascreated rebase-sycl-20250425 • 05bf838 • 
on Apr 25

Refactor to use torch.accelerator.device_index instead of torch.cuda.…

sommerlukaspushed 34 commits to main • d743a7b…ad81eeb • 
on Apr 25

[invoke_subgraph] Cache fake tensor if no unbacked symint in the outp…

sommerlukaspushed 765 commits to main • 6fa1b17…d743a7b • 
on Apr 24

Updated branch

Pull request mergeMissing commit
sommerlukaspushed 0 commits to sycl-develop • 730119f…a47e05c • 
on Apr 17

ROCm: Add trailing comma for consistency in gfx architecture list (py…

sommerlukaspushed 494 commits to main • 00a2c68…6fa1b17 • 
on Apr 3

Implement SYCL code cache (#1)

Pull request merge
sommerlukaspushed 1 commit to sycl-develop • 00a2c68…730119f • 
on Mar 20

Fix a typo "trochrec" to "torchrec" (pytorch#149542)

sommerlukascreated sycl-develop • 00a2c68 • 
on Mar 20

Fix a typo "trochrec" to "torchrec" (pytorch#149542)

sommerlukaspushed 338 commits to main • 9ad64ce…00a2c68 • 
on Mar 20