Skip to content

Add SYCL Kernels for XPU backend #1679

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 38 commits into
base: main
Choose a base branch
from

Conversation

xiaolil1
Copy link

@xiaolil1 xiaolil1 commented Jun 15, 2025

This is the pull request for the SYCL Kernels targeting the XPU backend.

  • It features the implementation of the "dequantize_blockwise," "dequantize_4bit," and "dequant & gemv_4bit fusion" kernels.
  • The target low-precision quantization datatypes encompass NF4, FP4 and General8bits.
  • This PR aims to eliminate the dependency on IPEX and improve the performance.

@matthewdouglas matthewdouglas added Low Priority (will be worked on after all priority issues) Intel labels Jun 17, 2025
@matthewdouglas matthewdouglas self-assigned this Jun 17, 2025
@matthewdouglas matthewdouglas self-requested a review June 17, 2025 16:19
@matthewdouglas matthewdouglas added this to the v0.48.0 milestone Jun 17, 2025
@fengyuan14
Copy link

Can we use a more accurate title for the commit? or reviewers would get confused if all SYCL kernels are included in the PR.

@jiqing-feng
Copy link
Contributor

jiqing-feng commented Jun 24, 2025

Hi @matthewdouglas . The PR is ready to be reviewed. The sycl kernel could get 0-150% speed-up compared to triton on 4bit models. Could you take the 1st round review? Please let me know if you have any concerns. Thanks!

@xiaolil1 xiaolil1 marked this pull request as ready for review June 25, 2025 01:51
@xiaolil1 xiaolil1 changed the title Add SYCL Kernels for XPU backend Add SYCL Kernels for QLoRA XPU backend Jun 27, 2025
@xiaolil1 xiaolil1 changed the title Add SYCL Kernels for QLoRA XPU backend Add SYCL Kernels for XPU backend Jun 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Intel Low Priority (will be worked on after all priority issues)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants