Skip to content

Print SYCL build options in building #1779

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

xytintel
Copy link
Contributor

Print SYCL options in building

message(STATUS "**** SYCL_EXECUTABLE **** ${SYCL_EXECUTABLE}")
message(STATUS "**** SYCL_FLAGS **** ${SYCL_FLAGS}")
message(STATUS "**** SYCL_OFFLINE_COMPILER_FLAGS **** ${SYCL_OFFLINE_COMPILER_FLAGS}")

@xytintel xytintel requested a review from chunhuanMeng June 25, 2025 03:47
@pytorchxpubot
Copy link

@sys_pytorchxpubot triage result for run 15866674343Triage bot UT analaysis result for reference only, please note unique error message only report once:
  1. third_party.torch-xpu-ops.test.xpu.test_linalg_xpu.TestLinalgXPU test_det_xpu_complex128 got failed with error message
 AssertionError: Scalars are not close! 

Triage bot response:

{
  "similar_issue_id": "570",
  "similar_issue_state": "closed",
  "issue_owner": "fengyuan14",
  "issue_description": "The reporter, ZzEeKkAa, encountered a NotImplementedError while running Intel's Triton unit tests with upstream PyTorch. The error message indicated that the operator 'aten::__lshift__.Scalar' is not implemented for the XPU device. The reporter was advised to either open a feature request or use a CPU fallback by setting the environment variable `PYTORCH_ENABLE_XPU_FALLBACK=1`.",
  "root_causes": [
    "The shift operators were not initially supported in the XPU implementation plan, but were added later based on community feedback and requirements."
  ],
  "suggested_solutions": [
    "Update torch-xpu-ops to ensure all relevant operations, including those for complex numbers, are implemented and optimized for XPU.",
    "Adjust test thresholds to account for potential precision differences in XPU computations.",
    "Implement specific optimizations for complex number determinant calculations on XPU if they are missing."
  ]
}
  1. third_party.torch-xpu-ops.test.xpu.test_linalg_xpu.TestLinalgXPU test_tensorsolve_xpu_complex128 got failed with error message
 AssertionError: Tensor-likes are not close! 

Triage bot response:

{
  "similar_issue_id": 1214,
  "similar_issue_state": "open",
  "issue_owner": "daisyden",
  "issue_description": "The test test_tensorsolve_xpu_complex128 failed with an AssertionError: Tensor-likes are not close! The error suggests a discrepancy in tensor values between CPU and XPU during the linalg operation, possibly due to numerical precision or kernel behavior differences.",
  "root_causes": [
    "Discrepancies in numerical computations between CPU and XPU implementations.",
    "Potential differences in kernel behavior affecting complex128 precision."
  ],
  "suggested_solutions": [
    "Enhance numerical precision checks in the linalg operations for XPU.",
    "Review and align kernel behavior to ensure consistency across CPU and XPU.",
    "Implement additional tests to validate tensor computations across different scenarios."
  ]
}
  1. third_party.torch-xpu-ops.test.xpu.test_ops_gradients_xpu.TestBwdGradientsXPU test_fn_grad_linalg_eigvals_xpu_float64 got failed with error message
 Exception: Caused by sample input at index 0: SampleInput(input=Tensor[size=(5, 5), device="xpu:0", dtype=torch.float64], args=(), kwargs={}, broadcasts_input=False, name='') 

Triage bot response:

{
  "similar_issue_id": 264,
  "similar_issue_state": "closed",
  "issue_owner": "torch-xpu-ops maintainers",
  "issue_description": "The test `test_fn_grad_linalg_eigvals_xpu_float64` failed with an error involving a tensor of size (5,5) on the XPU device with dtype float64. The failure suggests an issue with the implementation of `linalg.eigvals` for float64 tensors on XPU, potentially related to dtype handling or missing kernel support.",
  "root_causes": [
    "Inadequate support for float64 dtype in `linalg.eigvals` on XPU.",
    "Potential missing or incorrect implementation of the XPU kernel for `linalg.eigvals` with float64 tensors."
  ],
  "suggested_solutions": [
    "Implement the `linalg.eigvals` function for float64 tensors on XPU.",
    "Ensure proper dtype handling and kernel support for float64 in XPU operations.",
    "Add comprehensive tests to verify the correctness of `linalg.eigvals` across various dtypes on XPU."
  ]
}
  1. third_party.torch-xpu-ops.test.xpu.test_ops_xpu.TestCommonXPU test_numpy_ref_linalg_tensorsolve_xpu_complex128 got failed with error message
 AssertionError: Tensor-likes are not close! ; Exception: Caused by reference input at index 0: SampleInput(input=Tensor[size=(2, 3, 6), device="xpu:0", dtype=torch.complex128], args=TensorList[Tensor[size=(2, 3), device="xpu:0", dtype=torch.complex128]], kwargs={'dims': 'None'}, broadcasts_input=False, name='') 

Triage bot response:

{
  "similar_issue_id": 1214,
  "similar_issue_state": "open",
  "issue_owner": "daisyden",
  "issue_description": "The test `test_numpy_ref_linalg_tensorsolve_xpu_complex128` failed with an `AssertionError: Tensor-likes are not close!` error. This indicates a discrepancy between the test's expected result and the actual output, specifically for complex128 tensors. The failure may stem from differences in how `linalg_tensorsolve` operates on XPU compared to expected behavior, potentially related to precision or implementation specifics.",
  "root_causes": [
    "Discrepancies in `linalg_tensorsolve` operation between XPU and expected results, possibly due to differences in handling complex128 tensors.",
    "Potential alignment issues in the implementation of `linalg_tensorsolve` on XPU compared to other devices."
  ],
  "suggested_solutions": [
    "Review and adjust the implementation of `linalg_tensorsolve` on XPU to align with expected behavior, focusing on complex128 dtype handling.",
    "Add additional checks or adjust precision settings to ensure consistency across different devices for complex operations."
  ]
}

Copy link
Contributor

@dvrogozh dvrogozh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not against this PR, but imho, we need to just have a way to print executed cmdlines in the log. Preferably following same set of options when printing C/C++ cmdlines, i.e. VERBOSE=1 and such.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants