Skip to content

Conversation

@yizhuoz004
Copy link
Collaborator

No description provided.

@yizhuoz004 yizhuoz004 force-pushed the dev-yizhuoz-tp-fused-loc branch 2 times, most recently from 61cd9a6 to 239a3b1 Compare August 12, 2025 21:37
@yizhuoz004 yizhuoz004 marked this pull request as ready for review August 12, 2025 21:37
@yizhuoz004
Copy link
Collaborator Author

@pranavm-nvidia Need your advise on this issue. Previously the stderr looks like this:

ITensor::getDimensions: Error Code 4: Shape Error (reshape changes volume. Reshaping [2,2] to [3,3]. In operator() at optimizer/common/shape/shapeContext.cpp:4727)
loc("%t2,%t3;;<out>;;%t4"): error: ranks of MLIR Tensor is 2 while TRT ITensor has rank -1
loc("%t2,%t3;;<out>;;%t4"): error: 'tensorrt.shuffle' op failed to encode operation
loc("%t1,%t0;;<out>;;%t2"): error: failed to encode block
loc("%t2,%t3;;<out>;;%t4"): error: failed to translate function 'tensorrt_cluster' to a TensorRT engine

But now we changed to fusedLocation, and the stderr became:

ITensor::getDimensions: Error Code 4: Shape Error (reshape changes volume. Reshaping [2,2] to [3,3]. In operator() at optimizer/common/shape/shapeContext.cpp:4727)
loc(fused<"<TraceOp: %t4 = reshape(%t2 : tensor<?x?xf32:gpu:0>, %t3 : tensor<2xi32:gpu:0>) : tensor<?x?xf32:gpu:0>, Stack Info:   --> /tripy/test_tril.py:7 in <module>()  >, ">["%t2,%t3;;<out>;;%t4"]): error: ranks of MLIR Tensor is 2 while TRT ITensor has rank -1
loc(fused<"<TraceOp: %t4 = reshape(%t2 : tensor<?x?xf32:gpu:0>, %t3 : tensor<2xi32:gpu:0>) : tensor<?x?xf32:gpu:0>, Stack Info:   --> /tripy/test_tril.py:7 in <module>()  >, ">["%t2,%t3;;<out>;;%t4"]): error: 'tensorrt.shuffle' op failed to encode operation
loc(fused<"<TraceOp: %t2 = broadcast(%t1 : tensor<f32:gpu:0>, %t0 : tensor<2xi32:gpu:0>) : tensor<?x?xf32:gpu:0>, Stack Info:   --> /tripy/nvtripy/frontend/ops/ones.py:53 in ones() --> /tripy/test_tril.py:6 in <module>()  >, ">["%t1,%t0;;<out>;;%t2"]): error: failed to encode block
loc(fused<"<TraceOp: %t4 = reshape(%t2 : tensor<?x?xf32:gpu:0>, %t3 : tensor<2xi32:gpu:0>) : tensor<?x?xf32:gpu:0>, Stack Info:   --> /tripy/test_tril.py:7 in <module>()  >, ">["%t2,%t3;;<out>;;%t4"]): error: failed to translate function 'tensorrt_cluster' to a TensorRT engine

We can extract the original ["%t2,%t3;;<out>;;%t4"] part from the new stderr, or do you think there's a new way as there's already some stack info in the error message?

@pranavm-nvidia
Copy link
Collaborator

@yizhuoz004 I think you can just update the regexs here: https://github.com/NVIDIA/TensorRT-Incubator/blob/main/tripy/nvtripy/backend/mlir/utils.py#L132. We probably don't want to include the stack info in this part of the message - we should keep our current logic of swapping out the location information with the tensor name.

@yizhuoz004 yizhuoz004 force-pushed the dev-yizhuoz-tp-fused-loc branch from 239a3b1 to 528194f Compare August 13, 2025 19:21
@yizhuoz004 yizhuoz004 force-pushed the dev-yizhuoz-tp-fused-loc branch from 09c6886 to 5fff755 Compare August 14, 2025 21:11
@yizhuoz004 yizhuoz004 force-pushed the dev-yizhuoz-tp-fused-loc branch from 5fff755 to 5ff298e Compare August 14, 2025 22:50
@yizhuoz004 yizhuoz004 merged commit e25f408 into main Aug 14, 2025
1 of 2 checks passed
@yizhuoz004 yizhuoz004 deleted the dev-yizhuoz-tp-fused-loc branch August 14, 2025 23:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants