-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-54776][SQL] Improved the logs message regarding lambda function with SQL UDF #53542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-54776][SQL] Improved the logs message regarding lambda function with SQL UDF #53542
Conversation
e87c41c to
6f50c79
Compare
6f50c79 to
5d61f52
Compare
allisonwang-db
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for improving the error message! cc @cloud-fan
| throw new AnalysisException( | ||
| errorClass = "UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_SQL_UDF", | ||
| messageParameters = Map( | ||
| "funcName" -> ("\"" + function.name.unquotedString + "(" + formattedInputs + ")\""))) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add this in UserDefinedFunctionErrors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @cloud-fan mentioned that we don't include the function inputs in the error message. So i have removed it.
| case v: NamedLambdaVariable => "lambda " + v.name | ||
| case v: UnresolvedNamedLambdaVariable => "lambda " + v.name | ||
| case e => e.sql | ||
| }.mkString(", ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually don't include the function inputs in the error message, just function name is sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the response i have change made required change in the code to remove function input in the error. Please review it
6c2e653 to
1251461
Compare
| |RETURNS STRING | ||
| |RETURN lower(s) | ||
| |""".stripMargin) | ||
| val exception = intercept[AnalysisException] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's use checkError to test errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made the changes used checkError to test errors
f66e849 to
42677f5
Compare
…functions removed the input parameter
7792601 to
123f817
Compare
123f817 to
daba475
Compare
|
thanks, merging to master! |
…n with SQL UDF
### What changes were proposed in this pull request?
**Changes made:**
Added new error condition in error-conditions.json:
UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_SQL_UDF - A clear error message for SQL UDFs used in lambda functions.
### Why are the changes needed?
Currently, when a SQL UDF is used inside a higher-order function like transform, the error message is confusing:
```
CREATE FUNCTION lower_udf(s STRING) RETURNS STRING RETURN lower(s);
SELECT transform(array('A', 'B'), x -> lower_udf(x));
```
**Before (confusing error):**
[MISSING_ATTRIBUTES.RESOLVED_ATTRIBUTE_MISSING_FROM_INPUT]
Resolved attribute(s) "x" missing from in operator !Project [cast(lambda x#20395 as string) AS s#20397].
SQLSTATE: XX000
<img width="1728" height="427" alt="Screenshot 2025-12-18 at 6 13 29 PM" src="https://github.com/user-attachments/assets/8d7e79dd-bd86-4199-8b16-fae0b9313d46" />
This error doesn't explain why the attribute is missing or what the user should do.
**After (clear error):**
[UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_SQL_UDF] The feature is not supported: Lambda function with SQL UDF "spark_catalog.default.lower_udf(lambda x)" in a higher order function. SQLSTATE: 0A000
<img width="1728" height="314" alt="Screenshot 2025-12-18 at 6 14 11 PM" src="https://github.com/user-attachments/assets/76b30d2d-1c3a-4a8d-8feb-65a5295d6d35" />
This is consistent with the existing error message for Python UDFs in the same scenario (UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_PYTHON_UDF).
### Does this PR introduce _any_ user-facing change?
Yes. Users will now see a clearer, more actionable error message when attempting to use a SQL UDF inside a higher-order function's lambda expression.
### How was this patch tested?
**Test 1:**
Added a new test case "SQL UDF in higher-order function should fail with clear error message" in SQLFunctionSuite.scala that:
Creates a SQL UDF
Attempts to use it in a transform higher-order function
Verifies the error condition is UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_SQL_UDF
Verifies the error message contains the function name and lambda x
**Test 2:**
Manual testing
spark.sql("CREATE OR REPLACE FUNCTION test_lower_udf(s STRING) RETURNS STRING RETURN lower(s)") spark.sql("SELECT transform(array('A', 'B'), x -> test_lower_udf(x))").show()
### Was this patch authored or co-authored using generative AI tooling?
No
Closes apache#53542 from Shubhambhusate/fix/LAMBDA_FUNCTION_WITH_SQL_UDF.
Authored-by: Shubhambhusate <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
What changes were proposed in this pull request?
Changes made:
Added new error condition in error-conditions.json:
UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_SQL_UDF - A clear error message for SQL UDFs used in lambda functions.
Why are the changes needed?
Currently, when a SQL UDF is used inside a higher-order function like transform, the error message is confusing:
Before (confusing error):
[MISSING_ATTRIBUTES.RESOLVED_ATTRIBUTE_MISSING_FROM_INPUT]
Resolved attribute(s) "x" missing from in operator !Project [cast(lambda x#20395 as string) AS s#20397].
SQLSTATE: XX000
This error doesn't explain why the attribute is missing or what the user should do.
After (clear error):
[UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_SQL_UDF] The feature is not supported: Lambda function with SQL UDF "spark_catalog.default.lower_udf(lambda x)" in a higher order function. SQLSTATE: 0A000
This is consistent with the existing error message for Python UDFs in the same scenario (UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_PYTHON_UDF).
Does this PR introduce any user-facing change?
Yes. Users will now see a clearer, more actionable error message when attempting to use a SQL UDF inside a higher-order function's lambda expression.
How was this patch tested?
Test 1:
Added a new test case "SQL UDF in higher-order function should fail with clear error message" in SQLFunctionSuite.scala that:
Creates a SQL UDF
Attempts to use it in a transform higher-order function
Verifies the error condition is UNSUPPORTED_FEATURE.LAMBDA_FUNCTION_WITH_SQL_UDF
Verifies the error message contains the function name and lambda x
Test 2:
Manual testing
spark.sql("CREATE OR REPLACE FUNCTION test_lower_udf(s STRING) RETURNS STRING RETURN lower(s)") spark.sql("SELECT transform(array('A', 'B'), x -> test_lower_udf(x))").show()
Was this patch authored or co-authored using generative AI tooling?
No