Skip to content

fix: apply recursive CTE column-list aliases to the static term#23098

Merged
kosiew merged 3 commits into
apache:mainfrom
tomsanbear:recursive-cte-column-list-alias
Jun 30, 2026
Merged

fix: apply recursive CTE column-list aliases to the static term#23098
kosiew merged 3 commits into
apache:mainfrom
tomsanbear:recursive-cte-column-list-alias

Conversation

@tomsanbear

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

WITH RECURSIVE t(n) AS (...) failed to plan because the CTE's declared column-list names (the t(n) part) were never applied to the recursive working relation. They were applied (via apply_table_alias) only after the whole CTE plan was built, but the working table is derived from the static term's schema before that — so the self-reference couldn't resolve the declared names and planning failed with Schema error: No field named n. Valid fields are t."Int64(1)".. PostgreSQL and DuckDB accept the query; aliasing inside the static SELECT (SELECT 1 AS n) was the only workaround.

What changes are included in this PR?

Apply the column-list aliases to the static term inside recursive_cte(), before the work table is created, so the working relation and the self-reference carry the declared names. The caller now applies only the relation-name alias on the recursive path (the columns are already applied), avoiding a redundant projection on top of the RecursiveQuery node. The non-UNION fallback applies the aliases directly; non-recursive CTEs are unchanged. A column/alias-count mismatch is now reported at the static term — a clearer error than the previous "No field named …".

Are these changes tested?

Yes, added cte.slt cases for single- and multi-column column-list recursive CTEs (asserting the recursion produces the expected rows), UNION (DISTINCT), the arity-mismatch error, and an EXPLAIN locking the plan shape (no extra projection over RecursiveQuery).

Are there any user-facing changes?

WITH RECURSIVE t(n) AS (...) and multi-column column lists now plan and execute correctly, matching PostgreSQL/DuckDB. No API changes.

Note: this pull request was created together with AI tools (claude code), the full diff was reviewed by myself in full prior to submission

`WITH RECURSIVE t(n) AS (SELECT 1 UNION ALL SELECT n + 1 FROM t WHERE n < 10)`
failed to plan with `Schema error: No field named n. Valid fields are
t."Int64(1)".`. The CTE's declared column-list names were applied (via
`apply_table_alias`) only after the whole CTE plan was built, but the recursive
working relation is derived from the schema of the static term before that, so
the self-reference could never resolve the declared names.

Apply the column-list aliases to the static term inside `recursive_cte`, before
the work table is created, so the working relation and the self-reference expose
the declared names. The relation-name alias is still added by the caller; on the
recursive path the column re-alias is skipped to avoid a redundant projection on
top of the `RecursiveQuery` node. The non-UNION fallback applies the aliases
directly. Non-recursive CTEs are unchanged.

Adds sqllogictest coverage for single- and multi-column column-list recursive
CTEs, UNION (DISTINCT), and the column/alias-count mismatch error.
kosiew
kosiew previously approved these changes Jun 25, 2026

@kosiew kosiew left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomsanbear
Thanks for the update! The changes look good to me. I just have one small suggestion that could help strengthen the test coverage.

# recursive CTE with a column-list alias (e.g. `t(n)`): the declared names must be
# applied to the static term so the recursive self-reference can resolve them
query I rowsort
WITH RECURSIVE t(n) AS (

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement! One small suggestion: could we add a regression test with a quoted recursive CTE column-list alias, for example WITH RECURSIVE t("N") AS (...) SELECT "N" FROM t? I think it would be helpful to document that quoted and case-sensitive aliases are preserved in the recursive work table as well. This is not blocking since the implementation already goes through the existing alias normalization path.

@kosiew kosiew left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you investigate and resolve the CI error?

@kosiew kosiew dismissed their stale review June 25, 2026 11:45

CI error

@tomsanbear

Copy link
Copy Markdown
Contributor Author

@tomsanbear Thanks for the update! The changes look good to me. I just have one small suggestion that could help strengthen the test coverage.

For sure, thanks for the feedback! I'll have some time a bit later today and will push a fix to the branch

@github-actions github-actions Bot added the optimizer Optimizer rules label Jun 25, 2026
@tomsanbear tomsanbear requested a review from kosiew June 25, 2026 16:52
@tomsanbear

Copy link
Copy Markdown
Contributor Author

@kosiew i've updated the snapshot along with the new test you recommended 👍

@github-actions github-actions Bot removed the auto detected api change Auto detected API change label Jun 29, 2026

@kosiew kosiew left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomsanbear
thanks for the iteration

Looks 👍 to me

@tomsanbear

Copy link
Copy Markdown
Contributor Author

Sounds good, what remains to get this merged?

@kosiew

kosiew commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

what remains to get this merged?

as per docs/source/contributor-guide/index.md, we leave PRs for some "time" for additional feedback.

Adding this to the merge queue now.
Thanks for your contribution.

@kosiew kosiew added this pull request to the merge queue Jun 30, 2026
Merged via the queue into apache:main with commit 4dadbbd Jun 30, 2026
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

optimizer Optimizer rules sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Recursive CTE column-list alias t(n) is ignored, fails to plan with "No field named n"

2 participants