Skip to content

Fix node validation error for static pivot after extend+concatenate#4688

Open
rafaelbey wants to merge 4 commits intofinos:masterfrom
rafaelbey:bugfix-isolation-node-validation
Open

Fix node validation error for static pivot after extend+concatenate#4688
rafaelbey wants to merge 4 commits intofinos:masterfrom
rafaelbey:bugfix-isolation-node-validation

Conversation

@rafaelbey
Copy link
Copy Markdown
Contributor

Summary

Fix NODE VALIDATION ERROR: currentTreeNode / root / DOESN'T CONTAIN: root raised during plan generation when a static ->pivot() is the outermost operation in a chain that advanced the tree cursor, e.g. #TDS#->extend(window)->concatenate(#TDS#)->pivot(...).

Root cause

processStaticPivot builds a fresh TdsSelectSqlQuery with a brand-new RootJoinTreeNode and passes it to isolateNonTerminalGroupByQueryWithEmptyGroupingColumns. When $state.functionExpressionStack->size() > 1 is false (pivot is the outermost function), the else branch swapped select without remapping currentTreeNode into the new tree — leaving the cursor pointing at a node that no longer belonged to select.data. validateNode then fires because allNodes (computed from the new tree) does not contain the old cursor.

Fix

Mirror the remap pattern introduced by #4683 in isolateTdsSelect: run findNode against the old and new trees, falling back to the new root when the cursor was buried by the isolation boundary.

| let newData = $select.data->toOne();
  ^$query(select = $select,
          currentTreeNode = $query.currentTreeNode->map(ctn | $ctn->findNode($query.select.data->toOne(), $newData)->orElse($newData)));

Tests

  • New PCT test testStaticPivot_AfterExtendConcatenate in composition.pure reproducing the failure on DuckDB.
  • Stores that do not support pivot at the dialect / binding layer (H2, Java binding, ClickHouse, Databricks, MemSQL, Oracle, Postgres, Spanner, SqlServer, Trino, Deephaven) inherit the same expected-failure entry as the existing testStaticPivot_AfterConcatenate case.

Test plan

  • Test_Relational_DuckDB_RelationFunctions_PCT — 346/346 passing (new test passes empirically)
  • Test_Relational_H2_RelationFunctions_PCT — 346/346 (pivot not supported; expected-failure entry added)
  • Test_JAVA_RelationFunction_PCT — 346/346 (TDS translation unsupported; expected-failure entry added)
  • Cloud-store PCTs (Postgres, Snowflake, MemSQL, SqlServer, Oracle, Databricks, Spanner, Trino, ClickHouse, Deephaven) will run in CI

When a static `->pivot()` is the outermost operation in a chain that
advanced the tree cursor (e.g. `->extend(window)->concatenate()->pivot()`),
plan generation failed with "NODE VALIDATION ERROR: currentTreeNode /
root / DOESN'T CONTAIN: root".

`processStaticPivot` builds a TdsSelectSqlQuery with a fresh
RootJoinTreeNode and hands it to
`isolateNonTerminalGroupByQueryWithEmptyGroupingColumns`. When that
function's `functionExpressionStack > 1` guard is false (pivot is the
outermost function), the else branch swapped `select` without remapping
`currentTreeNode` into the new tree -- leaving the cursor pointing at a
node that no longer belongs to `select.data`.

The fix mirrors the remap pattern used by `isolateTdsSelect`: run
`findNode` against the old and new trees, falling back to the new root
when the cursor was buried by the isolation boundary.

New PCT test `testStaticPivot_AfterExtendConcatenate` reproduces the
failure on DuckDB. Stores that do not support pivot at the dialect /
binding layer inherit the same expected-failure entry as the existing
`testStaticPivot_AfterConcatenate` case.
@rafaelbey rafaelbey requested a review from a team as a code owner April 24, 2026 13:07
…tenate

Cloud PCT runs revealed that the new test trips different error paths
on the listed stores than the existing testStaticPivot_AfterConcatenate:

- MemSQL: test actually succeeds -- remove entry.
- Spanner, SqlServer: hits "pivot is not supported" (not the CTE path).
- Trino: hits "Column ''a__|__newval'' cannot be resolved".
- Deephaven: hits "function not supported yet: pivot_...".
Explain in router-and-pure-to-sql.md and expected-failures-howto.md
that CTE-related errors (`Common table expression not supported on DB
<X>`) only fire in PCT-relational for tests whose Pure expression
contains top-level `let` statements. The adapter wraps multi-statement
bodies in `eval`, which routes through `processFunctionDefinition` and
lifts each top-level `let` into a `CommonTableExpression`. Tests
without `let` skip this path and surface store-specific errors
directly, so expected-failure messages cannot be copied blindly between
sibling tests.
@rafaelbey rafaelbey force-pushed the bugfix-isolation-node-validation branch from 706a6e6 to 56a7963 Compare April 24, 2026 16:48
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 24, 2026

Test Results

  1 099 files   -      18    1 099 suites   - 18   3h 50m 26s ⏱️ - 1h 25m 49s
13 977 tests  -      97  13 797 ✔️  -      97  180 💤 ±0  0 ±0 
36 739 runs   - 2 951  36 559 ✔️  - 2 951  180 💤 ±0  0 ±0 

Results for commit 474d1e7. ± Comparison against base commit 7b71d58.

♻️ This comment has been updated with latest results.

@rafaelbey rafaelbey enabled auto-merge (squash) April 24, 2026 19:58
@rafaelbey
Copy link
Copy Markdown
Contributor Author

Context for currentTreeNode remap in isolateNonTerminalGroupByQueryWithEmptyGroupingColumns

The helper has multiple call sites (lines 6291, 6315, 6389, 6439, 6457, 6486, 6500, 6524). They split into two shapes:

  • Shape A — caller passes a select whose data shares nodes with query.select.data (column/groupBy edits over the same tree). The pre-fix ^$query(select=$select) was correct here because the cursor still points at a node that exists in the new tree.
  • Shape B — caller passes a select whose data is a freshly-built ^RootJoinTreeNode with no node-identity overlap (e.g. processStaticPivot at 6384–6389). The pre-fix line left the cursor pointing into the old tree even though select.data had been swapped wholesale.

validateNode (line 500–513) requires currentTreeNode ∈ select.data->getAllNodes(), so Shape B trips the NODE VALIDATION ERROR: currentTreeNode / root / DOESN'T CONTAIN: root we are fixing.

Three observable options for the remap fallback (when findNode returns empty):

Option When triggered validateNode (L500) Downstream consumers (e.g. L259 currentTreeNode->toOne(), L1039 findOneNode($newSelect.data, ...)) Net behavior
A. ->orElse($newData) (this PR) Shape B only — Shape A keeps the matched node Passes ($newData is the root of select.data) Cursor resolves to the new outer root; subsequent attach operations land on the wrapping select Validation green; downstream finds a node; semantics: "next op attaches at the wrapping subselect"
B. Keep original ctn instance Shape B only Fails — ctn is not in allNodes($newData); reproduces the original NODE VALIDATION ERROR n/a — validation fires first Same observable failure as before this PR
C. ->orElse([]) Shape B only Passes (validateNode short-circuits on empty) L259 ->toOne() raises on empty; L1039's isNotEmpty branch is skipped, but other call sites assume [1] Validation green; risk of toOne() failures elsewhere if the resulting SelectWithCursor is consumed by code paths that don't tolerate an empty cursor

Precedent. The same pattern as Option A was introduced in #4683 for isolateTdsSelect at line 3060–3062:

currentTreeNode = $query.currentTreeNode->map(ctn |
    $ctn->findNode($select.data->toOne(), $newNode)->orElse($newNode))

That PR established the convention "after a tree-replacing isolation step, the cursor lands at the new root if it cannot be remapped." The change in this PR applies the same convention to the second isolation helper for consistency.

Alternative architecture. A different shape of fix would push the cursor-update responsibility back to the call sites that build fresh trees (Shape B), e.g. have processStaticPivot set currentTreeNode = $pivotRoot on the input query before calling the helper. That would make the helper's else-branch the simple ^$query(select=$select) again. Trade-offs:

  • Pro: the helper would be a "no-op on the cursor" function; cursor responsibility lives next to the tree it builds.
  • Con: every Shape B call site (and any future ones) needs the explicit remap; if a contributor forgets, the failure is the same NODE VALIDATION ERROR rather than a localized error.

Happy to refactor in either direction — flagging the trade-off rather than asserting a preference.

Per PR review feedback, push the cursor-update responsibility to the
call site that owns the freshly-built tree (processStaticPivot) rather
than handling it defensively inside the shared
isolateNonTerminalGroupByQueryWithEmptyGroupingColumns helper.

processStaticPivot now binds the new pivot RootJoinTreeNode to a let,
threads it both into the new TdsSelectSqlQuery and into the input
SelectWithCursor's currentTreeNode, then calls the helper. The
helper's else branch reverts to the original `^$query(select=$select)`
form, scoping the change's blast radius to the static pivot path.
@rafaelbey rafaelbey force-pushed the bugfix-isolation-node-validation branch from 7ff6cdb to 474d1e7 Compare April 25, 2026 00:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant