Move extend clause after summarize for calculated measures#48
Conversation
This ensures that extend operations (which may reference summarized columns) are executed after the summarize clause in the generated KQL query.
There was a problem hiding this comment.
Pull request overview
This PR fixes the ordering of KQL query clauses by moving the extend clause to come after the summarize clause. This change is necessary for calculated measures that reference columns created in the summarize operation, as KQL requires that extend operations referencing summarized columns must come after the summarize clause.
Changes:
- Modified query generation logic to output
summarizebeforeextend - Updated existing tests to reflect the new query ordering
- Added a new test case to verify calculated measures work correctly
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| sqlalchemy_kusto/dialect_kql.py | Reordered query construction in visit_select to output summarize before extend; minor style improvement removing unnecessary parentheses |
| tests/unit/test_dialect_kql.py | Updated 4 existing tests to expect summarize before extend; added new test for calculated measures; minor formatting cleanup |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
65f2ec2 to
ae5f51b
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ae5f51b to
5548c47
Compare
…update test expectations for summarize-before-extend ordering
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Hello! However, strictly moving the extend clause after summarize breaks standard pre-aggregation calculations (e.g., creating a calculated column and then grouping by it). In KQL, summarize drops all columns that aren't in the by clause or the aggregations. So, if extend relies on raw columns, it will fail because those columns no longer exist after summarize. Here is a test case that passes on the main branch but fails with these changes because the extend is placed too late: def test_pre_aggregated_calculations():
app_col = literal_column("App")
ns_col = literal_column("Namespace")
id_col = literal_column("_id")
# 1. Calculated column before aggregation
app_namespace = func.strcat(app_col, ns_col).label("App_Namespace")
# 2. Group by the calculated column
query = (
select(app_namespace, func.count(id_col).label("TotalLogs"))
.select_from(text("Logs"))
.group_by(app_namespace)
)
query_compiled = str(
query.compile(engine, compile_kwargs={"literal_binds": True})
).replace("\n", "")
# Expected KQL (fails on this PR branch):
query_expected = (
'["Logs"]'
'| extend ["App_Namespace"] = strcat(App, Namespace)'
'| summarize ["TotalLogs"] = count(["_id"]) by ["App_Namespace"]'
'| project ["App_Namespace"], ["TotalLogs"]'
)
assert query_compiled == query_expected |
Description
This PR ensures that the
| extendclause is placed after| summarizein the generated KQL query. This ordering is necessary for calculated measures that reference summarized columns.Changes
dialect_kql.py:Reordered query construction in
visit_selectto output | summarize before| extendtests/unit/test_dialect_kql.py:
Updated existing tests to expect summarize before
extendWhy this change?
In KQL,
extendoperations that reference summarized columns must come after the summarize clause. Previously, the generated query placedextendbeforesummarize, which would fail for calculated measuresUI Changes
Before:


After:
