Skip to content

Avoid O(N^2) in VALUES with ordinals grouping #130576

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Jul 7, 2025

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Jul 3, 2025

Using the VALUES aggregator with ordinals grouping led to accidental quadratic complexity. Queries like FROM .. | STATS ... VALUES(field) ... BY keyword-field are affected by this performance issue. This change caches a sorted structure - previously used to fix a similar O(N^2) problem when emitting the output block - during the merging phase of the OrdinalGroupingOperator.

@dnhatn dnhatn force-pushed the fix-values-aggregator branch from 73349d4 to ef7d00c Compare July 3, 2025 22:29
@dnhatn dnhatn closed this Jul 3, 2025
@dnhatn dnhatn deleted the fix-values-aggregator branch July 3, 2025 22:29
@dnhatn dnhatn restored the fix-values-aggregator branch July 3, 2025 22:29
@dnhatn dnhatn reopened this Jul 3, 2025
@dnhatn dnhatn force-pushed the fix-values-aggregator branch from ef7d00c to 81326cb Compare July 3, 2025 22:39
@dnhatn dnhatn force-pushed the fix-values-aggregator branch from 81326cb to 0ee7d9e Compare July 4, 2025 03:49
@elasticsearchmachine
Copy link
Collaborator

Hi @dnhatn, I've created a changelog YAML for you.

@dnhatn
Copy link
Member Author

dnhatn commented Jul 4, 2025

I updated the benchmark to simulate the ordinal grouping operator. With the main branch, I could not complete the benchmark with 1,000,000 groups, but with the fix, it took 1625ms. The benchmark results are below.

Before:
Benchmark                      (dataType)  (groups)  (numOrdinalMerges)  Mode  Cnt      Score   Error  Units
ValuesAggregatorBenchmark.run    BytesRef         1                   0  avgt    2      3.680          ms/op
ValuesAggregatorBenchmark.run    BytesRef         1                   1  avgt    2      3.632          ms/op
ValuesAggregatorBenchmark.run    BytesRef      1000                   0  avgt    2      2.515          ms/op
ValuesAggregatorBenchmark.run    BytesRef      1000                   1  avgt    2      9.397          ms/op
ValuesAggregatorBenchmark.run    BytesRef    200000                   0  avgt    2    148.966          ms/op
ValuesAggregatorBenchmark.run    BytesRef    200000                   1  avgt    2  90055.908          ms/op
ValuesAggregatorBenchmark.run         int         1                   0  avgt    2      0.494          ms/op
ValuesAggregatorBenchmark.run         int         1                   1  avgt    2      0.488          ms/op
ValuesAggregatorBenchmark.run         int      1000                   0  avgt    2      2.788          ms/op
ValuesAggregatorBenchmark.run         int      1000                   1  avgt    2      8.232          ms/op
ValuesAggregatorBenchmark.run         int    200000                   0  avgt    2    198.020          ms/op
ValuesAggregatorBenchmark.run         int    200000                   1  avgt    2  70918.020          ms/op
ValuesAggregatorBenchmark.run        long         1                   0  avgt    2      0.862          ms/op
ValuesAggregatorBenchmark.run        long         1                   1  avgt    2      0.873          ms/op
ValuesAggregatorBenchmark.run        long      1000                   0  avgt    2      4.212          ms/op
ValuesAggregatorBenchmark.run        long      1000                   1  avgt    2     10.450          ms/op
ValuesAggregatorBenchmark.run        long    200000                   0  avgt    2    257.926          ms/op
ValuesAggregatorBenchmark.run        long    200000                   1  avgt    2  75686.076          ms/op



After:
Benchmark                      (dataType)  (groups)  (numOrdinalMerges)  Mode  Cnt     Score   Error  Units
ValuesAggregatorBenchmark.run    BytesRef         1                   0  avgt    2     3.909          ms/op
ValuesAggregatorBenchmark.run    BytesRef         1                   1  avgt    2     3.951          ms/op
ValuesAggregatorBenchmark.run    BytesRef      1000                   0  avgt    2     2.635          ms/op
ValuesAggregatorBenchmark.run    BytesRef      1000                   1  avgt    2     2.703          ms/op
ValuesAggregatorBenchmark.run    BytesRef   1000000                   0  avgt    2  1519.385          ms/op
ValuesAggregatorBenchmark.run    BytesRef   1000000                   1  avgt    2  1623.915          ms/op
ValuesAggregatorBenchmark.run         int         1                   0  avgt    2     0.601          ms/op
ValuesAggregatorBenchmark.run         int         1                   1  avgt    2     0.613          ms/op
ValuesAggregatorBenchmark.run         int      1000                   0  avgt    2     2.504          ms/op
ValuesAggregatorBenchmark.run         int      1000                   1  avgt    2     2.591          ms/op
ValuesAggregatorBenchmark.run         int   1000000                   0  avgt    2  1396.017          ms/op
ValuesAggregatorBenchmark.run         int   1000000                   1  avgt    2  1441.373          ms/op
ValuesAggregatorBenchmark.run        long         1                   0  avgt    2     0.598          ms/op
ValuesAggregatorBenchmark.run        long         1                   1  avgt    2     0.597          ms/op
ValuesAggregatorBenchmark.run        long      1000                   0  avgt    2     2.397          ms/op
ValuesAggregatorBenchmark.run        long      1000                   1  avgt    2     2.510          ms/op
ValuesAggregatorBenchmark.run        long   1000000                   0  avgt    2  1538.923          ms/op
ValuesAggregatorBenchmark.run        long   1000000                   1  avgt    2  1625.971          ms/op

@dnhatn dnhatn requested a review from nik9000 July 4, 2025 04:01
@dnhatn dnhatn marked this pull request as ready for review July 4, 2025 04:02
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jul 4, 2025
@@ -65,7 +65,7 @@
@Fork(1)
public class ValuesAggregatorBenchmark {
static final int MIN_BLOCK_LENGTH = 8 * 1024;
private static final int OP_COUNT = 1024;
private static final int OP_COUNT = 20;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a temporary change or permanent?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, I forgot to add a comment for this. Did we intentionally set the HashOperator to use 1000 pages? Would 100 or 50 pages be enough instead? I have reverted this change; let's discuss it in a separate PR.

// no longer need the bytes
bytes.close();
bytes = null;
int[] ids = sortedForOrdinalMerging.ids;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove this line? I spent like ten minutes staring at this code to try and figure out why it exists. But I was missing this line. Without it the loop's super obviously changing values in the ids array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a @param to the ids field of Sorted to say something like this is the position in my array of the values to read OR, if build from {@link buildForMerging} then it's the position into the *target* state to merge into or something like that. I haven't had enough coffee to make those words make sense. But it's interesting that this means different things depending on where you got it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ I've added a javadoc in 6401294

long both = values.get(id);
int group = (int) (both >>> Float.SIZE);
$endif$
$endif$
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the indent better? I liked having them on the left so they were easier to see as controls.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy-pasted issue. I have reverted this.

public void close() {
releasable.close();
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be it's own little java class rather than generated once per.

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually want to block merging on my question about OP_COUNT - once we get that settled it's cool.

elasticsearchmachine pushed a commit that referenced this pull request Jul 7, 2025
Using the VALUES aggregator with ordinals grouping led to accidental 
quadratic complexity. Queries like FROM .. | STATS ... VALUES(field) ...
BY keyword-field are affected by this performance issue. This change
caches a sorted structure - previously used to fix a similar O(N^2)
problem when emitting the output block - during the merging phase of the
OrdinalGroupingOperator.
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jul 7, 2025
Using the VALUES aggregator with ordinals grouping led to accidental
quadratic complexity. Queries like FROM .. | STATS ... VALUES(field) ...
BY keyword-field are affected by this performance issue. This change
caches a sorted structure - previously used to fix a similar O(N^2)
problem when emitting the output block - during the merging phase of the
OrdinalGroupingOperator.
# Conflicts:
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesBytesRefAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesDoubleAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesFloatAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesIntAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesLongAggregator.java
#	x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/ValuesBytesRefAggregators.java
#	x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/X-ValuesAggregator.java.st
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jul 7, 2025
Using the VALUES aggregator with ordinals grouping led to accidental
quadratic complexity. Queries like FROM .. | STATS ... VALUES(field) ...
BY keyword-field are affected by this performance issue. This change
caches a sorted structure - previously used to fix a similar O(N^2)
problem when emitting the output block - during the merging phase of the
OrdinalGroupingOperator.
# Conflicts:
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesBytesRefAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesDoubleAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesFloatAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesIntAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesLongAggregator.java
#	x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/ValuesBytesRefAggregators.java
#	x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/X-ValuesAggregator.java.st
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jul 7, 2025
Using the VALUES aggregator with ordinals grouping led to accidental
quadratic complexity. Queries like FROM .. | STATS ... VALUES(field) ...
BY keyword-field are affected by this performance issue. This change
caches a sorted structure - previously used to fix a similar O(N^2)
problem when emitting the output block - during the merging phase of the
OrdinalGroupingOperator.

(cherry picked from commit 59df1bf)

# Conflicts:
#	benchmarks/src/main/java/org/elasticsearch/benchmark/compute/operator/ValuesAggregatorBenchmark.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesBytesRefAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesDoubleAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesFloatAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesIntAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesLongAggregator.java
#	x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/ValuesBytesRefAggregators.java
#	x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/X-ValuesAggregator.java.st
@dnhatn
Copy link
Member Author

dnhatn commented Jul 7, 2025

💚 All backports created successfully

Status Branch Result
9.0
8.19
8.18
8.17

Questions ?

Please refer to the Backport tool documentation

dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jul 7, 2025
Using the VALUES aggregator with ordinals grouping led to accidental
quadratic complexity. Queries like FROM .. | STATS ... VALUES(field) ...
BY keyword-field are affected by this performance issue. This change
caches a sorted structure - previously used to fix a similar O(N^2)
problem when emitting the output block - during the merging phase of the
OrdinalGroupingOperator.

(cherry picked from commit 59df1bf)

# Conflicts:
#	benchmarks/src/main/java/org/elasticsearch/benchmark/compute/operator/ValuesAggregatorBenchmark.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesBytesRefAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesDoubleAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesFloatAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesIntAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesLongAggregator.java
#	x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/ValuesBytesRefAggregators.java
#	x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/X-ValuesAggregator.java.st

# Conflicts:
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesBytesRefAggregator.java
#	x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/X-ValuesAggregator.java.st
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jul 8, 2025
Using the VALUES aggregator with ordinals grouping led to accidental
quadratic complexity. Queries like FROM .. | STATS ... VALUES(field) ...
BY keyword-field are affected by this performance issue. This change
caches a sorted structure - previously used to fix a similar O(N^2)
problem when emitting the output block - during the merging phase of the
OrdinalGroupingOperator.
# Conflicts:
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesBytesRefAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesDoubleAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesFloatAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesIntAggregator.java
#	x-pack/plugin/esql/compute/src/main/generated-src/org/elasticsearch/compute/aggregation/ValuesLongAggregator.java
#	x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/ValuesBytesRefAggregators.java
#	x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/aggregation/X-ValuesAggregator.java.st
dnhatn added a commit that referenced this pull request Jul 8, 2025
Using the VALUES aggregator with ordinals grouping led to accidental
quadratic complexity. Queries like FROM .. | STATS ... VALUES(field) ...
BY keyword-field are affected by this performance issue. This change
caches a sorted structure - previously used to fix a similar O(N^2)
problem when emitting the output block - during the merging phase of the
OrdinalGroupingOperator.
dnhatn added a commit that referenced this pull request Jul 8, 2025
Using the VALUES aggregator with ordinals grouping led to accidental
quadratic complexity. Queries like FROM .. | STATS ... VALUES(field) ...
BY keyword-field are affected by this performance issue. This change
caches a sorted structure - previously used to fix a similar O(N^2)
problem when emitting the output block - during the merging phase of the
OrdinalGroupingOperator.
dnhatn added a commit that referenced this pull request Jul 8, 2025
Using the VALUES aggregator with ordinals grouping led to accidental
quadratic complexity. Queries like FROM .. | STATS ... VALUES(field) ...
BY keyword-field are affected by this performance issue. This change
caches a sorted structure - previously used to fix a similar O(N^2)
problem when emitting the output block - during the merging phase of the
OrdinalGroupingOperator.
dnhatn added a commit that referenced this pull request Jul 8, 2025
Using the VALUES aggregator with ordinals grouping led to accidental
quadratic complexity. Queries like FROM .. | STATS ... VALUES(field) ...
BY keyword-field are affected by this performance issue. This change
caches a sorted structure - previously used to fix a similar O(N^2)
problem when emitting the output block - during the merging phase of the
OrdinalGroupingOperator.
dnhatn added a commit that referenced this pull request Jul 8, 2025
We should not build the sorted structure for the ordinal grouping 
operator if the requested position is larger than maxGroupId. This
situation occurs with nulls. We should benchmark the ordinal blocks and
consider removing the ordinal grouping operator if performance is
similar; otherwise, we need to integrate this operator with
GroupingAggregatorFunctionTestCase.

Relates #130576
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jul 8, 2025
We should not build the sorted structure for the ordinal grouping 
operator if the requested position is larger than maxGroupId. This
situation occurs with nulls. We should benchmark the ordinal blocks and
consider removing the ordinal grouping operator if performance is
similar; otherwise, we need to integrate this operator with
GroupingAggregatorFunctionTestCase.

Relates elastic#130576
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jul 8, 2025
We should not build the sorted structure for the ordinal grouping 
operator if the requested position is larger than maxGroupId. This
situation occurs with nulls. We should benchmark the ordinal blocks and
consider removing the ordinal grouping operator if performance is
similar; otherwise, we need to integrate this operator with
GroupingAggregatorFunctionTestCase.

Relates elastic#130576
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jul 8, 2025
We should not build the sorted structure for the ordinal grouping
operator if the requested position is larger than maxGroupId. This
situation occurs with nulls. We should benchmark the ordinal blocks and
consider removing the ordinal grouping operator if performance is
similar; otherwise, we need to integrate this operator with
GroupingAggregatorFunctionTestCase.

Relates elastic#130576

(cherry picked from commit f58d291)
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jul 8, 2025
We should not build the sorted structure for the ordinal grouping
operator if the requested position is larger than maxGroupId. This
situation occurs with nulls. We should benchmark the ordinal blocks and
consider removing the ordinal grouping operator if performance is
similar; otherwise, we need to integrate this operator with
GroupingAggregatorFunctionTestCase.

Relates elastic#130576

(cherry picked from commit f58d291)
dnhatn added a commit to dnhatn/elasticsearch that referenced this pull request Jul 8, 2025
We should not build the sorted structure for the ordinal grouping
operator if the requested position is larger than maxGroupId. This
situation occurs with nulls. We should benchmark the ordinal blocks and
consider removing the ordinal grouping operator if performance is
similar; otherwise, we need to integrate this operator with
GroupingAggregatorFunctionTestCase.

Relates elastic#130576

(cherry picked from commit f58d291)

# Conflicts:
#	x-pack/plugin/esql/compute/src/test/java/org/elasticsearch/compute/OperatorTests.java
elasticsearchmachine pushed a commit that referenced this pull request Jul 8, 2025
We should not build the sorted structure for the ordinal grouping 
operator if the requested position is larger than maxGroupId. This
situation occurs with nulls. We should benchmark the ordinal blocks and
consider removing the ordinal grouping operator if performance is
similar; otherwise, we need to integrate this operator with
GroupingAggregatorFunctionTestCase.

Relates #130576
elasticsearchmachine pushed a commit that referenced this pull request Jul 9, 2025
* Fix empty VALUES with ordinals grouping (#130861)

We should not build the sorted structure for the ordinal grouping 
operator if the requested position is larger than maxGroupId. This
situation occurs with nulls. We should benchmark the ordinal blocks and
consider removing the ordinal grouping operator if performance is
similar; otherwise, we need to integrate this operator with
GroupingAggregatorFunctionTestCase.

Relates #130576

* Fix test
dnhatn added a commit that referenced this pull request Jul 9, 2025
We should not build the sorted structure for the ordinal grouping
operator if the requested position is larger than maxGroupId. This
situation occurs with nulls. We should benchmark the ordinal blocks and
consider removing the ordinal grouping operator if performance is
similar; otherwise, we need to integrate this operator with
GroupingAggregatorFunctionTestCase.

Relates #130576

(cherry picked from commit f58d291)

# Conflicts:
#	x-pack/plugin/esql/compute/src/test/java/org/elasticsearch/compute/OperatorTests.java
dnhatn added a commit that referenced this pull request Jul 9, 2025
We should not build the sorted structure for the ordinal grouping
operator if the requested position is larger than maxGroupId. This
situation occurs with nulls. We should benchmark the ordinal blocks and
consider removing the ordinal grouping operator if performance is
similar; otherwise, we need to integrate this operator with
GroupingAggregatorFunctionTestCase.

Relates #130576

(cherry picked from commit f58d291)
dnhatn added a commit that referenced this pull request Jul 9, 2025
We should not build the sorted structure for the ordinal grouping
operator if the requested position is larger than maxGroupId. This
situation occurs with nulls. We should benchmark the ordinal blocks and
consider removing the ordinal grouping operator if performance is
similar; otherwise, we need to integrate this operator with
GroupingAggregatorFunctionTestCase.

Relates #130576

(cherry picked from commit f58d291)
dnhatn added a commit that referenced this pull request Jul 15, 2025
The ordinals grouping operator was introduced to speed up aggregation 
before ordinal blocks and related optimizations in block hashes were
available. However, this operator has several issues:

1. It only supports single grouping with the `keyword` type and requires 
`doc_values`.

2. It needs a separate aggregation implementation, which currently lacks 
test coverage. We had performance issues with the `VALUES` aggregation
using this operator (see #130576).

3. It can be slower and use more memory when the target documents have 
sparse cardinality (see #98963).

4. Ad-hoc planning, although this can now be addressed with local plans.

Although the ordinals grouping operator is slightly faster than the hash 
operator with ordinal blocks, its complexity now outweighs the benefits.
This PR proposes removing the operator. Below is the NYC_taxis
benchmark.

Closes #98963
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.17.9 v8.18.4 v8.19.1 v9.0.4 v9.1.1 v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants