Avoid repeated `EmitTo::First` in partial hash aggregate output by hhhizzz · Pull Request #23250 · apache/datafusion

hhhizzz · 2026-06-30T05:11:44Z

Which issue does this PR close?

Closes Investigate EmitTo::First usage in partial hash aggregate output #23249.

Rationale for this change

The migrated partial hash aggregate output path still used EmitTo::First(batch_size) when draining grouped aggregate state in batches.

For terminal output this is unnecessary and can be expensive: EmitTo::First is not just slicing the first N rows, it also shifts remaining group indexes and maintains GroupValues lookup state. For high-cardinality partial aggregate output, this can cause repeated work during output draining.

The final hash aggregate path already avoids this by materializing output once with EmitTo::All and then slicing the resulting RecordBatch. This PR applies the same approach to partial hash aggregate output.

What changes are included in this PR?

Remove the helper that selected EmitTo::First(batch_size) for hash aggregate terminal output.
Change migrated partial hash aggregate output to:
- materialize grouped keys and aggregate state once with EmitTo::All
- slice the materialized RecordBatch into batch_size chunks across output polls
Rename the shared materialized-output state/type to mode-neutral names because it is now used by both final and partial output paths.
Add a regression test with a custom GroupsAccumulator that fails if partial terminal output calls EmitTo::First(_).
Strengthen the regression test to verify both batch slicing and emitted key/state values.

Are these changes tested?

Yes.

Local targeted tests:

cargo test -p datafusion-physical-plan partial_grouped_aggregate_materializes_before_slicing -- --nocapture
cargo test -p datafusion-physical-plan materialized_aggregate_output_slices_batches_until_exhausted -- --nocapture
git diff --check

Additional local verification run during development:

cargo test -p datafusion-physical-plan materialized_final_output_slices_batches_until_exhausted -- --nocapture
cargo test -p datafusion-physical-plan partial_grouped_aggregate_uses_raw_partial_stream -- --nocapture

The new regression test was also applied to the pre-fix baseline and failed with the expected internal error when the partial output path used EmitTo::First.
Local benchmark evidence was collected against the implementation commit before the final test/naming polish commit.
ClickBench full 43-query run, 5 iterations, 24 cores skip partial aggregation probe ratio 0.8:

mode	total warm time	geomean warm time
baseline migrated aggregate	128509.47 ms	352.79 ms
patched migrated aggregate	19652.37 ms	180.65 ms
baseline old aggregate path	19774.70 ms	181.25 ms

Largest patched/current wins included:
q33: 32961.02ms -> 1642.08ms
q34: 32739.34ms -> 1635.07ms
q18: 25673.25ms -> 1767.25ms
q16: 5949.82ms -> 810.17ms
q17: 5906.51ms -> 807.10ms
TPC-DS SF10 full99, 10 rounds:
Failures: 0
Aggregate geomean current/main: 0.982817
Aggregate current speedup: 1.748%

Are there any user-facing changes?

No. This is an internal physical execution change for hash aggregate output draining. There are no public API or documented behavior changes.

Dandandan · 2026-06-30T06:55:42Z

run benchmarks

adriangbot · 2026-06-30T06:58:25Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4840701536-749-5nw6v 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing codex/emit-first-partial-investigation (eedbae7) to 01bf68c (merge-base) diff using: clickbench_partitioned
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-06-30T06:58:30Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4840701536-750-7vv4d 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing codex/emit-first-partial-investigation (eedbae7) to 01bf68c (merge-base) diff using: tpcds
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-06-30T06:58:31Z

🤖 Benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4840701536-751-wp2nd 6.12.85+ #1 SMP Mon May 11 08:17:35 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing codex/emit-first-partial-investigation (eedbae7) to 01bf68c (merge-base) diff using: tpch
Results will be posted here when complete

File an issue against this benchmark runner

adriangbot · 2026-06-30T07:14:17Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and codex_emit-first-partial-investigation
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                           HEAD ┃ codex_emit-first-partial-investigation ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 38.22 / 39.52 ±1.33 / 41.28 ms │         38.11 / 39.54 ±1.31 / 41.32 ms │     no change │
│ QQuery 2  │ 19.34 / 19.63 ±0.20 / 19.88 ms │         19.20 / 19.94 ±0.77 / 21.34 ms │     no change │
│ QQuery 3  │ 31.18 / 33.86 ±1.64 / 36.34 ms │         31.38 / 33.08 ±1.09 / 34.33 ms │     no change │
│ QQuery 4  │ 17.58 / 17.72 ±0.12 / 17.88 ms │         17.72 / 18.04 ±0.50 / 19.02 ms │     no change │
│ QQuery 5  │ 38.11 / 41.19 ±2.59 / 45.99 ms │         38.93 / 41.92 ±2.07 / 44.71 ms │     no change │
│ QQuery 6  │ 16.18 / 16.34 ±0.15 / 16.55 ms │         16.27 / 16.97 ±0.82 / 18.42 ms │     no change │
│ QQuery 7  │ 44.46 / 46.00 ±1.31 / 48.26 ms │         44.14 / 46.30 ±2.37 / 50.67 ms │     no change │
│ QQuery 8  │ 43.38 / 43.66 ±0.15 / 43.80 ms │         43.35 / 43.60 ±0.29 / 44.13 ms │     no change │
│ QQuery 9  │ 49.98 / 51.06 ±0.91 / 52.60 ms │         49.59 / 50.69 ±0.68 / 51.62 ms │     no change │
│ QQuery 10 │ 42.39 / 42.73 ±0.30 / 43.20 ms │         42.30 / 42.66 ±0.35 / 43.26 ms │     no change │
│ QQuery 11 │ 13.62 / 13.79 ±0.11 / 13.92 ms │         13.52 / 13.61 ±0.08 / 13.73 ms │     no change │
│ QQuery 12 │ 23.72 / 23.92 ±0.22 / 24.33 ms │         23.88 / 24.38 ±0.43 / 25.04 ms │     no change │
│ QQuery 13 │ 31.94 / 35.01 ±2.67 / 39.08 ms │         32.70 / 34.44 ±1.00 / 35.77 ms │     no change │
│ QQuery 14 │ 23.90 / 24.75 ±1.24 / 27.16 ms │         23.88 / 24.18 ±0.25 / 24.54 ms │     no change │
│ QQuery 15 │ 31.55 / 32.06 ±0.49 / 32.79 ms │         31.52 / 31.86 ±0.34 / 32.37 ms │     no change │
│ QQuery 16 │ 14.47 / 14.67 ±0.17 / 14.96 ms │         13.93 / 14.15 ±0.13 / 14.33 ms │     no change │
│ QQuery 17 │ 87.62 / 88.79 ±1.41 / 91.37 ms │         73.78 / 74.81 ±0.56 / 75.27 ms │ +1.19x faster │
│ QQuery 18 │ 65.72 / 68.67 ±2.55 / 73.29 ms │         59.58 / 61.46 ±1.97 / 65.25 ms │ +1.12x faster │
│ QQuery 19 │ 33.00 / 33.28 ±0.41 / 34.10 ms │         33.27 / 34.01 ±1.07 / 36.12 ms │     no change │
│ QQuery 20 │ 34.63 / 34.98 ±0.28 / 35.33 ms │         32.18 / 32.65 ±0.36 / 33.01 ms │ +1.07x faster │
│ QQuery 21 │ 55.89 / 57.75 ±1.31 / 59.45 ms │         55.72 / 57.32 ±0.89 / 58.10 ms │     no change │
│ QQuery 22 │ 14.05 / 14.46 ±0.44 / 15.30 ms │         13.94 / 14.12 ±0.17 / 14.34 ms │     no change │
└───────────┴────────────────────────────────┴────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ Benchmark Summary                                     ┃          ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩
│ Total Time (HEAD)                                     │ 793.84ms │
│ Total Time (codex_emit-first-partial-investigation)   │ 769.73ms │
│ Average Time (HEAD)                                   │  36.08ms │
│ Average Time (codex_emit-first-partial-investigation) │  34.99ms │
│ Queries Faster                                        │        3 │
│ Queries Slower                                        │        0 │
│ Queries with No Change                                │       19 │
│ Queries with Failure                                  │        0 │
└───────────────────────────────────────────────────────┴──────────┘

Resource Usage

tpch — base (merge-base)

Metric	Value
Wall time	5.0s
Peak memory	1.2 GiB
Avg memory	507.9 MiB
CPU user	23.4s
CPU sys	1.7s
Peak spill	0 B

tpch — branch

Metric	Value
Wall time	5.0s
Peak memory	1.2 GiB
Avg memory	520.7 MiB
CPU user	22.4s
CPU sys	1.8s
Peak spill	0 B

File an issue against this benchmark runner

adriangbot · 2026-06-30T07:16:24Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and codex_emit-first-partial-investigation
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃                                  HEAD ┃ codex_emit-first-partial-investigation ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │           6.32 / 6.79 ±0.78 / 8.33 ms │            5.59 / 6.12 ±0.84 / 7.79 ms │ +1.11x faster │
│ QQuery 2  │        83.83 / 84.38 ±0.35 / 84.79 ms │         80.78 / 81.31 ±0.55 / 82.34 ms │     no change │
│ QQuery 3  │        31.15 / 31.30 ±0.16 / 31.57 ms │         29.37 / 29.71 ±0.24 / 29.94 ms │ +1.05x faster │
│ QQuery 4  │    487.16 / 513.89 ±17.29 / 533.57 ms │      490.95 / 494.04 ±2.48 / 497.77 ms │     no change │
│ QQuery 5  │        52.69 / 52.94 ±0.20 / 53.13 ms │         51.73 / 52.16 ±0.30 / 52.45 ms │     no change │
│ QQuery 6  │        37.01 / 37.33 ±0.26 / 37.67 ms │         36.85 / 37.49 ±0.49 / 38.15 ms │     no change │
│ QQuery 7  │        93.85 / 94.86 ±0.70 / 95.60 ms │         95.32 / 97.30 ±1.62 / 99.08 ms │     no change │
│ QQuery 8  │        36.98 / 37.54 ±0.73 / 38.95 ms │         38.64 / 39.69 ±1.48 / 42.45 ms │  1.06x slower │
│ QQuery 9  │        53.13 / 55.87 ±2.04 / 58.57 ms │         54.30 / 57.54 ±2.48 / 60.60 ms │     no change │
│ QQuery 10 │        63.76 / 64.40 ±0.40 / 64.91 ms │         66.42 / 66.82 ±0.36 / 67.43 ms │     no change │
│ QQuery 11 │     292.20 / 297.71 ±3.97 / 303.81 ms │      348.79 / 366.06 ±9.00 / 373.35 ms │  1.23x slower │
│ QQuery 12 │        28.74 / 28.84 ±0.15 / 29.14 ms │         29.53 / 30.05 ±0.40 / 30.62 ms │     no change │
│ QQuery 13 │     118.55 / 118.88 ±0.28 / 119.37 ms │      121.05 / 122.32 ±1.23 / 124.65 ms │     no change │
│ QQuery 14 │     417.89 / 419.44 ±1.45 / 422.07 ms │      417.18 / 427.36 ±7.09 / 437.97 ms │     no change │
│ QQuery 15 │        57.41 / 58.02 ±0.36 / 58.53 ms │         58.09 / 59.21 ±0.81 / 60.51 ms │     no change │
│ QQuery 16 │           6.92 / 7.07 ±0.24 / 7.54 ms │            6.87 / 7.00 ±0.19 / 7.37 ms │     no change │
│ QQuery 17 │        80.43 / 80.80 ±0.25 / 81.15 ms │         80.51 / 82.67 ±2.16 / 86.01 ms │     no change │
│ QQuery 18 │     123.69 / 125.94 ±1.79 / 128.80 ms │      124.37 / 125.72 ±0.82 / 126.52 ms │     no change │
│ QQuery 19 │        41.56 / 42.07 ±0.33 / 42.59 ms │         41.75 / 42.84 ±1.17 / 44.98 ms │     no change │
│ QQuery 20 │        35.25 / 36.78 ±1.71 / 39.96 ms │         36.20 / 36.68 ±0.27 / 37.00 ms │     no change │
│ QQuery 21 │        17.62 / 18.00 ±0.30 / 18.31 ms │         17.92 / 18.06 ±0.13 / 18.27 ms │     no change │
│ QQuery 22 │        62.48 / 63.15 ±0.66 / 63.96 ms │         62.37 / 63.35 ±0.94 / 65.01 ms │     no change │
│ QQuery 23 │     358.62 / 361.43 ±2.10 / 364.43 ms │      346.92 / 350.12 ±2.60 / 353.44 ms │     no change │
│ QQuery 24 │     226.94 / 229.80 ±3.69 / 236.87 ms │      226.48 / 228.76 ±2.18 / 232.52 ms │     no change │
│ QQuery 25 │     110.63 / 111.27 ±0.38 / 111.74 ms │      110.54 / 112.69 ±2.29 / 116.64 ms │     no change │
│ QQuery 26 │        58.42 / 60.98 ±4.16 / 69.26 ms │         58.34 / 58.54 ±0.14 / 58.71 ms │     no change │
│ QQuery 27 │           6.40 / 6.56 ±0.16 / 6.86 ms │            6.48 / 6.58 ±0.12 / 6.82 ms │     no change │
│ QQuery 28 │        61.14 / 61.85 ±0.60 / 62.92 ms │         60.70 / 61.18 ±0.27 / 61.47 ms │     no change │
│ QQuery 29 │       97.95 / 99.57 ±1.68 / 102.69 ms │        97.24 / 98.78 ±1.63 / 101.81 ms │     no change │
│ QQuery 30 │        35.02 / 35.93 ±0.79 / 37.26 ms │         32.42 / 32.76 ±0.35 / 33.41 ms │ +1.10x faster │
│ QQuery 31 │     119.12 / 120.09 ±0.89 / 121.56 ms │      111.57 / 113.33 ±2.86 / 119.02 ms │ +1.06x faster │
│ QQuery 32 │        22.69 / 23.09 ±0.23 / 23.35 ms │         20.41 / 21.22 ±1.15 / 23.50 ms │ +1.09x faster │
│ QQuery 33 │        41.03 / 44.24 ±4.28 / 52.54 ms │         38.01 / 38.53 ±0.75 / 40.02 ms │ +1.15x faster │
│ QQuery 34 │        10.95 / 11.29 ±0.25 / 11.57 ms │         10.04 / 10.13 ±0.15 / 10.43 ms │ +1.11x faster │
│ QQuery 35 │        83.01 / 83.84 ±0.64 / 84.80 ms │         72.58 / 73.18 ±0.48 / 73.91 ms │ +1.15x faster │
│ QQuery 36 │           6.81 / 6.86 ±0.03 / 6.89 ms │            5.83 / 5.99 ±0.21 / 6.41 ms │ +1.15x faster │
│ QQuery 37 │           7.99 / 8.05 ±0.06 / 8.17 ms │            6.93 / 7.04 ±0.07 / 7.12 ms │ +1.14x faster │
│ QQuery 38 │        71.93 / 74.78 ±4.65 / 84.05 ms │         62.36 / 63.68 ±0.97 / 65.29 ms │ +1.17x faster │
│ QQuery 39 │     102.03 / 103.69 ±1.64 / 105.92 ms │         86.45 / 86.71 ±0.32 / 87.31 ms │ +1.20x faster │
│ QQuery 40 │        25.12 / 25.54 ±0.36 / 26.11 ms │         23.50 / 23.68 ±0.14 / 23.83 ms │ +1.08x faster │
│ QQuery 41 │        12.54 / 12.60 ±0.06 / 12.69 ms │         11.55 / 11.72 ±0.17 / 12.04 ms │ +1.07x faster │
│ QQuery 42 │        25.14 / 25.46 ±0.29 / 25.98 ms │         23.60 / 23.85 ±0.15 / 24.07 ms │ +1.07x faster │
│ QQuery 43 │           5.39 / 5.45 ±0.04 / 5.48 ms │            5.02 / 5.13 ±0.11 / 5.35 ms │ +1.06x faster │
│ QQuery 44 │        10.10 / 10.18 ±0.05 / 10.25 ms │            9.46 / 9.56 ±0.08 / 9.67 ms │ +1.06x faster │
│ QQuery 45 │        42.76 / 45.45 ±3.65 / 52.58 ms │         38.85 / 41.11 ±2.30 / 45.40 ms │ +1.11x faster │
│ QQuery 46 │        12.92 / 13.59 ±0.56 / 14.35 ms │         12.06 / 12.57 ±0.39 / 13.12 ms │ +1.08x faster │
│ QQuery 47 │    233.87 / 253.10 ±10.46 / 261.82 ms │      224.80 / 228.88 ±4.37 / 234.91 ms │ +1.11x faster │
│ QQuery 48 │        96.11 / 96.81 ±0.60 / 97.74 ms │         96.08 / 96.72 ±0.61 / 97.74 ms │     no change │
│ QQuery 49 │        76.90 / 79.17 ±3.32 / 85.77 ms │         78.63 / 80.31 ±0.93 / 81.44 ms │     no change │
│ QQuery 50 │        59.42 / 59.66 ±0.21 / 60.02 ms │         61.22 / 62.16 ±0.77 / 63.32 ms │     no change │
│ QQuery 51 │      97.48 / 102.57 ±5.14 / 112.47 ms │        95.33 / 97.87 ±3.37 / 104.41 ms │     no change │
│ QQuery 52 │        23.96 / 24.44 ±0.36 / 25.04 ms │         25.76 / 26.48 ±1.01 / 28.45 ms │  1.08x slower │
│ QQuery 53 │        29.24 / 29.84 ±0.41 / 30.49 ms │         31.30 / 31.46 ±0.15 / 31.74 ms │  1.05x slower │
│ QQuery 54 │        55.52 / 55.94 ±0.29 / 56.32 ms │         58.66 / 59.11 ±0.28 / 59.50 ms │  1.06x slower │
│ QQuery 55 │        23.23 / 23.49 ±0.21 / 23.86 ms │         24.76 / 25.08 ±0.27 / 25.40 ms │  1.07x slower │
│ QQuery 56 │        39.18 / 40.86 ±2.27 / 45.35 ms │         41.42 / 43.09 ±2.48 / 48.02 ms │  1.05x slower │
│ QQuery 57 │     178.27 / 180.06 ±1.00 / 181.25 ms │      176.05 / 186.90 ±9.24 / 197.91 ms │     no change │
│ QQuery 58 │     115.42 / 117.05 ±1.49 / 119.56 ms │      119.49 / 121.19 ±0.98 / 122.22 ms │     no change │
│ QQuery 59 │     118.14 / 120.25 ±2.68 / 125.43 ms │      118.95 / 120.20 ±1.04 / 121.47 ms │     no change │
│ QQuery 60 │        40.41 / 41.24 ±0.55 / 41.88 ms │         40.26 / 40.65 ±0.33 / 41.24 ms │     no change │
│ QQuery 61 │        12.90 / 13.09 ±0.14 / 13.29 ms │         12.96 / 13.07 ±0.08 / 13.19 ms │     no change │
│ QQuery 62 │        46.65 / 47.22 ±0.67 / 48.48 ms │         46.50 / 47.10 ±0.51 / 47.75 ms │     no change │
│ QQuery 63 │        29.85 / 30.06 ±0.12 / 30.18 ms │         29.61 / 29.88 ±0.22 / 30.26 ms │     no change │
│ QQuery 64 │     408.95 / 415.70 ±5.48 / 422.78 ms │      409.19 / 413.25 ±3.44 / 419.19 ms │     no change │
│ QQuery 65 │     152.15 / 157.93 ±5.27 / 167.63 ms │      146.21 / 149.48 ±2.66 / 154.28 ms │ +1.06x faster │
│ QQuery 66 │        79.38 / 81.96 ±3.80 / 89.53 ms │         79.32 / 79.90 ±0.38 / 80.36 ms │     no change │
│ QQuery 67 │     267.11 / 276.67 ±7.07 / 285.51 ms │      237.83 / 241.61 ±3.69 / 246.28 ms │ +1.15x faster │
│ QQuery 68 │        13.19 / 13.27 ±0.08 / 13.41 ms │         12.00 / 12.15 ±0.18 / 12.50 ms │ +1.09x faster │
│ QQuery 69 │        61.14 / 61.48 ±0.24 / 61.81 ms │         58.20 / 58.65 ±0.41 / 59.31 ms │     no change │
│ QQuery 70 │     112.26 / 118.14 ±5.57 / 128.07 ms │      104.25 / 107.12 ±3.64 / 113.68 ms │ +1.10x faster │
│ QQuery 71 │        38.53 / 38.73 ±0.15 / 38.93 ms │         35.76 / 36.73 ±1.30 / 39.22 ms │ +1.05x faster │
│ QQuery 72 │ 2146.94 / 2202.81 ±56.45 / 2308.48 ms │ 2045.98 / 2243.06 ±152.98 / 2505.97 ms │     no change │
│ QQuery 73 │          9.76 / 9.89 ±0.14 / 10.15 ms │          9.97 / 10.16 ±0.13 / 10.36 ms │     no change │
│ QQuery 74 │     172.89 / 178.81 ±8.89 / 196.35 ms │      174.14 / 179.21 ±4.70 / 188.12 ms │     no change │
│ QQuery 75 │     152.80 / 157.50 ±8.68 / 174.84 ms │      149.67 / 154.40 ±6.59 / 167.40 ms │     no change │
│ QQuery 76 │        36.25 / 36.80 ±0.37 / 37.39 ms │         35.41 / 35.96 ±0.38 / 36.47 ms │     no change │
│ QQuery 77 │        62.22 / 63.00 ±0.74 / 64.38 ms │         61.20 / 61.93 ±0.79 / 63.44 ms │     no change │
│ QQuery 78 │     187.21 / 191.75 ±5.37 / 202.30 ms │      184.98 / 188.92 ±2.91 / 192.83 ms │     no change │
│ QQuery 79 │        67.77 / 72.38 ±4.62 / 80.73 ms │         67.23 / 67.98 ±0.71 / 69.27 ms │ +1.06x faster │
│ QQuery 80 │     105.88 / 108.22 ±1.79 / 111.34 ms │      100.15 / 101.64 ±2.19 / 105.97 ms │ +1.06x faster │
│ QQuery 81 │        28.57 / 28.80 ±0.35 / 29.48 ms │         25.99 / 26.20 ±0.14 / 26.41 ms │ +1.10x faster │
│ QQuery 82 │        18.04 / 18.20 ±0.14 / 18.44 ms │         16.53 / 16.69 ±0.17 / 17.02 ms │ +1.09x faster │
│ QQuery 83 │        43.75 / 47.63 ±5.33 / 58.05 ms │         40.63 / 40.81 ±0.16 / 41.06 ms │ +1.17x faster │
│ QQuery 84 │        32.64 / 33.37 ±0.43 / 33.93 ms │         30.76 / 30.92 ±0.13 / 31.06 ms │ +1.08x faster │
│ QQuery 85 │     112.26 / 113.96 ±0.96 / 115.14 ms │      108.29 / 113.15 ±6.70 / 126.42 ms │     no change │
│ QQuery 86 │        28.10 / 28.90 ±1.07 / 30.93 ms │         25.34 / 25.91 ±0.34 / 26.31 ms │ +1.12x faster │
│ QQuery 87 │        72.93 / 76.66 ±4.02 / 84.18 ms │         62.89 / 63.20 ±0.22 / 63.49 ms │ +1.21x faster │
│ QQuery 88 │        64.74 / 66.15 ±1.39 / 68.44 ms │         64.05 / 65.75 ±2.47 / 70.65 ms │     no change │
│ QQuery 89 │        36.45 / 37.30 ±0.44 / 37.65 ms │         36.25 / 36.75 ±0.60 / 37.84 ms │     no change │
│ QQuery 90 │        17.71 / 18.35 ±0.51 / 19.07 ms │         17.55 / 17.77 ±0.16 / 18.00 ms │     no change │
│ QQuery 91 │        47.87 / 50.73 ±3.57 / 57.67 ms │         46.71 / 46.93 ±0.14 / 47.12 ms │ +1.08x faster │
│ QQuery 92 │        32.38 / 32.86 ±0.47 / 33.74 ms │         29.99 / 30.43 ±0.37 / 31.08 ms │ +1.08x faster │
│ QQuery 93 │        51.65 / 52.46 ±1.09 / 54.44 ms │         50.76 / 52.49 ±2.11 / 56.44 ms │     no change │
│ QQuery 94 │        39.86 / 40.72 ±0.54 / 41.34 ms │         39.11 / 39.46 ±0.32 / 39.97 ms │     no change │
│ QQuery 95 │        85.49 / 88.04 ±3.14 / 94.12 ms │         81.97 / 82.28 ±0.27 / 82.68 ms │ +1.07x faster │
│ QQuery 96 │        25.56 / 25.71 ±0.22 / 26.14 ms │         24.43 / 24.67 ±0.25 / 25.14 ms │     no change │
│ QQuery 97 │        56.53 / 57.28 ±0.74 / 58.42 ms │         47.20 / 48.31 ±1.42 / 50.77 ms │ +1.19x faster │
│ QQuery 98 │        43.05 / 43.39 ±0.33 / 43.98 ms │         41.81 / 42.51 ±0.41 / 42.98 ms │     no change │
│ QQuery 99 │        71.12 / 73.48 ±3.12 / 79.44 ms │         70.55 / 70.97 ±0.41 / 71.64 ms │     no change │
└───────────┴───────────────────────────────────────┴────────────────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                                     ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                                     │ 10291.46ms │
│ Total Time (codex_emit-first-partial-investigation)   │ 10167.82ms │
│ Average Time (HEAD)                                   │   103.95ms │
│ Average Time (codex_emit-first-partial-investigation) │   102.71ms │
│ Queries Faster                                        │         37 │
│ Queries Slower                                        │          7 │
│ Queries with No Change                                │         55 │
│ Queries with Failure                                  │          0 │
└───────────────────────────────────────────────────────┴────────────┘

Resource Usage

tpcds — base (merge-base)

Metric	Value
Wall time	55.0s
Peak memory	2.0 GiB
Avg memory	1.4 GiB
CPU user	234.2s
CPU sys	6.0s
Peak spill	0 B

tpcds — branch

Metric	Value
Wall time	55.0s
Peak memory	2.1 GiB
Avg memory	1.4 GiB
CPU user	233.3s
CPU sys	5.6s
Peak spill	0 B

File an issue against this benchmark runner

adriangbot · 2026-06-30T07:26:45Z

🤖 Benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

Comparing HEAD and codex_emit-first-partial-investigation
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query     ┃                                       HEAD ┃ codex_emit-first-partial-investigation ┃         Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0  │               1.23 / 4.04 ±5.34 / 14.71 ms │           1.23 / 3.91 ±5.28 / 14.47 ms │      no change │
│ QQuery 1  │             13.06 / 13.26 ±0.16 / 13.43 ms │         12.85 / 13.03 ±0.24 / 13.49 ms │      no change │
│ QQuery 2  │             36.51 / 36.92 ±0.45 / 37.78 ms │         35.57 / 35.80 ±0.18 / 36.00 ms │      no change │
│ QQuery 3  │             30.99 / 31.78 ±0.73 / 32.78 ms │         30.56 / 31.02 ±0.45 / 31.74 ms │      no change │
│ QQuery 4  │      1617.99 / 1674.96 ±34.31 / 1714.82 ms │     222.02 / 236.62 ±19.23 / 274.38 ms │  +7.08x faster │
│ QQuery 5  │     1634.10 / 1737.49 ±107.32 / 1939.01 ms │     272.50 / 294.18 ±35.58 / 365.15 ms │  +5.91x faster │
│ QQuery 6  │                1.27 / 1.43 ±0.22 / 1.87 ms │            1.29 / 1.44 ±0.22 / 1.87 ms │      no change │
│ QQuery 7  │             13.87 / 17.07 ±6.24 / 29.55 ms │         13.54 / 15.91 ±4.35 / 24.60 ms │  +1.07x faster │
│ QQuery 8  │      2020.77 / 2046.29 ±26.84 / 2093.80 ms │     319.57 / 347.75 ±37.59 / 418.91 ms │  +5.88x faster │
│ QQuery 9  │         469.74 / 493.17 ±20.84 / 524.68 ms │      453.80 / 462.19 ±8.46 / 476.91 ms │  +1.07x faster │
│ QQuery 10 │             75.28 / 76.87 ±0.89 / 77.92 ms │         72.40 / 73.16 ±0.50 / 73.72 ms │      no change │
│ QQuery 11 │            89.35 / 96.09 ±7.80 / 106.25 ms │         83.28 / 84.35 ±0.94 / 85.79 ms │  +1.14x faster │
│ QQuery 12 │     1711.13 / 1785.55 ±106.54 / 1990.49 ms │      264.34 / 273.71 ±7.75 / 287.76 ms │  +6.52x faster │
│ QQuery 13 │        483.02 / 695.74 ±133.58 / 865.56 ms │     362.44 / 385.33 ±14.75 / 404.93 ms │  +1.81x faster │
│ QQuery 14 │          539.86 / 546.14 ±4.02 / 549.96 ms │      280.75 / 288.47 ±7.05 / 300.98 ms │  +1.89x faster │
│ QQuery 15 │      1814.85 / 1910.38 ±49.65 / 1952.23 ms │      271.35 / 280.11 ±7.86 / 293.49 ms │  +6.82x faster │
│ QQuery 16 │      4161.18 / 4296.38 ±86.99 / 4433.70 ms │     616.69 / 628.60 ±13.23 / 653.82 ms │  +6.83x faster │
│ QQuery 17 │     4205.66 / 4300.25 ±103.77 / 4429.35 ms │      616.28 / 626.15 ±8.30 / 638.41 ms │  +6.87x faster │
│ QQuery 18 │  17598.87 / 17983.64 ±361.69 / 18658.51 ms │  1271.65 / 1295.06 ±20.79 / 1333.19 ms │ +13.89x faster │
│ QQuery 19 │             28.07 / 30.21 ±2.94 / 35.85 ms │         27.57 / 27.92 ±0.24 / 28.32 ms │  +1.08x faster │
│ QQuery 20 │         517.51 / 527.27 ±12.15 / 550.66 ms │     521.60 / 536.56 ±28.27 / 593.07 ms │      no change │
│ QQuery 21 │          513.92 / 523.51 ±9.03 / 539.93 ms │      524.45 / 527.53 ±2.08 / 530.24 ms │      no change │
│ QQuery 22 │        982.63 / 996.00 ±11.22 / 1010.97 ms │   988.66 / 1009.12 ±14.06 / 1027.76 ms │      no change │
│ QQuery 23 │      3027.78 / 3064.58 ±37.62 / 3131.74 ms │  3099.11 / 3137.51 ±21.18 / 3157.66 ms │      no change │
│ QQuery 24 │             41.38 / 45.36 ±5.97 / 57.23 ms │         41.44 / 44.55 ±5.09 / 54.65 ms │      no change │
│ QQuery 25 │          112.49 / 116.42 ±6.97 / 130.35 ms │      112.07 / 115.81 ±5.11 / 125.87 ms │      no change │
│ QQuery 26 │             41.86 / 42.71 ±0.68 / 43.56 ms │         42.55 / 44.23 ±2.95 / 50.13 ms │      no change │
│ QQuery 27 │          673.99 / 684.46 ±8.81 / 696.52 ms │      675.80 / 683.70 ±5.86 / 690.63 ms │      no change │
│ QQuery 28 │      3580.28 / 3695.85 ±90.67 / 3804.45 ms │  3052.60 / 3077.43 ±20.15 / 3113.32 ms │  +1.20x faster │
│ QQuery 29 │             40.62 / 40.97 ±0.23 / 41.35 ms │        41.15 / 55.73 ±22.36 / 99.46 ms │   1.36x slower │
│ QQuery 30 │         561.53 / 575.65 ±14.93 / 594.24 ms │     300.97 / 316.85 ±12.94 / 334.78 ms │  +1.82x faster │
│ QQuery 31 │         281.20 / 305.53 ±20.20 / 339.62 ms │      291.50 / 296.68 ±4.23 / 302.67 ms │      no change │
│ QQuery 32 │       990.89 / 1024.00 ±32.50 / 1085.20 ms │   966.72 / 1006.52 ±34.21 / 1068.13 ms │      no change │
│ QQuery 33 │  25834.26 / 27777.82 ±995.02 / 28630.63 ms │  1486.22 / 1507.46 ±18.26 / 1538.45 ms │ +18.43x faster │
│ QQuery 34 │ 28069.49 / 29467.92 ±1131.78 / 31001.96 ms │  1497.86 / 1540.50 ±34.09 / 1598.49 ms │ +19.13x faster │
│ QQuery 35 │        966.62 / 991.83 ±21.34 / 1023.39 ms │     284.36 / 306.65 ±24.75 / 348.11 ms │  +3.23x faster │
│ QQuery 36 │         162.34 / 182.87 ±13.55 / 200.72 ms │         65.22 / 73.73 ±7.64 / 84.31 ms │  +2.48x faster │
│ QQuery 37 │             37.32 / 40.89 ±3.07 / 46.42 ms │         37.79 / 43.13 ±3.28 / 47.67 ms │   1.05x slower │
│ QQuery 38 │             42.46 / 49.99 ±8.18 / 63.24 ms │         44.90 / 46.16 ±1.93 / 49.99 ms │  +1.08x faster │
│ QQuery 39 │          185.21 / 192.95 ±7.33 / 204.98 ms │     139.01 / 162.07 ±12.18 / 173.15 ms │  +1.19x faster │
│ QQuery 40 │             14.31 / 14.86 ±0.35 / 15.24 ms │         14.50 / 14.66 ±0.12 / 14.83 ms │      no change │
│ QQuery 41 │             13.69 / 15.69 ±3.46 / 22.59 ms │         14.09 / 14.40 ±0.18 / 14.64 ms │  +1.09x faster │
│ QQuery 42 │             13.84 / 15.87 ±3.31 / 22.47 ms │         13.64 / 13.71 ±0.10 / 13.90 ms │  +1.16x faster │
└───────────┴────────────────────────────────────────────┴────────────────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary                                     ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)                                     │ 108170.66ms │
│ Total Time (codex_emit-first-partial-investigation)   │  19979.40ms │
│ Average Time (HEAD)                                   │   2515.60ms │
│ Average Time (codex_emit-first-partial-investigation) │    464.64ms │
│ Queries Faster                                        │          24 │
│ Queries Slower                                        │           2 │
│ Queries with No Change                                │          17 │
│ Queries with Failure                                  │           0 │
└───────────────────────────────────────────────────────┴─────────────┘

Resource Usage

clickbench_partitioned — base (merge-base)

Metric	Value
Wall time	545.1s
Peak memory	11.9 GiB
Avg memory	6.5 GiB
CPU user	4799.1s
CPU sys	319.6s
Peak spill	0 B

clickbench_partitioned — branch

Metric	Value
Wall time	105.0s
Peak memory	12.7 GiB
Avg memory	4.4 GiB
CPU user	1022.2s
CPU sys	73.7s
Peak spill	0 B

File an issue against this benchmark runner

hhhizzz · 2026-06-30T08:29:17Z

@2010YOUY01 @Rachelint The result looks good. Can you help take a look when you have time?

Rachelint

LGTM, thanks @hhhizzz

Qiwei Huang added 4 commits June 30, 2026 10:55

Avoid repeated partial aggregate prefix emits

2bd1553

Test partial aggregate output materialization

8bc5f63

Polish materialized aggregate output tests

2ef96c8

Clarify materialized aggregate output state restore

545cf00

github-actions Bot added the physical-plan Changes to the physical-plan crate label Jun 30, 2026

Fix aggregate materialization snapshot

eedbae7

hhhizzz mentioned this pull request Jun 30, 2026

Track long-term replacement for destructive EmitTo::First in aggregation #23251

Open

Rachelint mentioned this pull request Jun 30, 2026

Poc: expose group hash feedback from GroupValues #23229

Draft

Rachelint approved these changes Jun 30, 2026

View reviewed changes

Merge branch 'main' into codex/emit-first-partial-investigation

5c914ad

Rachelint added this pull request to the merge queue Jun 30, 2026

Merged via the queue into apache:main with commit 7d9f6ea Jun 30, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid repeated `EmitTo::First` in partial hash aggregate output#23250

Avoid repeated `EmitTo::First` in partial hash aggregate output#23250
Rachelint merged 6 commits into
apache:mainfrom
hhhizzz:codex/emit-first-partial-investigation

hhhizzz commented Jun 30, 2026

Uh oh!

Dandandan commented Jun 30, 2026

Uh oh!

adriangbot commented Jun 30, 2026

Uh oh!

adriangbot commented Jun 30, 2026

Uh oh!

adriangbot commented Jun 30, 2026

Uh oh!

adriangbot commented Jun 30, 2026

Uh oh!

adriangbot commented Jun 30, 2026

Uh oh!

adriangbot commented Jun 30, 2026

Uh oh!

hhhizzz commented Jun 30, 2026

Uh oh!

Rachelint left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

hhhizzz commented Jun 30, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Dandandan commented Jun 30, 2026

Uh oh!

adriangbot commented Jun 30, 2026

Uh oh!

adriangbot commented Jun 30, 2026

Uh oh!

adriangbot commented Jun 30, 2026

Uh oh!

adriangbot commented Jun 30, 2026

Uh oh!

adriangbot commented Jun 30, 2026

Uh oh!

adriangbot commented Jun 30, 2026

Uh oh!

hhhizzz commented Jun 30, 2026

Uh oh!

Rachelint left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants