Skip to content

Decimal average can overflow because its inner intermediate sum state overflows its storage size #22713

@AdamGS

Description

@AdamGS

Describe the bug

The current average implementation assumes that the sum of the values in DecimalAverager doesn't overflow for the Decimals underlying native type (i32 for Decimal32 etc).
The count is assumed to be a u64, so even an avg of [1_i32; i32::MAX + 1] will overflow.

To Reproduce

select avg(d), arrow_typeof(avg(d))
from (
  select arrow_cast(99999, 'Decimal32(5, 0)') as d
  from generate_series(1, 21476)
) t;

Currently fails with:

query failed: DataFusion error: Execution error: Arithmetic Overflow in AvgAccumulator

Expected behavior

Averages shouldn't overflow on intermediate state and should also return correct results even if the intermediate state is bigger.

Additional context

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions