[WIP] Use Kahan (Neumaier) summation for float sums by dimitarvdimitrov · Pull Request #8373 · vortex-data/vortex

dimitarvdimitrov · 2026-06-11T16:21:16Z

Stacked on #8367 so the CodSpeed run shows the cost of compensated summation against the freshly-added benchmarks.

In draft because we probably don't want to eat the perf cost

What this pr does

SumState::Float now holds a KahanSum (running sum + compensation term) implementing Neumaier's variant of Kahan compensated summation. When an addition produces a non-finite value the compensation update is skipped, preserving the existing overflow-to-infinity and inf + -inf => NaN (saturated) semantics.
Float sum partials are a struct {sum: f64, compensation: f64} instead of a plain f64, so precision survives partial flush/combine boundaries (chunked arrays, kernel partials, accumulator merges):
- Sum::combine_partials accepts both the struct partial and a plain f64.
- The legacy stats bridge in Accumulator::accumulate falls through on a partial dtype mismatch instead of erroring, and Sum::try_accumulate consumes the cached f64 Stat::Sum for floats so the stats short-circuit is preserved.

Add f64 cases to the aggregate_sum and aggregate_grouped benchmarks to establish a baseline for upcoming float summation changes. Signed-off-by: Dimitar Dimitrov <dimitar@spiraldb.com>

SumState::Float now carries a KahanSum (running sum plus compensation term) instead of a bare f64, so float sum aggregates accumulate with Neumaier's variant of Kahan summation. The compensation update is skipped when the addition produces a non-finite value, keeping the existing overflow-to-infinity and inf + -inf => NaN (saturated) semantics. Signed-off-by: Dimitar Dimitrov <dimitar@spiraldb.com>

Float sum partials are now a struct {sum: f64, compensation: f64} instead of a plain f64, so compensated-summation precision survives partial flush/combine boundaries (chunked arrays, kernel partials, accumulator merges). - Sum::partial_dtype diverges from return_dtype for floats; to_scalar emits the struct, finalize/finalize_scalar collapse it back to sum + compensation. - Sum::combine_partials accepts both the struct partial and a plain f64 (constant-multiply path). - The legacy stats bridge in Accumulator::accumulate now falls through on a partial dtype mismatch instead of erroring, and Sum::try_accumulate consumes the cached f64 Stat::Sum for floats. Signed-off-by: Dimitar Dimitrov <dimitar@spiraldb.com>

codspeed-hq · 2026-06-11T16:31:10Z

Merging this PR will degrade performance by 20.43%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 2 improved benchmarks
❌ 7 regressed benchmarks
✅ 1527 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
❌	Simulation	`sum_f64_all_valid`	294.2 µs	509.8 µs	-42.28%
❌	Simulation	`sum_f64_nulls_clustered`	629.1 µs	1,033.5 µs	-39.13%
❌	Simulation	`sum_f64`	711.9 µs	1,105.9 µs	-35.63%
❌	Simulation	`sum_f64_clustered_nulls`	553.3 µs	768.5 µs	-28%
❌	Simulation	`decompress_rd[f64, (100000, 0.0)]`	845.5 µs	1,024.6 µs	-17.48%
❌	Simulation	`decompress_rd[f32, (100000, 0.0)]`	499.1 µs	587.2 µs	-15%
❌	Simulation	`bitwise_not_vortex_buffer_mut[128]`	244.4 ns	273.6 ns	-10.66%
⚡	Simulation	`chunked_varbinview_canonical_into[(100, 100)]`	309 µs	272.8 µs	+13.24%
⚡	Simulation	`chunked_varbinview_into_canonical[(100, 100)]`	361 µs	326.3 µs	+10.63%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing mitko/kahan-float-aggregations (86623b1) with mitko/sum/f64-bench (4c8059c)}

gatesn · 2026-06-11T19:56:39Z

+        Some(match &return_dtype {
+            // Float partials carry the Kahan compensation term so that precision is preserved
+            // when partial sums are flushed and re-combined.
+            DType::Primitive(ptype, _) if ptype.is_float() => float_partial_dtype(),


CC @joseph-isaacs - but this would be a breaking change for serialized SUM statistics if we had started to serialize them in this way.

Do other engines use this?

In general, we should treat Vortex as a collection of physical kernels that some query engine may resolve to based on the engine's own semantics.

Therefore, this should be its own aggregate function. And DuckDB / DataFusion aggregate push-down can pick this one if it needs to have matching execution semantics.

Datafusion supports reporting Sum statistic, that was blocked though until 54 that released this week since sum stats were very slow

we spoke with Joe offline too. Keeping backwards compat would make this change more complicated.

the only one I could find was duckdb which has a dedicated fsum (aka sum_kahan) function. Because of that I think it's ok to keep a less accurate sum aggregation

dimitarvdimitrov added 3 commits June 11, 2026 16:29

Add f64 sum benchmarks

4c8059c

Add f64 cases to the aggregate_sum and aggregate_grouped benchmarks to establish a baseline for upcoming float summation changes. Signed-off-by: Dimitar Dimitrov <dimitar@spiraldb.com>

Base automatically changed from mitko/sum/f64-bench to develop June 11, 2026 17:32

gatesn reviewed Jun 11, 2026

View reviewed changes

dimitarvdimitrov closed this Jun 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Use Kahan (Neumaier) summation for float sums#8373

[WIP] Use Kahan (Neumaier) summation for float sums#8373
dimitarvdimitrov wants to merge 3 commits into
developfrom
mitko/kahan-float-aggregations

dimitarvdimitrov commented Jun 11, 2026 •

edited

Loading

Uh oh!

codspeed-hq Bot commented Jun 11, 2026

Uh oh!

gatesn Jun 11, 2026

Uh oh!

gatesn Jun 11, 2026 •

edited

Loading

Uh oh!

robert3005 Jun 11, 2026

Uh oh!

dimitarvdimitrov Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dimitarvdimitrov commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this pr does

Uh oh!

codspeed-hq Bot commented Jun 11, 2026

Merging this PR will degrade performance by 20.43%

Performance Changes

Uh oh!

gatesn Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

gatesn Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robert3005 Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

dimitarvdimitrov Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dimitarvdimitrov commented Jun 11, 2026 •

edited

Loading

gatesn Jun 11, 2026 •

edited

Loading