Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Dec 23, 2025

Which issue does this PR close?

Rationale for this change

When working on improving the boolean kernels, I have seen significant and unexplained noise from run to run. For example, just adding a fast path for u64 aligned data resulted in a reported 30% regression in the speed of slice24 (code that is not affected by the change at all).

for example, from #9022

and              1.00    208.0±5.91ns        ? ?/sec    1.34   278.8±10.07ns        ? ?/sec
and_sliced_1     1.00   1100.2±6.53ns        ? ?/sec    1.12   1226.9±6.11ns        ? ?/sec
and_sliced_24    1.40    340.9±2.49ns        ? ?/sec    1.00    243.7±2.13ns        ? ?/sec

I also can't reproduce this effect locally or when I run the benchmarks individually.

Given the above, and the tiny amount of time spent in the benchmark (hundreds of nanoseconds), I believe what is happening is that changing the allocation pattern during the benchmark runs (each kernel allocates output), data for subsequent iterations is allocated subtlety differently (e.g. the exact alignment or some other factor is different).

This results in different performance characteristics even when the code has not
changed.

What changes are included in this PR?

To reduce this noise, I want to change the benchmarks to pre-allocate the input.

Are these changes tested?

I ran them manually

Are there any user-facing changes?

No, this is just a benchmark change

@github-actions github-actions bot added the arrow Changes to the arrow crate label Dec 23, 2025
@alamb alamb changed the title Allocate buffers before work in binary_kernels benchmark Allocate buffers before work in boolean_kernels benchmark Dec 23, 2025
let offset = 1;
let array1_slice = array1.slice(offset, size - offset);
let array2_slice = array2.slice(offset, size - offset);
let array1_sliced_1 = array1.slice(offset, size - offset);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main change is to allocate all the inputs arrays before running the benchmarks

I had to rename some of the variables anyways, so I took the opportunity to rename them to reflect the names used in the benchmark descriptions as well

@alamb alamb marked this pull request as ready for review December 24, 2025 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant