Skip to content

Constant comparison and byte_length OnPair kernels#8371

Open
myrrc wants to merge 1 commit into
developfrom
myrrc/onpair-compare
Open

Constant comparison and byte_length OnPair kernels#8371
myrrc wants to merge 1 commit into
developfrom
myrrc/onpair-compare

Conversation

@myrrc

@myrrc myrrc commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

This PR adds 2 kernels for OnPair:

  • byte_length which uses uncompressed_lengths only
  • constant comparison to avoid decompressing for queries like URL != ''

This PR also adds validity() for FSST's kernel for byte_length so that
evaluating validity wouldn't mean decompressing whole FSST array.

@myrrc myrrc requested a review from a team June 11, 2026 16:03
@myrrc myrrc added the changelog/feature A new feature label Jun 11, 2026
@myrrc myrrc requested a review from joseph-isaacs June 11, 2026 16:03
@myrrc myrrc enabled auto-merge (squash) June 11, 2026 16:03
Signed-off-by: Mikhail Kot <mikhail@spiraldb.com>
@myrrc myrrc force-pushed the myrrc/onpair-compare branch from f834ee0 to 0fe87ba Compare June 11, 2026 16:05
@codspeed-hq

codspeed-hq Bot commented Jun 11, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 5 improved benchmarks
❌ 4 regressed benchmarks
✅ 1523 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation decompress_rd[f64, (10000, 0.0)] 111.3 µs 138.3 µs -19.53%
Simulation decompress_rd[f64, (10000, 0.1)] 111.3 µs 138.1 µs -19.41%
Simulation decompress_rd[f64, (10000, 0.01)] 111 µs 137.7 µs -19.4%
Simulation decompress_rd[f32, (10000, 0.1)] 80.7 µs 89.7 µs -10.05%
WallTime cuda/bitpacked_u8/unpack/3bw[100M] 351.3 µs 300.6 µs +16.85%
Simulation decompress_rd[f64, (100000, 0.0)] 980.4 µs 845.5 µs +15.96%
Simulation encode_varbin[(1000, 4)] 160.2 µs 142.5 µs +12.48%
Simulation encode_varbin[(1000, 8)] 160.9 µs 143.1 µs +12.45%
Simulation encode_varbin[(1000, 32)] 166.3 µs 148.1 µs +12.28%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing myrrc/onpair-compare (0fe87ba) with develop (0dd63f0)

Open in CodSpeed

Comment on lines +21 to +54
impl CompareKernel for OnPair {
fn compare(
lhs: ArrayView<'_, Self>,
rhs: &ArrayRef,
operator: CompareOperator,
ctx: &mut ExecutionCtx,
) -> VortexResult<Option<ArrayRef>> {
let Some(constant) = rhs.as_constant() else {
return Ok(None);
};
let is_empty = match constant.dtype() {
DType::Utf8(_) => constant.as_utf8().is_empty(),
DType::Binary(_) => constant.as_binary().is_empty(),
_ => return Ok(None),
};
if is_empty != Some(true) {
return Ok(None);
}

let lengths = lhs.uncompressed_lengths();
let buffer = match operator {
// every value is greater than an empty string
CompareOperator::Gte => BitBuffer::new_set(lhs.len()),
// no value is less than an empty string
CompareOperator::Lt => BitBuffer::new_unset(lhs.len()),
_ => lengths
.binary(
ConstantArray::new(Scalar::zero_value(lengths.dtype()), lengths.len())
.into_array(),
operator.into(),
)?
.execute(ctx)?,
};
Ok(Some(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets make a trait over the two string arrays we can use to impl this?

@myrrc myrrc Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started with a trait but it's a worse approach, because the only thing we can re-implement is comparison with an empty constant and then the diff is twice as large
I don't think over-generalisation is a good thing here.

let lengths = lhs.uncompressed_lengths();
let buffer = match operator {
// every value is greater than an empty string
CompareOperator::Gte => BitBuffer::new_set(lhs.len()),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo once we can have a constant with validity

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn’t Mask scalar function that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants