Skip to content

Remove unsafe code from compute_vec_l2sq#1094

Merged
arrayka merged 2 commits into
mainfrom
u/alrazu/l2sq_unsafe
May 21, 2026
Merged

Remove unsafe code from compute_vec_l2sq#1094
arrayka merged 2 commits into
mainfrom
u/alrazu/l2sq_unsafe

Conversation

@arrayka
Copy link
Copy Markdown
Contributor

@arrayka arrayka commented May 20, 2026

Summary

Removes the unsafe { std::slice::from_raw_parts(...) } call from compute_vec_l2sq by replacing manual index arithmetic with chunks_exact / par_chunks_exact iterators zipped with the output slice.

Changes

  • compute_vec_l2sq: Now takes &[f32] directly instead of (data, index, dim). No more unsafe pointer arithmetic.
  • compute_vecs_l2sq: Uses zip(data.chunks_exact(dim)) (sequential path) and zip(data.par_chunks_exact(dim)) (parallel path) to feed slices into compute_vec_l2sq. Eliminates enumerate() and manual offset calculation.

Benchmark Results

Benchmarks are flat or within noise threshold.

iai-callgrind (snrm2_benchmark_rust_iai — directly benchmarks compute_vecs_l2sq) shows small perf improvement (# of CPU instructions reduced by 2%):

cargo bench -p diskann-disk --bench bench_main_iai
    Finished `bench` profile [optimized + debuginfo] target(s) in 24.13s
     Running benches/bench_main_iai.rs (target/release/deps/bench_main_iai-f51a979b5ebb65bd)
bench_main_iai::kmeans_bench_iai::snrm2_benchmark_rust_iai
  Instructions:                     2371584|2421508              (-2.06169%) [-1.02105x]
  L1 Hits:                          2925062|2975035              (-1.67974%) [-1.01708x]
  L2 Hits:                            51229|51167                (+0.12117%) [+1.00121x]
  RAM Hits:                           32950|32947                (+0.00911%) [+1.00009x]
  Total read+write:                 3009241|3059149              (-1.63143%) [-1.01658x]
  Estimated Cycles:                 4334457|4384015              (-1.13042%) [-1.01143x]

Iai-Callgrind result: Ok. 3 without regressions; 0 regressed; 3 benchmarks finished in 94.0474s

Criterion (Snrm2 Rust Run): No statistically significant change (p > 0.05 in both before/after runs):

.\bench_main-8c2599eee32d59ce_after.exe --bench Snrm2 --color=never
Gnuplot not found, using plotters backend
Benchmarking kmeans-computation/Snrm2 Rust Run
Benchmarking kmeans-computation/Snrm2 Rust Run: Warming up for 3.0000 s
Benchmarking kmeans-computation/Snrm2 Rust Run: Collecting 50 samples in estimated 5.9923 s (7650 iterations)
Benchmarking kmeans-computation/Snrm2 Rust Run: Analyzing
kmeans-computation/Snrm2 Rust Run
                        time:   [777.50 µs 787.89 µs 799.10 µs]
                        change: [-3.9665% -2.4543% -0.7324%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 50 measurements (4.00%)
  2 (4.00%) high mild

@arrayka arrayka linked an issue May 20, 2026 that may be closed by this pull request
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.48%. Comparing base (895a2c0) to head (5f09e10).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1094      +/-   ##
==========================================
+ Coverage   89.45%   89.48%   +0.03%     
==========================================
  Files         458      474      +16     
  Lines       85398    89753    +4355     
==========================================
+ Hits        76392    80317    +3925     
- Misses       9006     9436     +430     
Flag Coverage Δ
miri 89.48% <100.00%> (+0.03%) ⬆️
unittests 89.13% <100.00%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-disk/src/utils/math_util.rs 98.83% <100.00%> (+0.02%) ⬆️

... and 46 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@arrayka arrayka marked this pull request as ready for review May 21, 2026 15:55
@arrayka arrayka requested review from a team and Copilot May 21, 2026 15:55
Comment thread diskann-disk/src/utils/math_util.rs
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors compute_vec_l2sq/compute_vecs_l2sq in diskann-disk to remove unsafe slice construction by switching from manual pointer/index arithmetic to safe slice chunk iterators (sequential and Rayon-parallel), while preserving performance characteristics.

Changes:

  • Refactors compute_vec_l2sq to accept a &[f32] vector slice directly (removing unsafe { from_raw_parts(...) }).
  • Updates compute_vecs_l2sq to drive computation via chunks_exact(dim) and par_chunks_exact(dim) zipped with the output buffer.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread diskann-disk/src/utils/math_util.rs
@arrayka arrayka merged commit 3dc4a28 into main May 21, 2026
25 of 26 checks passed
@arrayka arrayka deleted the u/alrazu/l2sq_unsafe branch May 21, 2026 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove unsafe code from compute_vec_l2sq

5 participants