Skip to content

Improve intersect_by_rank performance#7744

Draft
robert3005 wants to merge 1 commit intodevelopfrom
rk/intersect-by-rank
Draft

Improve intersect_by_rank performance#7744
robert3005 wants to merge 1 commit intodevelopfrom
rk/intersect-by-rank

Conversation

@robert3005
Copy link
Copy Markdown
Contributor

We never spent time and this is useful for merging selections and filters in
scans

Signed-off-by: Robert Kruszewski github@robertk.io

Signed-off-by: Robert Kruszewski <github@robertk.io>
@robert3005
Copy link
Copy Markdown
Contributor Author

I will go over this tomorrow @joseph-isaacs I looked at #7098 and #7393 which both optimised slightly different cases of this function. I tried to combine the two.

@robert3005 robert3005 added the changelog/performance A performance improvement label May 1, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 1, 2026

Merging this PR will degrade performance by 37.03%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 9 improved benchmarks
❌ 8 regressed benchmarks
✅ 1181 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
WallTime for[10M_u16] 94.8 µs 150.6 µs -37.03%
WallTime runend[10M_i32_runlen_10] 159.4 µs 192.5 µs -17.17%
WallTime dict[10M_u32_values_u16_codes] 146.5 µs 181.6 µs -19.3%
WallTime mix[100%_in/0%_out] 500.7 µs 451.6 µs +10.88%
WallTime dynamic_dispatch_u32[10M] 142.3 µs 96.3 µs +47.87%
Simulation decompress_rd[f32, (10000, 0.0)] 94.6 µs 85.7 µs +10.36%
Simulation decompress_rd[f32, (100000, 0.0)] 583.5 µs 495.6 µs +17.74%
Simulation decompress_rd[f32, (100000, 0.01)] 495.1 µs 582.5 µs -15.01%
Simulation decompress_rd[f32, (100000, 0.1)] 495.1 µs 582.5 µs -15.01%
Simulation decompress_rd[f32, (10000, 0.1)] 90.2 µs 81.8 µs +10.25%
Simulation decompress_rd[f32, (10000, 0.01)] 90.1 µs 81.9 µs +10.06%
Simulation decompress_rd[f64, (10000, 0.0)] 138.5 µs 122.1 µs +13.43%
Simulation decompress_rd[f64, (10000, 0.1)] 138.7 µs 122.1 µs +13.61%
Simulation decompress_rd[f64, (100000, 0.1)] 842.5 µs 1,020.7 µs -17.46%
Simulation decompress_rd[f64, (10000, 0.01)] 138.6 µs 121.9 µs +13.69%
Simulation decompress_rd[f64, (100000, 0.01)] 842.6 µs 1,020.5 µs -17.44%
Simulation bitwise_not_vortex_buffer_mut[128] 246.1 ns 275.3 ns -10.6%

Comparing rk/intersect-by-rank (8551137) with develop (cb9b138)

Open in CodSpeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/performance A performance improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant