Skip to content

[benchmark/filtered-search prep] Search Plugins#996

Open
hildebrandmw wants to merge 9 commits intomainfrom
mhildebr/benchmark-plugins
Open

[benchmark/filtered-search prep] Search Plugins#996
hildebrandmw wants to merge 9 commits intomainfrom
mhildebr/benchmark-plugins

Conversation

@hildebrandmw
Copy link
Copy Markdown
Contributor

@hildebrandmw hildebrandmw commented Apr 29, 2026

An issue that has been occurring more and more in diskann-benchmark is the inextensibility of adding new search kinds.

Currently a SearchPhase enum is used to select between topk, range, beta-filtered, and multi-hop filtered search. There are two places where this enum is matched: the giant run_search_outer method and the benchmark run for spherical quantization. Folks wishing to try new experimental post-processing or filtered style searches need to do the following:

  1. Add a new SearchPhase variant.
  2. Update the two match sites independently.

Now, here's the problem. Currently all static builds go through these sites, so new search flavors result in ~29 monomorphizations (4 full-precision only, 9 x 2 for scalar quantization, 2 x 2 for PQ, and 3 for spherical) of the search algorithm. If we ever want to implement just a subset of the search kinds, it's also difficult to update Benchmark::try_match and Benchmark::description accordingly. So adding a new search method has a disproportionate effect on compile times and is impossible or difficult to tweak on a per-index basis.

This PR fixes this by introducing a compile-time plugin-system for search kinds. The main addition is the Plugin dyn-compatible trait and the Plugins collection, defined in diskann-benchmark/src/backend/index/search/plugins.rs. The Plugin trait matches a particular input Kind (in this PR, just SearchPhase, but I left it as a generic to keep it more open-ended if needed) and returns AggregatedSearchResults. The Plugins struct is simply a collection of Plugin trait objects. Importantly, registering a new Plugin in Plugins is what (most likely) triggers the monomorphization of the corresponding method. The Plugins struct also provides convenience methods for listing, matching, and dispatching registered searches. This is used to implement Benchmark::try_match and Benchmark::description for users of this system.

Additionally, the new plugins.rs introduces 4 ZSTs for our current search kinds: Topk, Range, BetaFilter and MultihopFilter.

The benchmark types for index searches (diskann-benchmark/src/backend/index/{benchmarks.rs, scalar.rs, spherical.rs, product.rs}) are rewritten to wrap Plugins and the implementations of Benchmark are updated to query the set of registered searches in Benchmark::try_match and Benchmark::description. Thus, adding or removing a plugin will automatically be reflected.

The updates to diskann-benchmark-runner come in two flavors. First, the updates to fmt.rs introduce some utilities for making things look nice (Indent, Delimit, and Quote). The changes to any.rs and app.rs are targeting a more uniform approach to indenting benchmark related descriptions. With these changes, descriptions no longer need to track their indentation level implicitly: they may assume no indentation and the necessary formatting will happen at a higher level.

hildebrandmw added a commit that referenced this pull request May 1, 2026
A recurring problem with our current benchmark infrastructure is the
`SearchPhase` enum (selecting what kind of search is conducted) does its
job a little too well: every time a new variant is added, we need to
either update *all* users of `SearchPhase` (bloating compile times) or
explicitly opt-out of a particular search phase, which is brittle
especially with respect to `Benchmark::try_match` consistency.

For example see:
* [diversity
search](#858 (comment))
* [filtered
search](#929 (comment))

This PR takes the first step towards systematically solving this problem
by allowing benchmarks registered with
`diskann_benchmark_runner::Benchmarks` to have state rather
than being purely type-level constructs. Stateful benchmarks can have
"search plugins" dynamically registered at construction time. These
plugins participate in
`Benchmark::try_match`, `Benchmark::description`, and `Benchmark::run`,
allowing individual benchmarks to opt into new search-phase variants
without requiring changes across all
benchmarks. See #996 as a follow-up implementing this idea

## Suggested Reviewing Order

In `diskann-benchmark-runner`:
* `benchmark.rs`: This is the main change. It simply changes the
`Benchmark` and `Regression` traits to receive by `&self`.
* `registry.rs`: Change the signatures of `Benchmarks::register` and
`Benchmarks::register_regression` to receive the benchmark type
by-value.
* The rest of the changes are updates to the test infrastructure.

In `diskann-benchmark`: The main changes involve cleaning up the
[`'static`
hack](#865 (comment))
and removing the `BuildAndSearch`/`BuildAndDynamicRun` indirection
traits that are no longer necessary.

In `diskann-benchmark-simd`: Feel free to skip.

---------

Co-authored-by: Mark Hildebrand <mhildebrand@microsoft.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Base automatically changed from mhildebr/stateful-benchmarks to main May 1, 2026 17:52
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 1, 2026

Codecov Report

❌ Patch coverage is 62.82974% with 155 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.42%. Comparing base (45428af) to head (7a85df8).

Files with missing lines Patch % Lines
diskann-benchmark/src/backend/index/benchmarks.rs 51.63% 89 Missing ⚠️
diskann-benchmark/src/inputs/async_.rs 30.50% 41 Missing ⚠️
...kann-benchmark/src/backend/index/search/plugins.rs 65.95% 16 Missing ⚠️
diskann-benchmark-simd/src/lib.rs 0.00% 6 Missing ⚠️
diskann-benchmark/src/backend/filters/benchmark.rs 0.00% 2 Missing ⚠️
diskann-benchmark-runner/src/app.rs 85.71% 1 Missing ⚠️

❌ Your patch status has failed because the patch coverage (62.82%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #996      +/-   ##
==========================================
- Coverage   89.49%   89.42%   -0.07%     
==========================================
  Files         448      449       +1     
  Lines       84118    84369     +251     
==========================================
+ Hits        75282    75448     +166     
- Misses       8836     8921      +85     
Flag Coverage Δ
miri 89.42% <62.82%> (-0.07%) ⬇️
unittests 89.26% <62.82%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-benchmark-runner/src/any.rs 100.00% <ø> (ø)
diskann-benchmark-runner/src/utils/fmt.rs 98.38% <100.00%> (+0.88%) ⬆️
diskann-benchmark/src/backend/exhaustive/minmax.rs 100.00% <ø> (ø)
...iskann-benchmark/src/backend/exhaustive/product.rs 100.00% <ø> (ø)
...kann-benchmark/src/backend/exhaustive/spherical.rs 100.00% <ø> (ø)
diskann-benchmark/src/backend/index/product.rs 100.00% <ø> (ø)
diskann-benchmark/src/backend/index/result.rs 49.03% <100.00%> (-0.52%) ⬇️
diskann-benchmark/src/backend/index/scalar.rs 100.00% <ø> (ø)
diskann-benchmark/src/backend/index/spherical.rs 100.00% <ø> (ø)
diskann-benchmark/src/utils/mod.rs 72.28% <100.00%> (-0.33%) ⬇️
... and 6 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hildebrandmw hildebrandmw marked this pull request as ready for review May 1, 2026 21:51
@hildebrandmw hildebrandmw requested review from a team and Copilot May 1, 2026 21:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a dyn-compatible, compile-time registered “search plugin” system in diskann-benchmark to make it easier to add/remove search kinds per benchmark while reducing monomorphization/compile-time blowups from large match statements. It also updates diskann-benchmark-runner formatting so benchmark descriptions/mismatch reasons are indented uniformly at a higher level.

Changes:

  • Add Plugin/Plugins abstractions plus built-in ZST plugins (Topk, Range, BetaFilter, MultihopFilter) and refactor index benchmarks to dispatch searches via registered plugins.
  • Add SearchPhaseKind + typed SearchPhase::as_* accessors to support plugin dispatch and clearer internal invariants.
  • Replace describeln!-style implicit indentation with runner-level indentation utilities (Indent, Delimit, Quote) and update CLI output golden files accordingly.

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
diskann-benchmark/src/utils/mod.rs Adjust stub benchmark descriptions to align with new runner-level indentation/tag printing.
diskann-benchmark/src/inputs/async_.rs Add SearchPhaseKind, SearchPhase::kind(), and as_* accessors; add IndexSource::data_type().
diskann-benchmark/src/backend/index/spherical.rs Refactor spherical quantization benchmark to use plugin dispatch and registered search kinds.
diskann-benchmark/src/backend/index/search/plugins.rs New plugin trait + registry collection; built-in ZST plugin “kind” markers; formatting helper for kinds.
diskann-benchmark/src/backend/index/search/mod.rs Export plugins module and re-export Plugin.
diskann-benchmark/src/backend/index/scalar.rs Refactor scalar-quant benchmarks to register and dispatch search kinds via plugins.
diskann-benchmark/src/backend/index/result.rs Simplify BuildResult construction to accept an already-aggregated search result.
diskann-benchmark/src/backend/index/product.rs Refactor PQ benchmarks to register and dispatch search kinds via plugins.
diskann-benchmark/src/backend/index/benchmarks.rs Refactor full-precision benchmarks to plugin dispatch; implement core plugin logic for DP/S strategy pairs.
diskann-benchmark/src/backend/filters/benchmark.rs Remove tag printing from descriptions (runner prints tag).
diskann-benchmark/src/backend/exhaustive/spherical.rs Switch description formatting to plain writeln! (no embedded indentation).
diskann-benchmark/src/backend/exhaustive/product.rs Switch description formatting to plain writeln! (no embedded indentation).
diskann-benchmark/src/backend/exhaustive/minmax.rs Switch description formatting to plain writeln! (no embedded indentation).
diskann-benchmark-simd/src/lib.rs Switch description formatting to plain writeln! (no embedded indentation).
diskann-benchmark-runner/src/utils/fmt.rs Add Indent, Delimit, Quote helpers + tests.
diskann-benchmark-runner/src/app.rs Use Indent to uniformly indent benchmark descriptions and mismatch reasons.
diskann-benchmark-runner/src/any.rs Remove describeln! macro (indent handled at higher level now).
diskann-benchmark-runner/tests/benchmark/test-mismatch-1/stdout.txt Update expected output formatting for mismatch listing indentation/newlines.
diskann-benchmark-runner/tests/benchmark/test-mismatch-0/stdout.txt Update expected output formatting for mismatch listing indentation/newlines.
diskann-benchmark-runner/tests/benchmark/test-debug-mode-error/stdout.txt Update expected output formatting (newline/indent normalization).
diskann-benchmark-runner/tests/benchmark/test-4/stdout.txt Update expected output formatting for benchmark listing indentation/newlines.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +432 to +433
Self::TopkBetaFilter => "beta-filter",
Self::TopkMultihopFilter => "multihop-filter",
Comment on lines +340 to +343
for layout in input.query_layouts.iter() {
let search = self
.search
.run(index.clone(), &input.search_phase, layout)?;
Comment on lines +203 to +208
writeln!(
f,
"Data/Query Type: {}",
Why::<datatype::DataType, datatype::Type<T>>::new(data_type)
)?;

Comment on lines +199 to +202
let data_type = match &arg.source {
IndexSource::Load(load) => &load.data_type,
IndexSource::Build(build) => &build.data_type,
};
Comment on lines +155 to +168
writeln!(
f,
"{}",
Why::<datatype::DataType, datatype::Type<T>>::new(
arg.index_operation.source.data_type()
)
)?;

if !self
.quant_search
.is_match(&arg.index_operation.search_phase)
{
writeln!(
f,
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants