First part of a HashSortedMap by aneubeck · Pull Request #107 · github/rust-gems

aneubeck · 2026-04-29T13:25:32Z

In a subsequent step, merging functions and sorted iterators will be added.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

into aneubeck/prefixmap

Copilot

Pull request overview

Introduces a new hash-sorted-map crate implementing the first part of an insertion-only HashSortedMap with SIMD-accelerated group scanning and overflow chaining, plus accompanying benchmarks and design documentation.

Changes:

Adds core HashSortedMap implementation (insert/get/entry API) with growth and overflow chaining.
Adds platform-specific group scanning primitives (SSE2/NEON/scalar) and a basic test suite.
Adds a Criterion benchmarks crate and initial README/optimization docs.

Show a summary per file

File	Description
crates/hash-sorted-map/src/lib.rs	Exposes `group_ops` and `hash_sorted_map` modules.
crates/hash-sorted-map/src/hash_sorted_map.rs	Implements the `HashSortedMap` data structure, growth logic, and Entry API, plus tests.
crates/hash-sorted-map/src/group_ops.rs	Adds SIMD/scalar helpers for ctrl-byte matching and mask iteration.
crates/hash-sorted-map/benchmarks/performance.rs	Adds Criterion benchmarks comparing multiple map implementations.
crates/hash-sorted-map/benchmarks/lib.rs	Benchmark utilities (trigram generation, identity hasher).
crates/hash-sorted-map/benchmarks/Cargo.toml	Defines the benchmarks crate and its dependencies.
crates/hash-sorted-map/README.md	Documents motivation/design and includes benchmark results + running instructions.
crates/hash-sorted-map/OPTIMIZATIONS.md	Design/optimization analysis and rationale.
crates/hash-sorted-map/Cargo.toml	Declares the new `hash-sorted-map` crate metadata.
Cargo.toml	Adds the benchmarks crate to the workspace members.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (3)

crates/hash-sorted-map/src/hash_sorted_map.rs:560

insert_after_grow treats Insertion::NeedsOverflow as unreachable after a resize, but it can still happen for adversarial / extremely-colliding hash distributions (the table may still need overflow groups, even immediately after growing). Instead of unreachable!, handle NeedsOverflow by allocating/linking an overflow group (with the same capacity check as the normal insert path) or by retrying via the normal insert_hashed logic.

        // After grow, the new primary group for `key` cannot be full (see
        // function docs), and the key wasn't in the table before grow.
        FindResult::Vacant(Insertion::NeedsOverflow { .. }) | FindResult::Found(_) => {
            unreachable!("post-grow walk must hit an empty slot")
        }

crates/hash-sorted-map/README.md:90

The benchmark invocation doesn’t match the benchmark that’s defined in benchmarks/Cargo.toml ([[bench]] name = "performance"). cargo bench --bench hashmap_insert will fail; update the README to the correct command (likely cargo bench -p hash-sorted-map-benchmarks --bench performance, or run from the benchmarks/ crate).


```sh
cargo bench --bench hashmap_insert

**crates/hash-sorted-map/src/hash_sorted_map.rs:367**
* `insert_for_grow` allocates overflow groups without checking `self.num_groups` against `self.groups.len()`. Under heavy hash collisions during rehash, `new_gi` can reach `self.groups.len()` and the subsequent indexing will panic. Consider mirroring the capacity check used in `insert_hashed` (e.g., trigger another `grow()` / allocate more groups) so growth remains panic-free even for pathological hash distributions.

        } else {
            let new_gi = self.num_groups as usize;
            self.groups[gi].overflow = new_gi as u32;
            self.num_groups += 1;
            group = &mut self.groups[new_gi];
            break;

</details>


- **Files reviewed:** 10/10 changed files
- **Comments generated:** 3

Copilot · 2026-04-29T13:32:33Z

+│  Vec<Group<K,V>> where each Group (AoS):                         │
+│  { ctrl: [u8; 8], keys: [MaybeUninit<K>; 8],                    │
+│    values: [MaybeUninit<V>; 8], overflow: u32 }                  │
+│                                                                  │
+│  • Overflow chaining (linked groups)                             │
+│  • 8-byte groups with NEON/SSE2/scalar SIMD scan                 │
+│  • EMPTY / FULL tag states only (insertion-only, no deletion)    │


This architecture diagram hard-codes 8-slot groups (ctrl: [u8; 8], keys: ...; 8], etc.) and says “8-byte groups with NEON/SSE2”, but the implementation uses GROUP_SIZE = 16 on x86_64. Please update the documentation to reflect the 8-or-16 group size (or describe it as GROUP_SIZE).

jorendorff

This is cool.

Can I see the merge/iteration operations you have in mind, before stamping this?

jorendorff · 2026-04-30T01:19:52Z

+  scanning the group. Gives a direct hit on most inserts at low load.
+- **SIMD group scanning** — uses NEON on aarch64, SSE2 on x86\_64, and a
+  scalar fallback elsewhere to scan 8–16 control bytes in parallel.
+- **AoS group layout** — each group stores its control bytes, keys, and values


Suggest linking Wikipedia here: https://en.wikipedia.org/wiki/AoS_and_SoA#Array_of_structures_of_arrays

Wikipedia calls this AoSoA (since each Group is a struct of arrays) or "tiled AoS"

jorendorff · 2026-04-30T01:32:34Z

+    }
+
+    pub fn with_capacity_and_hasher(capacity: usize, hash_builder: S) -> Self {
+        let adjusted = capacity.checked_mul(8).unwrap_or(usize::MAX) / 7;


I think we need to adjust more than that, because the expected load factor is not as high as hashbrown.

I did a few simulations and we need to multiply by about 1.30 here, or 1.42 if GROUP_SIZE is 8. That is assuming alloc_groups overallocates by a further 1/8 and that we want with_capacity to be an honest effort to accommodate that many entries without growing.

👍
(I had copilot compute those numbers as well, or at least something similar, just forgot to change the code)

Co-authored-by: Jason Orendorff <jorendorff@github.com>

aneubeck · 2026-04-30T06:53:43Z

Here is the very simple sorting of each group:
https://github.com/github/blackbird/blob/aneubeck/reindex4/crates/index/src/indexer/prefix_map.rs#L288
and here the sorted iterator:
https://github.com/github/blackbird/blob/aneubeck/reindex4/crates/index/src/indexer/prefix_map.rs#L390

In the ported version, we will need to introduce a new type, since the hash map will be broken after the sorting.
But one might want to iterate multiple times.

Merging is implemented here:
https://github.com/github/blackbird/blob/aneubeck/reindex4/crates/index/src/indexer/memory_index.rs#L191-L204

We could provide a helper function in the rust crate for this.

Feel free to merge/make changes, since I will be OOO tomorrow.

jorendorff · 2026-05-04T22:58:39Z

+/// View into a vacant entry. Holds the borrow of the map plus the hash, key,
+/// and pre-computed insertion slot.
+pub struct VacantEntry<'a, K, V, S> {
+    map: &'a mut HashSortedMap<K, V, S>,


OccupiedEntry is fine, but modifying memory directly owned by this HashSortedMap from another mut pointer (group/tail), while this mut reference exists, is UB.

The annoying workaround is like

Suggested change

map: &'a mut HashSortedMap<K, V, S>,

phantom: PhantomData<&'a mut HashSortedMap<K, V, S>>,

map: *mut HashSortedMap<K, V, S>,

and only make mut references via one of the two pointers at a time, which means you have to alternate.

I have no idea whether I did the right thing now... I simply added (*map) dereferencing code to the insert implementation

should insertion also have the same lifetime (via a phantomdata)?

jorendorff · 2026-05-05T00:41:08Z

+```
+
+hashbrown does **not** have this optimization — it always does a full SIMD
+group scan. At ~50% load, the hint hits ~58% of the time, avoiding the scan


How is this possible? If the key being inserted has a random hash, then at X% load it should hit exactly X% of the time.

jorendorff · 2026-05-05T00:45:43Z

@@ -0,0 +1,176 @@
+# HashSortedMap vs. Rust Swiss Table (hashbrown): Optimization Analysis
+
+## Executive Summary


AI slop. :P It's honestly better than nothing, but I'd be happier if you did a read through this and just deleted everything that's not something you would say as a human.

jorendorff · 2026-05-05T00:46:24Z

+traversing overflow chains. The `m/8` default implicitly enforces ~62.5% max
+load, which aligns with the mathematical analysis (Poisson model, 3σ
+confidence).


...What mathematical analysis? Is that load number correct? It's different from what the code says.

Co-authored-by: Jason Orendorff <jorendorff@github.com>

aneubeck and others added 11 commits April 27, 2026 18:22

Add SIMD and no-hint benchmark variants for PrefixHashMap

10f276a

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Fast version

77742a2

make it generic

08e46dc

fix sse version

8464421

cleanup

0244f8f

replace vec with box

7d09f3f

add entry function

fba4bb2

Revert unnecessary change.

127798c

Simplify enums

7eaf609

some documentation

0ecf083

reorganize

200f837

Copilot AI review requested due to automatic review settings April 29, 2026 13:25

aneubeck requested a review from a team as a code owner April 29, 2026 13:25

Merge branch 'main' into aneubeck/prefixmap

5fffdea

Copilot started reviewing on behalf of aneubeck April 29, 2026 13:26 View session

aneubeck added 2 commits April 29, 2026 15:30

remove gxhash which doesn't compile with some configurations

427d982

Merge branch 'aneubeck/prefixmap' of https://github.com/github/rust-gems

7e8097d

into aneubeck/prefixmap

Copilot AI reviewed Apr 29, 2026

View reviewed changes

jorendorff reviewed Apr 30, 2026

View reviewed changes

Update crates/hash-sorted-map/src/lib.rs

7195a44

Co-authored-by: Jason Orendorff <jorendorff@github.com>

aneubeck and others added 5 commits April 30, 2026 09:24

fix initial capacity (and typo)

4e1a038

lints + build errors

865757a

and more :(

0ebfb79

Update equivalence.rs

d213be8

Add test for growth with collisions

f36c8df

jorendorff approved these changes May 5, 2026

View reviewed changes

aneubeck and others added 3 commits May 5, 2026 15:23

Update crates/hash-sorted-map/src/hash_sorted_map.rs

908d3e9

Co-authored-by: Jason Orendorff <jorendorff@github.com>

Apply suggestions from code review

7cac054

Co-authored-by: Jason Orendorff <jorendorff@github.com>

address review comments

7d2a74b

	map: &'a mut HashSortedMap<K, V, S>,
	phantom: PhantomData<&'a mut HashSortedMap<K, V, S>>,
	map: *mut HashSortedMap<K, V, S>,

		@@ -0,0 +1,176 @@
		# HashSortedMap vs. Rust Swiss Table (hashbrown): Optimization Analysis

		## Executive Summary

Conversation

aneubeck commented Apr 29, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jorendorff left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aneubeck commented Apr 30, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants