Skip to content

COMPARE ONLY, NO MERGE#161

Draft
szmyd wants to merge 11 commits into
mainfrom
dev/v6.x
Draft

COMPARE ONLY, NO MERGE#161
szmyd wants to merge 11 commits into
mainfrom
dev/v6.x

Conversation

@szmyd

@szmyd szmyd commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

szmyd and others added 11 commits June 7, 2026 13:08
Lift HomeBlocks off Folly futures onto the C++23 stackless-coroutine substrate
(HomeStore ^8 / iomgr ^13 / sisl ^14.6) and reshape the public surface around it.

Public API -- one installed header, <homeblks/home_blocks.hpp>:
- init_homeblocks() takes a home_blocks_config value (devices, threads, app_mem_size_mb and a cold-boot
  on_svc_id identity hook) and returns result<std::shared_ptr<home_blocks>>. The HomeBlocksApplication
  consumer interface is gone.
- home_blocks is an opaque handle for the volume control plane: create_volume (hands back the new
  volume_handle), remove_volume, get_volume, get_stats, volume_ids.
- Volume I/O is byte-addressed free functions over a volume_handle -- async_read / async_write /
  async_unmap(volume_handle, addr, sg_list) -- each a lazy sisl::async::task (co_await, or fan out with
  sisl::async::when_all). The volume type is opaque; the per-IO request is no longer on the surface.
- One error type, std::expected<T, std::error_condition>: HomeBlocks-specific failures use volume_error,
  everything with a standard meaning maps to std::errc.

Implementation:
- Read/write/destroy paths are coroutines (co_await repl_dev async_write / when_all(async_read) / the forced
  index CP flush); journal completion is delivered through sisl::async::value_awaitable.
- The pre-coroutine per-IO heap request (intrusive refcount + volume shared_ptr) becomes a by-value io_req on
  the coroutine frame plus a one-line RAII vol_io_guard for the volume's in-flight count; volume::rd() returns
  by const reference.
- Adapt to the v8 renames and APIs (HomeStore repl_dev / multi_blk_id / result / async_result; iomgr io_op /
  type_of / iomgr_params; sisl MetricsGroup / TEXT_FORMAT / httplib HttpServer) and expand the removed sisl
  cast/Clock shorthands.
- An internal prelude (lib/hb_internal.hpp) keeps the logging macros, size constants and convenience aliases
  out of the public header.

Tests migrated to co_await + when_all + the config/free-function surface. Drops the Folly dependency;
README and CHANGELOG rewritten for the new API.
Add src/test/ with a ublkpp-based adapter (HomeBlkDisk) and a homeblk_ublk
CLI that brings up a HomeBlocks instance, creates/recovers a volume, and
serves it as a Linux ublk device (/dev/ublkbN), so standard block tooling
(fio, dd, mkfs, mount) can exercise the data path end to end.

HomeBlkDisk implements ublkpp's coroutine ublk_disk: each ublk I/O is
submitted onto an iomgr reactor via run_on_forget -- homestore is
reactor-affine, so driving it inline from the ublk queue threads corrupts
the index under concurrency and drops completion wakeups -- and completions
are marshaled back to the originating queue thread through a per-queue
eventfd + io_uring POLL_ADD service loop, so ublksrv_complete_io always runs
on its single-issuer ring.

Build wiring: add the ublkpp test_requires and bump gtest to ^1.17
(conanfile), add_subdirectory(test), and drop the obsolete explicit
spdk/dpdk link workaround (src/CMakeLists). Document the flow in the README.

Also remove the dead Folly-era `executor` SISL option group and its now
unused enables (homeblks_impl + the test mains) and the leftover `spdk`
test option.
The original adapter bridged HomeStore completions (which land on a reactor)
back to the ublk queue thread via a per-queue eventfd + a service-loop
coroutine + a mutex-guarded completion list. Replace that with
IORING_OP_MSG_RING (iomgr post_msg_ring): the completing reactor posts the
per-IO cqe_state + result straight onto the queue's own io_uring, and
run_queue_loop reaps it like a native completion. The per-IO cqe_state is now
only ever touched on the queue thread, so the mutex, the eventfd, and the
service loop all go away.

A/B tested fresh-volume + alternating, release + tcmalloc, under both fio
randwrite+verify and billet's PostgreSQL-shaped workload: throughput is
identical (0 drops) and the bridges trade single-digit-percent tails --
eventfd slightly lower write p99, msg_ring slightly lower read and fsync p99.
msg_ring is the lock-free, simpler path and wins on the PG-perceived tails.
* Fix tsan

* Tsan suppressions

* Fix shutdown race

* Expand TSAN suppressions with glob forms for template-instantiated names

GCC 13 fully demangles template arguments into function names, so bare
substring entries like 'exec::__any::__immovable_storage' don't match
'exec::__any::__immovable_storage<vtable, alloc, 48ul, 16ul>::__t::__t<...>'.
Add '*pattern*' glob companions for every category that has long template
names, and add two new categories found in the latest CI run:

- deadlock:homeblocks::init_homeblocks — iomgr/homestore lock-order false
  positive from test-restart: TSAN's lock-order graph persists across
  HomeBlocksImpl instances so it confuses mutex addresses across restarts.
- race:*__future_base* / *_Sp_counted_base* — covers _M_ptr(), swap, and
  shared_ptr-refcount variants of the GCC future race.
- race:*exec::__any* — covers tag_invoke / __immovable_storage / __rec::__ref
  / __storage with full template argument lists.
- race:*inplace_stop* — covers stop_token/source/callback destructors and the
  __atomic_base<unsigned char>::load inside inplace_stop_source::~.
- race:*fmt::v12* — covers detail::buffer<char>::append (different symbol from
  basic_memory_buffer that was already suppressed).
- race:*basic_string* — std::string races inside spdlog formatting path.
- race:*tsan_new_delete* / *_Function_base* — operator new/delete at frame #0
  from iomgr lambda lifetime races crossing reactor boundaries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Suppress iomgr fd-reuse false positive in restart tests

TSAN tracks file descriptor numbers as a single resource across their
lifetime.  In the restart test pattern, fd N is closed during test 1
teardown and the OS reassigns N to a different iomgr device in test 2.
TSAN conflates the two and reports a race between T12's epoll_ctl read
(test 1) and the main thread's IODevice::close write (test 2).

This is not a real race: HBTestHelper::restart() joins all iomgr threads
before reinitializing ("All IO threads have stopped" in the logs), so the
fd-N read and fd-N write are fully serialized across test boundaries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* Start a knowledge base about CRAFT design

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant