Conversation
Lift HomeBlocks off Folly futures onto the C++23 stackless-coroutine substrate (HomeStore ^8 / iomgr ^13 / sisl ^14.6) and reshape the public surface around it. Public API -- one installed header, <homeblks/home_blocks.hpp>: - init_homeblocks() takes a home_blocks_config value (devices, threads, app_mem_size_mb and a cold-boot on_svc_id identity hook) and returns result<std::shared_ptr<home_blocks>>. The HomeBlocksApplication consumer interface is gone. - home_blocks is an opaque handle for the volume control plane: create_volume (hands back the new volume_handle), remove_volume, get_volume, get_stats, volume_ids. - Volume I/O is byte-addressed free functions over a volume_handle -- async_read / async_write / async_unmap(volume_handle, addr, sg_list) -- each a lazy sisl::async::task (co_await, or fan out with sisl::async::when_all). The volume type is opaque; the per-IO request is no longer on the surface. - One error type, std::expected<T, std::error_condition>: HomeBlocks-specific failures use volume_error, everything with a standard meaning maps to std::errc. Implementation: - Read/write/destroy paths are coroutines (co_await repl_dev async_write / when_all(async_read) / the forced index CP flush); journal completion is delivered through sisl::async::value_awaitable. - The pre-coroutine per-IO heap request (intrusive refcount + volume shared_ptr) becomes a by-value io_req on the coroutine frame plus a one-line RAII vol_io_guard for the volume's in-flight count; volume::rd() returns by const reference. - Adapt to the v8 renames and APIs (HomeStore repl_dev / multi_blk_id / result / async_result; iomgr io_op / type_of / iomgr_params; sisl MetricsGroup / TEXT_FORMAT / httplib HttpServer) and expand the removed sisl cast/Clock shorthands. - An internal prelude (lib/hb_internal.hpp) keeps the logging macros, size constants and convenience aliases out of the public header. Tests migrated to co_await + when_all + the config/free-function surface. Drops the Folly dependency; README and CHANGELOG rewritten for the new API.
Add src/test/ with a ublkpp-based adapter (HomeBlkDisk) and a homeblk_ublk CLI that brings up a HomeBlocks instance, creates/recovers a volume, and serves it as a Linux ublk device (/dev/ublkbN), so standard block tooling (fio, dd, mkfs, mount) can exercise the data path end to end. HomeBlkDisk implements ublkpp's coroutine ublk_disk: each ublk I/O is submitted onto an iomgr reactor via run_on_forget -- homestore is reactor-affine, so driving it inline from the ublk queue threads corrupts the index under concurrency and drops completion wakeups -- and completions are marshaled back to the originating queue thread through a per-queue eventfd + io_uring POLL_ADD service loop, so ublksrv_complete_io always runs on its single-issuer ring. Build wiring: add the ublkpp test_requires and bump gtest to ^1.17 (conanfile), add_subdirectory(test), and drop the obsolete explicit spdk/dpdk link workaround (src/CMakeLists). Document the flow in the README. Also remove the dead Folly-era `executor` SISL option group and its now unused enables (homeblks_impl + the test mains) and the leftover `spdk` test option.
The original adapter bridged HomeStore completions (which land on a reactor) back to the ublk queue thread via a per-queue eventfd + a service-loop coroutine + a mutex-guarded completion list. Replace that with IORING_OP_MSG_RING (iomgr post_msg_ring): the completing reactor posts the per-IO cqe_state + result straight onto the queue's own io_uring, and run_queue_loop reaps it like a native completion. The per-IO cqe_state is now only ever touched on the queue thread, so the mutex, the eventfd, and the service loop all go away. A/B tested fresh-volume + alternating, release + tcmalloc, under both fio randwrite+verify and billet's PostgreSQL-shaped workload: throughput is identical (0 drops) and the bridges trade single-digit-percent tails -- eventfd slightly lower write p99, msg_ring slightly lower read and fsync p99. msg_ring is the lock-free, simpler path and wins on the PG-perceived tails.
* Fix tsan
* Tsan suppressions
* Fix shutdown race
* Expand TSAN suppressions with glob forms for template-instantiated names
GCC 13 fully demangles template arguments into function names, so bare
substring entries like 'exec::__any::__immovable_storage' don't match
'exec::__any::__immovable_storage<vtable, alloc, 48ul, 16ul>::__t::__t<...>'.
Add '*pattern*' glob companions for every category that has long template
names, and add two new categories found in the latest CI run:
- deadlock:homeblocks::init_homeblocks — iomgr/homestore lock-order false
positive from test-restart: TSAN's lock-order graph persists across
HomeBlocksImpl instances so it confuses mutex addresses across restarts.
- race:*__future_base* / *_Sp_counted_base* — covers _M_ptr(), swap, and
shared_ptr-refcount variants of the GCC future race.
- race:*exec::__any* — covers tag_invoke / __immovable_storage / __rec::__ref
/ __storage with full template argument lists.
- race:*inplace_stop* — covers stop_token/source/callback destructors and the
__atomic_base<unsigned char>::load inside inplace_stop_source::~.
- race:*fmt::v12* — covers detail::buffer<char>::append (different symbol from
basic_memory_buffer that was already suppressed).
- race:*basic_string* — std::string races inside spdlog formatting path.
- race:*tsan_new_delete* / *_Function_base* — operator new/delete at frame #0
from iomgr lambda lifetime races crossing reactor boundaries.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Suppress iomgr fd-reuse false positive in restart tests
TSAN tracks file descriptor numbers as a single resource across their
lifetime. In the restart test pattern, fd N is closed during test 1
teardown and the OS reassigns N to a different iomgr device in test 2.
TSAN conflates the two and reports a race between T12's epoll_ctl read
(test 1) and the main thread's IODevice::close write (test 2).
This is not a real race: HBTestHelper::restart() joins all iomgr threads
before reinitializing ("All IO threads have stopped" in the logs), so the
fd-N read and fd-N write are fully serialized across test boundaries.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* Start a knowledge base about CRAFT design
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.