fix(db): avoid long flush stall on restart#211
Open
EddieHouston wants to merge 1 commit intoBlockstream:new-indexfrom
Open
fix(db): avoid long flush stall on restart#211EddieHouston wants to merge 1 commit intoBlockstream:new-indexfrom
EddieHouston wants to merge 1 commit intoBlockstream:new-indexfrom
Conversation
enable_auto_compaction() was lowering level0_stop_writes_trigger from the bulk-load value (512) to the RocksDB default (36) on every update() call. At DB open the bulk-load triggers are applied by DB::open, so on any restart L0 can legitimately hold more than 36 files. When the first post-restart update() called enable_auto_compaction(), the trigger tightening instantly put the DB into pre-flush stall territory, and the end-of-batch db.flush() that follows parked inside WaitUntilFlushWouldNotStallWrites waiting for background compaction to bring L0 below 36. On production testnet this reliably cost 77 minutes of indexer freeze per restart (verified by 'Manual flush start' → 'Manual flush finished' in the RocksDB LOG). The actual memtable flush took 62 ms once unblocked; the rest was wait. Split enable_auto_compaction() into the minimal flag-flip and a new apply_steady_state_triggers() that holds the L0 trigger / pending-bytes- limit reset. Invoke the latter exactly once per DB lifetime, inside the F-sentinel gate in start_auto_compactions(), immediately after full_compaction() has drained L0. On DBs where F is already set (steady- state restart), triggers stay at bulk-load values — the comment in DB::open already argues that configuration is fine for steady-state reads given the prefix bloom filters.
d102699 to
bef02e3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix freeze that can occur on restart of a mature electrs DB built from
91b883aor later.enable_auto_compaction()was tighteninglevel0_stop_writes_triggerfrom the bulk-load value (512) to the RocksDB default (36) on everyupdate()call, not just once.DB::openapplies bulk-load triggers at startup, so after any restart L0 can legitimately hold more than 36 files. The reset instantly put the DB into pre-flush stall territory, and the end-of-batchdb.flush()one call later parked insideWaitUntilFlushWouldNotStallWritesuntil background compaction brought L0 below 36.In testnet this could take over 1 hour of indexer freeze on a restart.
Fix
Split
enable_auto_compaction()into:enable_auto_compaction()— minimal flag flip (disable_auto_compactions=false), safe to call on everyupdate().apply_steady_state_triggers()— holds the L0 trigger and pending-bytes-limit resets. Documented as unsafe to call while L0 is populated.start_auto_compactions()inschema.rsnow callsapply_steady_state_triggers()inside theFsentinel gate, immediately afterfull_compaction()drains L0. So the tight triggers apply exactly once per DB lifetime, and only in a DBstate that won't stall under them.
On restarts where
Fis already set, triggers stay at bulk-load values — the comment inDB::openalready argues that configuration is fine for steady-state reads given the prefix bloom filters added in1e7da26and2c33745.Test plan
cargo checkcleancargo test --lib— 8/8 pass, includingnew_index::db::tests::*cargo test --test electrum— 4/4 passcargo test --test rest— 22/22 passflushing history_db to diskin under a secondManual flush start→Manual flush finishedtimestamps in the RocksDB LOG are within a few ms of each other on the restart