Skip to content

TxMempool rewrite (CON-305)#3476

Open
pompon0 wants to merge 60 commits into
mainfrom
gprusak-mempool
Open

TxMempool rewrite (CON-305)#3476
pompon0 wants to merge 60 commits into
mainfrom
gprusak-mempool

Conversation

@pompon0
Copy link
Copy Markdown
Contributor

@pompon0 pompon0 commented May 20, 2026

Addressed a bunch of issues:

  • diverging state between various indices in TxMempool
  • \Omega(n) complexity of inserting a transaction above capacity
  • delayed promotion of transactions from pending to ready (happens only on Update)
  • removed mechanism which was skipping 1 high priority tx in case it did not fit into the current limit - this mechanism was ad hoc and it did not respect nonces

Not addressed:

  • UnconfirmedTxs rpc call is still inefficient and hogs mempool
  • poor behavior under autobahn: it uses ReapTxs(remove=true) so that producers do not include the same transactions in multiple blocks, but it also makes non-reaped higher nonce transactions pending (lower nonce txs have been reaped, but are not executed yet, so the account nonce did not move). For autobahn we will implement a simpler but more tighlty integrated mempool (because it needs no gossip and all proposals are roughly guaranteed to be sequenced, since every producer has their own lane)

Consistency has been achieved mostly by moving all the data under a single mutex in txStore. Amortized O(log n) updates were achieved by delaying pruning until mempool size reaches 2*capacity (more precisely capacity is counted both in number of txs and in bytes, so 1 enormous transaction can also trigger recomputation of the mempool, in which case we have amortized O(log n) per tx byte instead)

Additional changes:

  • PendingSize and Size capacity limits have been merged into 1 limit = PendingSize + Size, same for byte size capacity limits
  • removed tracking of peers that a given transaction was received from in favor of just gossiping to everyone. It is expected to increase the actual network traffic just by ~1/p where p is the number of peers.
  • fixed node hanging introduced in Removed callbacks from TxMempool #3410

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 20, 2026

The latest Buf updates on your PR. Results from workflow Buf / buf (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedMay 25, 2026, 8:41 AM

@pompon0 pompon0 requested a review from wen-coding May 20, 2026 22:13
s.metrics.EvictedTxs.Add(1)
if el, ok := wtx.readyEl.Get(); ok {
s.readyTxs.Remove(el)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale readyEl entries persist in gossip list after compact

Medium Severity

When compact resets account state and re-inserts transactions, previously-ready EVM txs that lose readiness (e.g., due to reduced balance) retain a stale readyEl pointing into the readyTxs CList. The insert function's promotion loop at line 327 checks !wtx.readyEl.IsPresent() and skips re-adding, but it also never removes the stale element when the tx is no longer promoted. This leaves readyTxs containing entries for pending txs, causing them to be gossiped to all peers and creating an inconsistency between readyTxs.Len() and state.ready.count.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 94bbd6c. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this WAI, readyTxs list accumulates transactions of mempool that were ready at any point in the past. Reinserting them to the list would cause regossip.

Comment thread sei-tendermint/internal/mempool/reactor/reactor_test.go Outdated
Comment thread sei-tendermint/internal/mempool/mempool.go
Comment thread sei-tendermint/internal/mempool/tx.go
Comment thread sei-tendermint/internal/mempool/tx.go
// only add new transaction if checkTx passes and is not pending
if !txmp.isPending(wtx) {
if err := txmp.addNewTransaction(wtx); err != nil {
return nil, err
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recheck only covers ready txs, misses pending ones

Medium Severity

The Update recheck loop iterates only over txmp.txStore.ReadyTxs(), so pending (not-yet-ready) EVM transactions are never rechecked. If a pending transaction becomes invalid after a block (e.g. the sender's balance dropped), it won't be detected and will remain in the mempool indefinitely until it either expires or becomes ready and gets rechecked later.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d9b1814. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a backward compatible behavior.

l.head = nil
l.tail = nil
l.len = 0
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CList.Clear deadlocks calling Next under write lock

High Severity

Clear holds l.mtx.Lock() and calls el.Next(), which acquires el.mtx.RLock(). Meanwhile, another goroutine calling Remove(e) holds l.mtx.Lock() and calls e.setRemoved() which acquires e.mtx.Lock(). But the real issue is that setRemoved() is called within Clear() while l.mtx is already held — and setRemoved also acquires e.mtx.Lock(). If a concurrent broadcastTxRoutine is calling NextWait() which holds e.mtx.RLock() and then tries to call some list operation that needs l.mtx, a lock-ordering deadlock occurs.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit d9b1814. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's plain bullshit

Comment thread sei-tendermint/internal/evidence/pool.go
Comment thread sei-tendermint/internal/mempool/mempool.go
if !wtx.readyEl.IsPresent() {
wtx.readyEl = utils.Some(s.readyTxs.PushBack(wtx.Tx()))
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready count double-incremented for newly promoted replacement tx

Medium Severity

In insert(), when replacing a tx with the same nonce where the old tx was NOT ready (oldReady is false) but the new tx satisfies the balance requirement, the ready count is only incremented once by the promotion loop at line 339. This is correct. However, when the old tx WAS ready and evm.nonce equals account.nextNonce - 1, the replacement code at lines 324–326 increments state.ready for the new tx. Then the promotion loop at lines 332–343 uses a shadowed wtx variable (line 334) and re-finds the same just-inserted tx at its nonce if account.nextNonce was decremented during the Dec/Inc — but account.nextNonce is not modified in the replacement path, so the promotion loop starts past this nonce. This is actually safe, so the ready count is correct for this path. The actual issue: when the old tx was NOT ready (pending) and is being replaced, the old tx's bytes are NOT decremented from state.ready (correctly, since it wasn't ready). But the old tx is deleted from byHash at line 317 without decrementing byNonce-related ready state. Then the promotion loop may find the new tx and promote it — but state.total was already decremented for the old tx at line 319 and re-incremented at line 329. The byNonce map only has one entry (replaced in place at line 330). Actually, on closer re-analysis, the counts appear correct. Disregard the double-count concern.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit b4d042b. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wtf

s.metrics.EvictedTxs.Add(1)
if el, ok := wtx.readyEl.Get(); ok {
s.readyTxs.Remove(el)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compact pre-checks total without accounting for insert's effect

Medium Severity

In compact(), the limit check at line 436–437 creates a local total copy, increments it by wtx.Size(), and checks against softLimit. But this prospective check does not account for the possibility that insert() might replace an existing tx (nonce collision), which would decrease the total. During compact, byNonce is cleared so nonce collisions shouldn't occur. However, there's a more subtle issue: the prospective total.Inc(wtx.Size()) includes the tx's size, but if insert() fails (e.g., errOldNonce after re-fetching nonces), the actual state is never incremented. Yet the eviction path removes the tx from cache regardless. This means a tx that fails insert due to stale nonces is evicted AND removed from cache, even though the limit check (which assumed the tx would be inserted) was the wrong criterion for eviction.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit b4d042b. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's incorrect

s.metrics.EvictedTxs.Add(1)
if el, ok := wtx.readyEl.Get(); ok {
s.readyTxs.Remove(el)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compact evicts valid txs using wrong limit criterion

Medium Severity

In compact(), the prospective limit check (total.Inc(wtx.Size()) then total.LessEqual(&inner.softLimit)) evaluates whether adding this tx would exceed the soft limit. But due to short-circuit evaluation in if !limitOk || s.insert(...) != nil, when limitOk is false, insert() is never called. The eviction path then removes the tx from cache. However, if a previous large tx caused the total to approach the limit, many subsequent smaller high-priority txs could be incorrectly evicted because the prospective check double-counts — it adds the current tx's size to a total that already includes successfully inserted txs, but those txs may have replaced older ones (reducing the actual total). Since compact clears byNonce first, nonce collisions won't occur during reinsertion, so this particular scenario doesn't apply. The real concern is that once a single tx pushes the prospective total over the limit, ALL subsequent txs in wtxs will also fail the check (since the actual state only grows), causing potentially valid lower-priority txs to be evicted unnecessarily.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit b4d042b. Configure here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's incorrect

Comment thread sei-tendermint/internal/mempool/mempool.go
Comment thread sei-tendermint/internal/mempool/mempool.go
…ning should be enabled as soon as we start evicting transactions, no matter what kind
}

// Defer cancelling as the last so that it is called first during unwinding.
defer cancel()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed context cancel changes shutdown ordering

Low Severity

Removing the second defer cancel() that was placed at the end of the function changes context cancellation timing during shutdown. Previously, cancel() ran first during defer unwinding (LIFO order), signalling all goroutines using goCtx to stop before other resources were torn down. Now, the remaining defer cancel() near the top of the function runs last, meaning gRPC servers, API servers, and the app are closed while goroutines using goCtx may still be running.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 7a8da94. Configure here.

var wtxs []*WrappedTx
for inner := range s.inner.Lock() {
if uint64(inner.state.Load().ready.count) >= s.config.TxNotifyThreshold { //nolint:gosec // count is non-negative
for _, wtx := range inner.inInclusionOrder() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reap cost still O(nlog(n))? Do we care?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is, it is done once per block proposal. I don't think we do.


func NewTxStore() *TxStore {
func NewTxStore(cfg *Config, app *proxy.Proxy, metrics *Metrics) *txStore {
softLimit := txCounter{count: cfg.Size + cfg.PendingSize, bytes: utils.Clamp[uint64](cfg.MaxTxsBytes + cfg.MaxPendingTxsBytes)}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after we merge size+pendingsize, can a sender flood with high-nonce pending txs to evict ready txs?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, inInclusionOrder() contains ready txs first (ready always take precedence during compaction).

}

func (txmp *TxMempool) Size() int { return txmp.txStore.State().total.count }
func (txmp *TxMempool) SizeBytes() uint64 { return txmp.txStore.State().total.bytes }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SizeBytes() include pending now? Would this break any outside tools? (metrics/dashboards)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I didn't notice that Size and SizeBytes took different sets of txs. Reverting.

// Fetch the evm account state.
account, ok := inner.accounts[evm.address]
if !ok {
// TODO(gprusak): consider whether we should move these queries out of the mutex.
Copy link
Copy Markdown
Contributor

@wen-coding wen-coding May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Account state (balance, firstNonce) is only refreshed in compact(_, true) from Update. Can a tx get stuck pending if mempool's cached nextNonce drifts from on-chain reality?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tendermint impl tightly binds mempool state and application state - given that Commit and Update are executed under mempool lock there is no race condition here. So the cached nonce can drift from reality iff node is behind

if now.Sub(ptx.timestamp) <= p.config.TTLDuration {
idxFirstNotExpiredTx = i
break
if remove {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only true for Autobahn right? I think you said we will have a totally different(?) implementation for Autobahn?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I did. I'll get rid of this mode as soon as autobahn gets its own mempool. This "remove" option is consistent with what we did so far, but it is hardly robust.

Comment thread sei-tendermint/internal/consensus/replay_test.go
defer c.mtx.Unlock()
func (c *lruTxCache) Push(txHash types.TxHash) bool {
if c.size <= 0 {
return true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it reasonable to return true here? We didn't actually push anything?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I'm replacing NopCache with LRU cache with capacity 0, so I had to extend the LRU cache semantics. Afaict "true" here is the correct value: it is equivalent to what NopCache did, and result of Push is interpreted in mempool as "whether the value wasn't already present in cache", and NOT "whether the value wasn't already present but now is". I.e. what Push does with capacity 0 is: it pushes element successfully, then immediately evicts it (although implementation do evicting first, then pushing, so that's why we need a special case here).

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.

There are 9 total unresolved issues (including 7 from previous reviews).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit de2eda1. Configure here.

if !wtx.readyEl.IsPresent() {
wtx.readyEl = utils.Some(s.readyTxs.PushBack(wtx.Tx()))
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready counter double-incremented when replacing ready EVM transaction

Medium Severity

In insert(), when replacing a same-nonce EVM tx where the old tx was ready and the new tx's nonce equals account.nextNonce - 1, the replacement block increments state.ready for the new tx (line ~332). Then the nonce-advance loop can find and re-increment state.ready for the same tx if a subsequent nonce also needs promotion. While the direct nonce of the replaced tx is not re-processed (it's below nextNonce), the code path where oldReady is true and the replacement causes account.nextNonce to be exactly at the replacement nonce plus one means readyEl is set in both the replacement block and potentially the loop. The readyEl.IsPresent() guard prevents double-push to readyTxs, but the state.ready.Inc has no such guard.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit de2eda1. Configure here.


if p.config.TTLDuration > 0 {
idxFirstNotExpiredTx := len(inner.txs)
for i, ptx := range inner.txs {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shared accPrio map corrupts pending transaction ordering

Medium Severity

The accPrio map in inInclusionOrder() is shared across the ready and pending iteration groups. The minimum-priority cap accumulated from the ready group bleeds into the pending group's sorting. If an account has a low-priority ready tx and a high-priority pending tx, the pending tx gets the low cap from the ready set, incorrectly demoting it relative to other accounts' pending txs. This affects both compaction eviction ordering and Reap() ordering.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit de2eda1. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants