Skip to content

fix: prevent rate limiter bypass via Lua atomic script #593

Open
YashIIT0909 wants to merge 7 commits intocameri:mainfrom
YashIIT0909:fix/rate-limiter
Open

fix: prevent rate limiter bypass via Lua atomic script #593
YashIIT0909 wants to merge 7 commits intocameri:mainfrom
YashIIT0909:fix/rate-limiter

Conversation

@YashIIT0909
Copy link
Copy Markdown
Collaborator

Description

This PR resolves a critical rate-limiter bypass vulnerability in the SlidingWindowRateLimiter by migrating the sliding-window operations from Node.js Promise.all logic into an atomic Lua script executed natively on the Redis client.

It accomplishes three crucial fixes:

  1. Eliminates a severe TOCTOU (Time-of-Check to Time-of-Use) race condition caused by interleaved reads and writes during heavy concurrent connection bursts.
  2. Dynamically prevents Redis Key Collisions by implementing an entirely self-contained incremental counter loop while redis.call('ZSCORE', key, member), entirely avoiding data loss for exact-millisecond bursts without relying on heavy external UUID libraries for uniqueness.
  3. Patches a dormant expiration calculation bug by switching from the seconds-based EXPIRE command to the precisely accurate milliseconds-based PEXPIRE command.

Related Issue

Fixes #467

Motivation and Context

Previously, when hundreds of WebSocket connections or REST payloads arrived in the exact same millisecond, Redis answered 0 to the initial grouped read checks before any writing actually completed due to Node's asynchronous event loop. This allowed massive concurrent bursts to completely bypass the configured metrics. Furthermore, because concurrent writes were uniquely formatted strictly as timestamp:step, they overwrote each other as exact-duplicate members in the ZSET, effectively erasing accurate metrics from the cache. Utilizing an atomic eval wrapper enforces exact, bottlenecked throughput serialization in Redis, guaranteeing the metrics are impenetrable to race-condition exploitation.

How Has This Been Tested?

  1. Unit Testing: Updated and successfully passed all existing Mocha unit tests for the SlidingWindowRateLimiter and its eval implementations via ICacheAdapter stubs (npm run test:unit).
  2. Security Load Testing: Deployed a strict 5 / minute environment threshold locally via Docker Compose.
  3. Fired 1,000 concurrent payloads inside the exact same millisecond using a WebSocket parallel Promise.all() PoC script.
  4. Result: Verified that the server perfectly filters and drops 995 requests returning ["NOTICE", "rate limited"], while safely accepting exactly 5 requests, proving maximum robustness against intentional bypass bursts.

Screenshots (if appropriate):

image

The screenshot is of a checking script that is being ran against a limit of 5.

Types of changes

  • Non-functional change (docs, style, minor refactor)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my code changes.
  • All new and existing tests passed.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 1, 2026

🦋 Changeset detected

Latest commit: 4ba022b

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
nostream Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coveralls
Copy link
Copy Markdown
Collaborator

coveralls commented May 1, 2026

Coverage Status

coverage: 64.592% (+0.01%) from 64.58% — YashIIT0909:fix/rate-limiter into cameri:main

@YashIIT0909 YashIIT0909 requested a review from cameri May 1, 2026 11:25
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the SlidingWindowRateLimiter against concurrent-burst bypasses by moving its Redis sliding-window logic into an atomic Lua script (and switching expiry to millisecond precision), addressing the race/collision behavior described in #467.

Changes:

  • Replaces multi-call Redis logic in SlidingWindowRateLimiter.hit() with a single Redis-evaluated Lua script (atomic ZSET maintenance + unique member generation).
  • Adds a raw Lua eval helper to the Redis cache adapter and updates the cache adapter typing/tests to stub eval.
  • Adds a changeset documenting the patch release.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/utils/sliding-window-rate-limiter.ts Migrates sliding-window operations into an atomic Lua script and uses PEXPIRE for ms-accurate expiry.
src/adapters/redis-adapter.ts Adds evalRaw alongside existing SHA-cached eval implementation.
src/@types/adapters.ts Updates ICacheAdapter typing context around eval (formatting/spacing).
test/unit/utils/sliding-window-rate-limiter.spec.ts Updates unit tests to stub cache.eval and assert boolean outcomes.
.changeset/metal-snails-prove.md Patch changeset for the rate-limiter bypass fix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 55 to 66
it('returns true if rate limited', async () => {
const now = Date.now()
getRangeFromSortedSetStub.resolves([`${now}:6`, `${now}:4`, `${now}:1`])
evalStub.resolves(1)

const actualResult = await rateLimiter.hit('key', 1, { period: 60000, rate: 10 })

expect(actualResult).to.be.true
})

it('returns false if not rate limited', async () => {
const now = Date.now()
getRangeFromSortedSetStub.resolves([`${now}:10`])
evalStub.resolves(0)

const actualResult = await rateLimiter.hit('key', 1, { period: 60000, rate: 10 })
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests now only assert the boolean mapping of the Redis result. Given the security impact of this change, it would be better to also assert that cache.eval is invoked with the expected keys/args (period/step/rate) and add a case where the stub resolves to a string (e.g. '1') to ensure the production code handles Redis return types robustly.

Copilot uses AI. Check for mistakes.
Comment thread src/adapters/redis-adapter.ts Outdated
Comment on lines +127 to +133

public async evalRaw(script: string, keys: string[], args: string[]): Promise<unknown> {
await this.connection
logger('eval script with keys %o and args %o', keys, args)
return this.client.eval(script, { keys, arguments: args })
}

Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evalRaw is introduced as a new public method but is not referenced anywhere in the repo and isn’t part of ICacheAdapter, so it can’t be used from most call sites (which type the adapter as the interface). Either remove it to avoid dead API surface, or add it to ICacheAdapter and update call sites to use it intentionally.

Suggested change
public async evalRaw(script: string, keys: string[], args: string[]): Promise<unknown> {
await this.connection
logger('eval script with keys %o and args %o', keys, args)
return this.client.eval(script, { keys, arguments: args })
}

Copilot uses AI. Check for mistakes.
end
end

if hits >= max_rate then
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rate-limit decision ignores the incoming step value. Right now the script blocks only when hits >= max_rate, but if step can be > 1, a request could be accepted even when hits + step would exceed the limit. Update the Lua condition to account for the pending increment (and keep parity with the previous hitsAfterAdd > rate behavior).

Suggested change
if hits >= max_rate then
if hits + step > max_rate then

Copilot uses AI. Check for mistakes.
Comment on lines +30 to +32
local step_str = string.match(entries[i], "^[^:]+:(%d+)")
if step_str then
hits = hits + tonumber(step_str)
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Lua parsing only matches integer steps (%d+). Since step is typed as number, non-integer step values (e.g. 0.5) would be miscounted or skipped entirely, which can undercount hits and weaken enforcement. Consider extracting the full step token between the first and second : and passing it through tonumber, or otherwise enforce/validate that step is always an integer before calling the script.

Suggested change
local step_str = string.match(entries[i], "^[^:]+:(%d+)")
if step_str then
hits = hits + tonumber(step_str)
local step_str = string.match(entries[i], "^[^:]+:([^:]+)")
if step_str then
local entry_step = tonumber(step_str)
if entry_step then
hits = hits + entry_step
end

Copilot uses AI. Check for mistakes.
Comment on lines +16 to +18
const script = `
local key = KEYS[1]
local timestamp = tonumber(ARGV[1])
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

script is constructed inside hit() on every call. Since this is a hot path (rate limiting), consider hoisting the Lua script string to a module-level const so it’s allocated once and also makes the caching-by-script-string intent clearer.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +30 to +42
if hits + step > max_rate then
return 1
end

local base_member = timestamp .. ':' .. step
local member = base_member
local counter = 0
while redis.call('ZSCORE', key, member) do
counter = counter + 1
member = base_member .. ':' .. counter
end

redis.call('ZADD', key, timestamp, member)
Comment on lines +34 to +40
local base_member = timestamp .. ':' .. step
local member = base_member
local counter = 0
while redis.call('ZSCORE', key, member) do
counter = counter + 1
member = base_member .. ':' .. counter
end
Comment on lines 55 to +82
it('returns true if rate limited', async () => {
const now = Date.now()
getRangeFromSortedSetStub.resolves([`${now}:6`, `${now}:4`, `${now}:1`])
evalStub.resolves(1)

const actualResult = await rateLimiter.hit('key', 1, { period: 60000, rate: 10 })

expect(actualResult).to.be.true
expect(evalStub).to.have.been.calledOnce
const args = evalStub.firstCall.args
expect(args[1]).to.deep.equal(['key'])
expect(args[2][1]).to.equal('60000') // period
expect(args[2][2]).to.equal('1') // step
expect(args[2][3]).to.equal('10') // max_rate
})

it('returns false if not rate limited', async () => {
const now = Date.now()
getRangeFromSortedSetStub.resolves([`${now}:10`])
evalStub.resolves(0)

const actualResult = await rateLimiter.hit('key', 1, { period: 60000, rate: 10 })

expect(actualResult).to.be.false
})

it('robustly handles string return types from Redis', async () => {
evalStub.resolves('1')

const actualResult = await rateLimiter.hit('key', 1, { period: 60000, rate: 10 })

expect(actualResult).to.be.true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Sliding-window limiter bypass via non-unique Redis sorted-set members

4 participants