feat: batch event ingestion on POST /track via { type: 'batch' } envelope#377
feat: batch event ingestion on POST /track via { type: 'batch' } envelope#377mraj602-tohands wants to merge 7 commits into
Conversation
Reimplements the batch ingestion feature from PR Openpanel-dev#374 on top of current upstream/main (post buffer-perf, kafka client, ClickHouse round-robin). Adds __syncedAt property to all worker-processed events. See PR description for full architectural details. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
💤 Files with no reviewable changes (1)
📝 WalkthroughWalkthroughThis PR adds ChangesBatch event ingestion with per-event validation and bounded concurrency
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
apps/worker/src/jobs/events.incoming-events.test.ts (1)
553-596: 💤 Low valueConsider adding test coverage for historical events when lock is not acquired.
This test verifies historical event handling when the lock is acquired (session_start emitted). Consider adding a companion test where
getLockreturnsfalseto verify that:
session_startis skipped- The event itself is still created
sessionEndis still not scheduled (historical path)This would complete coverage of the historical event path.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/worker/src/jobs/events.incoming-events.test.ts` around lines 553 - 596, Add a new test alongside the existing historical-event test that simulates the lock NOT being acquired by mocking getLock to return false; call incomingEvent with the same historical payload and assert that createEvent is still called for the event (but there is no 'session_start' createEvent call), sessionsQueue.add is not called (no sessionEnd scheduled), and that the event createEvent call preserves deviceId/sessionId; reference the incomingEvent handler, getLock mock, createEvent mock, and sessionsQueue.add to locate and update the test harness.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/worker/src/jobs/events.incoming-event.ts`:
- Around line 217-219: The referrer fields (enrichment.referrer,
enrichment.referrer_name, enrichment.referrer_type) use "?? undefined" which can
yield undefined instead of falling back to the base event; change them to use
the same fallback pattern as the other fields by replacing "enrichment.referrer
?? undefined" with "enrichment.referrer ?? baseEvent.referrer" and likewise
"enrichment.referrer_name ?? baseEvent.referrer_name" and
"enrichment.referrer_type ?? baseEvent.referrer_type" so string-type
expectations are preserved.
---
Nitpick comments:
In `@apps/worker/src/jobs/events.incoming-events.test.ts`:
- Around line 553-596: Add a new test alongside the existing historical-event
test that simulates the lock NOT being acquired by mocking getLock to return
false; call incomingEvent with the same historical payload and assert that
createEvent is still called for the event (but there is no 'session_start'
createEvent call), sessionsQueue.add is not called (no sessionEnd scheduled),
and that the event createEvent call preserves deviceId/sessionId; reference the
incomingEvent handler, getLock mock, createEvent mock, and sessionsQueue.add to
locate and update the test harness.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 91718804-8af2-4a6e-9663-17127061acf7
📒 Files selected for processing (9)
apps/api/src/controllers/event.controller.tsapps/api/src/controllers/track.controller.tsapps/api/src/routes/track-batch.router.test.tsapps/api/src/routes/track.router.tsapps/api/src/utils/ids.tsapps/worker/src/jobs/events.incoming-event.tsapps/worker/src/jobs/events.incoming-events.test.tspackages/queue/src/queues.tspackages/validation/src/track.validation.ts
💤 Files with no reviewable changes (1)
- packages/queue/src/queues.ts
Tests use toStrictEqual, so the new __syncedAt property in event properties needs to be included in assertions. Uses expect.any(String) since the exact ISO timestamp varies per run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hment When session enrichment has null referrer fields, fall back to baseEvent values instead of undefined. Consistent with all other enrichment fields (path, os, browser, geo, etc.). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Hi @lindesvard, just following up on this PR in case it was missed. Happy to make any changes or adjustments if needed. Would appreciate a review when you get a chance 🙂 |
|
Hey, today our session management is very time dependent, so batching events might cause issues here. I have started working on improving this locally and will probably need to merge that before I can merge this (this will avoid your changes in the incoming-events file etc) Secondly I think we should just piggy back on our existing /track endpoint. It takes a generic body This would make proxying etc easier since we dont need to add yet another endpoint to proxy. So what I would like you todo is to:
|
…changes Per review feedback on Openpanel-dev#377: - Remove the separate /track/batch endpoint; POST /track now accepts { "type": "batch", "payload": [event, ...] } alongside single events, so proxies only ever deal with one endpoint. - Revert all worker/session changes (events.incoming-event.ts, queue payload, ids.ts, event.controller.ts) back to upstream behavior — session handling improvements will land separately. - Batch items reuse the existing per-event pipeline unchanged, including the isTimestampFromThePast flag semantics; per-item validation errors are reported per-index in rejected[] without failing the whole batch. - Raise the /track body limit to 10 MB to fit max-size batches. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…tch arms Adds batch coverage for group/assign_group/increment/decrement/replay dispatch, per-event __ip geo override, non-object items, single-event alias 400, and queue-failure internal rejections. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
apps/api/src/controllers/track.controller.ts (2)
594-606:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMap
rejected[]to the public response shape instead of returning internal state.This currently pushes the full
BatchItemResultobjects into the response, so each rejected item includesstatus: 'rejected'.apps/api/src/routes/track.router.tsdocuments the 202 body as{ index, reason, error }, so the controller should emit that exact shape instead of relying on serializer behavior.Suggested fix
let accepted = 0; - const rejected: Extract<BatchItemResult, { status: 'rejected' }>[] = []; + const rejected: Array<{ + index: number; + reason: 'validation' | 'internal'; + error: string; + }> = []; for (const result of results) { if (result.status === 'accepted') { accepted += 1; } else { - rejected.push(result); + rejected.push({ + index: result.index, + reason: result.reason, + error: result.error, + }); } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/controllers/track.controller.ts` around lines 594 - 606, The current loop pushes full BatchItemResult objects into the rejected array and returns them in the 202 response; update the controller (the loop that iterates over results and the rejected array sent in reply.status(202).send) to map each rejected item to the public shape { index, reason, error } instead of the internal object (i.e., when result.status !== 'accepted' push or collect { index: result.index, reason: result.reason, error: result.error }), then send that mapped array in the response so it matches the documented contract.
156-158:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPreserve the “no user-agent => no identity” behavior.
apps/api/src/utils/ids.tsexplicitly returns emptydeviceId/sessionIdwhenuais missing. Replacing a missing header with'unknown/1.0'here changes that contract and can collapse unrelated server-side traffic behind the same IP into the same device/session.Suggested fix
interface SharedRequestContext { projectId: string; requestIp: string; - requestUa: string; + requestUa?: string; requestHeaders: Record<string, string | undefined>; requestGeo: GeoLocation; salts: Salts; } @@ - const requestIp = request.clientIp; - const requestUa = request.headers['user-agent'] ?? 'unknown/1.0'; - const requestHeaders = getStringHeaders(request.headers); + const requestIp = request.clientIp; + const requestHeaders = getStringHeaders(request.headers); + const requestUa = requestHeaders['user-agent'];🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/controllers/track.controller.ts` around lines 156 - 158, The code currently forces a default user-agent string by setting requestUa = request.headers['user-agent'] ?? 'unknown/1.0', which breaks the contract in ids.ts that expects a missing UA to yield empty deviceId/sessionId; change this to preserve "no UA => no identity" by leaving requestUa undefined/null when the header is absent (e.g., requestUa = request.headers['user-agent'] as string | undefined) and ensure downstream uses of requestUa (the identity generation code that calls into ids.ts) continue to treat undefined/empty as "no UA" rather than a placeholder.
🧹 Nitpick comments (2)
apps/api/src/routes/track.router.ts (1)
37-37: 💤 Low valueConsider splitting the long description for readability.
The description string is comprehensive but spans a very long single line. While this works fine for OpenAPI documentation, breaking it into a template literal with multiple lines would improve source readability.
♻️ Optional refactor for readability
- description: `Ingest a tracking event (track, identify, group, increment, decrement, replay) or a batch of events ({ "type": "batch", "payload": [event, ...] }). Batch requests accept up to ${TRACK_BATCH_MAX_EVENTS} events and 10MB uncompressed per request; each event is dispatched through the same pipeline as a single-event request. Per-event validation failures are returned in the rejected[] array — the whole batch does not fail on a single bad row.`, + description: `Ingest a tracking event (track, identify, group, increment, decrement, replay) or a batch of events. + Batch requests use { "type": "batch", "payload": [event, ...] } and accept up to ${TRACK_BATCH_MAX_EVENTS} events and 10MB uncompressed per request. + Each event is dispatched through the same pipeline as a single-event request. + Per-event validation failures are returned in the rejected[] array — the whole batch does not fail on a single bad row.`,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/routes/track.router.ts` at line 37, The long single-line OpenAPI description on the route config should be split across multiple template-literal lines for readability: update the description property in track.router.ts (the route's description string that references TRACK_BATCH_MAX_EVENTS) to a multi-line template literal or joined string segments so the content flows across lines while preserving the embedded ${TRACK_BATCH_MAX_EVENTS} interpolation and exact wording; locate the description key in the route definition and break the string into readable lines (or an array .join('')) without changing the text or interpolation semantics.apps/api/src/routes/track-batch.router.test.ts (1)
204-359: ⚡ Quick winConsider adding test coverage for the 5-day historical floor.
The test suite covers historical timestamps (two hours ago), but the PR objectives state: "Enforce a 5-day historical acceptance window" with "older events rejected per-row." There's no test verifying that events with timestamps older than 5 days are rejected with a per-item validation error.
📋 Suggested test case
it('rejects events older than 5 days per-item', async () => { const sixDaysAgo = Date.now() - 6 * 24 * 60 * 60 * 1000; const res = await postBatch([ validTrack('recent'), { type: 'track' as const, payload: { name: 'ancient_event', properties: { __timestamp: new Date(sixDaysAgo).toISOString() }, }, }, ]); expect(res.statusCode).toBe(202); const body = res.json(); expect(body.accepted).toBe(1); expect(body.rejected).toHaveLength(1); expect(body.rejected[0]).toMatchObject({ index: 1, reason: 'validation', }); expect(body.rejected[0].error).toMatch(/timestamp|past|day/i); });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/routes/track-batch.router.test.ts` around lines 204 - 359, Add a test that verifies the 5-day historical floor by calling postBatch with one valid track (use validTrack) and one track whose payload.properties.__timestamp is older than 5 days (e.g., Date.now() - 6*24*60*60*1000), then assert the response is 202, accepted is 1, rejected length is 1, and the rejected item matches { index: 1, reason: 'validation' } and its error message matches a timestamp/past/day regex; name the test "rejects events older than 5 days per-item" and place it alongside the other batch tests so it exercises the same pipeline (postBatch, validTrack).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/api/src/controllers/track.controller.ts`:
- Around line 109-128: The change removed the 5-day cutoff and now accepts
arbitrarily old __timestamp values; reintroduce a FIVE_DAYS_MS constant (e.g., 5
* 24 * 60 * 60 * 1000) and add an early check using clientTimestampNumber and
safeTimestamp: if clientTimestampNumber < safeTimestamp - FIVE_DAYS_MS, return
the per-item rejection path required by the batch contract (the same
rejection/validation response your controller uses elsewhere) before the
existing ONE_MINUTE_MS and FIFTEEN_MINUTES_MS logic so old events are rejected
instead of persisted.
---
Outside diff comments:
In `@apps/api/src/controllers/track.controller.ts`:
- Around line 594-606: The current loop pushes full BatchItemResult objects into
the rejected array and returns them in the 202 response; update the controller
(the loop that iterates over results and the rejected array sent in
reply.status(202).send) to map each rejected item to the public shape { index,
reason, error } instead of the internal object (i.e., when result.status !==
'accepted' push or collect { index: result.index, reason: result.reason, error:
result.error }), then send that mapped array in the response so it matches the
documented contract.
- Around line 156-158: The code currently forces a default user-agent string by
setting requestUa = request.headers['user-agent'] ?? 'unknown/1.0', which breaks
the contract in ids.ts that expects a missing UA to yield empty
deviceId/sessionId; change this to preserve "no UA => no identity" by leaving
requestUa undefined/null when the header is absent (e.g., requestUa =
request.headers['user-agent'] as string | undefined) and ensure downstream uses
of requestUa (the identity generation code that calls into ids.ts) continue to
treat undefined/empty as "no UA" rather than a placeholder.
---
Nitpick comments:
In `@apps/api/src/routes/track-batch.router.test.ts`:
- Around line 204-359: Add a test that verifies the 5-day historical floor by
calling postBatch with one valid track (use validTrack) and one track whose
payload.properties.__timestamp is older than 5 days (e.g., Date.now() -
6*24*60*60*1000), then assert the response is 202, accepted is 1, rejected
length is 1, and the rejected item matches { index: 1, reason: 'validation' }
and its error message matches a timestamp/past/day regex; name the test "rejects
events older than 5 days per-item" and place it alongside the other batch tests
so it exercises the same pipeline (postBatch, validTrack).
In `@apps/api/src/routes/track.router.ts`:
- Line 37: The long single-line OpenAPI description on the route config should
be split across multiple template-literal lines for readability: update the
description property in track.router.ts (the route's description string that
references TRACK_BATCH_MAX_EVENTS) to a multi-line template literal or joined
string segments so the content flows across lines while preserving the embedded
${TRACK_BATCH_MAX_EVENTS} interpolation and exact wording; locate the
description key in the route definition and break the string into readable lines
(or an array .join('')) without changing the text or interpolation semantics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0fd25152-7171-48d5-a4a0-65a06d2b317f
📒 Files selected for processing (4)
apps/api/src/controllers/track.controller.tsapps/api/src/routes/track-batch.router.test.tsapps/api/src/routes/track.router.tspackages/validation/src/track.validation.ts
💤 Files with no reviewable changes (1)
- packages/validation/src/track.validation.ts
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
apps/api/src/controllers/track.controller.ts (2)
594-606:⚠️ Potential issue | 🟠 Major | ⚡ Quick winMap
rejected[]to the public response shape instead of returning internal state.This currently pushes the full
BatchItemResultobjects into the response, so each rejected item includesstatus: 'rejected'.apps/api/src/routes/track.router.tsdocuments the 202 body as{ index, reason, error }, so the controller should emit that exact shape instead of relying on serializer behavior.Suggested fix
let accepted = 0; - const rejected: Extract<BatchItemResult, { status: 'rejected' }>[] = []; + const rejected: Array<{ + index: number; + reason: 'validation' | 'internal'; + error: string; + }> = []; for (const result of results) { if (result.status === 'accepted') { accepted += 1; } else { - rejected.push(result); + rejected.push({ + index: result.index, + reason: result.reason, + error: result.error, + }); } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/controllers/track.controller.ts` around lines 594 - 606, The current loop pushes full BatchItemResult objects into the rejected array and returns them in the 202 response; update the controller (the loop that iterates over results and the rejected array sent in reply.status(202).send) to map each rejected item to the public shape { index, reason, error } instead of the internal object (i.e., when result.status !== 'accepted' push or collect { index: result.index, reason: result.reason, error: result.error }), then send that mapped array in the response so it matches the documented contract.
156-158:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPreserve the “no user-agent => no identity” behavior.
apps/api/src/utils/ids.tsexplicitly returns emptydeviceId/sessionIdwhenuais missing. Replacing a missing header with'unknown/1.0'here changes that contract and can collapse unrelated server-side traffic behind the same IP into the same device/session.Suggested fix
interface SharedRequestContext { projectId: string; requestIp: string; - requestUa: string; + requestUa?: string; requestHeaders: Record<string, string | undefined>; requestGeo: GeoLocation; salts: Salts; } @@ - const requestIp = request.clientIp; - const requestUa = request.headers['user-agent'] ?? 'unknown/1.0'; - const requestHeaders = getStringHeaders(request.headers); + const requestIp = request.clientIp; + const requestHeaders = getStringHeaders(request.headers); + const requestUa = requestHeaders['user-agent'];🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/controllers/track.controller.ts` around lines 156 - 158, The code currently forces a default user-agent string by setting requestUa = request.headers['user-agent'] ?? 'unknown/1.0', which breaks the contract in ids.ts that expects a missing UA to yield empty deviceId/sessionId; change this to preserve "no UA => no identity" by leaving requestUa undefined/null when the header is absent (e.g., requestUa = request.headers['user-agent'] as string | undefined) and ensure downstream uses of requestUa (the identity generation code that calls into ids.ts) continue to treat undefined/empty as "no UA" rather than a placeholder.
🧹 Nitpick comments (2)
apps/api/src/routes/track.router.ts (1)
37-37: 💤 Low valueConsider splitting the long description for readability.
The description string is comprehensive but spans a very long single line. While this works fine for OpenAPI documentation, breaking it into a template literal with multiple lines would improve source readability.
♻️ Optional refactor for readability
- description: `Ingest a tracking event (track, identify, group, increment, decrement, replay) or a batch of events ({ "type": "batch", "payload": [event, ...] }). Batch requests accept up to ${TRACK_BATCH_MAX_EVENTS} events and 10MB uncompressed per request; each event is dispatched through the same pipeline as a single-event request. Per-event validation failures are returned in the rejected[] array — the whole batch does not fail on a single bad row.`, + description: `Ingest a tracking event (track, identify, group, increment, decrement, replay) or a batch of events. + Batch requests use { "type": "batch", "payload": [event, ...] } and accept up to ${TRACK_BATCH_MAX_EVENTS} events and 10MB uncompressed per request. + Each event is dispatched through the same pipeline as a single-event request. + Per-event validation failures are returned in the rejected[] array — the whole batch does not fail on a single bad row.`,🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/routes/track.router.ts` at line 37, The long single-line OpenAPI description on the route config should be split across multiple template-literal lines for readability: update the description property in track.router.ts (the route's description string that references TRACK_BATCH_MAX_EVENTS) to a multi-line template literal or joined string segments so the content flows across lines while preserving the embedded ${TRACK_BATCH_MAX_EVENTS} interpolation and exact wording; locate the description key in the route definition and break the string into readable lines (or an array .join('')) without changing the text or interpolation semantics.apps/api/src/routes/track-batch.router.test.ts (1)
204-359: ⚡ Quick winConsider adding test coverage for the 5-day historical floor.
The test suite covers historical timestamps (two hours ago), but the PR objectives state: "Enforce a 5-day historical acceptance window" with "older events rejected per-row." There's no test verifying that events with timestamps older than 5 days are rejected with a per-item validation error.
📋 Suggested test case
it('rejects events older than 5 days per-item', async () => { const sixDaysAgo = Date.now() - 6 * 24 * 60 * 60 * 1000; const res = await postBatch([ validTrack('recent'), { type: 'track' as const, payload: { name: 'ancient_event', properties: { __timestamp: new Date(sixDaysAgo).toISOString() }, }, }, ]); expect(res.statusCode).toBe(202); const body = res.json(); expect(body.accepted).toBe(1); expect(body.rejected).toHaveLength(1); expect(body.rejected[0]).toMatchObject({ index: 1, reason: 'validation', }); expect(body.rejected[0].error).toMatch(/timestamp|past|day/i); });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/routes/track-batch.router.test.ts` around lines 204 - 359, Add a test that verifies the 5-day historical floor by calling postBatch with one valid track (use validTrack) and one track whose payload.properties.__timestamp is older than 5 days (e.g., Date.now() - 6*24*60*60*1000), then assert the response is 202, accepted is 1, rejected length is 1, and the rejected item matches { index: 1, reason: 'validation' } and its error message matches a timestamp/past/day regex; name the test "rejects events older than 5 days per-item" and place it alongside the other batch tests so it exercises the same pipeline (postBatch, validTrack).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@apps/api/src/controllers/track.controller.ts`:
- Around line 109-128: The change removed the 5-day cutoff and now accepts
arbitrarily old __timestamp values; reintroduce a FIVE_DAYS_MS constant (e.g., 5
* 24 * 60 * 60 * 1000) and add an early check using clientTimestampNumber and
safeTimestamp: if clientTimestampNumber < safeTimestamp - FIVE_DAYS_MS, return
the per-item rejection path required by the batch contract (the same
rejection/validation response your controller uses elsewhere) before the
existing ONE_MINUTE_MS and FIFTEEN_MINUTES_MS logic so old events are rejected
instead of persisted.
---
Outside diff comments:
In `@apps/api/src/controllers/track.controller.ts`:
- Around line 594-606: The current loop pushes full BatchItemResult objects into
the rejected array and returns them in the 202 response; update the controller
(the loop that iterates over results and the rejected array sent in
reply.status(202).send) to map each rejected item to the public shape { index,
reason, error } instead of the internal object (i.e., when result.status !==
'accepted' push or collect { index: result.index, reason: result.reason, error:
result.error }), then send that mapped array in the response so it matches the
documented contract.
- Around line 156-158: The code currently forces a default user-agent string by
setting requestUa = request.headers['user-agent'] ?? 'unknown/1.0', which breaks
the contract in ids.ts that expects a missing UA to yield empty
deviceId/sessionId; change this to preserve "no UA => no identity" by leaving
requestUa undefined/null when the header is absent (e.g., requestUa =
request.headers['user-agent'] as string | undefined) and ensure downstream uses
of requestUa (the identity generation code that calls into ids.ts) continue to
treat undefined/empty as "no UA" rather than a placeholder.
---
Nitpick comments:
In `@apps/api/src/routes/track-batch.router.test.ts`:
- Around line 204-359: Add a test that verifies the 5-day historical floor by
calling postBatch with one valid track (use validTrack) and one track whose
payload.properties.__timestamp is older than 5 days (e.g., Date.now() -
6*24*60*60*1000), then assert the response is 202, accepted is 1, rejected
length is 1, and the rejected item matches { index: 1, reason: 'validation' }
and its error message matches a timestamp/past/day regex; name the test "rejects
events older than 5 days per-item" and place it alongside the other batch tests
so it exercises the same pipeline (postBatch, validTrack).
In `@apps/api/src/routes/track.router.ts`:
- Line 37: The long single-line OpenAPI description on the route config should
be split across multiple template-literal lines for readability: update the
description property in track.router.ts (the route's description string that
references TRACK_BATCH_MAX_EVENTS) to a multi-line template literal or joined
string segments so the content flows across lines while preserving the embedded
${TRACK_BATCH_MAX_EVENTS} interpolation and exact wording; locate the
description key in the route definition and break the string into readable lines
(or an array .join('')) without changing the text or interpolation semantics.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0fd25152-7171-48d5-a4a0-65a06d2b317f
📒 Files selected for processing (4)
apps/api/src/controllers/track.controller.tsapps/api/src/routes/track-batch.router.test.tsapps/api/src/routes/track.router.tspackages/validation/src/track.validation.ts
💤 Files with no reviewable changes (1)
- packages/validation/src/track.validation.ts
🛑 Comments failed to post (1)
apps/api/src/controllers/track.controller.ts (1)
109-128:
⚠️ Potential issue | 🟠 Major | ⚡ Quick winRestore the 5-day rejection path for historical events.
This now accepts any past
__timestampand only marks it as historical after 15 minutes. That breaks the batch contract in this PR and will persist arbitrarily old events instead of rejecting them per item.Suggested fix
const ONE_MINUTE_MS = 60 * 1000; + const FIVE_DAYS_MS = 5 * 24 * 60 * ONE_MINUTE_MS; const FIFTEEN_MINUTES_MS = 15 * ONE_MINUTE_MS; // Use safeTimestamp if invalid or more than 1 minute in the future if ( Number.isNaN(clientTimestampNumber) || clientTimestampNumber > safeTimestamp + ONE_MINUTE_MS ) { return { timestamp: safeTimestamp, isTimestampFromThePast: false }; } + + if (clientTimestampNumber < safeTimestamp - FIVE_DAYS_MS) { + throw new HttpError('Timestamp is older than 5 days', { status: 400 }); + } // isTimestampFromThePast is true only if timestamp is older than 15 minutes const isTimestampFromThePast = clientTimestampNumber < safeTimestamp - FIFTEEN_MINUTES_MS;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.// Constants for time validation const ONE_MINUTE_MS = 60 * 1000; const FIVE_DAYS_MS = 5 * 24 * 60 * ONE_MINUTE_MS; const FIFTEEN_MINUTES_MS = 15 * ONE_MINUTE_MS; // Use safeTimestamp if invalid or more than 1 minute in the future if ( Number.isNaN(clientTimestampNumber) || clientTimestampNumber > safeTimestamp + ONE_MINUTE_MS ) { return { timestamp: safeTimestamp, isTimestampFromThePast: false }; } if (clientTimestampNumber < safeTimestamp - FIVE_DAYS_MS) { throw new HttpError('Timestamp is older than 5 days', { status: 400 }); } // isTimestampFromThePast is true only if timestamp is older than 15 minutes const isTimestampFromThePast = clientTimestampNumber < safeTimestamp - FIFTEEN_MINUTES_MS; return { timestamp: clientTimestampNumber, isTimestampFromThePast, };🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/api/src/controllers/track.controller.ts` around lines 109 - 128, The change removed the 5-day cutoff and now accepts arbitrarily old __timestamp values; reintroduce a FIVE_DAYS_MS constant (e.g., 5 * 24 * 60 * 60 * 1000) and add an early check using clientTimestampNumber and safeTimestamp: if clientTimestampNumber < safeTimestamp - FIVE_DAYS_MS, return the per-item rejection path required by the batch contract (the same rejection/validation response your controller uses elsewhere) before the existing ONE_MINUTE_MS and FIFTEEN_MINUTES_MS logic so old events are rejected instead of persisted.
|
@lindesvard Reworked as requested:
Happy to adjust the caps (2000 events / 10 MB) or the response shape if you'd prefer different numbers. |
Closes #373. Supersedes #374.
Summary
POST /trackaccepts a new body shape alongside the existing single-event one:{ "type": "batch", "payload": [ { "type": "track", "payload": { ... } }, ... ] }Up to 2000 events / 10 MB per request. Each item is dispatched through the exact same per-event pipeline as a single-event request. Built so offline-capable SDKs (mobile, IoT, edge) can drain a local buffer in one round-trip.
What changed per review
/track/batchendpoint, piggyback on/trackwith abatchtypezTrackBatchHandlerPayloadjoins the body union onPOST /track; no new routeapps/worker,packages/queue,apps/api/src/utils/ids.ts,event.controller.tsare all untouched vsmain.getTimestamp/isTimestampFromThePastsemantics unchanged.Diff is now:
packages/validation/src/track.validation.ts,apps/api/src/controllers/track.controller.ts,apps/api/src/routes/track.router.ts, plus a test file.Behavior
Response: single events keep their 200
{ deviceId, sessionId }. Batch envelopes return 202{ accepted, rejected[] }.Per-row rejection instead of whole-batch failure. The envelope validates loosely (
payload: z.unknown()[], 1–2000 items); each item issafeParsed throughzTrackHandlerPayloadin the controller. Failures become{ index, reason: 'validation' | 'internal', error }entries — a 2000-event batch with one malformed row doesn't fail the other 1999, and the caller can re-send just the bad indices.aliasis rejected per-row with the same message as the single-event 400.Salts + geo fetched once per request.
buildSharedRequestContextdoes the salts query and request-IP geo lookup once;buildEventContextreuses them per item. Only events overriding__iptrigger a second geo lookup. For a 2000-event batch that's 1 PG round-trip + 1 geo lookup instead of 2000 of each.Bounded concurrency. Items are processed 50 at a time — each event touches Redis (and possibly geo), so an unbounded
Promise.allover 2000 events would spike connection-pool/rate budgets on smaller self-hosted instances.Caps. 2000 events / 10 MB mirrors Mixpanel
/import. The 10 MB body cap uses Fastify's existingbodyLimiton the route — no new mechanism.Refactor note: the single-event
switchinhandlermoved into a shareddispatchEvent(with aneverexhaustiveness check) so single and batch paths can't drift apart.Out of scope (intentionally)
Historical events with
__timestampstill get their session derived at processing time, exactly as onmain— this PR doesn't touch session handling. Once the session-management rework lands, batched historical events should benefit from it automatically since they flow through the same per-event pipeline.Testing
apps/api/src/routes/track-batch.router.test.ts(449 lines): envelope happy path, per-index rejection (validation + internal), mixed valid/invalid rows, alias-in-batch, nested-batch rejection, empty/oversized payload bounds,__ipper-event geo override, non-object items, all dispatch arms (group,assign_group,increment,decrement,replay), single-event regression (200 shape unchanged), and chunk-boundary index alignment. Fullapps/apisuite passing.Summary by CodeRabbit