Skip to content

feat(kilo-chat): rip out stream chat#2907

Open
iscekic wants to merge 193 commits intomainfrom
feat/kilo-chat-migration-pr1
Open

feat(kilo-chat): rip out stream chat#2907
iscekic wants to merge 193 commits intomainfrom
feat/kilo-chat-migration-pr1

Conversation

@iscekic
Copy link
Copy Markdown
Contributor

@iscekic iscekic commented Apr 29, 2026

TBD

@iscekic iscekic self-assigned this Apr 29, 2026
Comment thread services/event-service/src/__tests__/has-context.test.ts
Comment thread services/event-service/src/__tests__/is-user-in-context.test.ts
@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented Apr 29, 2026

Code Review Summary

Status: 7 Issues Found | Recommendation: Address before merge

Overview

Severity Count
CRITICAL 0
WARNING 7
SUGGESTION 0
Issue Details (click to expand)

WARNING

File Line Issue
apps/mobile/src/components/kilo-chat/hooks/use-mark-read.ts 72 markRead depends on the unstable mutation object and can repeatedly fire focus-side effects.
services/event-service/src/do/connection-ticket-do.ts 31 Unconsumed connection tickets are stored in per-ticket Durable Objects with no expiry cleanup, leaking storage.
services/event-service/src/do/connection-ticket-do.ts 49 Successfully consumed tickets are marked consumed instead of deleted, leaking storage on the normal WebSocket connection path.
services/kiloclaw/src/index.ts 1080 deliverChatWebhook no longer validates runtime RPC payloads before deriving routing and forwarding to the controller.
services/notifications/src/dos/NotificationChannelDO.ts 28 dispatchPush no longer validates runtime payloads before presence lookup, storage mutation, DB reads, and Expo dispatch.
services/notifications/src/index.ts 174 clearBadgeBucketForUser no longer validates runtime input before selecting the user DO and clearing a bucket.
services/notifications/src/lib/expo-push.ts 88 Non-stale ticket errors include raw Expo push tokens, which can leak device tokens through logs or RPC results.
Other Observations (not in diff)

Issues found in unchanged code that cannot receive inline comments:

File Line Issue
Files Reviewed (12 files)
  • packages/event-service/src/client.ts - 0 new issues
  • packages/event-service/src/schemas.ts - 0 new issues
  • packages/event-service/src/types.ts - 0 new issues
  • packages/kilo-chat/src/client.ts - 0 new issues
  • services/event-service/src/do/connection-ticket-do.ts - 0 new issues; previous ticket cleanup findings still apply at shifted lines
  • services/event-service/src/do/user-session-do.ts - 0 new issues
  • services/event-service/src/index.ts - 0 new issues
  • services/kilo-chat/src/services/bot-status-request.ts - 0 new issues
  • services/kilo-chat/src/services/conversations.ts - 0 new issues
  • services/kilo-chat/src/services/messages.ts - 0 new issues
  • services/kiloclaw/plugins/kilo-chat/src/client.ts - 0 new issues
  • services/kiloclaw/src/durable-objects/kiloclaw-instance/lifecycle-push.ts - 0 new issues

Fix these issues in Kilo Cloud


Reviewed by gpt-5.5-20260423 · 1,240,947 tokens

iscekic added 2 commits April 29, 2026 17:12
Mirrors the .min(1) constraint used on every other userId/recipientUserIds
field. Empty strings were silently passing where null is the intended
sentinel for system-sent messages.
…event-service

The /presence/* contexts are queried via event-service.isUserInContext and
subscribed via the event-service WebSocket — they are an event-service
concern, not a notifications-package concern. Notifications only consumes
the resulting context strings.

Non-presence event contexts (kilo-chat conversation events, etc.) will
move into the same package in a follow-up; this PR ships only the
presence builders since that is what later phases need.
iscekic added 3 commits April 30, 2026 14:53
* refactor(db): rename channel_badge_counts to badge_counts (general purpose); update all consumers

* feat(db): migration to rename badge_counts and reset rows

* feat(notifications): add badge-bucket key builders

The badge_counts.badge_bucket column is a free-form string. To prevent
namespace collisions as more surfaces start emitting badge updates
(per-instance today, per-conversation later), centralize bucket-key
derivation in @kilocode/notifications and route NotificationChannelDO
through it. Mirrors the presence-context builders in @kilocode/event-service.

Safe to introduce now without a data migration because PR 2's migration
already wipes badge_counts.
… message.created (#2918)

* refactor(db): rename channel_badge_counts to badge_counts (general purpose); update all consumers

* feat(db): migration to rename badge_counts and reset rows

* feat(notifications): add badge-bucket key builders

The badge_counts.badge_bucket column is a free-form string. To prevent
namespace collisions as more surfaces start emitting badge updates
(per-instance today, per-conversation later), centralize bucket-key
derivation in @kilocode/notifications and route NotificationChannelDO
through it. Mirrors the presence-context builders in @kilocode/event-service.

Safe to introduce now without a data migration because PR 2's migration
already wipes badge_counts.

* chore(notifications): add EVENT_SERVICE binding, drop STREAM_CHAT_API_SECRET

* chore(notifications): add vitest scaffold

* feat(notifications): rewrite NotificationChannelDO around dispatchPush

* chore(notifications): drop orphan badgeBucketForInstance helper

* feat(notifications): add sendPushForConversation WorkerEntrypoint RPC

* chore(notifications): delete Stream webhook route

* chore(notifications): type EVENT_SERVICE RPC and enable cloudflare:test types

* feat(event-service): add kiloclaw event-context helpers; migrate kilo-chat producer

Adds kiloclawInstanceContext and kiloclawConversationContext path
builders to @kilocode/event-service, replacing hardcoded template
literals in kilo-chat's event-push.ts and its test so all callers
share a single source of truth.

* feat(kilo-chat): add fetchSandboxLabel helper

* chore(kilo-chat): add NOTIFICATIONS service binding

* feat(kilo-chat): publish push on message.created via NOTIFICATIONS RPC

When a chat message is persisted, fire-and-forget a call to
NOTIFICATIONS.sendPushForConversation so non-sender human members of the
conversation receive a push. Runs after realtime/event-service delivery
inside postCommitFanOut, with errors swallowed so push failures cannot
fail the send.

- Skip when there are no other human recipients or no sandboxId.
- senderUserId = callerId for human senders, null for bot senders.
- title is "<sandboxLabel> · <conversationTitle>"; bodyPreview is the
  first 200 chars of the concatenated text blocks.
- Add @kilocode/notifications workspace dep and layer the RPC method
  shape into Env via bindings.d.ts.
- Add a notifications-stub worker to the vitest config so tests can
  spy on env.NOTIFICATIONS.sendPushForConversation, and globally mock
  sandbox-lookup in setup.ts (it imports pg via @kilocode/db).

* chore(notifications): drop orphan stream-chat dep, refresh worker types, fix test mock

- Remove `stream-chat` from `services/notifications/package.json`; the Stream
  webhook (its only consumer) was deleted earlier in the stack.
- Regenerate `worker-configuration.d.ts` so the workerd runtime types match the
  current toolchain (sibling services were on `1.20260312.1`; this one had
  drifted to `1.20251217.0` from a stale local cache).
- Fix the global test mock to reference the renamed `badge_counts` table; the
  setup file was authored against the pre-rename name and never matched.
- Tidy two pre-existing lint nits in the new test files (`import type` for
  type-only import, drop unused `cols` parameter).

* fix(notifications): named entrypoint export, retry-safe badge, alarm-leak

- Switch `NotificationsService` from default-only to a named class export
  with a separate default. `services/kilo-chat/wrangler.jsonc` binds via
  `entrypoint: "NotificationsService"`, which resolves named module
  exports. The default-only form (`export default class NotificationsService`)
  exports under the `default` key — kilo-chat's RPC binding would not have
  resolved at deploy. Mirrors the existing pattern in
  `services/kilo-chat/src/index.ts` (`KiloChatService`).

- `dispatchPush` now uses a two-stage idempotency record (`pending` →
  `delivered`). The badge increment was previously non-idempotent: an
  Expo failure returned `failed` without writing the idempotency key, so
  upstream retries (which the design explicitly invites) re-ran the
  increment before the next send and inflated the badge by one per
  retry. The `pending` marker is written before the increment and
  short-circuits the increment on retry; the `delivered` marker is only
  written on success.

- `setAlarm` is now gated on `getAlarm() === null`. Calling `setAlarm`
  unconditionally on each successful push — as the previous code did —
  replaces the pending alarm and pushes the cleanup forward indefinitely
  on a conversation receiving more than one push per `IDEM_TTL_MS`,
  leaking expired idempotency entries.

Adds two test cases covering the badge-retry and alarm-reset paths.

* fix(notifications): close two cleanup-alarm leaks

- Schedule the cleanup alarm when writing the `pending` marker, not only
  on `delivered`. Without this, an Expo failure followed by no further
  push activity for the conversation leaves the `pending` record in DO
  storage forever (no alarm was ever set to prune it).

- After the alarm fires, reschedule for the earliest remaining record's
  expiry instead of leaving the alarm slot empty. Otherwise a quiet
  conversation strands its younger entries until some unrelated future
  dispatch wakes the DO up.

Both paths go through a small `ensureCleanupAlarm` helper that gates on
`getAlarm() === null` so a busy conversation still doesn't push the
alarm forward on every call.

* refactor(event-service): compose presence contexts from kiloclaw helpers

The kiloclaw-scoped presence paths are literally `/presence` prefixed
onto the kiloclaw event-context paths. Build them by composition so the
`/kiloclaw/{sandboxId}[/{conversationId}]` segment shape is defined in
exactly one place — `kiloclaw-contexts.ts`.

Pure refactor; same string output, template-literal types still narrow
to the same shape.
* refactor(db): rename channel_badge_counts to badge_counts (general purpose); update all consumers

* feat(db): migration to rename badge_counts and reset rows

* feat(notifications): add badge-bucket key builders

The badge_counts.badge_bucket column is a free-form string. To prevent
namespace collisions as more surfaces start emitting badge updates
(per-instance today, per-conversation later), centralize bucket-key
derivation in @kilocode/notifications and route NotificationChannelDO
through it. Mirrors the presence-context builders in @kilocode/event-service.

Safe to introduce now without a data migration because PR 2's migration
already wipes badge_counts.

* chore(notifications): add EVENT_SERVICE binding, drop STREAM_CHAT_API_SECRET

* chore(notifications): add vitest scaffold

* feat(notifications): rewrite NotificationChannelDO around dispatchPush

* chore(notifications): drop orphan badgeBucketForInstance helper

* feat(notifications): add sendPushForConversation WorkerEntrypoint RPC

* chore(notifications): delete Stream webhook route

* chore(notifications): type EVENT_SERVICE RPC and enable cloudflare:test types

* feat(event-service): add kiloclaw event-context helpers; migrate kilo-chat producer

Adds kiloclawInstanceContext and kiloclawConversationContext path
builders to @kilocode/event-service, replacing hardcoded template
literals in kilo-chat's event-push.ts and its test so all callers
share a single source of truth.

* feat(kilo-chat): add fetchSandboxLabel helper

* chore(kilo-chat): add NOTIFICATIONS service binding

* feat(kilo-chat): publish push on message.created via NOTIFICATIONS RPC

When a chat message is persisted, fire-and-forget a call to
NOTIFICATIONS.sendPushForConversation so non-sender human members of the
conversation receive a push. Runs after realtime/event-service delivery
inside postCommitFanOut, with errors swallowed so push failures cannot
fail the send.

- Skip when there are no other human recipients or no sandboxId.
- senderUserId = callerId for human senders, null for bot senders.
- title is "<sandboxLabel> · <conversationTitle>"; bodyPreview is the
  first 200 chars of the concatenated text blocks.
- Add @kilocode/notifications workspace dep and layer the RPC method
  shape into Env via bindings.d.ts.
- Add a notifications-stub worker to the vitest config so tests can
  spy on env.NOTIFICATIONS.sendPushForConversation, and globally mock
  sandbox-lookup in setup.ts (it imports pg via @kilocode/db).

* chore(notifications): drop orphan stream-chat dep, refresh worker types, fix test mock

- Remove `stream-chat` from `services/notifications/package.json`; the Stream
  webhook (its only consumer) was deleted earlier in the stack.
- Regenerate `worker-configuration.d.ts` so the workerd runtime types match the
  current toolchain (sibling services were on `1.20260312.1`; this one had
  drifted to `1.20251217.0` from a stale local cache).
- Fix the global test mock to reference the renamed `badge_counts` table; the
  setup file was authored against the pre-rename name and never matched.
- Tidy two pre-existing lint nits in the new test files (`import type` for
  type-only import, drop unused `cols` parameter).

* fix(notifications): named entrypoint export, retry-safe badge, alarm-leak

- Switch `NotificationsService` from default-only to a named class export
  with a separate default. `services/kilo-chat/wrangler.jsonc` binds via
  `entrypoint: "NotificationsService"`, which resolves named module
  exports. The default-only form (`export default class NotificationsService`)
  exports under the `default` key — kilo-chat's RPC binding would not have
  resolved at deploy. Mirrors the existing pattern in
  `services/kilo-chat/src/index.ts` (`KiloChatService`).

- `dispatchPush` now uses a two-stage idempotency record (`pending` →
  `delivered`). The badge increment was previously non-idempotent: an
  Expo failure returned `failed` without writing the idempotency key, so
  upstream retries (which the design explicitly invites) re-ran the
  increment before the next send and inflated the badge by one per
  retry. The `pending` marker is written before the increment and
  short-circuits the increment on retry; the `delivered` marker is only
  written on success.

- `setAlarm` is now gated on `getAlarm() === null`. Calling `setAlarm`
  unconditionally on each successful push — as the previous code did —
  replaces the pending alarm and pushes the cleanup forward indefinitely
  on a conversation receiving more than one push per `IDEM_TTL_MS`,
  leaking expired idempotency entries.

Adds two test cases covering the badge-retry and alarm-reset paths.

* fix(notifications): close two cleanup-alarm leaks

- Schedule the cleanup alarm when writing the `pending` marker, not only
  on `delivered`. Without this, an Expo failure followed by no further
  push activity for the conversation leaves the `pending` record in DO
  storage forever (no alarm was ever set to prune it).

- After the alarm fires, reschedule for the earliest remaining record's
  expiry instead of leaving the alarm slot empty. Otherwise a quiet
  conversation strands its younger entries until some unrelated future
  dispatch wakes the DO up.

Both paths go through a small `ensureCleanupAlarm` helper that gates on
`getAlarm() === null` so a busy conversation still doesn't push the
alarm forward on every call.

* refactor(event-service): compose presence contexts from kiloclaw helpers

The kiloclaw-scoped presence paths are literally `/presence` prefixed
onto the kiloclaw event-context paths. Build them by composition so the
`/kiloclaw/{sandboxId}[/{conversationId}]` segment shape is defined in
exactly one place — `kiloclaw-contexts.ts`.

Pure refactor; same string output, template-literal types still narrow
to the same shape.

* feat(web): add kiloChat.getToken tRPC procedure

* refactor(web): use kiloclaw-context helpers for event subscriptions

* feat(web): lift EventServiceClient to global provider

Introduces a single app-shell EventServiceProvider that owns the
EventServiceClient and KiloChatClient for all authenticated routes.
Mounted in (app)/layout.tsx so platform/instance/conversation presence
subscriptions and the kilo-chat UI share one WebSocket.

KiloChatLayout now consumes the global clients via useEventServiceClient()
instead of spinning up its own pair, and the getToken prop is removed from
KiloChatLayoutProps (along with both call sites). The local
useEventService(getToken) factory is dead code and has been deleted;
useInstanceContext / useConversationContext stay since they take
EventServiceClient as a parameter.

* feat(web): add usePresenceSubscription primitive

Thin hook that subscribes the global EventServiceClient to a single
context for the lifetime of the calling component, gated by an `active`
flag. Will back upcoming platform- and instance-level presence
indicators.

* refactor(web): collapse kilo-chat event subscriptions into usePresenceSubscription

- Drop dead getToken field from KiloChatContextValue (no consumers).
- Remove useInstanceContext / useConversationContext hooks; both call
  sites now use the shared usePresenceSubscription primitive directly.
- Harden usePresenceSubscription against empty-string contexts.

* feat(web): subscribe to /presence/web while tab is visible

* feat(web): subscribe to /presence/kiloclaw/{sandboxId} on instance views

* refactor(web): extract useDocumentVisible primitive

* feat(web): subscribe to conversation presence while tab visible

* style(web): reflow useDocumentVisible useState init to one line

* refactor(web): tighten presence hook + kilo-chat router contract

- usePresenceSubscription: accept 'string | null' instead of empty-string
  sentinel; update call sites (KiloChatLayout, MessageArea, useInstancePresence)
- kilo-chat router: validate expiresAt with z.iso.datetime()
- kilo-chat-router test: verify the JWT payload (kiloUserId, tokenSource,
  version) and that expiresAt lands in the expected ~1h window
- MessageArea: comment distinguishing the always-on chat-event subscription
  from the visibility-gated presence subscription

* fix(event-service): re-check destroyed after token fetch

connect() awaits getToken() before constructing the WebSocket. If
disconnect() runs in that window (provider unmount, sign-out, strict-mode
remount) the in-flight token fetch resolves and we'd construct a fresh
socket + start the ping timer with no React owner left to clean it up.

Re-check this.destroyed after the await and bail before creating the
socket.
@iscekic iscekic changed the title feat(notifications): PR 1 — shared schemas + presence query feat(kilo-chat): rip out stream chat Apr 30, 2026
iscekic added 8 commits April 30, 2026 15:10
* refactor(db): rename channel_badge_counts to badge_counts (general purpose); update all consumers

* feat(db): migration to rename badge_counts and reset rows

* feat(notifications): add badge-bucket key builders

The badge_counts.badge_bucket column is a free-form string. To prevent
namespace collisions as more surfaces start emitting badge updates
(per-instance today, per-conversation later), centralize bucket-key
derivation in @kilocode/notifications and route NotificationChannelDO
through it. Mirrors the presence-context builders in @kilocode/event-service.

Safe to introduce now without a data migration because PR 2's migration
already wipes badge_counts.

* chore(notifications): add EVENT_SERVICE binding, drop STREAM_CHAT_API_SECRET

* chore(notifications): add vitest scaffold

* feat(notifications): rewrite NotificationChannelDO around dispatchPush

* chore(notifications): drop orphan badgeBucketForInstance helper

* feat(notifications): add sendPushForConversation WorkerEntrypoint RPC

* chore(notifications): delete Stream webhook route

* chore(notifications): type EVENT_SERVICE RPC and enable cloudflare:test types

* feat(event-service): add kiloclaw event-context helpers; migrate kilo-chat producer

Adds kiloclawInstanceContext and kiloclawConversationContext path
builders to @kilocode/event-service, replacing hardcoded template
literals in kilo-chat's event-push.ts and its test so all callers
share a single source of truth.

* feat(kilo-chat): add fetchSandboxLabel helper

* chore(kilo-chat): add NOTIFICATIONS service binding

* feat(kilo-chat): publish push on message.created via NOTIFICATIONS RPC

When a chat message is persisted, fire-and-forget a call to
NOTIFICATIONS.sendPushForConversation so non-sender human members of the
conversation receive a push. Runs after realtime/event-service delivery
inside postCommitFanOut, with errors swallowed so push failures cannot
fail the send.

- Skip when there are no other human recipients or no sandboxId.
- senderUserId = callerId for human senders, null for bot senders.
- title is "<sandboxLabel> · <conversationTitle>"; bodyPreview is the
  first 200 chars of the concatenated text blocks.
- Add @kilocode/notifications workspace dep and layer the RPC method
  shape into Env via bindings.d.ts.
- Add a notifications-stub worker to the vitest config so tests can
  spy on env.NOTIFICATIONS.sendPushForConversation, and globally mock
  sandbox-lookup in setup.ts (it imports pg via @kilocode/db).

* chore(notifications): drop orphan stream-chat dep, refresh worker types, fix test mock

- Remove `stream-chat` from `services/notifications/package.json`; the Stream
  webhook (its only consumer) was deleted earlier in the stack.
- Regenerate `worker-configuration.d.ts` so the workerd runtime types match the
  current toolchain (sibling services were on `1.20260312.1`; this one had
  drifted to `1.20251217.0` from a stale local cache).
- Fix the global test mock to reference the renamed `badge_counts` table; the
  setup file was authored against the pre-rename name and never matched.
- Tidy two pre-existing lint nits in the new test files (`import type` for
  type-only import, drop unused `cols` parameter).

* fix(notifications): named entrypoint export, retry-safe badge, alarm-leak

- Switch `NotificationsService` from default-only to a named class export
  with a separate default. `services/kilo-chat/wrangler.jsonc` binds via
  `entrypoint: "NotificationsService"`, which resolves named module
  exports. The default-only form (`export default class NotificationsService`)
  exports under the `default` key — kilo-chat's RPC binding would not have
  resolved at deploy. Mirrors the existing pattern in
  `services/kilo-chat/src/index.ts` (`KiloChatService`).

- `dispatchPush` now uses a two-stage idempotency record (`pending` →
  `delivered`). The badge increment was previously non-idempotent: an
  Expo failure returned `failed` without writing the idempotency key, so
  upstream retries (which the design explicitly invites) re-ran the
  increment before the next send and inflated the badge by one per
  retry. The `pending` marker is written before the increment and
  short-circuits the increment on retry; the `delivered` marker is only
  written on success.

- `setAlarm` is now gated on `getAlarm() === null`. Calling `setAlarm`
  unconditionally on each successful push — as the previous code did —
  replaces the pending alarm and pushes the cleanup forward indefinitely
  on a conversation receiving more than one push per `IDEM_TTL_MS`,
  leaking expired idempotency entries.

Adds two test cases covering the badge-retry and alarm-reset paths.

* fix(notifications): close two cleanup-alarm leaks

- Schedule the cleanup alarm when writing the `pending` marker, not only
  on `delivered`. Without this, an Expo failure followed by no further
  push activity for the conversation leaves the `pending` record in DO
  storage forever (no alarm was ever set to prune it).

- After the alarm fires, reschedule for the earliest remaining record's
  expiry instead of leaving the alarm slot empty. Otherwise a quiet
  conversation strands its younger entries until some unrelated future
  dispatch wakes the DO up.

Both paths go through a small `ensureCleanupAlarm` helper that gates on
`getAlarm() === null` so a busy conversation still doesn't push the
alarm forward on every call.

* refactor(event-service): compose presence contexts from kiloclaw helpers

The kiloclaw-scoped presence paths are literally `/presence` prefixed
onto the kiloclaw event-context paths. Build them by composition so the
`/kiloclaw/{sandboxId}[/{conversationId}]` segment shape is defined in
exactly one place — `kiloclaw-contexts.ts`.

Pure refactor; same string output, template-literal types still narrow
to the same shape.

* feat(web): add kiloChat.getToken tRPC procedure

* refactor(web): use kiloclaw-context helpers for event subscriptions

* feat(web): lift EventServiceClient to global provider

Introduces a single app-shell EventServiceProvider that owns the
EventServiceClient and KiloChatClient for all authenticated routes.
Mounted in (app)/layout.tsx so platform/instance/conversation presence
subscriptions and the kilo-chat UI share one WebSocket.

KiloChatLayout now consumes the global clients via useEventServiceClient()
instead of spinning up its own pair, and the getToken prop is removed from
KiloChatLayoutProps (along with both call sites). The local
useEventService(getToken) factory is dead code and has been deleted;
useInstanceContext / useConversationContext stay since they take
EventServiceClient as a parameter.

* feat(web): add usePresenceSubscription primitive

Thin hook that subscribes the global EventServiceClient to a single
context for the lifetime of the calling component, gated by an `active`
flag. Will back upcoming platform- and instance-level presence
indicators.

* refactor(web): collapse kilo-chat event subscriptions into usePresenceSubscription

- Drop dead getToken field from KiloChatContextValue (no consumers).
- Remove useInstanceContext / useConversationContext hooks; both call
  sites now use the shared usePresenceSubscription primitive directly.
- Harden usePresenceSubscription against empty-string contexts.

* feat(web): subscribe to /presence/web while tab is visible

* feat(web): subscribe to /presence/kiloclaw/{sandboxId} on instance views

* refactor(web): extract useDocumentVisible primitive

* feat(web): subscribe to conversation presence while tab visible

* style(web): reflow useDocumentVisible useState init to one line

* refactor(web): tighten presence hook + kilo-chat router contract

- usePresenceSubscription: accept 'string | null' instead of empty-string
  sentinel; update call sites (KiloChatLayout, MessageArea, useInstancePresence)
- kilo-chat router: validate expiresAt with z.iso.datetime()
- kilo-chat-router test: verify the JWT payload (kiloUserId, tokenSource,
  version) and that expiresAt lands in the expected ~1h window
- MessageArea: comment distinguishing the always-on chat-event subscription
  from the visibility-gated presence subscription

* fix(event-service): refcount subscribe/unsubscribe by context

Multiple consumers can now independently hold the same context without
trampling each other. The wire context.subscribe/context.unsubscribe
messages are only sent on the 0->1 and 1->0 refcount transitions; the
intermediate churn stays client-side.

Resubscribe-on-reconnect dedupes by context key.

Tests cover: double-subscribe collapses to a single wire send, partial
unsubscribe keeps the context alive, last-consumer-out releases it,
mixed batches only send newly-active contexts, unknown-context
unsubscribes are no-ops, and reconnect resubscribes each context once.
* refactor(db): rename channel_badge_counts to badge_counts (general purpose); update all consumers

* feat(db): migration to rename badge_counts and reset rows

* feat(notifications): add badge-bucket key builders

The badge_counts.badge_bucket column is a free-form string. To prevent
namespace collisions as more surfaces start emitting badge updates
(per-instance today, per-conversation later), centralize bucket-key
derivation in @kilocode/notifications and route NotificationChannelDO
through it. Mirrors the presence-context builders in @kilocode/event-service.

Safe to introduce now without a data migration because PR 2's migration
already wipes badge_counts.

* chore(notifications): add EVENT_SERVICE binding, drop STREAM_CHAT_API_SECRET

* chore(notifications): add vitest scaffold

* feat(notifications): rewrite NotificationChannelDO around dispatchPush

* chore(notifications): drop orphan badgeBucketForInstance helper

* feat(notifications): add sendPushForConversation WorkerEntrypoint RPC

* chore(notifications): delete Stream webhook route

* chore(notifications): type EVENT_SERVICE RPC and enable cloudflare:test types

* feat(event-service): add kiloclaw event-context helpers; migrate kilo-chat producer

Adds kiloclawInstanceContext and kiloclawConversationContext path
builders to @kilocode/event-service, replacing hardcoded template
literals in kilo-chat's event-push.ts and its test so all callers
share a single source of truth.

* feat(kilo-chat): add fetchSandboxLabel helper

* chore(kilo-chat): add NOTIFICATIONS service binding

* feat(kilo-chat): publish push on message.created via NOTIFICATIONS RPC

When a chat message is persisted, fire-and-forget a call to
NOTIFICATIONS.sendPushForConversation so non-sender human members of the
conversation receive a push. Runs after realtime/event-service delivery
inside postCommitFanOut, with errors swallowed so push failures cannot
fail the send.

- Skip when there are no other human recipients or no sandboxId.
- senderUserId = callerId for human senders, null for bot senders.
- title is "<sandboxLabel> · <conversationTitle>"; bodyPreview is the
  first 200 chars of the concatenated text blocks.
- Add @kilocode/notifications workspace dep and layer the RPC method
  shape into Env via bindings.d.ts.
- Add a notifications-stub worker to the vitest config so tests can
  spy on env.NOTIFICATIONS.sendPushForConversation, and globally mock
  sandbox-lookup in setup.ts (it imports pg via @kilocode/db).

* chore(notifications): drop orphan stream-chat dep, refresh worker types, fix test mock

- Remove `stream-chat` from `services/notifications/package.json`; the Stream
  webhook (its only consumer) was deleted earlier in the stack.
- Regenerate `worker-configuration.d.ts` so the workerd runtime types match the
  current toolchain (sibling services were on `1.20260312.1`; this one had
  drifted to `1.20251217.0` from a stale local cache).
- Fix the global test mock to reference the renamed `badge_counts` table; the
  setup file was authored against the pre-rename name and never matched.
- Tidy two pre-existing lint nits in the new test files (`import type` for
  type-only import, drop unused `cols` parameter).

* fix(notifications): named entrypoint export, retry-safe badge, alarm-leak

- Switch `NotificationsService` from default-only to a named class export
  with a separate default. `services/kilo-chat/wrangler.jsonc` binds via
  `entrypoint: "NotificationsService"`, which resolves named module
  exports. The default-only form (`export default class NotificationsService`)
  exports under the `default` key — kilo-chat's RPC binding would not have
  resolved at deploy. Mirrors the existing pattern in
  `services/kilo-chat/src/index.ts` (`KiloChatService`).

- `dispatchPush` now uses a two-stage idempotency record (`pending` →
  `delivered`). The badge increment was previously non-idempotent: an
  Expo failure returned `failed` without writing the idempotency key, so
  upstream retries (which the design explicitly invites) re-ran the
  increment before the next send and inflated the badge by one per
  retry. The `pending` marker is written before the increment and
  short-circuits the increment on retry; the `delivered` marker is only
  written on success.

- `setAlarm` is now gated on `getAlarm() === null`. Calling `setAlarm`
  unconditionally on each successful push — as the previous code did —
  replaces the pending alarm and pushes the cleanup forward indefinitely
  on a conversation receiving more than one push per `IDEM_TTL_MS`,
  leaking expired idempotency entries.

Adds two test cases covering the badge-retry and alarm-reset paths.

* fix(notifications): close two cleanup-alarm leaks

- Schedule the cleanup alarm when writing the `pending` marker, not only
  on `delivered`. Without this, an Expo failure followed by no further
  push activity for the conversation leaves the `pending` record in DO
  storage forever (no alarm was ever set to prune it).

- After the alarm fires, reschedule for the earliest remaining record's
  expiry instead of leaving the alarm slot empty. Otherwise a quiet
  conversation strands its younger entries until some unrelated future
  dispatch wakes the DO up.

Both paths go through a small `ensureCleanupAlarm` helper that gates on
`getAlarm() === null` so a busy conversation still doesn't push the
alarm forward on every call.

* refactor(event-service): compose presence contexts from kiloclaw helpers

The kiloclaw-scoped presence paths are literally `/presence` prefixed
onto the kiloclaw event-context paths. Build them by composition so the
`/kiloclaw/{sandboxId}[/{conversationId}]` segment shape is defined in
exactly one place — `kiloclaw-contexts.ts`.

Pure refactor; same string output, template-literal types still narrow
to the same shape.

* feat(web): add kiloChat.getToken tRPC procedure

* refactor(web): use kiloclaw-context helpers for event subscriptions

* feat(web): lift EventServiceClient to global provider

Introduces a single app-shell EventServiceProvider that owns the
EventServiceClient and KiloChatClient for all authenticated routes.
Mounted in (app)/layout.tsx so platform/instance/conversation presence
subscriptions and the kilo-chat UI share one WebSocket.

KiloChatLayout now consumes the global clients via useEventServiceClient()
instead of spinning up its own pair, and the getToken prop is removed from
KiloChatLayoutProps (along with both call sites). The local
useEventService(getToken) factory is dead code and has been deleted;
useInstanceContext / useConversationContext stay since they take
EventServiceClient as a parameter.

* feat(web): add usePresenceSubscription primitive

Thin hook that subscribes the global EventServiceClient to a single
context for the lifetime of the calling component, gated by an `active`
flag. Will back upcoming platform- and instance-level presence
indicators.

* refactor(web): collapse kilo-chat event subscriptions into usePresenceSubscription

- Drop dead getToken field from KiloChatContextValue (no consumers).
- Remove useInstanceContext / useConversationContext hooks; both call
  sites now use the shared usePresenceSubscription primitive directly.
- Harden usePresenceSubscription against empty-string contexts.

* feat(web): subscribe to /presence/web while tab is visible

* feat(web): subscribe to /presence/kiloclaw/{sandboxId} on instance views

* refactor(web): extract useDocumentVisible primitive

* feat(web): subscribe to conversation presence while tab visible

* style(web): reflow useDocumentVisible useState init to one line

* refactor(web): tighten presence hook + kilo-chat router contract

- usePresenceSubscription: accept 'string | null' instead of empty-string
  sentinel; update call sites (KiloChatLayout, MessageArea, useInstancePresence)
- kilo-chat router: validate expiresAt with z.iso.datetime()
- kilo-chat-router test: verify the JWT payload (kiloUserId, tokenSource,
  version) and that expiresAt lands in the expected ~1h window
- MessageArea: comment distinguishing the always-on chat-event subscription
  from the visibility-gated presence subscription

* fix(event-service): refcount subscribe/unsubscribe by context

Multiple consumers can now independently hold the same context without
trampling each other. The wire context.subscribe/context.unsubscribe
messages are only sent on the 0->1 and 1->0 refcount transitions; the
intermediate churn stays client-side.

Resubscribe-on-reconnect dedupes by context key.

Tests cover: double-subscribe collapses to a single wire send, partial
unsubscribe keeps the context alive, last-consumer-out releases it,
mixed batches only send newly-active contexts, unknown-context
unsubscribes are no-ops, and reconnect resubscribes each context once.

* chore(mobile): add EXPO_PUBLIC_KILO_CHAT_URL and EXPO_PUBLIC_EVENT_SERVICE_URL

* chore(mobile): add kilo-chat workspace deps

* feat(mobile): add kilo-chat token getter with caching

* feat(mobile): add useCurrentUserId from JWT sub

* feat(mobile): add KiloChatProvider

* feat(mobile): add useKiloChatClient and useEventServiceClient hooks

* fix(mobile): fix lint errors in kilo-chat token getter

* fix(mobile): fix lint errors in useCurrentUserId hook

* fix(mobile): fix lint errors in useKiloChatClient hook

* feat(mobile): mount KiloChatProvider in (app) layout

* fix(kilo-chat): assert non-null in base64urlEncode loop

* fix(mobile): share kilo-chat token cache + handle fetch errors

Hoist cache and in-flight promise refs to module scope so all
useKiloChatTokenGetter() instances (provider + useCurrentUserId) share
one cache instead of each maintaining an independent one.

Wrap the fetch in try/catch/finally: on error rejectShared() is called
so concurrent waiters fail fast instead of hanging forever, and
inFlight is always cleared in finally regardless of outcome.

* fix(mobile): tie kilo-chat token cache to auth token, decode kiloUserId

- Key the module-level kilo-chat JWT cache and in-flight ref on the
  current auth token, so signing out and back in as a different user
  within the 1h token window no longer returns the previous user's
  cached JWT.
- Restructure dedup so the first caller awaits the same shared promise
  via a slot reference, eliminating the unhandled rejection that the
  prior resolve/reject-pair pattern produced when the only caller's
  fetch failed.
- Decode kiloUserId from the JWT payload instead of the standard `sub`
  claim — generateApiToken writes the user id as kiloUserId, so the
  sub-based version always returned null.

* fix(mobile): read auth token at call time, not at hook render

KiloChatProvider builds its EventService and KiloChat clients exactly
once via useState initializer, so it captures whatever getter exists at
first mount. Closing the previous getter over a render-time `authToken`
meant a cold start where the (app) layout mounted before SecureStore
finished loading would freeze the clients with an undefined token,
trapping them in a permanent reconnect loop.

Read the auth token from SecureStore inside the getter, the same pattern
trpcClient uses. The hook returns a stable callback with no React deps,
and the cache stays keyed on the auth token so user-switch safety is
preserved.
* refactor(db): rename channel_badge_counts to badge_counts (general purpose); update all consumers

* feat(db): migration to rename badge_counts and reset rows

* feat(notifications): add badge-bucket key builders

The badge_counts.badge_bucket column is a free-form string. To prevent
namespace collisions as more surfaces start emitting badge updates
(per-instance today, per-conversation later), centralize bucket-key
derivation in @kilocode/notifications and route NotificationChannelDO
through it. Mirrors the presence-context builders in @kilocode/event-service.

Safe to introduce now without a data migration because PR 2's migration
already wipes badge_counts.

* chore(notifications): add EVENT_SERVICE binding, drop STREAM_CHAT_API_SECRET

* chore(notifications): add vitest scaffold

* feat(notifications): rewrite NotificationChannelDO around dispatchPush

* chore(notifications): drop orphan badgeBucketForInstance helper

* feat(notifications): add sendPushForConversation WorkerEntrypoint RPC

* chore(notifications): delete Stream webhook route

* chore(notifications): type EVENT_SERVICE RPC and enable cloudflare:test types

* feat(event-service): add kiloclaw event-context helpers; migrate kilo-chat producer

Adds kiloclawInstanceContext and kiloclawConversationContext path
builders to @kilocode/event-service, replacing hardcoded template
literals in kilo-chat's event-push.ts and its test so all callers
share a single source of truth.

* feat(kilo-chat): add fetchSandboxLabel helper

* chore(kilo-chat): add NOTIFICATIONS service binding

* feat(kilo-chat): publish push on message.created via NOTIFICATIONS RPC

When a chat message is persisted, fire-and-forget a call to
NOTIFICATIONS.sendPushForConversation so non-sender human members of the
conversation receive a push. Runs after realtime/event-service delivery
inside postCommitFanOut, with errors swallowed so push failures cannot
fail the send.

- Skip when there are no other human recipients or no sandboxId.
- senderUserId = callerId for human senders, null for bot senders.
- title is "<sandboxLabel> · <conversationTitle>"; bodyPreview is the
  first 200 chars of the concatenated text blocks.
- Add @kilocode/notifications workspace dep and layer the RPC method
  shape into Env via bindings.d.ts.
- Add a notifications-stub worker to the vitest config so tests can
  spy on env.NOTIFICATIONS.sendPushForConversation, and globally mock
  sandbox-lookup in setup.ts (it imports pg via @kilocode/db).

* chore(notifications): drop orphan stream-chat dep, refresh worker types, fix test mock

- Remove `stream-chat` from `services/notifications/package.json`; the Stream
  webhook (its only consumer) was deleted earlier in the stack.
- Regenerate `worker-configuration.d.ts` so the workerd runtime types match the
  current toolchain (sibling services were on `1.20260312.1`; this one had
  drifted to `1.20251217.0` from a stale local cache).
- Fix the global test mock to reference the renamed `badge_counts` table; the
  setup file was authored against the pre-rename name and never matched.
- Tidy two pre-existing lint nits in the new test files (`import type` for
  type-only import, drop unused `cols` parameter).

* fix(notifications): named entrypoint export, retry-safe badge, alarm-leak

- Switch `NotificationsService` from default-only to a named class export
  with a separate default. `services/kilo-chat/wrangler.jsonc` binds via
  `entrypoint: "NotificationsService"`, which resolves named module
  exports. The default-only form (`export default class NotificationsService`)
  exports under the `default` key — kilo-chat's RPC binding would not have
  resolved at deploy. Mirrors the existing pattern in
  `services/kilo-chat/src/index.ts` (`KiloChatService`).

- `dispatchPush` now uses a two-stage idempotency record (`pending` →
  `delivered`). The badge increment was previously non-idempotent: an
  Expo failure returned `failed` without writing the idempotency key, so
  upstream retries (which the design explicitly invites) re-ran the
  increment before the next send and inflated the badge by one per
  retry. The `pending` marker is written before the increment and
  short-circuits the increment on retry; the `delivered` marker is only
  written on success.

- `setAlarm` is now gated on `getAlarm() === null`. Calling `setAlarm`
  unconditionally on each successful push — as the previous code did —
  replaces the pending alarm and pushes the cleanup forward indefinitely
  on a conversation receiving more than one push per `IDEM_TTL_MS`,
  leaking expired idempotency entries.

Adds two test cases covering the badge-retry and alarm-reset paths.

* fix(notifications): close two cleanup-alarm leaks

- Schedule the cleanup alarm when writing the `pending` marker, not only
  on `delivered`. Without this, an Expo failure followed by no further
  push activity for the conversation leaves the `pending` record in DO
  storage forever (no alarm was ever set to prune it).

- After the alarm fires, reschedule for the earliest remaining record's
  expiry instead of leaving the alarm slot empty. Otherwise a quiet
  conversation strands its younger entries until some unrelated future
  dispatch wakes the DO up.

Both paths go through a small `ensureCleanupAlarm` helper that gates on
`getAlarm() === null` so a busy conversation still doesn't push the
alarm forward on every call.

* refactor(event-service): compose presence contexts from kiloclaw helpers

The kiloclaw-scoped presence paths are literally `/presence` prefixed
onto the kiloclaw event-context paths. Build them by composition so the
`/kiloclaw/{sandboxId}[/{conversationId}]` segment shape is defined in
exactly one place — `kiloclaw-contexts.ts`.

Pure refactor; same string output, template-literal types still narrow
to the same shape.

* feat(web): add kiloChat.getToken tRPC procedure

* refactor(web): use kiloclaw-context helpers for event subscriptions

* feat(web): lift EventServiceClient to global provider

Introduces a single app-shell EventServiceProvider that owns the
EventServiceClient and KiloChatClient for all authenticated routes.
Mounted in (app)/layout.tsx so platform/instance/conversation presence
subscriptions and the kilo-chat UI share one WebSocket.

KiloChatLayout now consumes the global clients via useEventServiceClient()
instead of spinning up its own pair, and the getToken prop is removed from
KiloChatLayoutProps (along with both call sites). The local
useEventService(getToken) factory is dead code and has been deleted;
useInstanceContext / useConversationContext stay since they take
EventServiceClient as a parameter.

* feat(web): add usePresenceSubscription primitive

Thin hook that subscribes the global EventServiceClient to a single
context for the lifetime of the calling component, gated by an `active`
flag. Will back upcoming platform- and instance-level presence
indicators.

* refactor(web): collapse kilo-chat event subscriptions into usePresenceSubscription

- Drop dead getToken field from KiloChatContextValue (no consumers).
- Remove useInstanceContext / useConversationContext hooks; both call
  sites now use the shared usePresenceSubscription primitive directly.
- Harden usePresenceSubscription against empty-string contexts.

* feat(web): subscribe to /presence/web while tab is visible

* feat(web): subscribe to /presence/kiloclaw/{sandboxId} on instance views

* refactor(web): extract useDocumentVisible primitive

* feat(web): subscribe to conversation presence while tab visible

* style(web): reflow useDocumentVisible useState init to one line

* refactor(web): tighten presence hook + kilo-chat router contract

- usePresenceSubscription: accept 'string | null' instead of empty-string
  sentinel; update call sites (KiloChatLayout, MessageArea, useInstancePresence)
- kilo-chat router: validate expiresAt with z.iso.datetime()
- kilo-chat-router test: verify the JWT payload (kiloUserId, tokenSource,
  version) and that expiresAt lands in the expected ~1h window
- MessageArea: comment distinguishing the always-on chat-event subscription
  from the visibility-gated presence subscription

* fix(event-service): refcount subscribe/unsubscribe by context

Multiple consumers can now independently hold the same context without
trampling each other. The wire context.subscribe/context.unsubscribe
messages are only sent on the 0->1 and 1->0 refcount transitions; the
intermediate churn stays client-side.

Resubscribe-on-reconnect dedupes by context key.

Tests cover: double-subscribe collapses to a single wire send, partial
unsubscribe keeps the context alive, last-consumer-out releases it,
mixed batches only send newly-active contexts, unknown-context
unsubscribes are no-ops, and reconnect resubscribes each context once.

* chore(mobile): add EXPO_PUBLIC_KILO_CHAT_URL and EXPO_PUBLIC_EVENT_SERVICE_URL

* chore(mobile): add kilo-chat workspace deps

* feat(mobile): add kilo-chat token getter with caching

* feat(mobile): add useCurrentUserId from JWT sub

* feat(mobile): add KiloChatProvider

* feat(mobile): add useKiloChatClient and useEventServiceClient hooks

* fix(mobile): fix lint errors in kilo-chat token getter

* fix(mobile): fix lint errors in useCurrentUserId hook

* fix(mobile): fix lint errors in useKiloChatClient hook

* feat(mobile): mount KiloChatProvider in (app) layout

* fix(kilo-chat): assert non-null in base64urlEncode loop

* fix(mobile): share kilo-chat token cache + handle fetch errors

Hoist cache and in-flight promise refs to module scope so all
useKiloChatTokenGetter() instances (provider + useCurrentUserId) share
one cache instead of each maintaining an independent one.

Wrap the fetch in try/catch/finally: on error rejectShared() is called
so concurrent waiters fail fast instead of hanging forever, and
inFlight is always cleared in finally regardless of outcome.

* fix(mobile): tie kilo-chat token cache to auth token, decode kiloUserId

- Key the module-level kilo-chat JWT cache and in-flight ref on the
  current auth token, so signing out and back in as a different user
  within the 1h token window no longer returns the previous user's
  cached JWT.
- Restructure dedup so the first caller awaits the same shared promise
  via a slot reference, eliminating the unhandled rejection that the
  prior resolve/reject-pair pattern produced when the only caller's
  fetch failed.
- Decode kiloUserId from the JWT payload instead of the standard `sub`
  claim — generateApiToken writes the user id as kiloUserId, so the
  sub-based version always returned null.

* fix(mobile): read auth token at call time, not at hook render

KiloChatProvider builds its EventService and KiloChat clients exactly
once via useState initializer, so it captures whatever getter exists at
first mount. Closing the previous getter over a render-time `authToken`
meant a cold start where the (app) layout mounted before SecureStore
finished loading would freeze the clients with an undefined token,
trapping them in a permanent reconnect loop.

Read the auth token from SecureStore inside the getter, the same pattern
trpcClient uses. The hook returns a stable callback with no React deps,
and the cache stays keyed on the auth token so user-switch safety is
preserved.

* feat(mobile): add usePresenceSubscription primitive

* feat(mobile): subscribe to /presence/app while app is active

* feat(mobile): add useInstancePresence hook

* feat(mobile): add useConversationPresence hook

* fix(mobile): fix lint errors in presence hooks

* feat(mobile): add useEventSubscription primitive

* feat(mobile): add useInstanceEventSubscription

* fix(mobile): apply curly/switch-case-braces lint rules to event hooks

* feat(kilo-chat-hooks): create shared package; extract useConversations

* feat(kilo-chat-hooks): extract useMessages — base query + optimistic send

Move PAGE_SIZE, helper functions (applyReactionAdded/Removed, restoreMessageInCache,
removeMessageFromCache, findMessageInCache), useMessages infinite-query hook, and
useSendMessage mutation into @kilocode/kilo-chat-hooks. Web's useMessages.ts re-exports
the moved hooks and retains local helper copies for remaining mutations (37b will collapse).

* feat(kilo-chat-hooks): useMessages adds edit/delete/react mutations

* feat(kilo-chat-hooks): extract useMessageCacheUpdater into shared package

Moves the live event-stream cache patcher from the web-only useMessages
file into @kilocode/kilo-chat-hooks. Adds an optional onActionFailed
callback so platform wrappers inject toasts; web passes toast.error.

* feat(mobile): wire shared kilo-chat-hooks + platform adapters

* fix(kilo-chat-hooks): centralize query keys; tighten event-subscription API

- Add packages/kilo-chat-hooks/src/query-keys.ts with conversations/
  conversation/messages/bot-status helpers; route every hook + invalidator
  through it. Fixes the mobile useInstanceEventSubscription bug where
  invalidations used ['conversations', sandboxId] but the queries register
  under ['kilo-chat', 'conversations', sandboxId], so list previews and
  unread counts never refreshed on incoming events.
- useEventSubscription now takes a single event name; callers register one
  hook per event. Drops the events.join('|') dependency hack and the
  eslint-disable. useInstanceEventSubscription becomes six explicit
  registrations.
- Drop the hardcoded English toast string from useMessageCacheUpdater;
  onActionFailed is () => void and the message lives at each call site.
- Extract useAppActiveAndFocused to deduplicate AppState+focus boilerplate
  shared by useInstancePresence and useConversationPresence.

* fix(mobile): subscribe to conversation.* events on instance context

The instance-level subscription was listening for message.created/updated/
deleted, which are published on conversation contexts and never fire here.
Replace them with conversation.renamed, conversation.read, and
conversation.activity — the events kilo-chat actually pushes to the
instance context — so list updates (title, unread, last-activity)
invalidate the conversations query as intended.
* refactor(db): rename channel_badge_counts to badge_counts (general purpose); update all consumers

* feat(db): migration to rename badge_counts and reset rows

* feat(notifications): add badge-bucket key builders

The badge_counts.badge_bucket column is a free-form string. To prevent
namespace collisions as more surfaces start emitting badge updates
(per-instance today, per-conversation later), centralize bucket-key
derivation in @kilocode/notifications and route NotificationChannelDO
through it. Mirrors the presence-context builders in @kilocode/event-service.

Safe to introduce now without a data migration because PR 2's migration
already wipes badge_counts.

* chore(notifications): add EVENT_SERVICE binding, drop STREAM_CHAT_API_SECRET

* chore(notifications): add vitest scaffold

* feat(notifications): rewrite NotificationChannelDO around dispatchPush

* chore(notifications): drop orphan badgeBucketForInstance helper

* feat(notifications): add sendPushForConversation WorkerEntrypoint RPC

* chore(notifications): delete Stream webhook route

* chore(notifications): type EVENT_SERVICE RPC and enable cloudflare:test types

* feat(event-service): add kiloclaw event-context helpers; migrate kilo-chat producer

Adds kiloclawInstanceContext and kiloclawConversationContext path
builders to @kilocode/event-service, replacing hardcoded template
literals in kilo-chat's event-push.ts and its test so all callers
share a single source of truth.

* feat(kilo-chat): add fetchSandboxLabel helper

* chore(kilo-chat): add NOTIFICATIONS service binding

* feat(kilo-chat): publish push on message.created via NOTIFICATIONS RPC

When a chat message is persisted, fire-and-forget a call to
NOTIFICATIONS.sendPushForConversation so non-sender human members of the
conversation receive a push. Runs after realtime/event-service delivery
inside postCommitFanOut, with errors swallowed so push failures cannot
fail the send.

- Skip when there are no other human recipients or no sandboxId.
- senderUserId = callerId for human senders, null for bot senders.
- title is "<sandboxLabel> · <conversationTitle>"; bodyPreview is the
  first 200 chars of the concatenated text blocks.
- Add @kilocode/notifications workspace dep and layer the RPC method
  shape into Env via bindings.d.ts.
- Add a notifications-stub worker to the vitest config so tests can
  spy on env.NOTIFICATIONS.sendPushForConversation, and globally mock
  sandbox-lookup in setup.ts (it imports pg via @kilocode/db).

* chore(notifications): drop orphan stream-chat dep, refresh worker types, fix test mock

- Remove `stream-chat` from `services/notifications/package.json`; the Stream
  webhook (its only consumer) was deleted earlier in the stack.
- Regenerate `worker-configuration.d.ts` so the workerd runtime types match the
  current toolchain (sibling services were on `1.20260312.1`; this one had
  drifted to `1.20251217.0` from a stale local cache).
- Fix the global test mock to reference the renamed `badge_counts` table; the
  setup file was authored against the pre-rename name and never matched.
- Tidy two pre-existing lint nits in the new test files (`import type` for
  type-only import, drop unused `cols` parameter).

* fix(notifications): named entrypoint export, retry-safe badge, alarm-leak

- Switch `NotificationsService` from default-only to a named class export
  with a separate default. `services/kilo-chat/wrangler.jsonc` binds via
  `entrypoint: "NotificationsService"`, which resolves named module
  exports. The default-only form (`export default class NotificationsService`)
  exports under the `default` key — kilo-chat's RPC binding would not have
  resolved at deploy. Mirrors the existing pattern in
  `services/kilo-chat/src/index.ts` (`KiloChatService`).

- `dispatchPush` now uses a two-stage idempotency record (`pending` →
  `delivered`). The badge increment was previously non-idempotent: an
  Expo failure returned `failed` without writing the idempotency key, so
  upstream retries (which the design explicitly invites) re-ran the
  increment before the next send and inflated the badge by one per
  retry. The `pending` marker is written before the increment and
  short-circuits the increment on retry; the `delivered` marker is only
  written on success.

- `setAlarm` is now gated on `getAlarm() === null`. Calling `setAlarm`
  unconditionally on each successful push — as the previous code did —
  replaces the pending alarm and pushes the cleanup forward indefinitely
  on a conversation receiving more than one push per `IDEM_TTL_MS`,
  leaking expired idempotency entries.

Adds two test cases covering the badge-retry and alarm-reset paths.

* fix(notifications): close two cleanup-alarm leaks

- Schedule the cleanup alarm when writing the `pending` marker, not only
  on `delivered`. Without this, an Expo failure followed by no further
  push activity for the conversation leaves the `pending` record in DO
  storage forever (no alarm was ever set to prune it).

- After the alarm fires, reschedule for the earliest remaining record's
  expiry instead of leaving the alarm slot empty. Otherwise a quiet
  conversation strands its younger entries until some unrelated future
  dispatch wakes the DO up.

Both paths go through a small `ensureCleanupAlarm` helper that gates on
`getAlarm() === null` so a busy conversation still doesn't push the
alarm forward on every call.

* refactor(event-service): compose presence contexts from kiloclaw helpers

The kiloclaw-scoped presence paths are literally `/presence` prefixed
onto the kiloclaw event-context paths. Build them by composition so the
`/kiloclaw/{sandboxId}[/{conversationId}]` segment shape is defined in
exactly one place — `kiloclaw-contexts.ts`.

Pure refactor; same string output, template-literal types still narrow
to the same shape.

* feat(web): add kiloChat.getToken tRPC procedure

* refactor(web): use kiloclaw-context helpers for event subscriptions

* feat(web): lift EventServiceClient to global provider

Introduces a single app-shell EventServiceProvider that owns the
EventServiceClient and KiloChatClient for all authenticated routes.
Mounted in (app)/layout.tsx so platform/instance/conversation presence
subscriptions and the kilo-chat UI share one WebSocket.

KiloChatLayout now consumes the global clients via useEventServiceClient()
instead of spinning up its own pair, and the getToken prop is removed from
KiloChatLayoutProps (along with both call sites). The local
useEventService(getToken) factory is dead code and has been deleted;
useInstanceContext / useConversationContext stay since they take
EventServiceClient as a parameter.

* feat(web): add usePresenceSubscription primitive

Thin hook that subscribes the global EventServiceClient to a single
context for the lifetime of the calling component, gated by an `active`
flag. Will back upcoming platform- and instance-level presence
indicators.

* refactor(web): collapse kilo-chat event subscriptions into usePresenceSubscription

- Drop dead getToken field from KiloChatContextValue (no consumers).
- Remove useInstanceContext / useConversationContext hooks; both call
  sites now use the shared usePresenceSubscription primitive directly.
- Harden usePresenceSubscription against empty-string contexts.

* feat(web): subscribe to /presence/web while tab is visible

* feat(web): subscribe to /presence/kiloclaw/{sandboxId} on instance views

* refactor(web): extract useDocumentVisible primitive

* feat(web): subscribe to conversation presence while tab visible

* style(web): reflow useDocumentVisible useState init to one line

* refactor(web): tighten presence hook + kilo-chat router contract

- usePresenceSubscription: accept 'string | null' instead of empty-string
  sentinel; update call sites (KiloChatLayout, MessageArea, useInstancePresence)
- kilo-chat router: validate expiresAt with z.iso.datetime()
- kilo-chat-router test: verify the JWT payload (kiloUserId, tokenSource,
  version) and that expiresAt lands in the expected ~1h window
- MessageArea: comment distinguishing the always-on chat-event subscription
  from the visibility-gated presence subscription

* fix(event-service): refcount subscribe/unsubscribe by context

Multiple consumers can now independently hold the same context without
trampling each other. The wire context.subscribe/context.unsubscribe
messages are only sent on the 0->1 and 1->0 refcount transitions; the
intermediate churn stays client-side.

Resubscribe-on-reconnect dedupes by context key.

Tests cover: double-subscribe collapses to a single wire send, partial
unsubscribe keeps the context alive, last-consumer-out releases it,
mixed batches only send newly-active contexts, unknown-context
unsubscribes are no-ops, and reconnect resubscribes each context once.

* chore(mobile): add EXPO_PUBLIC_KILO_CHAT_URL and EXPO_PUBLIC_EVENT_SERVICE_URL

* chore(mobile): add kilo-chat workspace deps

* feat(mobile): add kilo-chat token getter with caching

* feat(mobile): add useCurrentUserId from JWT sub

* feat(mobile): add KiloChatProvider

* feat(mobile): add useKiloChatClient and useEventServiceClient hooks

* fix(mobile): fix lint errors in kilo-chat token getter

* fix(mobile): fix lint errors in useCurrentUserId hook

* fix(mobile): fix lint errors in useKiloChatClient hook

* feat(mobile): mount KiloChatProvider in (app) layout

* fix(kilo-chat): assert non-null in base64urlEncode loop

* fix(mobile): share kilo-chat token cache + handle fetch errors

Hoist cache and in-flight promise refs to module scope so all
useKiloChatTokenGetter() instances (provider + useCurrentUserId) share
one cache instead of each maintaining an independent one.

Wrap the fetch in try/catch/finally: on error rejectShared() is called
so concurrent waiters fail fast instead of hanging forever, and
inFlight is always cleared in finally regardless of outcome.

* fix(mobile): tie kilo-chat token cache to auth token, decode kiloUserId

- Key the module-level kilo-chat JWT cache and in-flight ref on the
  current auth token, so signing out and back in as a different user
  within the 1h token window no longer returns the previous user's
  cached JWT.
- Restructure dedup so the first caller awaits the same shared promise
  via a slot reference, eliminating the unhandled rejection that the
  prior resolve/reject-pair pattern produced when the only caller's
  fetch failed.
- Decode kiloUserId from the JWT payload instead of the standard `sub`
  claim — generateApiToken writes the user id as kiloUserId, so the
  sub-based version always returned null.

* fix(mobile): read auth token at call time, not at hook render

KiloChatProvider builds its EventService and KiloChat clients exactly
once via useState initializer, so it captures whatever getter exists at
first mount. Closing the previous getter over a render-time `authToken`
meant a cold start where the (app) layout mounted before SecureStore
finished loading would freeze the clients with an undefined token,
trapping them in a permanent reconnect loop.

Read the auth token from SecureStore inside the getter, the same pattern
trpcClient uses. The hook returns a stable callback with no React deps,
and the cache stays keyed on the auth token so user-switch safety is
preserved.

* feat(mobile): add usePresenceSubscription primitive

* feat(mobile): subscribe to /presence/app while app is active

* feat(mobile): add useInstancePresence hook

* feat(mobile): add useConversationPresence hook

* fix(mobile): fix lint errors in presence hooks

* feat(mobile): add useEventSubscription primitive

* feat(mobile): add useInstanceEventSubscription

* fix(mobile): apply curly/switch-case-braces lint rules to event hooks

* feat(kilo-chat-hooks): create shared package; extract useConversations

* feat(kilo-chat-hooks): extract useMessages — base query + optimistic send

Move PAGE_SIZE, helper functions (applyReactionAdded/Removed, restoreMessageInCache,
removeMessageFromCache, findMessageInCache), useMessages infinite-query hook, and
useSendMessage mutation into @kilocode/kilo-chat-hooks. Web's useMessages.ts re-exports
the moved hooks and retains local helper copies for remaining mutations (37b will collapse).

* feat(kilo-chat-hooks): useMessages adds edit/delete/react mutations

* feat(kilo-chat-hooks): extract useMessageCacheUpdater into shared package

Moves the live event-stream cache patcher from the web-only useMessages
file into @kilocode/kilo-chat-hooks. Adds an optional onActionFailed
callback so platform wrappers inject toasts; web passes toast.error.

* feat(mobile): wire shared kilo-chat-hooks + platform adapters

* fix(kilo-chat-hooks): centralize query keys; tighten event-subscription API

- Add packages/kilo-chat-hooks/src/query-keys.ts with conversations/
  conversation/messages/bot-status helpers; route every hook + invalidator
  through it. Fixes the mobile useInstanceEventSubscription bug where
  invalidations used ['conversations', sandboxId] but the queries register
  under ['kilo-chat', 'conversations', sandboxId], so list previews and
  unread counts never refreshed on incoming events.
- useEventSubscription now takes a single event name; callers register one
  hook per event. Drops the events.join('|') dependency hack and the
  eslint-disable. useInstanceEventSubscription becomes six explicit
  registrations.
- Drop the hardcoded English toast string from useMessageCacheUpdater;
  onActionFailed is () => void and the message lives at each call site.
- Extract useAppActiveAndFocused to deduplicate AppState+focus boilerplate
  shared by useInstancePresence and useConversationPresence.

* fix(mobile): subscribe to conversation.* events on instance context

The instance-level subscription was listening for message.created/updated/
deleted, which are published on conversation contexts and never fire here.
Replace them with conversation.renamed, conversation.read, and
conversation.activity — the events kilo-chat actually pushes to the
instance context — so list updates (title, unread, last-activity)
invalidate the conversations query as intended.

* chore(mobile): add @shopify/flash-list dependency

Required by the kilo-chat MessageList and ConversationListScreen components.

* chore(mobile): add EXPO_PUBLIC_KILO_CHAT_URL and EXPO_PUBLIC_EVENT_SERVICE_URL

These were declared in env-keys.js by PR 5a but never added to apps/mobile/.env,
which broke the dev build.

* feat(mobile): add EmptyConversationList

* feat(mobile): add ConversationHeader

* feat(mobile): add TypingIndicator placeholder

* feat(mobile): add MessageInput

* feat(mobile): add MessageBubble

* feat(mobile): add MessageList

Implement MessageList using FlashList v2 with maintainVisibleContentPosition
and startRenderingFromBottom for chat layout; wire fetchOlder via onStartReached.

* feat(mobile): add ConversationScreen

* feat(mobile): add ConversationListScreen

* fix(mobile): address review feedback on kilo-chat components

- Drop double-cast `as unknown as Href` in favor of `as Href`
- Use themed `Text` from `@/components/ui/text` and local `useKiloChatClient`
  re-export in `MessageBubble`
- Switch `crypto.randomUUID()` to `expo-crypto`'s `Crypto.randomUUID` to
  match existing usage in `cloud-agent-runtime.ts`
* refactor(db): rename channel_badge_counts to badge_counts (general purpose); update all consumers

* feat(db): migration to rename badge_counts and reset rows

* feat(notifications): add badge-bucket key builders

The badge_counts.badge_bucket column is a free-form string. To prevent
namespace collisions as more surfaces start emitting badge updates
(per-instance today, per-conversation later), centralize bucket-key
derivation in @kilocode/notifications and route NotificationChannelDO
through it. Mirrors the presence-context builders in @kilocode/event-service.

Safe to introduce now without a data migration because PR 2's migration
already wipes badge_counts.

* chore(notifications): add EVENT_SERVICE binding, drop STREAM_CHAT_API_SECRET

* chore(notifications): add vitest scaffold

* feat(notifications): rewrite NotificationChannelDO around dispatchPush

* chore(notifications): drop orphan badgeBucketForInstance helper

* feat(notifications): add sendPushForConversation WorkerEntrypoint RPC

* chore(notifications): delete Stream webhook route

* chore(notifications): type EVENT_SERVICE RPC and enable cloudflare:test types

* feat(event-service): add kiloclaw event-context helpers; migrate kilo-chat producer

Adds kiloclawInstanceContext and kiloclawConversationContext path
builders to @kilocode/event-service, replacing hardcoded template
literals in kilo-chat's event-push.ts and its test so all callers
share a single source of truth.

* feat(kilo-chat): add fetchSandboxLabel helper

* chore(kilo-chat): add NOTIFICATIONS service binding

* feat(kilo-chat): publish push on message.created via NOTIFICATIONS RPC

When a chat message is persisted, fire-and-forget a call to
NOTIFICATIONS.sendPushForConversation so non-sender human members of the
conversation receive a push. Runs after realtime/event-service delivery
inside postCommitFanOut, with errors swallowed so push failures cannot
fail the send.

- Skip when there are no other human recipients or no sandboxId.
- senderUserId = callerId for human senders, null for bot senders.
- title is "<sandboxLabel> · <conversationTitle>"; bodyPreview is the
  first 200 chars of the concatenated text blocks.
- Add @kilocode/notifications workspace dep and layer the RPC method
  shape into Env via bindings.d.ts.
- Add a notifications-stub worker to the vitest config so tests can
  spy on env.NOTIFICATIONS.sendPushForConversation, and globally mock
  sandbox-lookup in setup.ts (it imports pg via @kilocode/db).

* chore(notifications): drop orphan stream-chat dep, refresh worker types, fix test mock

- Remove `stream-chat` from `services/notifications/package.json`; the Stream
  webhook (its only consumer) was deleted earlier in the stack.
- Regenerate `worker-configuration.d.ts` so the workerd runtime types match the
  current toolchain (sibling services were on `1.20260312.1`; this one had
  drifted to `1.20251217.0` from a stale local cache).
- Fix the global test mock to reference the renamed `badge_counts` table; the
  setup file was authored against the pre-rename name and never matched.
- Tidy two pre-existing lint nits in the new test files (`import type` for
  type-only import, drop unused `cols` parameter).

* fix(notifications): named entrypoint export, retry-safe badge, alarm-leak

- Switch `NotificationsService` from default-only to a named class export
  with a separate default. `services/kilo-chat/wrangler.jsonc` binds via
  `entrypoint: "NotificationsService"`, which resolves named module
  exports. The default-only form (`export default class NotificationsService`)
  exports under the `default` key — kilo-chat's RPC binding would not have
  resolved at deploy. Mirrors the existing pattern in
  `services/kilo-chat/src/index.ts` (`KiloChatService`).

- `dispatchPush` now uses a two-stage idempotency record (`pending` →
  `delivered`). The badge increment was previously non-idempotent: an
  Expo failure returned `failed` without writing the idempotency key, so
  upstream retries (which the design explicitly invites) re-ran the
  increment before the next send and inflated the badge by one per
  retry. The `pending` marker is written before the increment and
  short-circuits the increment on retry; the `delivered` marker is only
  written on success.

- `setAlarm` is now gated on `getAlarm() === null`. Calling `setAlarm`
  unconditionally on each successful push — as the previous code did —
  replaces the pending alarm and pushes the cleanup forward indefinitely
  on a conversation receiving more than one push per `IDEM_TTL_MS`,
  leaking expired idempotency entries.

Adds two test cases covering the badge-retry and alarm-reset paths.

* fix(notifications): close two cleanup-alarm leaks

- Schedule the cleanup alarm when writing the `pending` marker, not only
  on `delivered`. Without this, an Expo failure followed by no further
  push activity for the conversation leaves the `pending` record in DO
  storage forever (no alarm was ever set to prune it).

- After the alarm fires, reschedule for the earliest remaining record's
  expiry instead of leaving the alarm slot empty. Otherwise a quiet
  conversation strands its younger entries until some unrelated future
  dispatch wakes the DO up.

Both paths go through a small `ensureCleanupAlarm` helper that gates on
`getAlarm() === null` so a busy conversation still doesn't push the
alarm forward on every call.

* refactor(event-service): compose presence contexts from kiloclaw helpers

The kiloclaw-scoped presence paths are literally `/presence` prefixed
onto the kiloclaw event-context paths. Build them by composition so the
`/kiloclaw/{sandboxId}[/{conversationId}]` segment shape is defined in
exactly one place — `kiloclaw-contexts.ts`.

Pure refactor; same string output, template-literal types still narrow
to the same shape.

* feat(web): add kiloChat.getToken tRPC procedure

* refactor(web): use kiloclaw-context helpers for event subscriptions

* feat(web): lift EventServiceClient to global provider

Introduces a single app-shell EventServiceProvider that owns the
EventServiceClient and KiloChatClient for all authenticated routes.
Mounted in (app)/layout.tsx so platform/instance/conversation presence
subscriptions and the kilo-chat UI share one WebSocket.

KiloChatLayout now consumes the global clients via useEventServiceClient()
instead of spinning up its own pair, and the getToken prop is removed from
KiloChatLayoutProps (along with both call sites). The local
useEventService(getToken) factory is dead code and has been deleted;
useInstanceContext / useConversationContext stay since they take
EventServiceClient as a parameter.

* feat(web): add usePresenceSubscription primitive

Thin hook that subscribes the global EventServiceClient to a single
context for the lifetime of the calling component, gated by an `active`
flag. Will back upcoming platform- and instance-level presence
indicators.

* refactor(web): collapse kilo-chat event subscriptions into usePresenceSubscription

- Drop dead getToken field from KiloChatContextValue (no consumers).
- Remove useInstanceContext / useConversationContext hooks; both call
  sites now use the shared usePresenceSubscription primitive directly.
- Harden usePresenceSubscription against empty-string contexts.

* feat(web): subscribe to /presence/web while tab is visible

* feat(web): subscribe to /presence/kiloclaw/{sandboxId} on instance views

* refactor(web): extract useDocumentVisible primitive

* feat(web): subscribe to conversation presence while tab visible

* style(web): reflow useDocumentVisible useState init to one line

* refactor(web): tighten presence hook + kilo-chat router contract

- usePresenceSubscription: accept 'string | null' instead of empty-string
  sentinel; update call sites (KiloChatLayout, MessageArea, useInstancePresence)
- kilo-chat router: validate expiresAt with z.iso.datetime()
- kilo-chat-router test: verify the JWT payload (kiloUserId, tokenSource,
  version) and that expiresAt lands in the expected ~1h window
- MessageArea: comment distinguishing the always-on chat-event subscription
  from the visibility-gated presence subscription

* fix(event-service): refcount subscribe/unsubscribe by context

Multiple consumers can now independently hold the same context without
trampling each other. The wire context.subscribe/context.unsubscribe
messages are only sent on the 0->1 and 1->0 refcount transitions; the
intermediate churn stays client-side.

Resubscribe-on-reconnect dedupes by context key.

Tests cover: double-subscribe collapses to a single wire send, partial
unsubscribe keeps the context alive, last-consumer-out releases it,
mixed batches only send newly-active contexts, unknown-context
unsubscribes are no-ops, and reconnect resubscribes each context once.

* chore(mobile): add EXPO_PUBLIC_KILO_CHAT_URL and EXPO_PUBLIC_EVENT_SERVICE_URL

* chore(mobile): add kilo-chat workspace deps

* feat(mobile): add kilo-chat token getter with caching

* feat(mobile): add useCurrentUserId from JWT sub

* feat(mobile): add KiloChatProvider

* feat(mobile): add useKiloChatClient and useEventServiceClient hooks

* fix(mobile): fix lint errors in kilo-chat token getter

* fix(mobile): fix lint errors in useCurrentUserId hook

* fix(mobile): fix lint errors in useKiloChatClient hook

* feat(mobile): mount KiloChatProvider in (app) layout

* fix(kilo-chat): assert non-null in base64urlEncode loop

* fix(mobile): share kilo-chat token cache + handle fetch errors

Hoist cache and in-flight promise refs to module scope so all
useKiloChatTokenGetter() instances (provider + useCurrentUserId) share
one cache instead of each maintaining an independent one.

Wrap the fetch in try/catch/finally: on error rejectShared() is called
so concurrent waiters fail fast instead of hanging forever, and
inFlight is always cleared in finally regardless of outcome.

* fix(mobile): tie kilo-chat token cache to auth token, decode kiloUserId

- Key the module-level kilo-chat JWT cache and in-flight ref on the
  current auth token, so signing out and back in as a different user
  within the 1h token window no longer returns the previous user's
  cached JWT.
- Restructure dedup so the first caller awaits the same shared promise
  via a slot reference, eliminating the unhandled rejection that the
  prior resolve/reject-pair pattern produced when the only caller's
  fetch failed.
- Decode kiloUserId from the JWT payload instead of the standard `sub`
  claim — generateApiToken writes the user id as kiloUserId, so the
  sub-based version always returned null.

* fix(mobile): read auth token at call time, not at hook render

KiloChatProvider builds its EventService and KiloChat clients exactly
once via useState initializer, so it captures whatever getter exists at
first mount. Closing the previous getter over a render-time `authToken`
meant a cold start where the (app) layout mounted before SecureStore
finished loading would freeze the clients with an undefined token,
trapping them in a permanent reconnect loop.

Read the auth token from SecureStore inside the getter, the same pattern
trpcClient uses. The hook returns a stable callback with no React deps,
and the cache stays keyed on the auth token so user-switch safety is
preserved.

* feat(mobile): add usePresenceSubscription primitive

* feat(mobile): subscribe to /presence/app while app is active

* feat(mobile): add useInstancePresence hook

* feat(mobile): add useConversationPresence hook

* fix(mobile): fix lint errors in presence hooks

* feat(mobile): add useEventSubscription primitive

* feat(mobile): add useInstanceEventSubscription

* fix(mobile): apply curly/switch-case-braces lint rules to event hooks

* feat(kilo-chat-hooks): create shared package; extract useConversations

* feat(kilo-chat-hooks): extract useMessages — base query + optimistic send

Move PAGE_SIZE, helper functions (applyReactionAdded/Removed, restoreMessageInCache,
removeMessageFromCache, findMessageInCache), useMessages infinite-query hook, and
useSendMessage mutation into @kilocode/kilo-chat-hooks. Web's useMessages.ts re-exports
the moved hooks and retains local helper copies for remaining mutations (37b will collapse).

* feat(kilo-chat-hooks): useMessages adds edit/delete/react mutations

* feat(kilo-chat-hooks): extract useMessageCacheUpdater into shared package

Moves the live event-stream cache patcher from the web-only useMessages
file into @kilocode/kilo-chat-hooks. Adds an optional onActionFailed
callback so platform wrappers inject toasts; web passes toast.error.

* feat(mobile): wire shared kilo-chat-hooks + platform adapters

* fix(kilo-chat-hooks): centralize query keys; tighten event-subscription API

- Add packages/kilo-chat-hooks/src/query-keys.ts with conversations/
  conversation/messages/bot-status helpers; route every hook + invalidator
  through it. Fixes the mobile useInstanceEventSubscription bug where
  invalidations used ['conversations', sandboxId] but the queries register
  under ['kilo-chat', 'conversations', sandboxId], so list previews and
  unread counts never refreshed on incoming events.
- useEventSubscription now takes a single event name; callers register one
  hook per event. Drops the events.join('|') dependency hack and the
  eslint-disable. useInstanceEventSubscription becomes six explicit
  registrations.
- Drop the hardcoded English toast string from useMessageCacheUpdater;
  onActionFailed is () => void and the message lives at each call site.
- Extract useAppActiveAndFocused to deduplicate AppState+focus boilerplate
  shared by useInstancePresence and useConversationPresence.

* fix(mobile): subscribe to conversation.* events on instance context

The instance-level subscription was listening for message.created/updated/
deleted, which are published on conversation contexts and never fire here.
Replace them with conversation.renamed, conversation.read, and
conversation.activity — the events kilo-chat actually pushes to the
instance context — so list updates (title, unread, last-activity)
invalidate the conversations query as intended.

* chore(mobile): add @shopify/flash-list dependency

Required by the kilo-chat MessageList and ConversationListScreen components.

* chore(mobile): add EXPO_PUBLIC_KILO_CHAT_URL and EXPO_PUBLIC_EVENT_SERVICE_URL

These were declared in env-keys.js by PR 5a but never added to apps/mobile/.env,
which broke the dev build.

* feat(mobile): add EmptyConversationList

* feat(mobile): add ConversationHeader

* feat(mobile): add TypingIndicator placeholder

* feat(mobile): add MessageInput

* feat(mobile): add MessageBubble

* feat(mobile): add MessageList

Implement MessageList using FlashList v2 with maintainVisibleContentPosition
and startRenderingFromBottom for chat layout; wire fetchOlder via onStartReached.

* feat(mobile): add ConversationScreen

* feat(mobile): add ConversationListScreen

* fix(mobile): address review feedback on kilo-chat components

- Drop double-cast `as unknown as Href` in favor of `as Href`
- Use themed `Text` from `@/components/ui/text` and local `useKiloChatClient`
  re-export in `MessageBubble`
- Switch `crypto.randomUUID()` to `expo-crypto`'s `Crypto.randomUUID` to
  match existing usage in `cloud-agent-runtime.ts`

* feat(mobile): add chat sandbox stack layout

* feat(mobile): add conversation list route

* feat(mobile): add conversation message route

* feat(mobile): wire chat deep links and active-conversation suppression

* fix(mobile): clear correct badge bucket on legacy chat foreground push
* refactor(db): rename channel_badge_counts to badge_counts (general purpose); update all consumers

* feat(db): migration to rename badge_counts and reset rows

* feat(notifications): add badge-bucket key builders

The badge_counts.badge_bucket column is a free-form string. To prevent
namespace collisions as more surfaces start emitting badge updates
(per-instance today, per-conversation later), centralize bucket-key
derivation in @kilocode/notifications and route NotificationChannelDO
through it. Mirrors the presence-context builders in @kilocode/event-service.

Safe to introduce now without a data migration because PR 2's migration
already wipes badge_counts.

* chore(notifications): add EVENT_SERVICE binding, drop STREAM_CHAT_API_SECRET

* chore(notifications): add vitest scaffold

* feat(notifications): rewrite NotificationChannelDO around dispatchPush

* chore(notifications): drop orphan badgeBucketForInstance helper

* feat(notifications): add sendPushForConversation WorkerEntrypoint RPC

* chore(notifications): delete Stream webhook route

* chore(notifications): type EVENT_SERVICE RPC and enable cloudflare:test types

* feat(event-service): add kiloclaw event-context helpers; migrate kilo-chat producer

Adds kiloclawInstanceContext and kiloclawConversationContext path
builders to @kilocode/event-service, replacing hardcoded template
literals in kilo-chat's event-push.ts and its test so all callers
share a single source of truth.

* feat(kilo-chat): add fetchSandboxLabel helper

* chore(kilo-chat): add NOTIFICATIONS service binding

* feat(kilo-chat): publish push on message.created via NOTIFICATIONS RPC

When a chat message is persisted, fire-and-forget a call to
NOTIFICATIONS.sendPushForConversation so non-sender human members of the
conversation receive a push. Runs after realtime/event-service delivery
inside postCommitFanOut, with errors swallowed so push failures cannot
fail the send.

- Skip when there are no other human recipients or no sandboxId.
- senderUserId = callerId for human senders, null for bot senders.
- title is "<sandboxLabel> · <conversationTitle>"; bodyPreview is the
  first 200 chars of the concatenated text blocks.
- Add @kilocode/notifications workspace dep and layer the RPC method
  shape into Env via bindings.d.ts.
- Add a notifications-stub worker to the vitest config so tests can
  spy on env.NOTIFICATIONS.sendPushForConversation, and globally mock
  sandbox-lookup in setup.ts (it imports pg via @kilocode/db).

* chore(notifications): drop orphan stream-chat dep, refresh worker types, fix test mock

- Remove `stream-chat` from `services/notifications/package.json`; the Stream
  webhook (its only consumer) was deleted earlier in the stack.
- Regenerate `worker-configuration.d.ts` so the workerd runtime types match the
  current toolchain (sibling services were on `1.20260312.1`; this one had
  drifted to `1.20251217.0` from a stale local cache).
- Fix the global test mock to reference the renamed `badge_counts` table; the
  setup file was authored against the pre-rename name and never matched.
- Tidy two pre-existing lint nits in the new test files (`import type` for
  type-only import, drop unused `cols` parameter).

* fix(notifications): named entrypoint export, retry-safe badge, alarm-leak

- Switch `NotificationsService` from default-only to a named class export
  with a separate default. `services/kilo-chat/wrangler.jsonc` binds via
  `entrypoint: "NotificationsService"`, which resolves named module
  exports. The default-only form (`export default class NotificationsService`)
  exports under the `default` key — kilo-chat's RPC binding would not have
  resolved at deploy. Mirrors the existing pattern in
  `services/kilo-chat/src/index.ts` (`KiloChatService`).

- `dispatchPush` now uses a two-stage idempotency record (`pending` →
  `delivered`). The badge increment was previously non-idempotent: an
  Expo failure returned `failed` without writing the idempotency key, so
  upstream retries (which the design explicitly invites) re-ran the
  increment before the next send and inflated the badge by one per
  retry. The `pending` marker is written before the increment and
  short-circuits the increment on retry; the `delivered` marker is only
  written on success.

- `setAlarm` is now gated on `getAlarm() === null`. Calling `setAlarm`
  unconditionally on each successful push — as the previous code did —
  replaces the pending alarm and pushes the cleanup forward indefinitely
  on a conversation receiving more than one push per `IDEM_TTL_MS`,
  leaking expired idempotency entries.

Adds two test cases covering the badge-retry and alarm-reset paths.

* fix(notifications): close two cleanup-alarm leaks

- Schedule the cleanup alarm when writing the `pending` marker, not only
  on `delivered`. Without this, an Expo failure followed by no further
  push activity for the conversation leaves the `pending` record in DO
  storage forever (no alarm was ever set to prune it).

- After the alarm fires, reschedule for the earliest remaining record's
  expiry instead of leaving the alarm slot empty. Otherwise a quiet
  conversation strands its younger entries until some unrelated future
  dispatch wakes the DO up.

Both paths go through a small `ensureCleanupAlarm` helper that gates on
`getAlarm() === null` so a busy conversation still doesn't push the
alarm forward on every call.

* refactor(event-service): compose presence contexts from kiloclaw helpers

The kiloclaw-scoped presence paths are literally `/presence` prefixed
onto the kiloclaw event-context paths. Build them by composition so the
`/kiloclaw/{sandboxId}[/{conversationId}]` segment shape is defined in
exactly one place — `kiloclaw-contexts.ts`.

Pure refactor; same string output, template-literal types still narrow
to the same shape.

* feat(web): add kiloChat.getToken tRPC procedure

* refactor(web): use kiloclaw-context helpers for event subscriptions

* feat(web): lift EventServiceClient to global provider

Introduces a single app-shell EventServiceProvider that owns the
EventServiceClient and KiloChatClient for all authenticated routes.
Mounted in (app)/layout.tsx so platform/instance/conversation presence
subscriptions and the kilo-chat UI share one WebSocket.

KiloChatLayout now consumes the global clients via useEventServiceClient()
instead of spinning up its own pair, and the getToken prop is removed from
KiloChatLayoutProps (along with both call sites). The local
useEventService(getToken) factory is dead code and has been deleted;
useInstanceContext / useConversationContext stay since they take
EventServiceClient as a parameter.

* feat(web): add usePresenceSubscription primitive

Thin hook that subscribes the global EventServiceClient to a single
context for the lifetime of the calling component, gated by an `active`
flag. Will back upcoming platform- and instance-level presence
indicators.

* refactor(web): collapse kilo-chat event subscriptions into usePresenceSubscription

- Drop dead getToken field from KiloChatContextValue (no consumers).
- Remove useInstanceContext / useConversationContext hooks; both call
  sites now use the shared usePresenceSubscription primitive directly.
- Harden usePresenceSubscription against empty-string contexts.

* feat(web): subscribe to /presence/web while tab is visible

* feat(web): subscribe to /presence/kiloclaw/{sandboxId} on instance views

* refactor(web): extract useDocumentVisible primitive

* feat(web): subscribe to conversation presence while tab visible

* style(web): reflow useDocumentVisible useState init to one line

* refactor(web): tighten presence hook + kilo-chat router contract

- usePresenceSubscription: accept 'string | null' instead of empty-string
  sentinel; update call sites (KiloChatLayout, MessageArea, useInstancePresence)
- kilo-chat router: validate expiresAt with z.iso.datetime()
- kilo-chat-router test: verify the JWT payload (kiloUserId, tokenSource,
  version) and that expiresAt lands in the expected ~1h window
- MessageArea: comment distinguishing the always-on chat-event subscription
  from the visibility-gated presence subscription

* fix(event-service): refcount subscribe/unsubscribe by context

Multiple consumers can now independently hold the same context without
trampling each other. The wire context.subscribe/context.unsubscribe
messages are only sent on the 0->1 and 1->0 refcount transitions; the
intermediate churn stays client-side.

Resubscribe-on-reconnect dedupes by context key.

Tests cover: double-subscribe collapses to a single wire send, partial
unsubscribe keeps the context alive, last-consumer-out releases it,
mixed batches only send newly-active contexts, unknown-context
unsubscribes are no-ops, and reconnect resubscribes each context once.

* chore(mobile): add EXPO_PUBLIC_KILO_CHAT_URL and EXPO_PUBLIC_EVENT_SERVICE_URL

* chore(mobile): add kilo-chat workspace deps

* feat(mobile): add kilo-chat token getter with caching

* feat(mobile): add useCurrentUserId from JWT sub

* feat(mobile): add KiloChatProvider

* feat(mobile): add useKiloChatClient and useEventServiceClient hooks

* fix(mobile): fix lint errors in kilo-chat token getter

* fix(mobile): fix lint errors in useCurrentUserId hook

* fix(mobile): fix lint errors in useKiloChatClient hook

* feat(mobile): mount KiloChatProvider in (app) layout

* fix(kilo-chat): assert non-null in base64urlEncode loop

* fix(mobile): share kilo-chat token cache + handle fetch errors

Hoist cache and in-flight promise refs to module scope so all
useKiloChatTokenGetter() instances (provider + useCurrentUserId) share
one cache instead of each maintaining an independent one.

Wrap the fetch in try/catch/finally: on error rejectShared() is called
so concurrent waiters fail fast instead of hanging forever, and
inFlight is always cleared in finally regardless of outcome.

* fix(mobile): tie kilo-chat token cache to auth token, decode kiloUserId

- Key the module-level kilo-chat JWT cache and in-flight ref on the
  current auth token, so signing out and back in as a different user
  within the 1h token window no longer returns the previous user's
  cached JWT.
- Restructure dedup so the first caller awaits the same shared promise
  via a slot reference, eliminating the unhandled rejection that the
  prior resolve/reject-pair pattern produced when the only caller's
  fetch failed.
- Decode kiloUserId from the JWT payload instead of the standard `sub`
  claim — generateApiToken writes the user id as kiloUserId, so the
  sub-based version always returned null.

* fix(mobile): read auth token at call time, not at hook render

KiloChatProvider builds its EventService and KiloChat clients exactly
once via useState initializer, so it captures whatever getter exists at
first mount. Closing the previous getter over a render-time `authToken`
meant a cold start where the (app) layout mounted before SecureStore
finished loading would freeze the clients with an undefined token,
trapping them in a permanent reconnect loop.

Read the auth token from SecureStore inside the getter, the same pattern
trpcClient uses. The hook returns a stable callback with no React deps,
and the cache stays keyed on the auth token so user-switch safety is
preserved.

* feat(mobile): add usePresenceSubscription primitive

* feat(mobile): subscribe to /presence/app while app is active

* feat(mobile): add useInstancePresence hook

* feat(mobile): add useConversationPresence hook

* fix(mobile): fix lint errors in presence hooks

* feat(mobile): add useEventSubscription primitive

* feat(mobile): add useInstanceEventSubscription

* fix(mobile): apply curly/switch-case-braces lint rules to event hooks

* feat(kilo-chat-hooks): create shared package; extract useConversations

* feat(kilo-chat-hooks): extract useMessages — base query + optimistic send

Move PAGE_SIZE, helper functions (applyReactionAdded/Removed, restoreMessageInCache,
removeMessageFromCache, findMessageInCache), useMessages infinite-query hook, and
useSendMessage mutation into @kilocode/kilo-chat-hooks. Web's useMessages.ts re-exports
the moved hooks and retains local helper copies for remaining mutations (37b will collapse).

* feat(kilo-chat-hooks): useMessages adds edit/delete/react mutations

* feat(kilo-chat-hooks): extract useMessageCacheUpdater into shared package

Moves the live event-stream cache patcher from the web-only useMessages
file into @kilocode/kilo-chat-hooks. Adds an optional onActionFailed
callback so platform wrappers inject toasts; web passes toast.error.

* feat(mobile): wire shared kilo-chat-hooks + platform adapters

* fix(kilo-chat-hooks): centralize query keys; tighten event-subscription API

- Add packages/kilo-chat-hooks/src/query-keys.ts with conversations/
  conversation/messages/bot-status helpers; route every hook + invalidator
  through it. Fixes the mobile useInstanceEventSubscription bug where
  invalidations used ['conversations', sandboxId] but the queries register
  under ['kilo-chat', 'conversations', sandboxId], so list previews and
  unread counts never refreshed on incoming events.
- useEventSubscription now takes a single event name; callers register one
  hook per event. Drops the events.join('|') dependency hack and the
  eslint-disable. useInstanceEventSubscription becomes six explicit
  registrations.
- Drop the hardcoded English toast string from useMessageCacheUpdater;
  onActionFailed is () => void and the message lives at each call site.
- Extract useAppActiveAndFocused to deduplicate AppState+focus boilerplate
  shared by useInstancePresence and useConversationPresence.

* fix(mobile): subscribe to conversation.* events on instance context

The instance-level subscription was listening for message.created/updated/
deleted, which are published on conversation contexts and never fire here.
Replace them with conversation.renamed, conversation.read, and
conversation.activity — the events kilo-chat actually pushes to the
instance context — so list updates (title, unread, last-activity)
invalidate the conversations query as intended.

* chore(mobile): add @shopify/flash-list dependency

Required by the kilo-chat MessageList and ConversationListScreen components.

* chore(mobile): add EXPO_PUBLIC_KILO_CHAT_URL and EXPO_PUBLIC_EVENT_SERVICE_URL

These were declared in env-keys.js by PR 5a but never added to apps/mobile/.env,
which broke the dev build.

* feat(mobile): add EmptyConversationList

* feat(mobile): add ConversationHeader

* feat(mobile): add TypingIndicator placeholder

* feat(mobile): add MessageInput

* feat(mobile): add MessageBubble

* feat(mobile): add MessageList

Implement MessageList using FlashList v2 with maintainVisibleContentPosition
and startRenderingFromBottom for chat layout; wire fetchOlder via onStartReached.

* feat(mobile): add ConversationScreen

* feat(mobile): add ConversationListScreen

* fix(mobile): address review feedback on kilo-chat components

- Drop double-cast `as unknown as Href` in favor of `as Href`
- Use themed `Text` from `@/components/ui/text` and local `useKiloChatClient`
  re-export in `MessageBubble`
- Switch `crypto.randomUUID()` to `expo-crypto`'s `Crypto.randomUUID` to
  match existing usage in `cloud-agent-runtime.ts`

* feat(mobile): add chat sandbox stack layout

* feat(mobile): add conversation list route

* feat(mobile): add conversation message route

* feat(mobile): wire chat deep links and active-conversation suppression

* fix(mobile): clear correct badge bucket on legacy chat foreground push

* chore(mobile): delete Stream-based chat components and routes

* chore(mobile): remove useStreamChatCredentials hook

* chore: remove stream-chat deps and RN patch

* chore(web): remove Stream tRPC procedures

* chore(web): delete Stream chat-credentials API route

* chore(web): strip Stream methods from kiloclaw clients

* chore(web): replace ChatTab with redirect, drop Stream hooks

* chore(kiloclaw): delete src/stream-chat directory

* chore(kiloclaw): remove Stream injections from instance DO and routes

* chore(kiloclaw): remove Stream from controller config-writer

* chore(kiloclaw): drop STREAM_CHAT_* secret bindings

* chore(web): remove residual Stream CSS and npm deps

* chore(mobile): drop unused exports and deps flagged by knip
… badge endpoints (#2961)

* refactor(db): rename channel_badge_counts to badge_counts (general purpose); update all consumers

* feat(db): migration to rename badge_counts and reset rows

* feat(notifications): add badge-bucket key builders

The badge_counts.badge_bucket column is a free-form string. To prevent
namespace collisions as more surfaces start emitting badge updates
(per-instance today, per-conversation later), centralize bucket-key
derivation in @kilocode/notifications and route NotificationChannelDO
through it. Mirrors the presence-context builders in @kilocode/event-service.

Safe to introduce now without a data migration because PR 2's migration
already wipes badge_counts.

* chore(notifications): add EVENT_SERVICE binding, drop STREAM_CHAT_API_SECRET

* chore(notifications): add vitest scaffold

* feat(notifications): rewrite NotificationChannelDO around dispatchPush

* chore(notifications): drop orphan badgeBucketForInstance helper

* feat(notifications): add sendPushForConversation WorkerEntrypoint RPC

* chore(notifications): delete Stream webhook route

* chore(notifications): type EVENT_SERVICE RPC and enable cloudflare:test types

* feat(event-service): add kiloclaw event-context helpers; migrate kilo-chat producer

Adds kiloclawInstanceContext and kiloclawConversationContext path
builders to @kilocode/event-service, replacing hardcoded template
literals in kilo-chat's event-push.ts and its test so all callers
share a single source of truth.

* feat(kilo-chat): add fetchSandboxLabel helper

* chore(kilo-chat): add NOTIFICATIONS service binding

* feat(kilo-chat): publish push on message.created via NOTIFICATIONS RPC

When a chat message is persisted, fire-and-forget a call to
NOTIFICATIONS.sendPushForConversation so non-sender human members of the
conversation receive a push. Runs after realtime/event-service delivery
inside postCommitFanOut, with errors swallowed so push failures cannot
fail the send.

- Skip when there are no other human recipients or no sandboxId.
- senderUserId = callerId for human senders, null for bot senders.
- title is "<sandboxLabel> · <conversationTitle>"; bodyPreview is the
  first 200 chars of the concatenated text blocks.
- Add @kilocode/notifications workspace dep and layer the RPC method
  shape into Env via bindings.d.ts.
- Add a notifications-stub worker to the vitest config so tests can
  spy on env.NOTIFICATIONS.sendPushForConversation, and globally mock
  sandbox-lookup in setup.ts (it imports pg via @kilocode/db).

* chore(notifications): drop orphan stream-chat dep, refresh worker types, fix test mock

- Remove `stream-chat` from `services/notifications/package.json`; the Stream
  webhook (its only consumer) was deleted earlier in the stack.
- Regenerate `worker-configuration.d.ts` so the workerd runtime types match the
  current toolchain (sibling services were on `1.20260312.1`; this one had
  drifted to `1.20251217.0` from a stale local cache).
- Fix the global test mock to reference the renamed `badge_counts` table; the
  setup file was authored against the pre-rename name and never matched.
- Tidy two pre-existing lint nits in the new test files (`import type` for
  type-only import, drop unused `cols` parameter).

* fix(notifications): named entrypoint export, retry-safe badge, alarm-leak

- Switch `NotificationsService` from default-only to a named class export
  with a separate default. `services/kilo-chat/wrangler.jsonc` binds via
  `entrypoint: "NotificationsService"`, which resolves named module
  exports. The default-only form (`export default class NotificationsService`)
  exports under the `default` key — kilo-chat's RPC binding would not have
  resolved at deploy. Mirrors the existing pattern in
  `services/kilo-chat/src/index.ts` (`KiloChatService`).

- `dispatchPush` now uses a two-stage idempotency record (`pending` →
  `delivered`). The badge increment was previously non-idempotent: an
  Expo failure returned `failed` without writing the idempotency key, so
  upstream retries (which the design explicitly invites) re-ran the
  increment before the next send and inflated the badge by one per
  retry. The `pending` marker is written before the increment and
  short-circuits the increment on retry; the `delivered` marker is only
  written on success.

- `setAlarm` is now gated on `getAlarm() === null`. Calling `setAlarm`
  unconditionally on each successful push — as the previous code did —
  replaces the pending alarm and pushes the cleanup forward indefinitely
  on a conversation receiving more than one push per `IDEM_TTL_MS`,
  leaking expired idempotency entries.

Adds two test cases covering the badge-retry and alarm-reset paths.

* fix(notifications): close two cleanup-alarm leaks

- Schedule the cleanup alarm when writing the `pending` marker, not only
  on `delivered`. Without this, an Expo failure followed by no further
  push activity for the conversation leaves the `pending` record in DO
  storage forever (no alarm was ever set to prune it).

- After the alarm fires, reschedule for the earliest remaining record's
  expiry instead of leaving the alarm slot empty. Otherwise a quiet
  conversation strands its younger entries until some unrelated future
  dispatch wakes the DO up.

Both paths go through a small `ensureCleanupAlarm` helper that gates on
`getAlarm() === null` so a busy conversation still doesn't push the
alarm forward on every call.

* refactor(event-service): compose presence contexts from kiloclaw helpers

The kiloclaw-scoped presence paths are literally `/presence` prefixed
onto the kiloclaw event-context paths. Build them by composition so the
`/kiloclaw/{sandboxId}[/{conversationId}]` segment shape is defined in
exactly one place — `kiloclaw-contexts.ts`.

Pure refactor; same string output, template-literal types still narrow
to the same shape.

* feat(web): add kiloChat.getToken tRPC procedure

* refactor(web): use kiloclaw-context helpers for event subscriptions

* feat(web): lift EventServiceClient to global provider

Introduces a single app-shell EventServiceProvider that owns the
EventServiceClient and KiloChatClient for all authenticated routes.
Mounted in (app)/layout.tsx so platform/instance/conversation presence
subscriptions and the kilo-chat UI share one WebSocket.

KiloChatLayout now consumes the global clients via useEventServiceClient()
instead of spinning up its own pair, and the getToken prop is removed from
KiloChatLayoutProps (along with both call sites). The local
useEventService(getToken) factory is dead code and has been deleted;
useInstanceContext / useConversationContext stay since they take
EventServiceClient as a parameter.

* feat(web): add usePresenceSubscription primitive

Thin hook that subscribes the global EventServiceClient to a single
context for the lifetime of the calling component, gated by an `active`
flag. Will back upcoming platform- and instance-level presence
indicators.

* refactor(web): collapse kilo-chat event subscriptions into usePresenceSubscription

- Drop dead getToken field from KiloChatContextValue (no consumers).
- Remove useInstanceContext / useConversationContext hooks; both call
  sites now use the shared usePresenceSubscription primitive directly.
- Harden usePresenceSubscription against empty-string contexts.

* feat(web): subscribe to /presence/web while tab is visible

* feat(web): subscribe to /presence/kiloclaw/{sandboxId} on instance views

* refactor(web): extract useDocumentVisible primitive

* feat(web): subscribe to conversation presence while tab visible

* style(web): reflow useDocumentVisible useState init to one line

* refactor(web): tighten presence hook + kilo-chat router contract

- usePresenceSubscription: accept 'string | null' instead of empty-string
  sentinel; update call sites (KiloChatLayout, MessageArea, useInstancePresence)
- kilo-chat router: validate expiresAt with z.iso.datetime()
- kilo-chat-router test: verify the JWT payload (kiloUserId, tokenSource,
  version) and that expiresAt lands in the expected ~1h window
- MessageArea: comment distinguishing the always-on chat-event subscription
  from the visibility-gated presence subscription

* fix(event-service): refcount subscribe/unsubscribe by context

Multiple consumers can now independently hold the same context without
trampling each other. The wire context.subscribe/context.unsubscribe
messages are only sent on the 0->1 and 1->0 refcount transitions; the
intermediate churn stays client-side.

Resubscribe-on-reconnect dedupes by context key.

Tests cover: double-subscribe collapses to a single wire send, partial
unsubscribe keeps the context alive, last-consumer-out releases it,
mixed batches only send newly-active contexts, unknown-context
unsubscribes are no-ops, and reconnect resubscribes each context once.

* chore(mobile): add EXPO_PUBLIC_KILO_CHAT_URL and EXPO_PUBLIC_EVENT_SERVICE_URL

* chore(mobile): add kilo-chat workspace deps

* feat(mobile): add kilo-chat token getter with caching

* feat(mobile): add useCurrentUserId from JWT sub

* feat(mobile): add KiloChatProvider

* feat(mobile): add useKiloChatClient and useEventServiceClient hooks

* fix(mobile): fix lint errors in kilo-chat token getter

* fix(mobile): fix lint errors in useCurrentUserId hook

* fix(mobile): fix lint errors in useKiloChatClient hook

* feat(mobile): mount KiloChatProvider in (app) layout

* fix(kilo-chat): assert non-null in base64urlEncode loop

* fix(mobile): share kilo-chat token cache + handle fetch errors

Hoist cache and in-flight promise refs to module scope so all
useKiloChatTokenGetter() instances (provider + useCurrentUserId) share
one cache instead of each maintaining an independent one.

Wrap the fetch in try/catch/finally: on error rejectShared() is called
so concurrent waiters fail fast instead of hanging forever, and
inFlight is always cleared in finally regardless of outcome.

* fix(mobile): tie kilo-chat token cache to auth token, decode kiloUserId

- Key the module-level kilo-chat JWT cache and in-flight ref on the
  current auth token, so signing out and back in as a different user
  within the 1h token window no longer returns the previous user's
  cached JWT.
- Restructure dedup so the first caller awaits the same shared promise
  via a slot reference, eliminating the unhandled rejection that the
  prior resolve/reject-pair pattern produced when the only caller's
  fetch failed.
- Decode kiloUserId from the JWT payload instead of the standard `sub`
  claim — generateApiToken writes the user id as kiloUserId, so the
  sub-based version always returned null.

* fix(mobile): read auth token at call time, not at hook render

KiloChatProvider builds its EventService and KiloChat clients exactly
once via useState initializer, so it captures whatever getter exists at
first mount. Closing the previous getter over a render-time `authToken`
meant a cold start where the (app) layout mounted before SecureStore
finished loading would freeze the clients with an undefined token,
trapping them in a permanent reconnect loop.

Read the auth token from SecureStore inside the getter, the same pattern
trpcClient uses. The hook returns a stable callback with no React deps,
and the cache stays keyed on the auth token so user-switch safety is
preserved.

* feat(mobile): add usePresenceSubscription primitive

* feat(mobile): subscribe to /presence/app while app is active

* feat(mobile): add useInstancePresence hook

* feat(mobile): add useConversationPresence hook

* fix(mobile): fix lint errors in presence hooks

* feat(mobile): add useEventSubscription primitive

* feat(mobile): add useInstanceEventSubscription

* fix(mobile): apply curly/switch-case-braces lint rules to event hooks

* feat(kilo-chat-hooks): create shared package; extract useConversations

* feat(kilo-chat-hooks): extract useMessages — base query + optimistic send

Move PAGE_SIZE, helper functions (applyReactionAdded/Removed, restoreMessageInCache,
removeMessageFromCache, findMessageInCache), useMessages infinite-query hook, and
useSendMessage mutation into @kilocode/kilo-chat-hooks. Web's useMessages.ts re-exports
the moved hooks and retains local helper copies for remaining mutations (37b will collapse).

* feat(kilo-chat-hooks): useMessages adds edit/delete/react mutations

* feat(kilo-chat-hooks): extract useMessageCacheUpdater into shared package

Moves the live event-stream cache patcher from the web-only useMessages
file into @kilocode/kilo-chat-hooks. Adds an optional onActionFailed
callback so platform wrappers inject toasts; web passes toast.error.

* feat(mobile): wire shared kilo-chat-hooks + platform adapters

* fix(kilo-chat-hooks): centralize query keys; tighten event-subscription API

- Add packages/kilo-chat-hooks/src/query-keys.ts with conversations/
  conversation/messages/bot-status helpers; route every hook + invalidator
  through it. Fixes the mobile useInstanceEventSubscription bug where
  invalidations used ['conversations', sandboxId] but the queries register
  under ['kilo-chat', 'conversations', sandboxId], so list previews and
  unread counts never refreshed on incoming events.
- useEventSubscription now takes a single event name; callers register one
  hook per event. Drops the events.join('|') dependency hack and the
  eslint-disable. useInstanceEventSubscription becomes six explicit
  registrations.
- Drop the hardcoded English toast string from useMessageCacheUpdater;
  onActionFailed is () => void and the message lives at each call site.
- Extract useAppActiveAndFocused to deduplicate AppState+focus boilerplate
  shared by useInstancePresence and useConversationPresence.

* fix(mobile): subscribe to conversation.* events on instance context

The instance-level subscription was listening for message.created/updated/
deleted, which are published on conversation contexts and never fire here.
Replace them with conversation.renamed, conversation.read, and
conversation.activity — the events kilo-chat actually pushes to the
instance context — so list updates (title, unread, last-activity)
invalidate the conversations query as intended.

* chore(mobile): add @shopify/flash-list dependency

Required by the kilo-chat MessageList and ConversationListScreen components.

* chore(mobile): add EXPO_PUBLIC_KILO_CHAT_URL and EXPO_PUBLIC_EVENT_SERVICE_URL

These were declared in env-keys.js by PR 5a but never added to apps/mobile/.env,
which broke the dev build.

* feat(mobile): add EmptyConversationList

* feat(mobile): add ConversationHeader

* feat(mobile): add TypingIndicator placeholder

* feat(mobile): add MessageInput

* feat(mobile): add MessageBubble

* feat(mobile): add MessageList

Implement MessageList using FlashList v2 with maintainVisibleContentPosition
and startRenderingFromBottom for chat layout; wire fetchOlder via onStartReached.

* feat(mobile): add ConversationScreen

* feat(mobile): add ConversationListScreen

* fix(mobile): address review feedback on kilo-chat components

- Drop double-cast `as unknown as Href` in favor of `as Href`
- Use themed `Text` from `@/components/ui/text` and local `useKiloChatClient`
  re-export in `MessageBubble`
- Switch `crypto.randomUUID()` to `expo-crypto`'s `Crypto.randomUUID` to
  match existing usage in `cloud-agent-runtime.ts`

* feat(mobile): add chat sandbox stack layout

* feat(mobile): add conversation list route

* feat(mobile): add conversation message route

* feat(mobile): wire chat deep links and active-conversation suppression

* fix(mobile): clear correct badge bucket on legacy chat foreground push

* chore(mobile): delete Stream-based chat components and routes

* chore(mobile): remove useStreamChatCredentials hook

* chore: remove stream-chat deps and RN patch

* chore(web): remove Stream tRPC procedures

* chore(web): delete Stream chat-credentials API route

* chore(web): strip Stream methods from kiloclaw clients

* chore(web): replace ChatTab with redirect, drop Stream hooks

* chore(kiloclaw): delete src/stream-chat directory

* chore(kiloclaw): remove Stream injections from instance DO and routes

* chore(kiloclaw): remove Stream from controller config-writer

* chore(kiloclaw): drop STREAM_CHAT_* secret bindings

* chore(web): remove residual Stream CSS and npm deps

* chore(mobile): drop unused exports and deps flagged by knip

* refactor(notifications): re-key DO per-user, move badge state to DO storage

Key NotificationChannelDO by recipient userId instead of conversationId,
and store per-bucket badge counts directly in DO storage under
`bucket:${badgeBucket}` keys. The Drizzle `badge_counts` insert/sum paths
are gone from the DO; sendPushForConversationCore now fans out to one DO
per recipient via idFromName(userId). Adds private incrementBucket /
getTotal helpers and public markBucketRead / listNonZeroBuckets RPC for
the upcoming HTTP routes.

* feat(notifications): JWT auth + badge HTTP routes

Mirrors kilo-chat's auth middleware: bearer Kilo JWT verified against
NEXTAUTH_SECRET, callerId/callerKind set on context. Mounts CORS + auth
on /v1/* and adds GET /v1/badges + POST /v1/badges/mark-read backed by
NotificationChannelDO RPC methods.

* fix(notifications): mount useWorkersLogger so auth setTags is effective

Without the middleware, logger.setTags in authMiddleware writes to no
AsyncLocalStorage frame. Mirrors the kilo-chat setup. Also tightens the
mark-read missing-bucket test to lock the JSON error contract for mobile.

* refactor(web): drop badge_counts tRPC procedures

Remove markChatRead and getUnreadCounts from user router; mobile now
calls the notifications worker HTTP routes (GET /v1/badges,
POST /v1/badges/mark-read) added in tasks 63-64. The badge_counts
table itself is dropped in a follow-up.

* feat(mobile): call notifications worker for badge counts

Replace tRPC `user.getUnreadCounts` and `user.markChatRead` (deleted in
prior commit) with direct fetches to the notifications worker
(`GET /v1/badges`, `POST /v1/badges/mark-read`) authed with the existing
kilo-chat JWT. Adds `EXPO_PUBLIC_NOTIFICATIONS_URL` config and rekeys the
unread-counts query to `['badges', userId]`.

* refactor(db): drop badge_counts table

Badge state now lives in the notifications DO storage; the postgres
table is no longer read or written by any service.

* chore(db): revert incidental NewSecurityAdvisorScan reorder

The previous commit also moved NewSecurityAdvisorScan up next to
SecurityAdvisorScan as a cosmetic cleanup; that's out of scope for
the badge_counts removal. Restore the orphan to its original spot
at end-of-file so the badge_counts diff is minimal.

* docs(notifications): update badge-bucket comment after table drop

* chore: update env vars

* chore(mobile): drop expo public env prefix

* chore(kilo-chat): remove redundant non-null assertion

* fix(mobile): clear badge cache on mark read
Comment thread apps/mobile/src/components/kilo-chat/conversation-screen.tsx
Comment thread apps/mobile/src/components/kilo-chat/conversation-screen.tsx Outdated
Comment thread apps/mobile/src/components/kilo-chat/hooks/use-current-user-id.ts Outdated
Comment thread apps/mobile/src/components/kilo-chat/hooks/use-mark-read.ts
Comment thread apps/web/src/app/(app)/claw/components/ChatTab.tsx Outdated
@@ -0,0 +1 @@
DROP TABLE "badge_counts" CASCADE; No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Dropping badge counts loses existing unread state

This permanently removes all existing badge_counts rows while the new unread store lives in NotificationChannelDO, but there is no backfill from the table into DO storage. Users with non-zero unread counts will have their dashboard/OS badge state reset at migration time. CASCADE also makes this riskier by silently dropping any unexpected dependent objects instead of failing loudly.

Comment thread services/notifications/src/dos/NotificationChannelDO.ts Outdated
Comment thread services/notifications/src/dos/NotificationChannelDO.ts Outdated
export async function sendPushNotifications(
messages: ExpoPushMessage[],
accessToken: string
): Promise<SendResult> {
if (messages.length === 0) return { ticketTokenPairs: [], staleTokens: [] };
if (messages.length === 0) return { ticketTokenPairs: [], staleTokens: [], ticketErrors: [] };

const expo = new Expo({ accessToken });
const chunks = expo.chunkPushNotifications(messages);

const ticketTokenPairs: TicketTokenPair[] = [];
const staleTokens: string[] = [];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Ticket-error logs include Expo push tokens

PushTicketError stores the raw Expo token for non-stale ticket errors, and callers now surface these errors in results/log-derived summaries. Expo push tokens are credentials for addressing a device and should not be propagated beyond the stale-token cleanup path. Return only counts or a redacted token fingerprint here so non-stale Expo failures cannot leak device tokens into logs or RPC responses.

Comment thread apps/web/src/app/(app)/claw/kilo-chat/[conversationId]/page.tsx
Comment thread apps/web/src/app/(app)/claw/kilo-chat/components/MessageInput.tsx
const existing = await this.ctx.storage.get<number>(`${DEDUP_PREFIX}${webhookId}`);
if (existing) {
return Response.json({ ok: true, deduplicated: true });
async dispatchPush(input: DispatchPushInput): Promise<DispatchPushOutcome> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Runtime validation was removed from the DO boundary

DispatchPushInput is only a TypeScript type, so service-binding/RPC callers can still pass malformed payloads at runtime. Without dispatchPushInputSchema.parse(input), invalid userId, empty idempotency keys, negative badge deltas, or malformed push data now reach presence lookup, storage mutation, DB reads, and Expo dispatch. This reopens the side effects that the removed malformed-payload test covered; keep schema validation at this boundary before touching DO storage or external services.

deps: {
getRecipientDOStub: (userId: string) => RecipientDOStub;
}
): Promise<SendPushForConversationOutput> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: RPC input validation was removed before recipient dispatch

SendPushForConversationInput is erased at runtime, and this Worker RPC can be called with arbitrary JSON through the service binding. Removing sendPushForConversationInputSchema.parse(input) lets malformed recipient IDs or empty sandbox/conversation/message IDs route to DO stubs and create invalid idempotency/badge/presence contexts instead of failing before side effects. Restore runtime schema parsing before deriving recipients or dispatching to recipient DOs.


async clearBadgeBucketForUser(
input: ClearBadgeBucketForUserInput
): Promise<ClearBadgeBucketForUserOutput> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Badge-clearing RPC now trusts unvalidated user IDs

ClearBadgeBucketForUserInput is only a compile-time type. Without clearBadgeBucketForUserInputSchema.parse(input), a malformed runtime payload can call idFromName(input.userId) with an empty or non-string value and clear the wrong/invalid bucket key instead of rejecting. Since this RPC mutates per-user badge state, validate the input before selecting the Durable Object and calling markBucketRead.

const parsed = ParamsSchema.parse(params);

const tokens = await deps.getTokens(parsed.userId);
const tokens = await deps.getTokens(params.userId);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Lifecycle push dispatch no longer validates RPC params

SendInstanceLifecycleNotificationParams does not exist at runtime, so malformed RPC payloads can now reach token lookup and Expo message construction. For example an empty userId or sandboxId will query/delete/send using invalid identifiers and enqueue push payloads clients cannot route. Restore sendInstanceLifecycleNotificationInputSchema.parse(params) before any IO, as the removed tests were asserting.

@@ -1078,12 +1077,12 @@ export default class extends WorkerEntrypoint<KiloClawEnv> {
* stale-online until staleness inference catches up, ~poll interval).
*/
async deliverChatWebhook(payload: ChatWebhookPayload): Promise<void> {
const parsed = chatWebhookRpcSchema.parse(payload);
const { targetBotId, ...webhookPayload } = payload;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Webhook RPC payload is no longer validated

ChatWebhookPayload is inferred from Zod but not enforced at runtime across the Worker RPC boundary. Removing chatWebhookRpcSchema.parse(payload) means malformed or partial payloads can be destructured and forwarded to the controller as-is after only a targetBotId.startsWith check; payloads with an invalid type, missing message fields, or non-string targetBotId can now throw unexpected errors or bypass the intended schema contract. Parse the payload before deriving sandboxId and forwarding the webhook body.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant