Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/prd.md
Original file line number Diff line number Diff line change
Expand Up @@ -3999,10 +3999,10 @@ Notation: `[x]` shipped, `[ ]` planned. Mirrors the README's roadmap so contribu
- [ ] Typed hook payloads: `onSkillStart<T>`, `onToolCall<T>`, `onToolResult<T>` (§8.4)
- [ ] Typed memory strategies: `sliding<T>`, `tokenBudget<T>`, `summarized<T>` namespaces (§8.5)
- [ ] Human-in-the-loop: `confirm()` with message templates, timeouts, fallback behavior (§9.2.1)
- [ ] Session model — multi-turn `AgentSession`, automatic compaction (`SUMMARIZE`, `SLIDING_WINDOW`, `CUSTOM`) (§5.7)
- [x] Session model — multi-turn `AgentSession` **shipped** (#1736; `events` Flow + `await()` + snapshot/resume). _Remaining:_ automatic compaction (`SUMMARIZE` / `SLIDING_WINDOW` / `CUSTOM`) (§5.7)
- [ ] Reactive context hooks: `beforeInference`, `afterToolCall` — context-mutating hooks that inject system reminders (§8.4)
- [ ] `.spawn {}` — independent sub-agent lifecycle, `AgentHandle<OUT>`, parent-managed join
- [ ] `Flow<PipelineEvent>` for reactive UIs + Pipeline-level events (`StageStarted`, `PipelineCompleted`, etc) — depends on streaming, sub-agents, sessions (§10.2)
- [x] Reactive event stream for UIs **shipped** — `AgentSession.events: Flow<AgentEvent>` (#1736) + `agent.observe { }` (#965). _Remaining:_ composition-stage event types (`StageStarted`, `PipelineCompleted`) at the Pipeline level (§10.2)
- [ ] Serialization — `agent.json`, A2A AgentCard
- [ ] JAR bundles and folder-based assembly
- [ ] Gradle plugin
Expand Down
10 changes: 5 additions & 5 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ The 0.6.0 epic ([#1911](../../issues/1911)) tracks the full acceptance criteria.
- [ ] jlink minimal JRE bundle for runtime (~35MB)

*Secondary:*
- [ ] Session model — multi-turn `AgentSession`, automatic compaction (`SUMMARIZE`, `SLIDING_WINDOW`, `CUSTOM`)
- [x] Session model — multi-turn `AgentSession` **shipped** (cold `events: Flow<AgentEvent>` + `await()` + snapshot/resume; #1736 — see "Streaming session surface" below). _Remaining:_ automatic compaction (`SUMMARIZE` / `SLIDING_WINDOW` / `CUSTOM`).
- [x] **`onBefore*` interceptor family** — Rails-style `onBeforeSkill` / `onBeforeToolCall` / `onBeforeTurn` returning a sealed `Decision { Proceed | ProceedWith(args) | Deny(reason) | Substitute(result) }`. Sibling to today's post-hoc observer hooks (`onToolUse` / `onSkillChosen` / `onError`). Unifies per-client tool policy (McpServer), action confirmation, prompt-injection filtering (one-liner: `onBeforeTurn { msgs -> if (filter.flag(msgs)) Decision.Deny(...) else Decision.Proceed }`), and uniform `perToolTimeout` wrapping. Chain semantics: registration order, all run, first non-`Proceed` wins. ([#1907](../../issues/1907), feeds [#1908](../../issues/1908))
- [x] Agent memory — `MemoryBank`, `memory_read`/`memory_write`/`memory_search` auto-injected tools
- [ ] `.spawn {}` — independent sub-agent lifecycle, `AgentHandle<OUT>`, parent-managed join
Expand All @@ -80,10 +80,10 @@ The 0.6.0 epic ([#1911](../../issues/1911)) tracks the full acceptance criteria.
- [x] **Enforce `perToolTimeout` on session-aware tool path** — `sessionExecutor` calls now respect `budget.perToolTimeout`, emit failed `ToolCallFinished` events on timeout, and surface `BudgetExceededException(PER_TOOL_TIMEOUT)`. ([#1903](../../issues/1903))
- [x] **Streaming docs reconcile** — README Limitations / Roadmap bullets are tagged as shipped / experimental / planned; the stale "no per-adapter native streaming yet" wording is gone, and DeepSeek is called out as using the OpenAI-compatible SSE path. ([#1901](../../issues/1901))
- [x] Per-adapter native streaming overrides — Anthropic SSE (`ClaudeClient.chatStream`), OpenAI SSE (`OpenAiClient.chatStream`), Ollama NDJSON `stream: true` (`OllamaClient.chatStream`) all emit real partial chunks at the wire. Live integration tests measure 19 / 2 / 19 chunks per response respectively. See [v0.5.0 streaming premortem](premortem-0.5.0-streaming.md)
- [ ] `Flow<PipelineEvent>` for reactive UIs + Pipeline-level events (`StageStarted`, `PipelineCompleted`, etc) — built on top of `LlmChunk`; depends on sub-agents and sessions
- [ ] **Multimodal input** — vision and audio content blocks on LLM messages.
- **Image input:** vision-capable adapters accept image bytes + media type as a content block alongside text. Targets: Anthropic (`image` content blocks), OpenAI (`image_url` / base64 in content), Ollama (`llava` / `bakllava` via `images` field), Google Gemini.
- **Audio input:** true audio input (Gemini, GPT-4o-audio) — `LlmContent.Audio` block. Optional STT-only helper `audio.transcribe(file)` for the Whisper-style use case.
- [x] Reactive event stream for UIs — **shipped**: `AgentSession.events: Flow<AgentEvent>` (#1736) + `agent.observe { (PipelineEvent) -> }` (#965); a UI consumes the typed agent stream today (`Token` / `ToolCall*` / `SkillStarted` / `SkillCompleted` / `Completed` / `Failed`). _Remaining:_ composition-stage event types (`StageStarted`, `PipelineCompleted`) at the Pipeline level.
- [x] **Multimodal input — image/document (vision): SHIPPED** end-to-end across **Anthropic / OpenAI / Ollama** via `Content.Image` → `ImagePart` → provider wire and `agent.invokeWithAttachments` (#2466–#2470). The `Content` sealed type (`Text / Image / Audio / Video / Document`, each with a typed `ContentRef` + closed mime) is in place. _Remaining:_ audio/video input (below) + a Gemini provider to extend vision to.
- **Image/document input — [x] shipped** for Anthropic / OpenAI / Ollama (image bytes + media type as a content block alongside text; Gemini pending — no provider yet).
- **Audio/video input — [ ] remaining:** `Content.Audio` / `Content.Video` variants are typed but not yet sent to providers (Gemini, GPT-4o-audio). Optional STT-only helper `audio.transcribe(file)` for the Whisper-style use case.
- **Architectural change:** `LlmMessage.content: String` needs to evolve into a `List<LlmContent>` sealed type (Text / Image / Audio blocks). Binary-compat risk: add a sibling `contentBlocks: List<LlmContent>?` field first with the existing String form auto-coerced into a single Text block; deprecate the String form once the API surface settles. Typed boundaries are unaffected — `Agent<Image, String>` (image classifier) and `Agent<AudioClip, String>` (transcriber) become coherent agent shapes.
- [ ] Serialization — `agent.json`, A2A AgentCard
- [ ] JAR bundles and folder-based assembly
Expand Down
Loading