Skip to content

feat(api): accept model overrides on session creation and add runtime model switching endpoints#2791

Open
dgageot wants to merge 10 commits into
docker:mainfrom
dgageot:board/54bbc2d7029b6f59
Open

feat(api): accept model overrides on session creation and add runtime model switching endpoints#2791
dgageot wants to merge 10 commits into
docker:mainfrom
dgageot:board/54bbc2d7029b6f59

Conversation

@dgageot
Copy link
Copy Markdown
Member

@dgageot dgageot commented May 13, 2026

Fixes #2780 and #2781.

What changes

POST /api/sessions honours model selection (#2780)

The handler binds the request body to a session.Session so the existing
agent_model_overrides and custom_models_used JSON fields are now picked
up and copied onto the new session in SessionManager.CreateSession. The
runtime applies them the first time it is built for the session (see
runtimeForSession), so a freshly created session can be pinned to a
specific model without an extra round-trip.

Runtime model switching over HTTP (#2781)

Two new endpoints:

  • GET /api/sessions/:id/models → list of runtime.ModelChoice for the
    session's current agent, with IsCurrent set on the active selection
    (override or default), and any inline provider/model ref from the
    session's history (or override) appended as a synthetic choice.
  • PATCH /api/sessions/:id/model → applies a per-agent override and
    persists it. Empty model clears the override and reverts to the
    agent's configured default.

POST /api/sessions/:id/model is also accepted because the existing
pkg/runtime Client.SetAgentModel (used by RemoteRuntime) was
historically a POST — so no client upgrade is required.

Status code mapping:

outcome code
ok 200
invalid JSON body 400
no runtime attached for session 404
runtime doesn't support model switching 422
invalid model ref 400

Implementation notes

  • Reuses runtime.ModelChoice directly as the wire type (added JSON tags).
    No duplicate pkg/api.ModelChoice.
  • The TUI App.AvailableModels now delegates to a shared
    runtime.DecorateModelChoices helper, which is also used by
    SessionManager.AvailableSessionModels. HTTP and TUI clients see the
    same picker state.
  • SessionManager.SetSessionAgentModel snapshots state before mutation
    and rolls back both the in-memory session and the runtime override
    if sessionStore.UpdateSession fails.
  • New sentinel ErrSessionNotRunning distinguishes "session not found
    or not running" from generic errors so the HTTP layer can map it to
    404 deterministically.
  • runtimeForSession re-applies persisted overrides via a small
    applyStoredOverrides helper. Failures are logged at WARN — a stored
    override that no longer resolves must not prevent session resumption.

Tests

  • pkg/server/session_models_test.go (new) — end-to-end coverage for
    GET/PATCH/POST, IsCurrent decoration, custom models appended, empty
    body clears override, store-write rollback, runtime-failure rollback,
    404 when no runtime attached, 422 when runtime doesn't support
    switching, POST verb compat.
  • pkg/runtime/model_switcher_test.go — table test for
    DecorateModelChoices corner cases.
  • pkg/server/session_manager_test.goSupportsModelSwitching on
    fakeRuntime.

Validation

  • task lint — clean.
  • go test -race -count=3 ./pkg/server/... ./pkg/runtime/... ./pkg/api/... — green.
  • Full repo go test — green except pre-existing environmental failures
    in pkg/teamloader (require Docker Model Runner running locally,
    unrelated to this PR).

Commits

  • feat: support model overrides when creating sessions
  • feat: add GET/PATCH/POST endpoints for runtime model switching
  • fix: add backward compat note for POST /sessions/:id/model
  • fix: thread-safe model reads and persistence rollback
  • test: enhance model selection test coverage and fix goroutine leaks
  • refactor(runtime): extract DecorateModelChoices and remove duplication
  • fix: distinguish "session not running" from other errors

dgageot added 7 commits May 13, 2026 14:37
Allow callers to specify agent_model_overrides and custom_models_used
when creating sessions via POST /api/sessions, and wire the runtime
with model switcher configuration to apply those overrides when the
session is first run.
Add GET /api/sessions/:id/models to list available models and the
current override (if any), and PATCH/POST /api/sessions/:id/model to
apply a model override for the session's current agent.

Extend ModelChoice with JSON tags for wire format and add
SessionModelsResponse type. Both endpoints return 422 if the runtime
doesn't support model switching.
PATCH is the canonical verb for updating the agent's model. POST is also
accepted because pkg/runtime Client.SetAgentModel (used by RemoteRuntime)
was historically a POST; keep both so a client upgrade is not required.
Fix a data race in AvailableSessionModels where rs.session is read without
holding sm.mux (the lock protecting reads/writes of AgentModelOverrides and
CustomModelsUsed by SetSessionAgentModel).

In SetSessionAgentModel, snapshot the in-memory override state before
mutating the runtime and session. If sessionStore.UpdateSession fails after
the runtime mutation, roll back both the in-memory session state and the
runtime override so callers don't observe a runtime/store mismatch on the
next request.

Fixes issues identified in review and validated with go test -race.
Extract a startAttachedServer helper that wires up a SessionManager + HTTP
server with t.Cleanup(ln.Close) so the Serve goroutine exits cleanly when
the test finishes. Replaces boilerplate in all model-switching tests and
prevents goroutine leaks.

Add tests for key picker scenarios:
- TestAttachedServer_GetSessionModels_DefaultMarkedCurrent: default marked
  current when no override is active
- TestAttachedServer_GetSessionModels_AppendsCustomModels: session's custom
  model history appears in the picker
- TestSessionManager_SetSessionAgentModel_RollsBackOnStoreFailure: store
  failure rolls back both in-memory session and runtime state
- TestDecorateModelChoices (table): corner cases for the decorate helper

Update TestAttachedServer_GetSessionModels to verify the IsCurrent
decoration flow end-to-end and to remove the manual IsCurrent: true from
the fixture (now correctly set by the manager).
Map ErrSessionNotRunning to 404 so callers can tell apart a stale session ID
from a server-side problem. Introduce the sentinel error and replace four
ad-hoc errors.New() calls with it: SteerSession, FollowUpSession,
AvailableSessionModels, SetSessionAgentModel. Update the HTTP handlers to
check for this sentinel explicitly and return 404 accordingly.

Also add TestAttachedServer_ModelEndpoints_404WhenNotRunning to verify both
GET /sessions/:id/models and PATCH /sessions/:id/model return 404 when no
runtime is attached.
@dgageot dgageot requested a review from a team as a code owner May 13, 2026 14:01
trungutt
trungutt previously approved these changes May 13, 2026
Copy link
Copy Markdown

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

This PR introduces runtime model switching (GET/PATCH/POST endpoints) and session-creation model overrides. The implementation is well-structured with a rollback path and a new error sentinel. Three findings were identified in the changed code.

customRefs = rs.session.CustomModelsUsed
}

choices := runtime.DecorateModelChoices(rs.runtime.AvailableModels(ctx), current, customRefs)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] Lock held across blocking runtime I/O in AvailableSessionModels

sm.mux.Lock() / defer sm.mux.Unlock() wraps the entire function body, including the call to rs.runtime.AvailableModels(ctx) — a context-accepting call that can perform network I/O (e.g. models.dev catalog lookups, provider discovery). Any concurrent caller of RunSession, DeleteSession, SetSessionAgentModel, UpdateSessionTitle, or any other method that acquires sm.mux will block for the full duration of that network call.

The same pattern occurs in SetSessionAgentModel, which calls rs.runtime.SetAgentModel(ctx, agentName, modelRef) under sm.mux.

Fix: snapshot the rs pointer (and any session fields needed) under the lock, release the lock, then invoke the runtime method without holding sm.mux. The runtime is already concurrency-safe via its own internal locking.

Comment thread pkg/server/server.go Outdated
case errors.Is(err, ErrSessionNotRunning):
return echo.NewHTTPError(http.StatusNotFound, err.Error())
default:
return echo.NewHTTPError(http.StatusBadRequest, err.Error())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] Unknown errors from SetSessionAgentModel mapped to HTTP 400 instead of 500

The default branch here returns http.StatusBadRequest (400) for any error that is not ErrModelSwitchingNotSupported or ErrSessionNotRunning. HTTP 400 means the client sent an invalid request — but unexpected errors from the session store or from the runtime (e.g. a network failure, an internal validation error wrapped in fmt.Errorf) are server-side faults, not client mistakes.

By contrast, the companion getSessionModels handler (a few lines above) correctly uses http.StatusInternalServerError (500) in its default branch.

Fix: Change the default case to return http.StatusInternalServerError.

default:
    return echo.NewHTTPError(http.StatusInternalServerError, err.Error())

Comment thread pkg/runtime/model_switcher.go Outdated
for i := range models {
if currentRef != "" {
if models[i].Ref == currentRef {
models[i].IsCurrent = true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MEDIUM] DecorateModelChoices mutates the caller-supplied models slice in-place

models[i].IsCurrent = true and models[i].IsCurrent = models[i].IsDefault modify elements of the slice that was passed in by the caller. In Go, a slice header is copied at the call boundary but the underlying array is shared. If AvailableModels() returns a slice backed by an internal cache (rather than a freshly allocated copy), these index-assignment mutations will silently update the cached structs.

A subsequent call to AvailableModels() could then return entries already marked IsCurrent = true (or false) from a previous decoration call, producing incorrect picker state for a different session or agent.

Fix: Make a defensive copy of the input slice at the top of DecorateModelChoices:

result := make([]ModelChoice, len(models))
copy(result, models)
// ... operate on result, not models

and return result instead of the mutated input.

@dgageot
Copy link
Copy Markdown
Member Author

dgageot commented May 13, 2026

Thanks for the review! Addressed all three findings in 3 separate commits:

[MEDIUM] DecorateModelChoices mutates caller-supplied slice — fixed in 96ea8bb

The function now allocates a result slice (sized len(models)+len(customRefs)+1 to fit the synthetic-override case), copies the input, and operates on the copy. Doc-comment updated to spell out the non-mutation guarantee. Added TestDecorateModelChoices/does_not_mutate_the_input_slice that snapshots the input before calling and asserts deep-equality after.

[MEDIUM] Unknown errors mapped to 400 — fixed in ad1366a

The handler's default branch now returns 500 (mirroring the companion getSessionModels handler). 400 is reserved for the upstream Bind failure on the request body. Added TestAttachedServer_SetSessionModel_StoreFailureReturns500 driving a synthetic UpdateSession failure through the HTTP layer and asserting 500.

[HIGH] Lock held across blocking runtime I/O — fixed in be9e8ae

  • Added a per-session modelSwitch sync.Mutex on activeRuntimes to serialise concurrent SetSessionAgentModel calls on the same session without blocking other sessions.
  • AvailableSessionModels now snapshots current and customRefs under sm.mux briefly, releases the lock, then calls runtime.AvailableModels(ctx).
  • SetSessionAgentModel is restructured so sm.mux is taken in three short critical sections (snapshot prev → mutate session → rollback in-memory) but released around the slow runtime calls (SetAgentModel apply + rollback) and the store write. Atomicity wrt other model-switches on the same session is preserved by the new modelSwitch lock; the rollback contract is unchanged.
  • Added two synchronisation-style tests that park a deliberately-slow runtime call open and verify that an unrelated SetSessionStarred (sm.mux-acquiring) completes within 2s while the runtime call is still in flight:
    • TestSessionManager_AvailableSessionModels_DoesNotHoldMuxDuringRuntimeIO
    • TestSessionManager_SetSessionAgentModel_DoesNotHoldMuxDuringRuntimeIO

Validation

  • task lint — clean.
  • go test -race -count=5 ./pkg/server/... — green.
  • go test -race -count=3 ./pkg/server/... ./pkg/runtime/... — green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

API: allow choosing a model when creating a session

3 participants