feat(api): accept model overrides on session creation and add runtime model switching endpoints#2791
feat(api): accept model overrides on session creation and add runtime model switching endpoints#2791dgageot wants to merge 10 commits into
Conversation
Allow callers to specify agent_model_overrides and custom_models_used when creating sessions via POST /api/sessions, and wire the runtime with model switcher configuration to apply those overrides when the session is first run.
Add GET /api/sessions/:id/models to list available models and the current override (if any), and PATCH/POST /api/sessions/:id/model to apply a model override for the session's current agent. Extend ModelChoice with JSON tags for wire format and add SessionModelsResponse type. Both endpoints return 422 if the runtime doesn't support model switching.
PATCH is the canonical verb for updating the agent's model. POST is also accepted because pkg/runtime Client.SetAgentModel (used by RemoteRuntime) was historically a POST; keep both so a client upgrade is not required.
Fix a data race in AvailableSessionModels where rs.session is read without holding sm.mux (the lock protecting reads/writes of AgentModelOverrides and CustomModelsUsed by SetSessionAgentModel). In SetSessionAgentModel, snapshot the in-memory override state before mutating the runtime and session. If sessionStore.UpdateSession fails after the runtime mutation, roll back both the in-memory session state and the runtime override so callers don't observe a runtime/store mismatch on the next request. Fixes issues identified in review and validated with go test -race.
Extract a startAttachedServer helper that wires up a SessionManager + HTTP server with t.Cleanup(ln.Close) so the Serve goroutine exits cleanly when the test finishes. Replaces boilerplate in all model-switching tests and prevents goroutine leaks. Add tests for key picker scenarios: - TestAttachedServer_GetSessionModels_DefaultMarkedCurrent: default marked current when no override is active - TestAttachedServer_GetSessionModels_AppendsCustomModels: session's custom model history appears in the picker - TestSessionManager_SetSessionAgentModel_RollsBackOnStoreFailure: store failure rolls back both in-memory session and runtime state - TestDecorateModelChoices (table): corner cases for the decorate helper Update TestAttachedServer_GetSessionModels to verify the IsCurrent decoration flow end-to-end and to remove the manual IsCurrent: true from the fixture (now correctly set by the manager).
Map ErrSessionNotRunning to 404 so callers can tell apart a stale session ID from a server-side problem. Introduce the sentinel error and replace four ad-hoc errors.New() calls with it: SteerSession, FollowUpSession, AvailableSessionModels, SetSessionAgentModel. Update the HTTP handlers to check for this sentinel explicitly and return 404 accordingly. Also add TestAttachedServer_ModelEndpoints_404WhenNotRunning to verify both GET /sessions/:id/models and PATCH /sessions/:id/model return 404 when no runtime is attached.
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
This PR introduces runtime model switching (GET/PATCH/POST endpoints) and session-creation model overrides. The implementation is well-structured with a rollback path and a new error sentinel. Three findings were identified in the changed code.
| customRefs = rs.session.CustomModelsUsed | ||
| } | ||
|
|
||
| choices := runtime.DecorateModelChoices(rs.runtime.AvailableModels(ctx), current, customRefs) |
There was a problem hiding this comment.
[HIGH] Lock held across blocking runtime I/O in AvailableSessionModels
sm.mux.Lock() / defer sm.mux.Unlock() wraps the entire function body, including the call to rs.runtime.AvailableModels(ctx) — a context-accepting call that can perform network I/O (e.g. models.dev catalog lookups, provider discovery). Any concurrent caller of RunSession, DeleteSession, SetSessionAgentModel, UpdateSessionTitle, or any other method that acquires sm.mux will block for the full duration of that network call.
The same pattern occurs in SetSessionAgentModel, which calls rs.runtime.SetAgentModel(ctx, agentName, modelRef) under sm.mux.
Fix: snapshot the rs pointer (and any session fields needed) under the lock, release the lock, then invoke the runtime method without holding sm.mux. The runtime is already concurrency-safe via its own internal locking.
| case errors.Is(err, ErrSessionNotRunning): | ||
| return echo.NewHTTPError(http.StatusNotFound, err.Error()) | ||
| default: | ||
| return echo.NewHTTPError(http.StatusBadRequest, err.Error()) |
There was a problem hiding this comment.
[MEDIUM] Unknown errors from SetSessionAgentModel mapped to HTTP 400 instead of 500
The default branch here returns http.StatusBadRequest (400) for any error that is not ErrModelSwitchingNotSupported or ErrSessionNotRunning. HTTP 400 means the client sent an invalid request — but unexpected errors from the session store or from the runtime (e.g. a network failure, an internal validation error wrapped in fmt.Errorf) are server-side faults, not client mistakes.
By contrast, the companion getSessionModels handler (a few lines above) correctly uses http.StatusInternalServerError (500) in its default branch.
Fix: Change the default case to return http.StatusInternalServerError.
default:
return echo.NewHTTPError(http.StatusInternalServerError, err.Error())| for i := range models { | ||
| if currentRef != "" { | ||
| if models[i].Ref == currentRef { | ||
| models[i].IsCurrent = true |
There was a problem hiding this comment.
[MEDIUM] DecorateModelChoices mutates the caller-supplied models slice in-place
models[i].IsCurrent = true and models[i].IsCurrent = models[i].IsDefault modify elements of the slice that was passed in by the caller. In Go, a slice header is copied at the call boundary but the underlying array is shared. If AvailableModels() returns a slice backed by an internal cache (rather than a freshly allocated copy), these index-assignment mutations will silently update the cached structs.
A subsequent call to AvailableModels() could then return entries already marked IsCurrent = true (or false) from a previous decoration call, producing incorrect picker state for a different session or agent.
Fix: Make a defensive copy of the input slice at the top of DecorateModelChoices:
result := make([]ModelChoice, len(models))
copy(result, models)
// ... operate on result, not modelsand return result instead of the mutated input.
|
Thanks for the review! Addressed all three findings in 3 separate commits: [MEDIUM]
|
Fixes #2780 and #2781.
What changes
POST /api/sessionshonours model selection (#2780)The handler binds the request body to a
session.Sessionso the existingagent_model_overridesandcustom_models_usedJSON fields are now pickedup and copied onto the new session in
SessionManager.CreateSession. Theruntime applies them the first time it is built for the session (see
runtimeForSession), so a freshly created session can be pinned to aspecific model without an extra round-trip.
Runtime model switching over HTTP (#2781)
Two new endpoints:
GET /api/sessions/:id/models→ list ofruntime.ModelChoicefor thesession's current agent, with
IsCurrentset on the active selection(override or default), and any inline
provider/modelref from thesession's history (or override) appended as a synthetic choice.
PATCH /api/sessions/:id/model→ applies a per-agent override andpersists it. Empty
modelclears the override and reverts to theagent's configured default.
POST /api/sessions/:id/modelis also accepted because the existingpkg/runtimeClient.SetAgentModel(used byRemoteRuntime) washistorically a POST — so no client upgrade is required.
Status code mapping:
Implementation notes
runtime.ModelChoicedirectly as the wire type (added JSON tags).No duplicate
pkg/api.ModelChoice.App.AvailableModelsnow delegates to a sharedruntime.DecorateModelChoiceshelper, which is also used bySessionManager.AvailableSessionModels. HTTP and TUI clients see thesame picker state.
SessionManager.SetSessionAgentModelsnapshots state before mutationand rolls back both the in-memory session and the runtime override
if
sessionStore.UpdateSessionfails.ErrSessionNotRunningdistinguishes "session not foundor not running" from generic errors so the HTTP layer can map it to
404 deterministically.
runtimeForSessionre-applies persisted overrides via a smallapplyStoredOverrideshelper. Failures are logged at WARN — a storedoverride that no longer resolves must not prevent session resumption.
Tests
pkg/server/session_models_test.go(new) — end-to-end coverage forGET/PATCH/POST, IsCurrent decoration, custom models appended, empty
body clears override, store-write rollback, runtime-failure rollback,
404 when no runtime attached, 422 when runtime doesn't support
switching, POST verb compat.
pkg/runtime/model_switcher_test.go— table test forDecorateModelChoicescorner cases.pkg/server/session_manager_test.go—SupportsModelSwitchingonfakeRuntime.Validation
task lint— clean.go test -race -count=3 ./pkg/server/... ./pkg/runtime/... ./pkg/api/...— green.go test— green except pre-existing environmental failuresin
pkg/teamloader(require Docker Model Runner running locally,unrelated to this PR).
Commits