feat: add mcp_catalog toolset for on-demand MCP server discovery#2794
feat: add mcp_catalog toolset for on-demand MCP server discovery#2794dgageot wants to merge 7 commits into
Conversation
persist managedOAuth flag so it applies to servers enabled after SetManagedOAuth was called, honor ctx.Err() when iterating enabled toolsets in Tools() so cancelled contexts return early, fix package doc comment that said three meta-tools instead of four, and add three tests for these fixes.
Adds a fifth tool to the mcp_catalog toolset for clearing persisted OAuth credentials. Useful when a token is revoked, scopes change, or the user signs in to the wrong account. For oauth servers: stops the active MCP session, removes the token from the keyring, and fires the tools-changed handler if enabled. For api_key/ none servers: no-op. Keyring errors surface as tool errors, not panics. Includes 5 new tests validating the auth reset behavior.
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
Two medium-severity findings in pkg/tools/builtin/mcpcatalog/mcpcatalog.go — see inline comments for details.
| // Pre-flight: warn (don't block) if an api_key server is missing its env var. | ||
| // We do not block because the user may set the variable later, or rely on | ||
| // the model to surface the error from the first tool call. | ||
| missing := t.missingAPIKeyEnv(ctx, server) |
There was a problem hiding this comment.
[MEDIUM] Blocking external calls held under write lock in handleEnable
missingAPIKeyEnv (line 475) and t.expander.ExpandMap (line 477) are both called while t.mu is write-locked (acquired at line 466, released at line 499). missingAPIKeyEnv calls t.env.Get(ctx, sec.Env) which is an external provider that may hit the OS keychain or a network service. ExpandMap similarly invokes an external expander interface.
Holding a write lock during these calls blocks all concurrent readers — including Tools() (which every agent turn calls), handleSearch, and handleList — for the full duration of the external I/O. For servers with multiple api_key secrets (e.g. Apify with APIFY_API_KEY), each secret triggers a separate env.Get round-trip.
Both functions operate exclusively on server, which comes from t.byID — a map that is read-only after construction and requires no mutex protection. They should be moved before the t.mu.Lock() call:
// Perform slow external calls BEFORE acquiring the lock —
// server data is immutable, no protection needed here.
missing := t.missingAPIKeyEnv(ctx, server)
headers := t.expander.ExpandMap(ctx, server.Headers)
t.mu.Lock()
if _, exists := t.enabled[id]; exists {
t.mu.Unlock()
return tools.ResultSuccess(...), nil
}
// ... rest of the enable logic| } | ||
|
|
||
| if !e.ts.IsStarted() { | ||
| if err := e.ts.Start(ctx); err != nil { |
There was a problem hiding this comment.
[MEDIUM] Concurrent Disable during Tools() iteration may restart a just-disabled server
Tools() snapshots t.enabled under RLock (lines 326–331), then iterates the snapshot outside the lock. If handleDisable runs concurrently it:
- Removes the server from
t.enabledand releases the lock - Calls
wrapped.Stop(ctx)— marking theStartableToolSetas stopped
Meanwhile, Tools() still holds a reference to the same *tools.StartableToolSet and, on line 339, calls e.ts.Start(ctx) if IsStarted() is false — which it is, because Stop just cleared it.
The result is that a server the user explicitly disabled can be silently reconnected within the same Tools() invocation, restoring its tools to the active set until the next invocation. Whether StartableToolSet.Start() honours a post-Stop() call as a no-op is the key question; if it doesn't reject the call, this is a real reconnection.
A simple mitigation is to re-check membership in t.enabled before starting:
for _, e := range enabled {
if err := ctx.Err(); err != nil {
return nil, err
}
// Skip servers that were disabled after we snapshotted.
t.mu.RLock()
_, stillEnabled := t.enabled[e.id]
t.mu.RUnlock()
if !stillEnabled {
continue
}
if !e.ts.IsStarted() {
// ... existing start logic
}
// ...
}The new pkg/tools/builtin/mcpcatalog package embeds servers.json via //go:embed. The .dockerignore default-denies everything and only whitelists specific paths, so the embedded JSON would be missing from the Docker build context, causing the build to fail.
Summary
Adds a new built-in toolset,
mcp_catalog, that lets agents discover and on-demand activate remote (streamable-http) MCP servers from the Docker MCP Catalog without declaring each one ahead of time in YAML.The catalog (45 servers) is embedded at build time from
https://desktop.docker.com/mcp/catalog/v3/catalog.json(filtered totype=remote AND remote.transport_type=streamable-http). The toolset surfaces five meta-tools to the model:search_remote_mcp_servers— case-insensitive fuzzy search over id / title / description / category / tags.list_remote_mcp_servers— show currently enabled servers.enable_remote_mcp_server— instantiate an*mcp.Toolsetfor the server (deferred — no network until tools are next enumerated).disable_remote_mcp_server— stop the toolset and remove its tools.reset_remote_mcp_server_auth— wipe persisted OAuth credentials so the next enable triggers a fresh authorization flow.How it works
*mcp.Toolsetis wrapped in atools.NewStartableso it gets the same lazy single-flight Start semantics as YAML-declared toolsets.Tools()lazily Start()s each enabled wrapped toolset on first enumeration. Treatsmcp.IsAuthorizationRequiredas expected deferral (silent on the non-interactive startup probe; user-facing OAuth dialog on the next interactive turn). Real Start failures get the StartableToolSet de-duplicated warning streak.mcp.remotetoolset.${ENV_VAR}placeholders in catalog headers (e.g. Apify'sAuthorization: Bearer ${APIFY_API_KEY}) resolve at enable time viajs.Expander— same code path the YAML toolsets use.reset_remote_mcp_server_authstops the inner toolset (so the live MCP session drops its now-stale token) before notifying the runtime and clearing the keyring entry. The notify fires even when the keyring removal fails, so the runtime's tool list never desyncs from the actualenabledstate.Configuration
See
examples/mcp_catalog.yamlfor a full example.What's in this branch
pkg/tools/builtin/mcpcatalog/{mcpcatalog.go, servers.go, servers.json}— toolset, embedded catalog (45 servers).pkg/teamloader/registry.go— registers the newmcp_catalogtoolset type.agent-schema.json— addsmcp_catalogto the toolset-type enums.examples/mcp_catalog.yaml— example agent usinganthropic/claude-sonnet-4-6.pkg/tools/builtin/mcpcatalog/mcpcatalog_test.go— 19 tests (load, search, lifecycle, race-friendly concurrent enable/disable, ctx cancellation, lazy-start regression test against a fake MCP server, auth-required deferral, reset_auth happy path / unknown id / no-op for non-oauth / disable-on-reset / error surfacing / notify-on-keyring-failure).Known limitation
The runtime's MCP-prompt discovery uses
tools.As[*mcp.Toolset]to find prompt-capable toolsets. Becausemcp_catalogis a container, prompts exposed by servers activated through this catalog are not surfaced via/prompts. Tools (the primary interface) work fine. Plumbing prompt discovery through container toolsets would require a new interface and changes topkg/runtime/runtime.go— out of scope for this PR.Validation
task build✅go vet ./...✅golangci-lint run ./...→ 0 issues ✅go test ./pkg/tools/builtin/mcpcatalog/... -race -count=1→ all 19 tests pass ✅