Skip to content

fix(docker): direct-exec resolved docker binary for isolated spawn (#696 follow-up)#699

Merged
Dumbris merged 1 commit into
mainfrom
fix/mcp-2744-docker-direct-exec
Jun 17, 2026
Merged

fix(docker): direct-exec resolved docker binary for isolated spawn (#696 follow-up)#699
Dumbris merged 1 commit into
mainfrom
fix/mcp-2744-docker-direct-exec

Conversation

@Dumbris

@Dumbris Dumbris commented Jun 17, 2026

Copy link
Copy Markdown
Member

Summary

Docker-isolated upstream servers still failed to start with command not found: docker when the docker CLI is not on the spawn PATH (Docker Desktop installed without the optional CLI-tools step), even after PR #697. #697 resolved the absolute path but still routed the spawn through $SHELL -l -c "<docker> run …", where the login shell re-derives $PATH from rc files and can drop the bundle dir. The absolute path was only a token inside the -c string, so the bug persisted while upstream_servers list reported a healthy docker_path (same resolver, different time/path → "status available, spawn not found").

Related #696. Branched from origin/main (contains #697 / 79849e0c).

Changes

A — direct-exec the resolved docker binary (primary). On successful resolution, setupDockerIsolation now returns the resolved absolute docker binary as the exec command with the raw docker argv (no login-shell wrap), mirroring the management path's newDockerCmd (exec.CommandContext(dockerBin, …)). It returns a shellWrapped flag so callers select the matching cidfile helper. Login-shell wrap of bare docker is kept only on the resolution-failure fallback. The spawn env already carries the enhanced PATH (BuildSecureEnvironment, EnhancePath=true), and exec'ing an absolute path needs no PATH at all.

  • New insertCidfileIntoDockerArgs inserts --cidfile <file> right after the run token in the argv slice — platform-agnostic, sidestepping the -c vs /c shell quirk. Both call sites (connection_stdio.go, connection_launcher.go) branch on shellWrapped.

B — negative-cache cannot shadow a live os.Stat probe. ResolveDockerPath re-runs the cheap, sandbox-immune well-known os.Stat probe before honoring a still-live cached negative, so a failure cached when only the restricted login-shell leg failed can never permanently shadow a docker binary sitting at a well-known path.

Tests (TDD — written first, watched fail, then implemented)

  • TestSetupDockerIsolationExecsResolvedBinaryDirectly (new, the regression that failed today): asserts finalCommand == <resolved>, args[0] == "run", no shell -c indirection.
  • Updated TestSetupDockerIsolationUsesResolvedAbsolutePath, …CidfileInsertionWithAbsolutePath, …FallsBackToBareDocker, TestInsertCidfileWindowsCmdFormat for the args-based cidfile + direct-exec shape; added TestInsertCidfileIntoDockerArgs.
  • TestResolveDockerPath_StatProbeOverridesLiveNegative (new) — a still-live negative must not shadow a now-present well-known docker.

Verification

go test ./internal/upstream/... ./internal/shellwrap/... -race   # ok
go build ./cmd/mcpproxy                                          # ok
golangci-lint run --config .github/.golangci.yml ./internal/upstream/core/... ./internal/shellwrap/...  # 0 issues (v2)

Docs: docs/docker-isolation.md troubleshooting section updated to describe direct-exec (ENG-9).

Generated with mcpproxy cockpit — engineer agent. Human merges at the pre-merge gate.

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 17, 2026

Copy link
Copy Markdown

Deploying mcpproxy-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: e7ac964
Status: ✅  Deploy successful!
Preview URL: https://2321a134.mcpproxy-docs.pages.dev
Branch Preview URL: https://fix-mcp-2744-docker-direct-e.mcpproxy-docs.pages.dev

View logs

@codecov-commenter

codecov-commenter commented Jun 17, 2026

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 90.00000% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/shellwrap/shellwrap.go 90.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown

📦 Build Artifacts

Workflow Run: View Run
Branch: fix/mcp-2744-docker-direct-exec

Available Artifacts

  • archive-darwin-amd64 (28 MB)
  • archive-darwin-arm64 (25 MB)
  • archive-linux-amd64 (16 MB)
  • archive-linux-arm64 (14 MB)
  • archive-windows-amd64 (28 MB)
  • archive-windows-arm64 (25 MB)
  • frontend-dist-pr (0 MB)
  • installer-dmg-darwin-amd64 (21 MB)
  • installer-dmg-darwin-arm64 (19 MB)

How to Download

Option 1: GitHub Web UI (easiest)

  1. Go to the workflow run page linked above
  2. Scroll to the bottom "Artifacts" section
  3. Click on the artifact you want to download

Option 2: GitHub CLI

gh run download 27673404798 --repo smart-mcp-proxy/mcpproxy-go

Note: Artifacts expire in 14 days.

@Dumbris

Dumbris commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

🤖 Codex review (model gpt-5.5, grounded in fix/mcp-2744-docker-direct-exec vs origin/main)

Verdict: REQUEST_CHANGES — the binary-resolution fix is correct, but the direct-exec path introduces a daemon-connectivity regression for shell-derived Docker environments (P2).

[P2] Preserve shell-derived Docker environment for direct exec — internal/upstream/core/connection_docker.go:89

When Docker daemon access depends on variables initialized by the user's login shell — e.g. DOCKER_HOST / DOCKER_CONTEXT for Colima / rootless Docker — the new direct-exec path skips wrapWithUserShell and runs docker with only BuildSecureEnvironment(), whose default allowlist does not include Docker-specific variables. Previously the isolated launch sourced the login shell before docker run, so these setups could start containers; now binary resolution succeeds but docker run cannot connect to the daemon — a silent regression for Colima/rootless users.

Fix options: add the Docker env vars (DOCKER_HOST, DOCKER_CONTEXT, DOCKER_CONFIG, DOCKER_TLS_VERIFY, DOCKER_CERT_PATH) to the secure-env allowlist for the docker-spawn path, capture them from the login shell once at resolve time, or fall back to shell-wrapping when any of these are set. Add a regression test for the Colima case (DOCKER_HOST set, must survive to the spawn env).

Reviewed via codex exec review --base origin/main in a checkout of the PR branch.

Dumbris added a commit that referenced this pull request Jun 17, 2026
The MCP-2744 direct-exec docker-isolation spawn replaced the $SHELL -l -c
login-shell wrap, so the child inherits only the secure-env allowlist. That
allowlist excluded DOCKER_HOST/DOCKER_CONTEXT/DOCKER_CONFIG/DOCKER_TLS_VERIFY/
DOCKER_CERT_PATH, so for Colima/rootless/remote daemons the docker binary
resolved but 'docker run' could not reach the daemon (silent regression).

Add DockerDaemonEnvVars() to the secure-env allowlist (DefaultEnvConfig) and
thread them through for command/stdio servers even when an operator supplies a
custom Environment allowlist. These are connection pointers, not secrets.

Related #699
Related #696
@Dumbris

Dumbris commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

Addressed Codex P2 — Docker daemon env preserved on direct-exec path (b54d0b0)

Good catch — confirmed and fixed.

Root cause: the direct-exec spawn (connection_docker.go:88) replaced the $SHELL -l -c login-shell wrap, so the child now inherits only BuildSecureEnvironment(). The allowlist in secureenv.DefaultEnvConfig excluded the Docker daemon-access vars, so for Colima/rootless/remote daemons the binary resolved but docker run could not reach the daemon.

Fix:

  • Added secureenv.DockerDaemonEnvVars() = {DOCKER_HOST, DOCKER_CONTEXT, DOCKER_CONFIG, DOCKER_TLS_VERIFY, DOCKER_CERT_PATH} and included it in DefaultEnvConfig's allowlist (cross-platform). These are connection pointers, not secrets.
  • client.go also threads them through for command/stdio servers even when an operator supplied a custom Environment allowlist that omits them (the slice is rebuilt, not appended in place, so the shared source config is never mutated).

Tests (TDD, all offline):

  • secureenv.TestBuildSecureEnvironmentPreservesDockerHost — the requested Colima regression: DOCKER_HOST/DOCKER_CONTEXT set in env must survive into the spawn env.
  • secureenv.TestDefaultEnvConfigAllowsDockerDaemonVars — allowlist guard for all five vars.
  • core.TestDockerIsolationSpawnEnvPreservesDockerHost — end-to-end via NewClient with a restrictive custom allowlist; DOCKER_HOST still threads through.

Verified: go build ./cmd/mcpproxy; go test ./internal/secureenv/... ./internal/upstream/core/... ./internal/shellwrap/... -race green; golangci-lint v2.5.0 (CI config) 0 issues.

@Dumbris

Dumbris commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

🤖 Codex re-review (round 2) — still REQUEST_CHANGES (1×P2)

The env-var preservation fix doesn't actually cover the documented Colima/rootless case.

[P2] Allowlist only preserves inherited vars — capture from the login shell — internal/upstream/core/client.go:194-197

Adding DOCKER_HOST/DOCKER_CONTEXT to AllowedSystemVars only keeps values already present in os.Environ(). When mcpproxy is launched from the tray/LaunchAgent and a Colima/rootless setup exports DOCKER_HOST/DOCKER_CONTEXT only from login-shell startup files, those keys are absent from os.Environ(), so the allowlist is a no-op and the direct-exec path still hits the default socket → fails for exactly those users.

Fix: capture these specific Docker vars from the login shell (reuse the same login-shell resolution mechanism that already discovers PATH/docker), or fall back to shell-wrapping the docker spawn when DOCKER_HOST/DOCKER_CONTEXT are unset in os.Environ(). Add a regression test simulating "var present only via login shell, absent from os.Environ()".

Reviewed via codex exec review --base origin/main on b54d0b07.

Dumbris added a commit that referenced this pull request Jun 17, 2026
…-exec spawn

Codex round-2 (P2) on PR #699: the round-1 fix only added DOCKER_HOST/
DOCKER_CONTEXT/… to the secure-env allowlist, which preserves vars already in
os.Environ(). When mcpproxy is launched from the macOS tray / LaunchAgent those
vars exist ONLY in the user's login shell rc files, so the allowlist was a
no-op and the direct-exec docker spawn still hit the default socket
(Colima/rootless/remote daemons unreachable).

Fix: capture the docker daemon-access vars FROM the login shell, mirroring the
existing LoginShellPATH machinery.

- internal/shellwrap/shellwrap.go — new LoginShellEnvVars(logger, keys): one
  `<shell> -l -c` invocation prints the requested vars bracketed by markers and
  separated by 0x1F (banner-safe), parsed and cached per-key process-wide.
  Returns only non-empty values; no-op on Windows / on capture failure.
- internal/secureenv/manager.go — BuildSecureEnvironment, gated on EnhancePath
  (the upstream-spawn signal), injects any docker daemon var still missing from
  the env via the login shell. Vars already present (allowlist + os.Environ)
  are left untouched, so it only forks when something is actually missing, and
  the underlying capture is cached. Overridable loginShellEnvFn for tests.

The round-1 allowlist append (client.go) is kept: it still covers vars that ARE
in os.Environ() under a custom operator allowlist. The two paths are
complementary.

TDD: TestLoginShellEnvVars_CapturesFromShellOnly (var exported only inside the
fake login shell, absent from os.Environ) + TestBuildSecureEnvironment_Injects
DockerVarFromLoginShell (restrictive allowlist proves the login-shell path, not
the allowlist, supplied DOCKER_HOST) + EnhancePath-gating guard. All offline.

Related #696
@Dumbris

Dumbris commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

Round-2 (P2) addressed — capture DOCKER_* from the login shell (c9eac4fc)

You're right: round-1 only allowlisted the vars, which preserves what's in os.Environ(). Under tray/LaunchAgent those vars live only in rc files, so the allowlist was a no-op and direct-exec still hit the default socket.

Fix: capture the daemon-access vars from the login shell, mirroring the existing LoginShellPATH machinery rather than re-introducing the command shell-wrap.

  • shellwrap.LoginShellEnvVars(logger, keys) — one <shell> -l -c run prints the requested vars bracketed by unique markers, separated by 0x1F (banner-safe, same approach as LoginShellPATH), parsed and cached per-key process-wide. Non-empty only; no-op on Windows/failure.
  • secureenv.BuildSecureEnvironment (gated on EnhancePath) injects any docker daemon var still missing from the env via the login shell. Only forks when something is actually missing; cached. Vars already present (allowlist + os.Environ) are untouched — so the round-1 allowlist append still covers the os.Environ-present + custom-allowlist case. Complementary, not redundant.

Regression test you asked for (var only in login shell, absent from os.Environ()):

  • TestLoginShellEnvVars_CapturesFromShellOnly — fake login shell exports DOCKER_HOST/DOCKER_CONTEXT internally (never in the parent env); asserts they're captured and an unset var is omitted.
  • TestBuildSecureEnvironment_InjectsDockerVarFromLoginShell — restrictive allowlist {PATH,HOME} guarantees os.Environ() cannot supply DOCKER_HOST, so the pass proves the login-shell path did it.
  • TestBuildSecureEnvironment_NoDockerLoginShellWhenEnhanceDisabled — gating guard.

All offline. go test ./internal/shellwrap/... ./internal/secureenv/... ./internal/upstream/core/... -race green; golangci-lint v2 (CI config) 0 issues; go build ./cmd/mcpproxy ok.

@Dumbris

Dumbris commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

🤖 Codex re-review (round 3) — still REQUEST_CHANGES (2×P2)

The login-shell DOCKER_* capture works, but it introduced two new issues — one robustness, one isolation/security.

[P2] Verify the resolved path before direct-exec — internal/upstream/core/connection_docker.go:89

ResolveDockerPath's last-resort lookup is command -v docker in the login shell, which can resolve to a shell function/alias or a bare name rather than an executable file. Direct-exec.Command on that fails before the isolated server starts. Only direct-exec when the resolved value is a verified absolute path to an executable (filepath.IsAbs + stat + executable bit); otherwise keep the shell-wrapped fallback.

[P2] Scope DOCKER_* passthrough to Docker launches only — internal/upstream/core/client.go:194-197

This block runs for every configured command, so a non-Docker upstream whose Environment.allowed_system_vars intentionally excludes DOCKER_* still receives them from the parent env — weakening configured env filtering and exposing Docker context/config paths to arbitrary MCP subprocesses. Only capture/append these vars when the server actually uses docker run / Docker isolation.

Reviewed via codex exec review --base origin/main on c9eac4fc.

Note: the recurring "fix one var, break another coupling" pattern across rounds 1–3 is being addressed at the root with a separate systematic launch-context env-hydration design (in progress). Recommend this PR be reduced to the safe core (verified-absolute direct-exec + shell fallback, docker-scoped env) and leave broad env hydration to that follow-up.

…rPath (#696)

Minimal #696 fix per board decision (MCP-2744): drop the direct-exec spawn
migration and all DOCKER_* env experiments; keep only the negative-cache fix.

Root cause of 'status shows docker_path but spawn used bare docker': under a
LaunchAgent/sandbox the first resolution can fail because only the restricted
login-shell leg failed, caching a negative for dockerPathNegativeTTL=60s. Early
upstream spawns hit that window and fall back to bare 'docker'. The well-known
-path probe is a pure os.Stat (never sandbox- or login-shell-restricted), so a
live cached negative must not shadow a docker binary present at a well-known
path right now. ResolveDockerPath now re-runs the cheap os.Stat probe before
honoring a still-live negative and, on success, upgrades the cache to a
permanent success.

With the cache fixed, the existing post-#697 shell wrap runs the resolved
ABSOLUTE docker path as the command token, so login-shell PATH is irrelevant
and #696 is fixed with no Colima regression and no env handling change. The
direct-exec migration + login-shell env hydration are deferred to a separate
systematic task.

TDD: TestResolveDockerPath_StatProbeOverridesLiveNegative asserts a successful
os.Stat well-known probe wins over a still-live cached negative (no TTL expiry).

Related #696

Co-Authored-By: Paperclip <noreply@paperclip.ing>
@Dumbris Dumbris force-pushed the fix/mcp-2744-docker-direct-exec branch from c9eac4f to e7ac964 Compare June 17, 2026 07:58
@Dumbris

Dumbris commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

Revised to the minimal #696 fix per board decision (e7ac964d, force-pushed)

Per Algis's decision, dropped the direct-exec migration and all DOCKER_* env experiments (which caused the 3 rounds of Codex P2s). PR #699 is now just the negative-cache fix + test.

Net diff (vs origin/main): 2 files, +65/-8

  • internal/shellwrap/shellwrap.goResolveDockerPath re-runs the pure-os.Stat well-known-path probe before honoring a still-live cached negative; on success it upgrades the cache to a permanent success.
  • internal/shellwrap/shellwrap_test.goTestResolveDockerPath_StatProbeOverridesLiveNegative: a successful os.Stat probe wins over a still-live cached negative (no TTL expiry).

Reverted (deferred to a new systematic task):

Why this fixes #696 with no regression: the bug was the negative-cache poisoning making spawn (early, sandboxed) and status (later) diverge. With the cache fixed, the existing shell wrap runs /Applications/Docker.app/.../docker run … as an absolute token → no command not found, no Colima env regression, no env hack.

Verification:

  • go build ./cmd/mcpproxy
  • go test ./internal/shellwrap/... ./internal/upstream/core/... ./internal/secureenv/... -race
  • golangci-lint v2.5.0 (CI config) → 0 issues ✓

Follow-up systematic task (direct-exec migration + login-shell env hydration) being filed separately. Re-requesting Codex review on this minimal diff.

@Dumbris

Dumbris commented Jun 17, 2026

Copy link
Copy Markdown
Member Author

🤖 Codex review — ACCEPT ✅

Re-reviewed after re-scoping to the minimal #696 fix (negative-cache stat-probe override; direct-exec + DOCKER_* experiments reverted).

The change is narrowly scoped and preserves existing cache behavior while allowing well-known Docker paths to override a live negative cache. The added regression test passes.

No P0–P2 findings. Fixes the "status shows docker_path but spawn used bare docker" symptom at the source (the 60s negative-cache no longer shadows a stat-resolvable absolute path), with the $SHELL -l -c wrap retained so the login env (incl. DOCKER_*) is preserved — no Colima regression. Broader login-shell env hydration tracked separately.

Reviewed via codex exec review --base origin/main on e7ac964d.

@mcpproxy-gatekeeper mcpproxy-gatekeeper Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved via Claude Code review (Codex quota exhausted).

@Dumbris Dumbris enabled auto-merge (squash) June 17, 2026 08:02
@Dumbris Dumbris merged commit b99af57 into main Jun 17, 2026
34 checks passed
Dumbris added a commit that referenced this pull request Jun 17, 2026
…aunchd) — MCP-2751 (#701)

* feat(shellwrap): hydrate login-shell env once at startup (macOS GUI/launchd)

When mcpproxy is launched from a macOS GUI/launchd context (Launchpad, the
SMAppService login item, or the tray spawning the core), the Go core inherits a
launchd-minimal environment: it never sources the user's login shell, so it
lacks Homebrew/Docker PATH entries and exported vars like DOCKER_HOST. This is
the root-cause class behind the recent docker/PATH point-fixes (#697, #699).

Add a one-time, allow-listed login-shell environment hydration at startup so
every spawn path (docker lifecycle, stdio servers, uvx/npx, ResolveDockerPath,
secureenv.BuildSecureEnvironment) inherits a correct environment with no
call-site changes:

- internal/shellwrap/hydrate.go: captureLoginShellEnv() sources `$SHELL -l -c
  'env -0'` once (NUL-delimited, marker-bracketed, 5s timeout, sync.Once) and
  caches the parsed env. LoginShellPATH now reads PATH from this shared capture
  — one shell fork, not two.
- HydrateFromLoginShell() gates on macOS + launchd-minimal PATH (no-op on
  terminal launches and non-macOS), merges PATH login-first, and applies a
  curated allow-list (DOCKER_*, HTTP(S)/NO_PROXY, NVM_DIR/ASDF_DIR/PYENV_ROOT/
  VOLTA_HOME/HOMEBREW_PREFIX/COLIMA_HOME) set-if-empty via os.Setenv. Secrets
  are never imported; HOME/USER/SHELL are never touched; values are never
  logged (key names + lengths only).
- secureenv: extend the default allow-list with the curated DOCKER_*/proxy/
  tool-home keys so the hydrated vars survive the filter into upstream spawns.
  Scanner MinimalEnv stays narrow (credential-free).
- cmd/mcpproxy: call HydrateFromLoginShell after logger setup, before any
  manager reads os.Environ().

Supersedes the per-spawn DOCKER_* env capture; the absolute-path docker probe
(#697) and negative-cache fix (#699) remain as complementary fallbacks.

Related #699

* fix(shellwrap): preserve intentionally-empty operator vars during hydration

Codex review (PR #701, P2): the curated-key set-if-empty guard used
`os.Getenv(key) != ""`, which treats an explicitly set-empty value the same
as unset. An operator who sets `HTTPS_PROXY=` or `DOCKER_HOST=` to DISABLE an
inherited value would have it overwritten from the login shell — violating the
never-clobber-operator-values contract.

Use os.LookupEnv to distinguish unset from intentionally-empty: any present
key (even empty) is operator intent and is preserved. launchd never sets these
to empty, so present-but-empty is always deliberate. Adds
TestHydrate_PreservesIntentionallyEmptyVar.

Co-Authored-By: Paperclip <noreply@paperclip.ing>

* fix(shellwrap): alias-aware set-if-unset for proxy var spellings

Codex round-2 (PR #701, P2): proxy vars (HTTP_PROXY/HTTPS_PROXY/NO_PROXY) are
case-insensitive to HTTP clients, so the per-key LookupEnv guard could still
clobber operator intent — e.g. operator sets `https_proxy=` (lowercase, to
disable) but the login shell exports `HTTPS_PROXY=...`; the exact-key check on
the uppercase spelling missed the lowercase presence and imported it anyway.

Split the curated allow-list into single-spelling keys (DOCKER_*, tool-home)
and an alias-aware proxy trio: for each logical proxy var, if EITHER spelling
is already present in the process env (including intentionally empty), skip
hydrating BOTH spellings. Adds TestHydrate_ProxyCaseAliasingPreservesOperatorIntent
(daemon https_proxy='' + login HTTPS_PROXY=http://x → neither imported).

Co-Authored-By: Paperclip <noreply@paperclip.ing>

* fix(shellwrap): drop proxy vars, fix launch gate for pre-seeded PATH

Codex round-3 scope reduction:
- Remove HTTP_PROXY/HTTPS_PROXY/NO_PROXY from hydration and from
  secureenv DefaultEnvConfig allow-list. Proxy forwarding to every
  stdio upstream is out of scope; filed as a separate opt-in follow-up.
- Fix the launch gate: hydration now triggers when PATH looks launchd-
  minimal OR any DOCKER_* curated var is absent. A GUI launcher can
  pre-seed PATH via /etc/paths yet still not export DOCKER_HOST from
  rc files, so PATH-comprehensiveness alone is insufficient.
- PATH-merge stays gated on pathIsMinimal only — a comprehensive PATH
  is not widened even when the DOCKER_* gate fires.

Tests:
- Rename GateNoOpOnComprehensivePath → GateNoOpOnComprehensivePathAndAllDockerPresent
  (now requires both conditions, so all 5 DOCKER_* vars are pre-set)
- Add ComprehensivePathStillHydratesDockerIfMissing — verifies the new
  secondary gate: comprehensive PATH, missing DOCKER_HOST → hydrates
  curated vars but leaves PATH unmodified
- Drop PreservesIntentionallyEmptyVar and ProxyCaseAliasingPreservesOperatorIntent
  (proxy vars are no longer hydrated)
- Update MinimalPathHydratesCuratedSet to assert proxy vars are NOT imported

Co-Authored-By: Paperclip <noreply@paperclip.ing>

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Dumbris added a commit that referenced this pull request Jun 17, 2026
…otless regression)

Codex round-4 (PR #703 REQUEST_CHANGES, P2): the direct-exec branch regressed
non-Darwin rootless/remote Docker. Dropping the `$SHELL -l` wrap also drops
rc-file env inheritance, and HydrateFromLoginShell (MCP-2751) is Darwin-only
while BuildSecureEnvironment only forwards DOCKER_* already in os.Environ(). So
on Linux a daemon whose DOCKER_HOST/DOCKER_CONTEXT lives only in .profile would
be lost by direct-exec — the same DOCKER_* loss that made #699 keep the shell
wrap.

Add a second precondition (dockerDaemonEnvGuaranteed) alongside the verified-
absolute-executable check: direct-exec only when the daemon env is guaranteed
without the login shell — macOS (startup hydration captured DOCKER_* into
os.Environ()) OR DOCKER_HOST/DOCKER_CONTEXT already present in the process env on
any platform. Otherwise keep the `$SHELL -l` wrap so `docker run` inherits the
rc-file daemon config; the wrap now prefers the resolved absolute path (still
sidestepping the login shell's PATH re-derivation) and drops to bare "docker"
only when resolution failed.

dockerDaemonEnvGOOS is a package var so the Darwin branch is testable on a
non-Darwin CI host. New tests: non-Darwin + DOCKER_HOST-only-in-rc stays
shell-wrapped with the absolute path; non-Darwin + DOCKER_HOST in env direct-
execs; and a table test for the gate.

Related #703
Related #699

Co-Authored-By: Paperclip <noreply@paperclip.ing>
Dumbris added a commit that referenced this pull request Jun 17, 2026
…CP-2753) (#703)

* fix(docker): direct-exec resolved docker binary for isolated spawn (MCP-2753)

PR #697 resolved the docker CLI to an absolute path but still routed the
isolated spawn through `$SHELL -l -c "<docker> run …"`, where the login shell
re-derives $PATH from rc files and can drop the Docker Desktop bundle dir. The
absolute path was only a token in the -c string, so docker-isolated servers
still failed with `command not found: docker` when the CLI was not on the
LaunchAgent/tray spawn PATH — even though `upstream_servers list` reported a
good docker_path.

On successful resolution to a VERIFIED absolute executable, setupDockerIsolation
now returns the resolved binary as the exec command with raw docker argv (no
shell wrap), mirroring the management path's newDockerCmd. It returns a
shellWrapped flag so callers pick the matching cidfile helper: the new
args-based insertCidfileIntoDockerArgs inserts `--cidfile` after the `run`
token (platform-agnostic, sidestepping the -c vs /c quirk), while the
login-shell fallback keeps the string-based insertCidfileIntoShellDockerCommand.

The direct-exec branch is gated by isDirectExecutable (filepath.IsAbs + os.Stat
+ exec-bit). ResolveDockerPath's last resort runs `command -v docker` in the
login shell, which can emit a shell function/alias/bare name; such a value is
rejected and shell-wrapped instead of failing a direct exec. Login-shell wrap of
bare `docker` is also kept on the resolution-failure fallback.

The DOCKER_* daemon-env half of MCP-2753 is already satisfied by the startup
login-shell hydration shipped in MCP-2751 (#701): DOCKER_* are hydrated into
os.Environ() and carried through BuildSecureEnvironment's default allow-list, so
the direct-exec'd docker process still reaches Colima/rootless/remote daemons
without a per-spawn shell capture.

Related #699
Related #696

* fix(docker): gate direct-exec on guaranteed daemon env (non-Darwin rootless regression)

Codex round-4 (PR #703 REQUEST_CHANGES, P2): the direct-exec branch regressed
non-Darwin rootless/remote Docker. Dropping the `$SHELL -l` wrap also drops
rc-file env inheritance, and HydrateFromLoginShell (MCP-2751) is Darwin-only
while BuildSecureEnvironment only forwards DOCKER_* already in os.Environ(). So
on Linux a daemon whose DOCKER_HOST/DOCKER_CONTEXT lives only in .profile would
be lost by direct-exec — the same DOCKER_* loss that made #699 keep the shell
wrap.

Add a second precondition (dockerDaemonEnvGuaranteed) alongside the verified-
absolute-executable check: direct-exec only when the daemon env is guaranteed
without the login shell — macOS (startup hydration captured DOCKER_* into
os.Environ()) OR DOCKER_HOST/DOCKER_CONTEXT already present in the process env on
any platform. Otherwise keep the `$SHELL -l` wrap so `docker run` inherits the
rc-file daemon config; the wrap now prefers the resolved absolute path (still
sidestepping the login shell's PATH re-derivation) and drops to bare "docker"
only when resolution failed.

dockerDaemonEnvGOOS is a package var so the Darwin branch is testable on a
non-Darwin CI host. New tests: non-Darwin + DOCKER_HOST-only-in-rc stays
shell-wrapped with the absolute path; non-Darwin + DOCKER_HOST in env direct-
execs; and a table test for the gate.

Related #703
Related #699

Co-Authored-By: Paperclip <noreply@paperclip.ing>

---------

Co-authored-by: Paperclip <noreply@paperclip.ing>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants