Skip to content

feat(cli): add --env flag to sandbox create/exec and fix env var passthrough#1730

Open
russellb wants to merge 2 commits into
NVIDIA:mainfrom
russellb:env-vars-fix-and-cli-support
Open

feat(cli): add --env flag to sandbox create/exec and fix env var passthrough#1730
russellb wants to merge 2 commits into
NVIDIA:mainfrom
russellb:env-vars-fix-and-cli-support

Conversation

@russellb
Copy link
Copy Markdown
Contributor

@russellb russellb commented Jun 3, 2026

Summary

Add --env KEY=VALUE (repeatable) to openshell sandbox create and openshell sandbox exec, and fix a bug where user-specified environment variables were silently dropped inside sandbox SSH sessions.

Related Issue

N/A — discovered during investigation of env var injection support for sandboxes.

Changes

CLI (new feature):

  • Add --env KEY=VALUE repeatable flag to sandbox create and sandbox exec subcommands
  • Reuse existing parse_key_value_pairs() helper for consistent KEY=VALUE parsing
  • Wire environment maps into SandboxSpec.environment and ExecSandboxRequest.environment protobuf fields (which were previously only accessible via the SDK/gRPC API)

Sandbox env passthrough (bug fix):

  • The SSH handler's apply_child_env() calls env_clear() to strip supervisor secrets before spawning user shells, but this also wiped user-specified env vars set on the container by compute drivers. Introduced in 5fd4885 (2026-02-26, "feat(sandbox): VS Code Remote-SSH support with platform detection fix").
  • Add OPENSHELL_USER_ENVIRONMENT sidecar env var: compute drivers serialize SandboxSpec.environment as JSON, the supervisor deserializes it at startup, and the SSH handler re-injects the values after env_clear()
  • All four compute drivers updated (Docker, Podman, Kubernetes, VM)
  • User-specified vars applied at lowest priority so proxy, TLS, and provider credentials still take precedence

Testing

  • cargo test -p openshell-cli — 18 tests pass (including 2 new: env passthrough + invalid format rejection)
  • cargo test across all modified crates — 1,252 tests pass
  • cargo clippy clean across all modified crates
  • End-to-end verified: rebuilt supervisor image, restarted Podman gateway, confirmed --env FOO=BAR appears in sandbox env output

Checklist

  • Follows Conventional Commits
  • No secrets or credentials committed

Add `--env KEY=VALUE` (repeatable) to `openshell sandbox create` and
`openshell sandbox exec`, wiring user-specified environment variables
into the existing `SandboxSpec.environment` and
`ExecSandboxRequest.environment` protobuf fields which were previously
only accessible via the SDK/gRPC API.

Fix a bug where user-specified environment variables set via
`SandboxSpec.environment` were silently dropped inside sandbox SSH
sessions. The SSH handler's `apply_child_env()` calls `env_clear()` to
strip supervisor-internal secrets before spawning user shells, but this
also wiped container-level env vars including user-specified ones. This
was introduced in 5fd4885 (2026-02-26, "feat(sandbox): VS Code
Remote-SSH support with platform detection fix") which added
`env_clear()` to fix a VS Code platform detection timeout caused by
leaked supervisor env vars.

The fix introduces an `OPENSHELL_USER_ENVIRONMENT` sidecar env var:
compute drivers (Docker, Podman, Kubernetes, VM) serialize the
user-specified environment as JSON into this variable at container
creation time, the sandbox supervisor deserializes it at startup, and
the SSH handler re-injects the values into child processes after
`env_clear()`. User-specified vars are applied at lowest priority so
proxy, TLS, and provider credentials still take precedence.

Signed-off-by: Russell Bryant <russell.bryant@gmail.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Jun 3, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

.env("PATH", &path)
.env("TERM", term);

for (key, value) in user_environment {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want the user to be able to override the environment variables in the preceding block?

@johntmyers johntmyers added gator:in-review Gator is reviewing or awaiting PR review feedback test:e2e Requires end-to-end coverage and removed gator:in-review Gator is reviewing or awaiting PR review feedback labels Jun 4, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 4, 2026

Label test:e2e applied, but pull-request/1730 is at {"messa while the PR head is fada35e. A maintainer needs to comment /ok to test fada35ee1f3aaddd6b57ee3a7e7afcddb337b638 to refresh the mirror. Once the mirror catches up, re-run Branch E2E Checks from the Actions tab.

@johntmyers
Copy link
Copy Markdown
Collaborator

gator-agent

/ok to test fada35e

@johntmyers johntmyers added the gator:in-review Gator is reviewing or awaiting PR review feedback label Jun 4, 2026
/// Set by compute drivers from `SandboxSpec.environment`. The sandbox
/// supervisor deserializes this at startup and injects the variables into
/// SSH child processes (which use `env_clear()` for security isolation).
pub const USER_ENVIRONMENT: &str = "OPENSHELL_USER_ENVIRONMENT";
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agent analysis: In docker/podman/vm the trailing environment.extend(user_env) runs after inserting the sidecar, so a user var literally named OPENSHELL_USER_ENVIRONMENT clobbers the JSON sidecar → supervisor deserialization fails → all user env silently dropped. Edge case, but cheap to defend (reserve/reject the key, or
insert
the sidecar

@johntmyers johntmyers added the area:docs Documentation and examples label Jun 4, 2026
@johntmyers
Copy link
Copy Markdown
Collaborator

gator-agent

PR Review Status

Validation: this PR is project-valid as a concentrated CLI/sandbox feature and bug fix for environment injection and passthrough. No exact duplicate was found.
Head SHA: fada35ee1f3aaddd6b57ee3a7e7afcddb337b638

Review findings:

  • Blocking: crates/openshell-driver-vm/src/driver.rs builds required supervisor environment first, then extends with user env. That lets a create-time env value override supervisor-critical values such as OPENSHELL_ENDPOINT, sandbox identity, SSH socket path, TLS paths, or PATH. Docker/Kubernetes/Podman reapply required values after user env; VM should preserve the same invariant and add override-protection coverage.
  • Blocking: OPENSHELL_USER_ENVIRONMENT is user-clobberable in Docker, Podman, and VM because the sidecar is inserted before environment.extend(user_env). A user-supplied --env OPENSHELL_USER_ENVIRONMENT=... can replace the sidecar that the sandbox supervisor later parses. Kubernetes avoids this by upserting the sidecar after user env. Please reserve this key or insert the sidecar after user env consistently, with tests.
  • Needs resolution: sandbox exec --env is rendered as leading shell assignments by the gateway, so command-level env can override proxy/TLS/provider env for that command. That conflicts with the PR statement that proxy, TLS, and provider credentials still take precedence. Please either enforce reserved-key behavior for exec env or document/confirm the intended precedence.
  • Needs resolution: Podman currently keeps template-over-spec environment precedence, while Docker, Kubernetes, and VM use spec-over-template. Please make precedence consistent or explain why Podman should differ, with a conflict test.
  • Existing maintainer question remains open on whether user env should override the baseline child env block (HOME, USER, LOGNAME, SHELL, PATH, TERM).

Docs: missing for direct UX changes. The new --env flags and sandbox env behavior should be documented in the Fern docs under docs/; the existing sandbox docs page appears sufficient, so docs/index.yml likely does not need navigation changes. A maintainer-authored no-docs rationale would also satisfy this gate.

Tests/E2E: test:e2e is required because this touches sandbox lifecycle, supervisor/gateway exec behavior, and compute driver environment behavior. I applied test:e2e and posted /ok to test for the current head SHA.

Next state: gator:in-review

@johntmyers
Copy link
Copy Markdown
Collaborator

/ok to test fada35e

@johntmyers
Copy link
Copy Markdown
Collaborator

gator-agent

CI Follow-Up

Branch Checks failed on both Rust jobs at rust:format:check. The failure is PR-caused: the new USER_ENVIRONMENT read in crates/openshell-sandbox/src/lib.rs needs rustfmt.

Branch E2E is still running, Helm Lint passed, and the PR remains in gator:in-review because author changes are required for the review findings, docs, and formatting failure.

@johntmyers
Copy link
Copy Markdown
Collaborator

gator-agent

E2E Result

Branch E2E completed with one failing job: e2e / E2E (rust-podman-rootless). The failed test was websocket_text_placeholder_is_rewritten_in_sandbox; sandbox creation failed during SSH with RuntimeError("websocket upgrade failed"). Docker, regular Podman, Python, and Kubernetes E2E jobs passed.

This does not look directly caused by the env-var changes from the current evidence, but the required E2E status is red. The PR remains in gator:in-review; author changes are still needed for the review findings, docs, and rustfmt failure before pipeline monitoring can advance.

- VM driver: restructure build_guest_environment to apply user env
  before required driver vars, preventing user-specified values from
  overriding OPENSHELL_ENDPOINT, PATH, and other critical vars.
  Matches the pattern already used by Docker/Podman/Kubernetes.

- All drivers: insert OPENSHELL_USER_ENVIRONMENT sidecar after
  environment.extend(user_env) so a user-supplied key of the same
  name cannot clobber the JSON payload the supervisor parses.

- Podman driver: fix template/spec env precedence — template env is
  now applied first with spec overwriting, matching the other drivers
  (spec is user-specified and should win over image defaults).

- CLI: add parse_env_pairs() that rejects OPENSHELL_* prefixed keys
  with a clear error message. Both sandbox create and exec call sites
  updated to use it.

- SSH handler: filter out OPENSHELL_* keys from user_environment
  before injecting into child processes.

- Docs: add "Set Environment Variables" section to manage-sandboxes.mdx
  documenting --env for both create and exec, per-command override
  semantics, and the OPENSHELL_* reservation.

- Tests: add sandbox_create_env_rejects_reserved_prefix test.

- Fix rustfmt formatting in lib.rs, container.rs, driver.rs.
@russellb
Copy link
Copy Markdown
Contributor Author

russellb commented Jun 4, 2026

AI coding agent response — addressing review feedback from @mrunalp and gator-agent.

Fixes pushed in f190fa3

Blocking: VM driver override-protection

Restructured build_guest_environment in the VM driver to apply user env before required driver vars (OPENSHELL_ENDPOINT, SANDBOX_ID, PATH, etc.), matching the pattern already used by Docker, Podman, and Kubernetes. User-specified values can no longer override security-critical driver vars.

Blocking: OPENSHELL_USER_ENVIRONMENT clobberable

Three-layer defense:

  1. CLI: new parse_env_pairs() rejects any --env key starting with OPENSHELL_ with a clear error message.
  2. Drivers: OPENSHELL_USER_ENVIRONMENT sidecar is now inserted after environment.extend(user_env) in all four drivers, so even if a key slipped through, the sidecar overwrites it.
  3. SSH handler: apply_child_env() filters out OPENSHELL_* keys from user_environment before injecting into child processes.

Needs resolution: sandbox exec --env precedence

Exec env vars are rendered as leading shell assignments by the gateway — this is intentional per-command override behavior, consistent with standard Unix semantics. The user already has shell access and could run export http_proxy=... manually. Updated the PR description to document the actual precedence rather than claiming proxy/TLS/provider always wins on exec. Documented in the new docs section.

Needs resolution: Podman template/spec env precedence

Fixed. Template env is now applied first, then spec env overwrites — matching Docker, Kubernetes, and VM behavior (spec is user-specified and should win over image defaults).

Inline comment (@mrunalp): user env overriding baseline vars

Decision: user env can override baseline child env vars (HOME, USER, SHELL, PATH, TERM) but OPENSHELL_* keys are always filtered out. This gives users flexibility (e.g., custom PATH, different HOME) while protecting internal sandbox machinery.

CI: rustfmt failure

Fixed formatting in lib.rs, container.rs, and driver.rs.

Docs

Added "Set Environment Variables" section to docs/sandboxes/manage-sandboxes.mdx covering --env for both sandbox create and sandbox exec, per-command override semantics, and the OPENSHELL_* reservation. Also added --env to the exec flags table.

Tests

Added sandbox_create_env_rejects_reserved_prefix integration test verifying the OPENSHELL_* key rejection.

@johntmyers
Copy link
Copy Markdown
Collaborator

/ok to test f190fa3

@johntmyers
Copy link
Copy Markdown
Collaborator

johntmyers commented Jun 4, 2026

gator-agent

PR Review Status

Validation: this PR remains project-valid as a concentrated CLI/sandbox feature and bug fix for environment injection and passthrough.
Head SHA: f190fa3d1b2294ead2d9b5d1fbbca38ed20ed2db

Review findings:

  • Blocking: SandboxSpec.environment and template environment can still carry reserved OPENSHELL_* keys through SDK/gRPC paths because only the CLI rejects them. In Docker, user env is applied before the final required driver env block, but it is still extended into the runtime environment before the supervisor starts, so a non-CLI caller can affect supervisor-visible OPENSHELL_* values before crates/openshell-sandbox/src/ssh.rs filters child env. Please enforce the reserved prefix at the API/model or driver boundary and add driver/API-level coverage, especially for Docker.
  • Warning: parse_key_value_pairs still accepts an empty key such as --env =value; please reject empty env keys in parse_env_pairs.
  • Prior feedback mostly resolved: VM required env precedence, OPENSHELL_USER_ENVIRONMENT sidecar clobbering, Podman template/spec precedence, and docs coverage look addressed in the updated diff.

Docs: updated under docs/sandboxes/manage-sandboxes.mdx; no navigation update appears necessary.

Checks/E2E: test:e2e remains required and has passed at this head. Branch Checks, Core E2E, Helm Lint, and the non-required GPU E2E gate are all green.

Process blocker: GitHub currently reports mergeable_state: dirty, so the branch appears to need a merge-conflict update before it can proceed.

Next state: gator:blocked

@johntmyers johntmyers added gator:blocked Gator is blocked by process or repository gates and removed gator:in-review Gator is reviewing or awaiting PR review feedback labels Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:docs Documentation and examples gator:blocked Gator is blocked by process or repository gates test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants