Skip to content

feat(providers): add Docker Model Runner provider#1312

Closed
ericcurtin wants to merge 1 commit into
NVIDIA:mainfrom
ericcurtin:feat/provider-model-runner
Closed

feat(providers): add Docker Model Runner provider#1312
ericcurtin wants to merge 1 commit into
NVIDIA:mainfrom
ericcurtin:feat/provider-model-runner

Conversation

@ericcurtin
Copy link
Copy Markdown
Contributor

@ericcurtin ericcurtin commented May 11, 2026

Summary

  • Add model-runner as a built-in inference provider for OpenShell
  • Agents running inside sandboxes can route inference to Docker Model Runner via model-runner.docker.internal
  • No credentials required; Docker Model Runner is accessed over the Docker-internal network without authentication by default
  • model-runner is now a first-class inference routing type for inference.local (on par with Ollama, LM Studio, and other local providers)

Related Issue

N/A — new provider addition from the Docker ecosystem.

Changes

  • providers/model-runner.yaml — declarative profile: inference category, model-runner.docker.internal:80 endpoint, docker binary
  • crates/openshell-providers/src/providers/model_runner.rsModelRunnerProvider plugin; always returns a default discovered provider since the service is local
  • crates/openshell-providers/src/providers/mod.rs — expose model_runner module
  • crates/openshell-providers/src/profiles.rs — embed YAML at compile time
  • crates/openshell-providers/src/lib.rs — register plugin in ProviderRegistry; add model-runner and model_runner aliases to normalize_provider_type
  • crates/openshell-core/src/inference.rs — add InferenceProviderProfile for model-runner so it can be used with inference.local; default base URL is http://model-runner.docker.internal/engines/llama.cpp/v1; overridable via MODEL_RUNNER_BASE_URL
  • crates/openshell-server/src/inference.rs — add model-runner to the supported-types error message
  • docs/get-started/tutorials/inference-docker-model-runner.mdx — new tutorial on par with the Ollama tutorial
  • docs/get-started/tutorials/index.mdx — add Docker Model Runner tutorial card
  • docs/sandboxes/inference-routing.mdx — add Docker Model Runner tab to the Create a Provider section and Next Steps
  • docs/sandboxes/manage-providers.mdx — add model-runner to Supported Provider Types and Supported Inference Providers tables

Testing

  • All 38 existing openshell-providers unit tests pass
  • Two unit tests added in model_runner.rs covering id and discovery
  • One unit test added in openshell-core for the model-runner inference profile
  • cargo clippy -p openshell-providers -p openshell-core — clean
  • cargo fmt --all — clean
  • Markdown lint — clean (0 errors)

Checklist

  • Follows Conventional Commits format
  • SPDX license headers present on all new files
  • No credentials or secrets introduced
  • Unit tests added for new provider plugin and inference profile
  • Profile YAML validates (covered by default_profiles_are_sorted_by_id test)
  • Documentation added on par with Ollama integration

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 11, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@maxamillion
Copy link
Copy Markdown
Collaborator

/ok to test 4158928

@maxamillion
Copy link
Copy Markdown
Collaborator

@ericcurtin this looks good to me, fix the merge conflicts and I'll approve

@johntmyers johntmyers added gator:blocked Gator is blocked by process or repository gates test:e2e Requires end-to-end coverage labels Jun 3, 2026
protocol: rest
access: read-write
enforcement: enforce
binaries: [/usr/local/bin/docker]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is docker making the API call or would it be a model harness, agent workload, etc?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this instance it's the inference provider. Docker model runner is effectively an alternative to ollama.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

Label test:e2e applied for 4158928. Open the existing run and click Re-run all jobs to execute with the label set. The run will execute the standard E2E suite after building the required gateway and supervisor images once. The matching required CI gate status on this PR will flip green automatically once the run finishes.

@johntmyers
Copy link
Copy Markdown
Collaborator

johntmyers commented Jun 3, 2026

gator-agent

PR Review Status

Validation: this is project-valid provider-v2 work for a built-in local inference provider. The merge-conflict blocker is resolved, and gator has moved this PR back to review.
Head SHA: 3fb5d0cabfc65a69ab66dacc342850dc9345edbc

Review findings:

  • Blocking: the intended local/no-credential flow is not wired through the OpenShell inference route resolver yet. MODEL_RUNNER_PROFILE correctly declares credential_key_names: &[], but crates/openshell-server/src/inference.rs::resolve_provider_route still calls find_provider_api_key(...).ok_or_else(...) before constructing the route. With the documented openshell provider create --name model-runner --type model-runner flow, that returns None and openshell inference set --provider model-runner ... fails with “no usable API key credential.” This is not a Docker credential pass-through concern; it is OpenShell rejecting the local no-auth provider before it can route to the Docker Model Runner endpoint.
  • Security / least privilege: providers/model-runner.yaml grants /usr/local/bin/docker for the provider endpoint. If that binary grant is needed for a specific sandbox workflow, please document why; otherwise the documented runtime path appears to be HTTP inference through inference.local rather than Docker CLI network access.
  • Maintainability / UX: ModelRunnerProvider::discover_existing still reports the provider as available unconditionally. Please probe Docker Model Runner availability or return None when the local service cannot be validated.
  • Tests: please add coverage for a model-runner provider with empty credentials through server route resolution or openshell inference set, plus a negative-path discovery test.

Docs: updated for the direct UX change, but currently inaccurate until no-credential routing works as documented.

Checks: Branch Checks are failing. The Rust jobs fail in grpc::provider::tests::list_provider_profiles_returns_built_in_profile_categories because the expected built-in profile list does not include model-runner. Helm Lint is green.

E2E: test:e2e is applied and Core E2E passed. test:e2e-gpu is not applied and GPU E2E is not required.

Next state: gator:in-review

@johntmyers
Copy link
Copy Markdown
Collaborator

I'm also curious if there's been a full e2e smoke test done here? What does a deployment actually look like?

Add model-runner as a built-in inference provider so agents running
inside OpenShell sandboxes can route inference requests to the local
Docker Model Runner daemon via model-runner.docker.internal.

No credentials are required; Docker Model Runner is accessed over the
Docker-internal network without authentication by default.

Add an InferenceProviderProfile for model-runner in openshell-core so
the model-runner type can be used with inference.local. The default
base URL is http://model-runner.docker.internal/engines/llama.cpp/v1.
Override with MODEL_RUNNER_BASE_URL if needed.

Add documentation on par with the Ollama integration:
- Tutorial: docs/get-started/tutorials/inference-docker-model-runner.mdx
- model-runner added to the Supported Provider Types table
- model-runner added to the Supported Inference Providers table
- Docker Model Runner tab added to inference-routing.mdx
@ericcurtin ericcurtin force-pushed the feat/provider-model-runner branch from 4158928 to 3fb5d0c Compare June 3, 2026 21:07
@johntmyers
Copy link
Copy Markdown
Collaborator

/ok to test 3fb5d0c

@johntmyers johntmyers added gator:in-review Gator is reviewing or awaiting PR review feedback and removed gator:blocked Gator is blocked by process or repository gates labels Jun 3, 2026
@ericcurtin
Copy link
Copy Markdown
Contributor Author

I'm heading on PTO, unless someone wants to take this over like @ilopezluna or @doringeman it will probably stay stagnant... But I don't think they are vouched for

@maxamillion
Copy link
Copy Markdown
Collaborator

@johntmyers looking at the PR Review Status, I get the impression the coding agent is assuming that docker is doing some sort of pass through of provider and credentials instead of serving local inference similar to ollama.

@johntmyers
Copy link
Copy Markdown
Collaborator

@johntmyers looking at the PR Review Status, I get the impression the coding agent is assuming that docker is doing some sort of pass through of provider and credentials instead of serving local inference similar to ollama.

Ah yes, probably thinks that b/c I steered it that way. I'd still prefer not to populate legacy v1 providers with newer ones. So I think this could be reduced to only use the provider v2 profile. Since it could be stale for a while I'll close it and if someone wants to re-open and adjust that works for me.

@johntmyers johntmyers closed this Jun 3, 2026
@ericcurtin ericcurtin deleted the feat/provider-model-runner branch June 4, 2026 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gator:in-review Gator is reviewing or awaiting PR review feedback test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants