fix(gpu): select single CDI GPU defaults#1675
Open
elezar wants to merge 1 commit into
Open
Conversation
7 tasks
|
🌿 Preview your docs: https://nvidia-preview-pr-1675.docs.buildwithfern.com/openshell |
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
de229a4 to
7fa3cc6
Compare
3 tasks
4c18100 to
cec0e21
Compare
7fa3cc6 to
16fe7a2
Compare
Prefer a single CDI-qualified device when Docker or Podman resolves the default GPU request to one GPU. Allow nvidia.com/gpu=all only as a WSL2 all-only compatibility fallback, using Docker daemon info and Podman's /dev/dxg probe to identify that case. Update driver docs, architecture notes, and GPU e2e coverage for the default selection behavior. Signed-off-by: Evan Lezar <elezar@nvidia.com>
16fe7a2 to
4cedfec
Compare
elezar
commented
Jun 4, 2026
| Kubernetes mirrors each limit into the matching request. VM accepts the fields | ||
| but currently ignores them. | ||
|
|
||
| GPU requests enter the driver layer through `SandboxSpec.gpu` and |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Updates Docker and Podman GPU handling so a bare GPU request selects one concrete NVIDIA CDI device when possible.
Default
nvidia.com/gpu=allfallback is allowed only for WSL2 all-only compatibility. Explicit GPU device requests pass through unchanged, includingnvidia.com/gpu=all.Related Issue
Closes #1477
Changes
DiscoveredDevicesand Docker/infoto allow WSL2 all-only fallback./dev/nvidiaNdevice nodes and uses/dev/dxgfor WSL2 all-only fallback.Testing
mise exec -- cargo fmt --checkmise exec -- cargo check -p openshell-core -p openshell-driver-docker -p openshell-driver-podman --testsmise exec -- cargo test -p openshell-core -p openshell-driver-docker -p openshell-driver-podman gpu --testsmise exec -- cargo test -p openshell-driver-docker -p openshell-driver-podman wsl2 --testsmise exec -- cargo test -p openshell-core -p openshell-driver-podman all_only --testsmise run pre-commitChecklist