Skip to content

shim: open output FIFO write end probe-first#229

Open
austinvazquez wants to merge 2 commits into
containerd:mainfrom
austinvazquez:shim-stdio-rendezvous
Open

shim: open output FIFO write end probe-first#229
austinvazquez wants to merge 2 commits into
containerd:mainfrom
austinvazquez:shim-stdio-rendezvous

Conversation

@austinvazquez

Copy link
Copy Markdown
Member

Open the output FIFOs probing for the reader end first bounded by context and hardcoded timeout to prevent blocking. Similarly for Windows named pipes, spin if file not found bounded by context and same hardcoded timeout. Surface timeouts as DEADLINE_EXCEEDED errors.

Copilot AI review requested due to automatic review settings June 19, 2026 01:53

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the shim’s stdio binding behavior to avoid indefinite blocking when opening output FIFOs / named pipes by probing for the reader/server first with bounded waiting and surfacing deadline timeouts as context.DeadlineExceeded (which maps to gRPC DEADLINE_EXCEEDED).

Changes:

  • Windows: add dialPipeWaiting to poll through “pipe not found” until the server end appears or a timeout/context cancellation occurs.
  • Unix: add openFifoWriteRendezvous to open FIFO write-ends non-blocking and poll until a reader appears, then clear O_NONBLOCK for normal backpressure.
  • Add platform-specific unit tests covering “server/reader present”, “arrives late”, and “times out”.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
internal/shim/task/io_copystreams_windows.go Adds a context-bounded named-pipe dial helper that waits for pipe creation.
internal/shim/task/io_copystreams_windows_test.go Adds Windows tests validating the new named-pipe dial behavior.
internal/shim/task/io_copystreams_unix.go Adds FIFO rendezvous open logic to avoid blocking on missing readers and improve safety.
internal/shim/task/io_copystreams_unix_test.go Adds non-Windows tests for FIFO rendezvous open behavior (including symlink rejection).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/shim/task/io_copystreams_unix.go
Comment thread internal/shim/task/io_copystreams_windows.go
Comment thread internal/shim/task/io_copystreams_windows_test.go
Comment thread internal/shim/task/io_copystreams_unix_test.go Outdated
@austinvazquez austinvazquez force-pushed the shim-stdio-rendezvous branch from 29a710a to 293788b Compare June 19, 2026 02:51
The v1 shim opened a container's stdout/stderr FIFO with a blocking
O_WRONLY open before opening its own read-end holder. The write open
blocks until a reader is present, so a process whose stdio consumer
never opens the read end stalls the bind indefinitely.

Open the write end probe-first instead: O_WRONLY|O_NONBLOCK returns ENXIO
while no reader is present, so poll until a reader appears, the context
is cancelled, or a 30s deadline elapses, then clear O_NONBLOCK so
io.Copy's writes block on a full pipe (backpressure) rather than spin on
EAGAIN. Binding never stalls at start and fails fast (surfaced as
DEADLINE_EXCEEDED) if the reader never arrives. The read-end holder is
opened after the writer so a later reader detach does not EOF us.

Pass O_NOFOLLOW on the write open: for stdio the path is
client-supplied, and a symlink swapped in after daemon validated the
path must not redirect the shim's writes to an arbitrary file.
O_NOFOLLOW is deliberately NOT set on the read-end holder or stdin:
those go through containerd/fifo, which reopens by /proc/self/fd handle
(itself a symlink, already dev/ino-verified) and would fail that reopen
with ELOOP on Linux.

Signed-off-by: Austin Vazquez <austin.vazquez@docker.com>
winio.DialPipe retries only while a pipe exists but all of its server
instances are busy (ERROR_PIPE_BUSY); it fails immediately if the pipe
does not exist yet (ERROR_FILE_NOT_FOUND). For client-owned process I/O
the consumer is the pipe server and may not have called ListenPipe by the
time the shim binds stdio during Create/Exec, so the bind fails
spuriously.

Add dialPipeWaiting, which polls through "not found" until the context is
cancelled or a 30s deadline elapses, and use it for stdin, stdout, and
stderr. Each attempt uses winio.DialPipeContext so a busy pipe is retried
under ctx (observed within winio's 10ms granularity) rather than blocking
for a fixed per-dial timeout — callers can cancel promptly. For
daemon-owned stdio the pipe already exists, so the first dial succeeds —
same behavior as before.

Signed-off-by: Austin Vazquez <austin.vazquez@docker.com>
@austinvazquez austinvazquez force-pushed the shim-stdio-rendezvous branch from 293788b to 84199cb Compare June 19, 2026 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants