Skip to content

fix: make e2e tests pass reliably locally with Docker Desktop#13741

Open
glours wants to merge 1 commit intomainfrom
fix-e2e-local-failures
Open

fix: make e2e tests pass reliably locally with Docker Desktop#13741
glours wants to merge 1 commit intomainfrom
fix-e2e-local-failures

Conversation

@glours
Copy link
Copy Markdown
Contributor

@glours glours commented Apr 15, 2026

What I did

  • Fix stale image/container reuse across test runs
  • Add registry readiness check and async removal polling
  • Skip multi-arch test when docker driver supports it
  • Use t.Cleanup for reliable teardown, fix project name mismatches
  • Re-enable 4 previously skipped tests that now pass

Related issue
N/A

(not mandatory) A picture of a cute animal, if possible in relation to what you did

  - Fix stale image/container reuse across test runs                                                                                                                                                                                                                                                          - Add registry readiness check and async removal polling
  - Skip multi-arch test when docker driver supports it
  - Use t.Cleanup for reliable teardown, fix project name mismatches
  - Re-enable 4 previously skipped tests that now pass

Signed-off-by: Guillaume Lours <glours@users.noreply.github.com>
@glours glours marked this pull request as ready for review April 17, 2026 08:54
@glours glours requested a review from a team as a code owner April 17, 2026 08:54
@glours glours requested review from Copilot and ndeloof April 17, 2026 08:54
@glours glours self-assigned this Apr 17, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves local reliability of the Go E2E test suite (especially on Docker Desktop) by reducing stale resource reuse between runs, adding readiness/cleanup polling, and adjusting a few tests to be more deterministic across environments.

Changes:

  • Increase polling timeouts and add explicit readiness checks to avoid flakiness on slower Docker Desktop startups.
  • Make teardown more reliable (e.g., t.Cleanup, explicit down, and polling for async --rm removal).
  • Re-enable previously skipped E2E tests and adjust test commands/project naming to avoid mismatches and reuse.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
pkg/utils/safebuffer.go Increase RequireEventuallyContains poll timeout to accommodate Docker Desktop startup latency.
pkg/e2e/watch_test.go Ensure watched services are built and add explicit poll timeouts for watch/rebuild waits.
pkg/e2e/publish_test.go Add registry readiness polling prior to publish operations.
pkg/e2e/ps_test.go Add pre-test cleanup to avoid stale state from prior failed runs.
pkg/e2e/pause_test.go Re-enable the pause E2E test by removing the CI skip.
pkg/e2e/networks_test.go Re-enable TestNetworkConfigChanged.
pkg/e2e/model_test.go Conditionally skip based on docker-model plugin availability; align project usage and ensure --rm.
pkg/e2e/env_file_test.go Fix project name consistency and ensure run uses --rm and explicit -p.
pkg/e2e/compose_test.go Use t.Cleanup for teardown to improve reliability.
pkg/e2e/compose_run_test.go Poll for async container auto-removal after --rm to avoid timing flakes.
pkg/e2e/compose_run_build_once_test.go Use t.Cleanup-based teardown and avoid pre-run cleanup with random project names.
pkg/e2e/build_test.go Skip the “docker driver lacks multi-arch support” assertion when buildx reports multi-platform support under the docker driver.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/e2e/publish_test.go
Comment on lines +142 to +154
// Wait for registry to be ready
registryURL := "http://" + registry + "/v2/"
poll.WaitOn(t, func(l poll.LogT) poll.Result {
resp, err := http.Get(registryURL) //nolint:gosec,noctx
if err != nil {
return poll.Continue("registry not ready: %v", err)
}
_ = resp.Body.Close()
if resp.StatusCode < 500 {
return poll.Success()
}
return poll.Continue("registry not ready, status %d", resp.StatusCode)
}, poll.WithTimeout(10*time.Second), poll.WithDelay(100*time.Millisecond))
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The registry readiness poll uses http.Get without any request timeout. If the TCP connection succeeds but the server never responds (or a proxy/network issue stalls reads), this can hang indefinitely and bypass the poll timeout. Use a http.Client with a short Timeout or use the existing HTTPGetWithRetry helper (pkg/e2e/framework.go) so each attempt is bounded by a per-request timeout and the overall poll timeout remains effective.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants