Skip to content

Fix broken artifact path matching#126

Open
refi64 wants to merge 1 commit into
mainfrom
wip/refi64/artifact-paths
Open

Fix broken artifact path matching#126
refi64 wants to merge 1 commit into
mainfrom
wip/refi64/artifact-paths

Conversation

@refi64

@refi64 refi64 commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Older versions of gitlab-runner did not filter uploaded artifacts at all, regardless of the value of artifacts the user gave. Despite this, several downstreams (most notably places where obs-gitlab-runner is used) would specify the artifacts there anyway, with the usual syntax/semantics that gitlab-runner uses, for consistency and as a "safety" for when this would later be implemented.

In commit eccfdbc, artifact filtering was added, by treating artifacts entries as glob patterns and matching them against the artifact paths the runner instances proposed. However, gitlab-runner itself instead uses doublestar to traverse the current tree for artifacts, which has a few notable implications:

  • Paths are normalized first with filepath.Clean & filepath.ToSlash:

    https://github.com/bmatcuk/doublestar/blob/a9ad9e0ef4d6b7e4443090e9a7201d847a881711/utils.go#L74-L80

    So file and ./file// are both the same pattern. (We don't care too much about ToSlash for now, since that should only have an effect for the Windows path separator).

  • If a pattern matches a directory, the full subtree is added to the archive.

  • Wildcards * and ? retain the usual filesystem-level semantics of not crossing path components.

All of these behaviors were broken with the move to glob matches, which will only do an exact match on the full artifact path with the default glob options (require_literal_separator: false).

To fix this, both the declared artifact path and UploadableFile's path (for consistency) are normalized, and then the patterns are matched on successive ancestors of the current path, while telling glob that it needs to give the / special treatment.

Technically this leaves some other gitlab-runner behavior on the table, such as support for brace expansion, or the aforementioned Windows separator support, or handling absolute paths:

https://gitlab.com/gitlab-org/gitlab-runner/-/blob/fc6a8943d99928a9d293186be03674c333fd09fa/commands/helpers/file_archiver.go#L191

But that's all leaning more towards fringe functionality that we probably don't care about here (at least not for now).

@sjoerdsimons

Copy link
Copy Markdown
Collaborator

Is it broken or do you mean it didn't match gitlab's runner behaviour? If the former can you point to the commit that introduced the brokeness. In the latter case apart from the bits you documented is this change the behaviour of the runner in other ways that might impact users?

Older versions of gitlab-runner did not filter uploaded artifacts at
all, regardless of the value of `artifacts` the user gave. Despite this,
several downstreams (most notably places where obs-gitlab-runner is
used) would specify the artifacts there anyway, with the usual
syntax/semantics that gitlab-runner uses, for consistency and as a
"safety" for when this would later be implemented.

In commit eccfdbc, artifact filtering
was added, by treating `artifacts` entries as glob patterns and matching
them against the artifact paths the runner instances proposed. However,
gitlab-runner itself instead uses doublestar to traverse the current
tree for artifacts, which has a few notable implications:

- Paths are normalized first with filepath.Clean & filepath.ToSlash:

  https://github.com/bmatcuk/doublestar/blob/a9ad9e0ef4d6b7e4443090e9a7201d847a881711/utils.go#L74-L80

  So `file` and `./file//` are both the same pattern. (We don't care too
  much about ToSlash for now, since that should only have an effect for
  the Windows path separator).

- If a pattern matches a directory, the full subtree is added to the
  archive.

- Wildcards `*` and `?` retain the usual filesystem-level semantics of
  not crossing path components.

All of these behaviors were "broken" with the move to glob matches,
which will only do an exact match on the full artifact path with the
default glob options (`require_literal_separator: false`).

To fix this, both the declared artifact path and UploadableFile's path
(for consistency) are normalized, and then the patterns are matched on
successive ancestors of the current path, while telling `glob` that it
needs to give the `/` special treatment.

Technically this leaves some other gitlab-runner behavior on the table,
such as support for brace expansion, or the aforementioned Windows
separator support, or handling absolute paths:

https://gitlab.com/gitlab-org/gitlab-runner/-/blob/fc6a8943d99928a9d293186be03674c333fd09fa/commands/helpers/file_archiver.go#L191

But that's all leaning more towards fringe functionality that we
probably don't care about here (at least not for now).
@refi64 refi64 force-pushed the wip/refi64/artifact-paths branch from 1bfd08f to 7013664 Compare June 18, 2026 14:42
@refi64

refi64 commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

I updated the commit message / PR body. I guess strictly speaking, it's not "broken" in that before we weren't doing any filtering at all, but I'd argue it's still "broken" in that we have a bunch of consumers of the runner (e.g. ci-package-builder) that actually give "correct" values for artifacts: anyway, except the syntax this crate was now using is very different and arguably nonsensical in context (* matching slashes?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants