feat(workflows): add label-driven bug-test workflow (#3239) by BenBtg · Pull Request #3257 · github/spec-kit

BenBtg · 2026-06-30T14:05:50Z

Summary

Adds the third stage (assess → fix → test) of the semi-automated, human-gated bug pipeline, closing #3239. A new gh-aw agentic workflow bug-test triggers when a maintainer applies the bug-test label, runs the relevant tests in isolation against the fix, compiles a readable pass/fail report, and posts it back as a single comment on the originating issue.

Modeled on the existing bug-assess workflow for safety and trigger parity, and decoupled from Spec Kit specifics so other projects can reuse it.

What's included

.github/workflows/bug-test.md — hand-authored agentic workflow source.
.github/workflows/bug-test.lock.yml — compiled with gh aw compile v0.79.8 (do not hand-edit).

Behavior

Label-driven trigger: issues: labeled gated to bug-test; bot-skip parity with bug-assess.
Locates the fix under test: linked PR → named fix branch → current-checkout fallback, only ever from origin (untrusted references are recorded, never fetched/executed).
Stack-agnostic test detection: uv+pytest (default for this repo), npm/pnpm/yarn, go, make — no hardcoded ecosystem.
Isolated execution: tests run inside the firewalled runner, wrapped in a timeout, treated as untrusted code; raw logs kept in $RUNNER_TEMP, never written to the working tree.
Compiles a report: structured test-report.md with a one-line verdict, counts table, failures, and caveats.
Verification mode: compares a generated fix against the historical fix for old/closed bugs to surface discrepancies and improve pipeline reliability.
Posts back: one comment (≤65k chars) + one optional result label (tests-passing / tests-failing / tests-inconclusive).
Safety parity: scoped read-only permissions (contents, issues, pull-requests), identical URL-safety / untrusted-input guardrails, maintainer remains the gatekeeper.
Action consistency: pinned to actions/checkout@v7.0.0 to align with other workflows in the repo.

Acceptance criteria

bug-test markdown workflow added under .github/workflows/ and compiled to its .lock.yml.
Triggered by applying the test label; runs tests in isolation against the fix.
Compiles and posts the test outcome back to the issue.
Supports validation against an old/closed bug to compare generated vs. historical fix.
Maintainer remains the gatekeeper; consistent safety model with the other stages.

Notes

The bug-fix stage (Implement label-driven bug fix workflow #3238) is not yet merged; bug-test consumes its output (PR/branch) but degrades gracefully when no fix artifact is found, reporting inconclusive.
Result labels (tests-passing / tests-failing / tests-inconclusive) are applied only if they exist in the repo; a missing label is a soft no-op and does not block the comment.
Compiled with gh aw v0.79.8. actions/checkout was manually pinned to v7.0.0 to align with repo standards (similar to dependabot PR chore(deps): bump actions/checkout from 6.0.3 to 7.0.0 #3064).

🤖 This PR was authored autonomously by GitHub Copilot (model: Claude Opus 4.8) on behalf of @BenBtg. Each commit carries an Assisted-by: trailer.

Add the third stage (assess → fix → test) of the semi-automated, human-gated bug pipeline. The `bug-test` agentic workflow triggers when a maintainer applies the `bug-test` label, runs the relevant tests in isolation against the fix, compiles a readable pass/fail report, and posts it back as a single issue comment. - Locates the fix under test: linked PR → named fix branch → current checkout fallback, only ever from origin. - Stack-agnostic test detection (uv+pytest, npm/pnpm/yarn, go, make) so it is decoupled from Spec Kit specifics and reusable by other projects. - Runs tests under a timeout as untrusted code; scoped read-only permissions; same URL-safety / untrusted-input guardrails as bug-assess. - Verification mode compares a generated fix against the historical fix for old/closed bugs to surface discrepancies. - Optional single result label (tests-passing / tests-failing / tests-inconclusive). Compiled bug-test.lock.yml with `gh aw compile`. Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous) Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Adds a new label-driven gh-aw agentic workflow stage (bug-test) to run relevant tests against a proposed bug fix and post a single compiled test report back to the originating issue, completing the assess → fix → test pipeline.

Changes:

Introduces a hand-authored .github/workflows/bug-test.md workflow prompt/source for the test stage.
Adds the compiled .github/workflows/bug-test.lock.yml generated by gh aw compile for execution in GitHub Actions.

Show a summary per file

File	Description
.github/workflows/bug-test.md	Defines the label-triggered “bug-test” agent behavior (locate fix artifact, detect test stack, run with timeout, compile report, post comment/label).
.github/workflows/bug-test.lock.yml	Compiled, pinned GitHub Actions workflow generated from `bug-test.md` for actual execution.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 1/2 changed files
Comments generated: 2
Review effort level: Low

… workflow Align with repo standards (e.g. dependabot PR #3064, other workflows). Manually pinned in the compiled lock file for consistency. Assisted-by: GitHub Copilot (model: Claude Opus 4.8, autonomous) Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>

Copilot

Review details

Files reviewed: 1/2 changed files
Comments generated: 0 new
Review effort level: Low

mnriem · 2026-07-01T17:13:16Z

Thank you!

Copilot AI review requested due to automatic review settings June 30, 2026 14:05

Copilot started reviewing on behalf of BenBtg June 30, 2026 14:06 View session

Copilot AI reviewed Jun 30, 2026

View reviewed changes

Comment thread .github/workflows/bug-test.lock.yml

Comment thread .github/workflows/bug-test.lock.yml

BenBtg marked this pull request as ready for review June 30, 2026 14:19

BenBtg requested a review from mnriem as a code owner June 30, 2026 14:19

BenBtg marked this pull request as draft June 30, 2026 14:24

BenBtg marked this pull request as ready for review June 30, 2026 14:27

BenBtg self-assigned this Jun 30, 2026

BenBtg marked this pull request as draft June 30, 2026 14:28

BenBtg marked this pull request as ready for review July 1, 2026 16:22

Copilot AI review requested due to automatic review settings July 1, 2026 16:22

Copilot started reviewing on behalf of BenBtg July 1, 2026 16:22 View session

Copilot AI reviewed Jul 1, 2026

View reviewed changes

mnriem approved these changes Jul 1, 2026

View reviewed changes

mnriem merged commit ac6eef4 into main Jul 1, 2026
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(workflows): add label-driven bug-test workflow (#3239)#3257

feat(workflows): add label-driven bug-test workflow (#3239)#3257
mnriem merged 2 commits into
mainfrom
benbtg-feat-3239-bug-test-workflow

BenBtg commented Jun 30, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

mnriem commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

BenBtg commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's included

Behavior

Acceptance criteria

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Review details

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Review details

Uh oh!

Uh oh!

mnriem commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

BenBtg commented Jun 30, 2026 •

edited

Loading