Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/INVENTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ This file is the exact path inventory for the live GitHub Copilot catalog in thi
- `.github/skills/local-agent-sync-external-resources/SKILL.md`
- `.github/skills/local-agent-sync-global-copilot-configs-into-repo/SKILL.md`
- `.github/skills/local-agent-sync-install-ai-resources/SKILL.md`
- `.github/skills/local-copilot-log-analyzer/SKILL.md`
- `.github/skills/mattpocock-caveman/SKILL.md`
- `.github/skills/openai-docx/SKILL.md`
- `.github/skills/openai-gh-address-comments/SKILL.md`
Expand Down
3 changes: 3 additions & 0 deletions .github/instructions/internal-python.instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,10 @@ This file is optimized for Copilot code review and should produce only evidenced
- Verify guard clauses and error handling make failure modes explicit.
- Flag unsafe input handling, shell invocation, or filesystem side effects.
- Check function and module boundaries for readability and cohesion.
- Flag behavioral configuration buried in helpers, services, or library modules instead of centralized at the correct boundary: a script entrypoint, `Configuration` section, settings module, adapter, application factory, or composition root.
- Do not flag stable domain invariants merely because they are constants near domain code.
- Verify type hints and public interfaces stay consistent with call sites.
- Flag manual formatting churn that fights the repository formatter; when Ruff is configured, prefer `ruff format` and Ruff diagnostics over subjective style edits.
- Report dependency usage that is unpinned or unnecessary for the change.
- Flag vendored libraries, wheelhouses, copied site-packages, or fallback dependency mirrors.
- Flag new external dependencies that are missing hash-locked pins in the owning requirements file.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ Baseline owner: `internal-python`
| PY-M07 | `print()` instead of `logging` in application/library code | No log level control in production |
| PY-M08 | Missing unit tests for new public functions | Violates test coverage mandate |
| PY-M09 | Python tests outside repository-root `tests/` or without mirrored source paths | Breaks repository test discoverability and ownership mapping |
| PY-M10 | `rich`, emoji, tables, or panels outside human-facing CLI/reporting boundaries | Mixes terminal UI with importable logic or machine-readable output such as JSON |

## Minor

Expand Down Expand Up @@ -61,7 +62,7 @@ except:
try:
result = fetch_data()
except requests.RequestException as exc:
logger.warning("⚠️ Fetch failed: %s", exc)
logger.warning("Fetch failed: %s", exc)
raise
```

Expand Down Expand Up @@ -89,5 +90,5 @@ import logging
logger = logging.getLogger(__name__)

def process(data: list[dict]) -> None:
logger.info("ℹ️ Processing %d items", len(data))
logger.info("Processing %d items", len(data))
```
6 changes: 6 additions & 0 deletions .github/skills/internal-gateway-execute-plans/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ consumes approved `compact` and `extended` retained plans.
- Reject `compact` folders outside the `mini-plan-*` convention.
- Ignore `questions.md` during execution.
- Maintain a compact execution state and prefer targeted rereads over full file re-ingestion unless new evidence invalidates current state.
- Use `Compact Evidence Reporting` for large validator output: read enough output to decide the state honestly, then retain command, exit code, material counts, header or schema checks, changed files, and exact gaps instead of pasting raw logs.
- Infer the execution strategy from `Plan profile`, folder shape, merged control-contract sections in `02-control.md` when applicable, and the validation path. Do not require a separate retained-plan consumer field.
- Audit only mandatory requirements that are applicable; do not convert specialist rules into universal policy.
- Use `superpowers-verification-before-completion` as the fresh-evidence owner; do not duplicate its mechanics.
Expand All @@ -71,6 +72,10 @@ scope, anti-scope, and validation path, then iterate:
5. Continue only while evidence improves and no stop condition fires.
6. Stop with `DONE`, a blocker, or an explicit evidence gap.

Apply `Compact Evidence Reporting` after each focused validation: preserve the
exact gap and proof path, but keep large outputs summarized unless the raw
output is itself the missing evidence.

Stop on scope drift, destructive action, owner conflict, missing validation
path, human approval need, secret exposure risk, or repeated non-improving
failures.
Expand Down Expand Up @@ -195,6 +200,7 @@ Before declaring any closeout step complete:
- Hiding ownership conflicts instead of escalating a next owner and validation path.
- Continuing the Agentic Execution Loop after evidence stops improving or a
stop condition fires.
- Pasting raw large validator output when compact evidence would preserve the same proof.
- Packaging `DONE` while evidence gaps still require `APPLIED_UNVERIFIED`, `PARTIAL`, or `BLOCKED`.
- Declaring a non-`DONE` state without writing or updating the `<STATE>-plan-state.md` marker.
- Leaving stale `<STATE>-plan-state.md` markers behind after a state transition.
Expand Down
2 changes: 2 additions & 0 deletions .github/skills/internal-gateway-idea-brainstorming/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ through retained-plan creation. It stops before execution.
- Same-conversation support-skill loading is not a lane change.
- Idea Gate 0 remains mandatory.
- Start with a bounded evidence pass ordered by risk. Read only the smallest local owner evidence needed to classify the request before asking questions.
- For large tabular files, generated reports, or long log exports, keep the bounded evidence pass aggregate-first: collect file sizes, schema or headers, counts, anomalies, and targeted slices before any deeper read.
- When authoritative platform semantics control feasibility or ownership, verify them early in the bounded evidence pass.
- This gateway is not a specialized execution owner. A concrete task may not be accepted for execution here until Idea Gate 0 is `grill-me satisfied` and `Critical Gate 2` is `confident`.
- For a direct concrete operation, emit `Specialization Checkpoint: gated`, explain that this owner cannot decide task ownership or execute yet, and continue with the bounded evidence pass plus mandatory `grill-me`.
Expand All @@ -59,6 +60,7 @@ State rules:
- If the incoming request is already concrete (file edit, command execution, validator run, or implementation step), start with `Specialization Checkpoint: gated` before Idea Gate 0.
- At `Specialization Checkpoint: gated`, name the recommended specialized owner (`internal-gateway-simple-task` by default, `internal-gateway-review` for defect-first review, `internal-gateway-critical-master` for pressure testing), but do not ask the user to keep this owner yet.
- Continue through the bounded evidence pass, mandatory `grill-me`, and critical gate before asking whether this owner should stay in charge of the task.
- Before the initial numbered block, keep large-file and large-log evidence compact: summarize counts, headers, anomalies, routes, and open gaps unless raw content is itself the missing evidence.
- After the evidence pass, load `grill-me` and ask one mandatory numbered bulk question block with recommendations and defaults.
- Before the initial numbered block, emit a compact facts/options summary derived from the bounded evidence pass.
- Ask further focused numbered bulk blocks only for unresolved, dependent, or reopened branches.
Expand Down
67 changes: 62 additions & 5 deletions .github/skills/internal-gateway-review/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,69 @@ description: Use when repository-owned work needs same-conversation defect-first

- `internal-code-review`
- `internal-high-level-review`
- `internal-ai-resource-review`
- `internal-copilot-audit`
- `internal-gateway-critical-master`
- `internal-gateway-writing-plans`
- `internal-agent-support-next-step`

Portable review orchestrator. This skill owns review scope, lens selection,
findings consolidation, critical support, and remediation-plan transition. It
does not apply fixes.
Treat this section as an audit and routing index, not a preload bundle. Load a
referenced skill only when the domain, finding, blocker, or phase requires it.

Portable review orchestrator. Owns review scope, lens selection, findings
consolidation, critical support, and remediation-plan transition. It does not apply fixes.

Before any user-visible verdict, run a lightweight internal check for evidence,
severity, false positives, contrary evidence, and scope narrowing. Load
`internal-gateway-critical-master` only for a material challenge. Revise or
reopen when the check exposes a material gap.

See `references/review-gate.md` for the review output contract and gate states.

## Lens selection

Select the review lens from the changed-path families, not from a single default.
A diff may activate more than one lens; load each only when its evidence appears.

- Code lens (`internal-code-review`): Python, Bash, Terraform, Java, or
Node.js/TypeScript source changes.
- Systems lens (`internal-high-level-review`): architecture, workflow, or
cross-cutting impact beyond line-level defects.
- AI-resource lens (`internal-ai-resource-review`, with `internal-copilot-audit`
as the drift sub-lens): repository-owned AI customization assets, including
`.github/skills/**`, `.github/agents/*.agent.md`, `.github/prompts/*.prompt.md`,
`.github/instructions/**`, `AGENTS.md`, `.github/copilot-instructions.md`,
`.github/INVENTORY.md`, and `**/agents/openai.yaml`.

When the diff is mainly AI customization assets, the AI-resource lens is
mandatory and the code lens stays scoped to any embedded scripts only. Do not
let the code lens silently skip `.md` skill, agent, or instruction files.

## Token Discipline

Inspect diff and failing evidence first; avoid broad repository scans unless an
evidence gap requires one; never preload referenced skills; show at most 5
material findings unless exhaustive review is requested; summarize omitted
low-risk observations separately, not as findings.

Use `Compact Evidence Reporting` for large diffs, generated files, tabular
exports, and logs: keep findings defect-first, cite the smallest excerpt or
file point that proves impact, and avoid dumping large raw blocks when a
targeted excerpt plus evidence path preserves the same proof.

## Review To Plan Transition

Before creating, accepting, or routing a remediation plan, keep the review
defect-first and map every original material finding: `id`, `status` (`planned`,
`deferred`, `rejected`, or `residual`), `reason`, `next owner`, and `validation
expected`.

If remediation steps cover less than 100% of material findings, label the
output `partial remediation plan` and keep residual, deferred, or rejected
findings visible. A retained mini-plan is a coverage-preserving handoff authored
by `internal-gateway-writing-plans`; its job is plan creation, not fixes. This
gateway does not choose the execution owner.

## When to use

- The user asks for review of a concrete artifact, diff, workflow, or bundle.
Expand All @@ -27,7 +80,11 @@ See `references/review-gate.md` for the review output contract and gate states.
## Validation

- Findings stay defect-first.
- Lens selection matches the changed-path families; AI customization assets route to `internal-ai-resource-review` (drift via `internal-copilot-audit`) instead of being skipped by the code lens.
- Review flow preserves compact context: prioritize diff and failing evidence first, then expand only when an evidence gap remains.
- Review output carries findings, severity, confidence, evidence gap, route or next owner, and a Review Gate outcome before the final verdict.
- Retained remediation plans are authored by `internal-gateway-writing-plans`.
- Large evidence may be reported compactly, but each material finding still keeps severity, confidence, evidence gap, counter-validation result, and route or next owner.
- Review output carries findings, severity, confidence, evidence gap, counter-validation result, route or next owner, and a Review Gate outcome before the final verdict.
- The review cannot present analysis to the user until counter-validation confirms it or reopens material gaps.
- Remediation-plan transitions preserve a 100% material-finding coverage map or explicitly declare a `partial remediation plan`.
- Retained remediation plans are authored by `internal-gateway-writing-plans` and preserve the coverage map.
- The gateway stops before fixes.
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,19 @@ Use this reference when `internal-gateway-review` needs to package findings befo
- Severity
- Confidence
- Evidence gap
- Counter-validation
- Route or next owner
- Review Gate outcome

## Gate States

- `review gate: satisfied` when the findings are specific, routed, and ready for the user-visible verdict.
- `review gate: reopen` when material evidence is missing or the remediation choice needs more challenge.
- `review gate: satisfied` when the findings are specific, routed, counter-validated, and ready for the user-visible verdict.
- `review gate: reopen` when material evidence is missing, counter-validation exposes a material flaw, or the remediation choice needs more challenge.

## Boundary

- Keep the gate visible before any fixes.
- Run counter-validation before the final user-visible verdict; challenge each finding for evidence, severity, route, and contrary proof.
- For large diffs, generated files, logs, or tabular exports, keep evidence compact: cite the smallest excerpt or path that proves the finding and summarize omitted raw volume.
- Report only material self-critique results: corrections, confidence changes, evidence gaps, or confirmation that no material issue was found.
- Use the gate to route each actionable finding to the smallest next owner.
39 changes: 36 additions & 3 deletions .github/skills/internal-gateway-simple-task/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,33 @@ Classify every simple task before operational work as `full-gate`,
- If planning, review, critical pressure, or multi-phase validation becomes the
real job, `escalate`.

### Token Budget Gate

- Run a `Token Budget Gate` before choosing `trivial-skip` or `plan-mode` when
the user asks for low-token execution or the task centers on large tabular
files, log exports, repeated tool output, or broad file changes.
- For Copilot or debug log analysis, start with file size, model-call counts,
prompt or token aggregates, tool-span counts, result-size summaries, and
targeted slices; avoid full JSON dumps or prompt bodies unless they are the
missing evidence.
- Keep compact reporting runner-agnostic: ask for bounded summaries, exit
state, counts, anomalies, and evidence gaps, but do not require `jq`, `awk`,
shell flags, or terminal-only recipes unless they are already the local
workflow being analyzed.
- A cost checkpoint pauses before a new expensive tool burst, broad reread, or
repeated execution loop. It does not interrupt ordinary conversation,
grill-me questioning, or collaborative reasoning when no expensive tool
action is being launched.
- If the user explicitly asks for full output, deeper slices, or continued
execution, name the likely token or context impact before expanding and then
either proceed with the smallest bounded next slice or ask for confirmation
before the new expensive burst.
- Keep `trivial-skip` only for truly tiny local work with obvious validation
and no material completeness risk.
- If context pressure could hide required validation, data integrity, or route
ownership, prefer `plan-mode` and apply the `Plan Profile Selection Guard`
before proposing `compact`.

## Simple Procedure

1. Inspect local files first.
Expand Down Expand Up @@ -147,9 +174,15 @@ the output shape changes.
### Profile

- Default to `compact` (`tmp/superpowers/mini-plan-*`) with
`01-change-summary.md` and `02-execution.md`.
- Use `extended` only when the task needs multi-slice execution, multiple
independent validators, an articulated anti-scope, or external pins.
`01-change-summary.md` and `02-execution.md` only when the task stays within
one owner, one execution lane, one primary validation path, and low
completeness risk.
- Apply the `Plan Profile Selection Guard` before proposing `compact`.
- Use `extended` when the task needs multi-slice execution, multiple
independent validators, an articulated anti-scope, external pins,
cross-skill token-discipline work, validator-impacting changes, or
exports, generated reports, and datasets that need non-trivial
reconciliation.

### Procedure

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,31 +39,55 @@ signal and ask for explicit confirmation before switching to plan mode.
provisioning).
- There is material risk that context pressure or chat limits will interrupt
the work before it can be validated.
- The task centers on large `.csv`, `.tsv`, `.xlsx`, JSON log exports,
repeated tool output, or broad file changes that would bloat chat context.
- The user is asking for a large refactor, migration, or cross-file mechanical
change that is safer as a tracked plan.

Do not switch to plan mode implicitly without declaring the detected signals
and asking for user confirmation.

## Profile selection
### Token Budget Gate

When the cost signals come mainly from context pressure instead of task count,
prefer a compact evidence posture rather than a raw-output posture. Keep same-chat
execution only for tiny local work; otherwise switch to `plan-mode` and let the
profile guard choose whether `compact` is still safe.

A cost checkpoint pauses before a new expensive tool burst, broad reread, or
multi-step execution loop. It does not interrupt ordinary conversation,
grill-me analysis, or collaborative study when no expensive tool action is
starting.

- **Default `compact`**: use for a single owner, concrete target, one primary
validation path, and one execution lane. Folder name follows
`tmp/superpowers/mini-plan-*` and contains `01-change-summary.md` and
`02-execution.md`.
- **Use `extended` only when**: the task needs multi-slice execution, several
independent validators, an articulated anti-scope, or external pins that must
be tracked in a control file.
When the user explicitly asks for broader output, deeper analysis, or continued
execution, name the likely token or context impact first and then either
continue with the smallest bounded next slice or ask for confirmation before
the new expensive burst.

## Profile selection

When in doubt, prefer `compact`. A simple task that needs a plan usually does
not need the overhead of an extended plan.
- **Default `compact`**: use only when the task stays within a single owner,
concrete target, one primary validation path, one execution lane, and low
completeness risk. Folder name follows `tmp/superpowers/mini-plan-*` and
contains `01-change-summary.md` and `02-execution.md`.
- **Plan Profile Selection Guard**: escalate to `extended` when context or
completeness risk is material, especially for cross-skill token-discipline
work, validator-impacting changes, exports or generated reports, datasets
that need non-trivial reconciliation, several independent validators, an
articulated anti-scope, or external pins that must be tracked in a control
file.

When profile safety is in doubt, prefer `extended` and state why. Prefer
`compact` only when the plan can record the contrary evidence that keeps
lower-context execution safe.

## Confirmation rule for implicit triggers

For implicit cost-signal triggers, emit a short statement that:

1. Names the detected cost signals.
2. Proposes `plan-mode` with a default `compact` profile.
2. Proposes `plan-mode` with the safest profile suggested by the signals,
defaulting to `compact` only when the profile guard stays clear.
3. Asks the user to confirm, decline, or choose `extended`.

Do not write the retained plan until the user confirms.
Expand Down
Loading