From 8509f65f6e7362b392f50a6bad78c719e426345c Mon Sep 17 00:00:00 2001 From: Diego Mauricio Lagos Date: Thu, 18 Jun 2026 22:10:29 +0200 Subject: [PATCH 01/11] feat: Enhance internal gateway writing plans with clarity on token management and file structuring --- .../internal-gateway-writing-plans/SKILL.md | 19 +++++++++++++++++-- .../references/plan-review-gate.md | 2 +- .../scripts/plan_authoring.py | 9 +++++++-- tests/test_plan_policy_contract.py | 4 ++++ 4 files changed, 29 insertions(+), 5 deletions(-) diff --git a/.github/skills/internal-gateway-writing-plans/SKILL.md b/.github/skills/internal-gateway-writing-plans/SKILL.md index f2b456a..e7a6a4e 100644 --- a/.github/skills/internal-gateway-writing-plans/SKILL.md +++ b/.github/skills/internal-gateway-writing-plans/SKILL.md @@ -44,7 +44,12 @@ New `compact` plans should use `tmp/superpowers/mini-plan-*`. | Profile | When | Required files | | --- | --- | --- | | `compact` | Single owner, concrete target, one validation path, low-to-medium risk, and one execution lane. Best fit for small/fast executors after positive handoff validation. | `01-change-summary.md`, `02-execution.md` | -| `extended` | Cross-family changes, higher risk, lower-context execution, multiple validators, or multi-slice execution state. Thinking-first profile with explicit control files and deterministic read order. | `01-change-summary.md`, `02-control.md`, `03-execution.md`, additional numbered files by category (`04-...`). | +| `extended` | Cross-family changes, higher risk, lower-context execution, multiple validators, or multi-slice execution state. Soft-limit profile: use judgment-based size review with completeness over compression, explicit control files, and deterministic read order. | `01-change-summary.md`, `02-control.md`, `03-execution.md`, additional numbered files by category (`04-...`). | + +Escalate to `extended` when completeness risk is material: exports, reports, or +datasets with non-trivial reconciliation; external API contracts +(credentials, pagination, retries, schema pinning); executive-facing output; +multiple validators; or synced always-on guidance edits. Do not use `compact` when the executor needs exact sources, target files, validators, blockers, or external pins that only `02-control.md` @@ -70,6 +75,8 @@ can provide. - Compact plans have a 2,000 estimated-token total budget measured as `ceil(UTF-8 bytes / 4)` across plan Markdown files. Keep `02-execution.md` under 1,500 estimated tokens. Treat warnings as required review inputs. +- For `extended`, treat token warnings as review inputs for completeness and + slicing. Prefer splitting into numbered files over compression. - `compact` uses exactly `01-change-summary.md` and `02-execution.md` during authoring. `extended` uses `01-change-summary.md`, `02-control.md`, `03-execution.md`, and optional higher numbered files. @@ -98,6 +105,11 @@ can provide. - For `extended`, implementation-contract sections are merged into `02-control.md` with these exact headings: `Sources`, `Candidate targets`, `Validation commands`, `Blockers and fallback rules`, and `External pins`. +- For `extended`, recommend adding deep companion files only when justified by + triggers, and keep them as recommendations (not ERROR-level required files): + `data-contract.md` for reconciled datasets and schema mappings, + `validation-runbook.md` for multi-validator troubleshooting or rollback paths, + and API/schema pin notes when external dependencies or credentials drive risk. - Apply a say-once rule: each control fact (target, owner, validator, blockers, pins, and source-item coverage) is written once in the owning file, and step files do not restate target/owner/validator. @@ -126,7 +138,10 @@ can provide. `questions.md` file for `compact`. 11. Run scope challenge and plan review gate for non-trivial plans. 12. Run `audit` first, then run `handoff-check`; execute only when ready. -13. Treat token warnings as review inputs for compression or split decisions, not as proof of measured savings. +13. Treat token warnings as review inputs, not as proof of measured savings. For + `extended`, prefer splitting into numbered files over compression, and never + compress away source pins, schema contracts, validation rules, stop + conditions, or failure-investigation steps. ## Validation diff --git a/.github/skills/internal-gateway-writing-plans/references/plan-review-gate.md b/.github/skills/internal-gateway-writing-plans/references/plan-review-gate.md index 32d2ff9..d3da491 100644 --- a/.github/skills/internal-gateway-writing-plans/references/plan-review-gate.md +++ b/.github/skills/internal-gateway-writing-plans/references/plan-review-gate.md @@ -33,7 +33,7 @@ or handoff. It checks clarity and validity without creating reviewer personas. | Open questions | Is `questions.md` present and set to `- none` for execution handoff, or explicitly blocking handoff? | | Lifecycle status | Is plan state explicit (`scaffold`, `ready`, or `closed`) so an executor does not infer readiness? | | Token discipline | Does the ledger define `Initial evidence pass` and `Reading budget` so the executor can classify the folder with the fewest safe reads? | -| Profile token budget | Is `compact` within the 2,000 estimated-token total budget, with `01-change-summary.md` under 300 and `02-execution.md` under 1,500, or escalated to `extended`? | +| Profile token budget | Is `compact` within the 2,000 estimated-token total budget, with `01-change-summary.md` under 300 and `02-execution.md` under 1,500, or escalated to `extended`? For `extended`, are soft limits reviewed with completeness over compression and split-by-slice decisions when files grow large? | ## Outcomes diff --git a/.github/skills/internal-gateway-writing-plans/scripts/plan_authoring.py b/.github/skills/internal-gateway-writing-plans/scripts/plan_authoring.py index bc09161..6fc4b05 100644 --- a/.github/skills/internal-gateway-writing-plans/scripts/plan_authoring.py +++ b/.github/skills/internal-gateway-writing-plans/scripts/plan_authoring.py @@ -709,11 +709,16 @@ def _token_warnings(plan_folder: Path, profile: str | None = None) -> list[str]: control_names = {"01-change-summary.md", "02-control.md"} control_tokens = sum(tokens for name, tokens in file_tokens if name in control_names) if total_tokens and control_tokens / total_tokens > 0.7: - warnings.append("Initial control read is disproportionately large; compress or split the control files.") + warnings.append("Initial control read is disproportionately large; prefer splitting control facts into numbered files by delivery slice.") for name, tokens in file_tokens: if tokens > 1200: - warnings.append(f"Estimated token weight is high for {name}; split or compress by delivery slice.") + if profile == "extended": + warnings.append( + f"Informational: estimated token weight is high for {name}; prefer splitting into numbered files by delivery slice." + ) + else: + warnings.append(f"Estimated token weight is high for {name}; split or compress by delivery slice.") return warnings diff --git a/tests/test_plan_policy_contract.py b/tests/test_plan_policy_contract.py index 616bdc1..d584400 100644 --- a/tests/test_plan_policy_contract.py +++ b/tests/test_plan_policy_contract.py @@ -28,6 +28,10 @@ def test_writing_plans_declares_profile_only_handoff_contract() -> None: assert "mini-plan-*" in compact_reference assert "Decisioni aperte" in compact_reference assert "2,000 estimated tokens" in compact_reference + assert "completeness over compression" in writing_text + assert "Escalate to `extended`" in writing_text + assert "prefer splitting into numbered files over compression" in writing_text + assert "data-contract.md" in writing_text def test_executing_plans_accepts_compact_and_extended_consumers() -> None: From e4fc1177b756645a1768064dc955e6e3b2a053b8 Mon Sep 17 00:00:00 2001 From: Diego Mauricio Lagos Date: Thu, 18 Jun 2026 22:13:35 +0200 Subject: [PATCH 02/11] feat: Improve Python version handling in virtual environment setup --- tools/analyze_copilot_debug_log/run.sh | 33 ++++++++++++++++++-------- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/tools/analyze_copilot_debug_log/run.sh b/tools/analyze_copilot_debug_log/run.sh index d1a080c..dc28c56 100755 --- a/tools/analyze_copilot_debug_log/run.sh +++ b/tools/analyze_copilot_debug_log/run.sh @@ -60,8 +60,10 @@ load_required_python_version() { select_python_bin() { if [[ -n "$PYTHON_BIN" ]]; then + PYTHON_BIN_EXPLICIT=1 return fi + PYTHON_BIN_EXPLICIT=0 PYTHON_BIN="python$REQUIRED_PYTHON_MAJOR_MINOR" } @@ -70,6 +72,12 @@ verify_python_bin_version() { actual_version="$("$PYTHON_BIN" -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")')" if [[ "$actual_version" == "$REQUIRED_PYTHON_MAJOR_MINOR" ]]; then + EXPECTED_PYTHON_MAJOR_MINOR="$REQUIRED_PYTHON_MAJOR_MINOR" + return + fi + + if [[ "$PYTHON_BIN_EXPLICIT" -eq 1 ]]; then + EXPECTED_PYTHON_MAJOR_MINOR="$actual_version" return fi @@ -82,27 +90,30 @@ verify_venv_version() { local venv_version if [[ ! -x "$venv_python" ]]; then - log_error "virtual environment is missing its Python interpreter: $venv_python" - exit 1 + return 1 fi - venv_version="$("$venv_python" -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")')" - if [[ "$venv_version" == "$REQUIRED_PYTHON_MAJOR_MINOR" ]]; then - return + venv_version="$($venv_python -c 'import sys; print(f"{sys.version_info.major}.{sys.version_info.minor}")')" + if [[ "$venv_version" == "$EXPECTED_PYTHON_MAJOR_MINOR" ]]; then + return 0 fi - log_error "existing virtual environment uses Python $venv_version, but .python-version requires $REQUIRED_PYTHON_VERSION. Remove $VENV_DIR and rerun." - exit 1 + return 1 } ensure_venv() { if [[ -d "$VENV_DIR" ]]; then - verify_venv_version - return + if verify_venv_version; then + return + fi + rm -rf "$VENV_DIR" fi "$PYTHON_BIN" -m venv "$VENV_DIR" - verify_venv_version + if ! verify_venv_version; then + log_error "virtual environment uses an unexpected Python version after creation: $VENV_DIR" + exit 1 + fi } install_dependencies() { @@ -151,6 +162,8 @@ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" REPO_ROOT="$(cd "$SCRIPT_DIR/../.." && pwd)" VENV_DIR="$SCRIPT_DIR/.venv" PYTHON_BIN="${PYTHON_BIN:-}" +PYTHON_BIN_EXPLICIT=0 +EXPECTED_PYTHON_MAJOR_MINOR="" PYTHON_VERSION_FILE="$REPO_ROOT/.python-version" REQUIREMENTS_FILE="$SCRIPT_DIR/requirements.txt" REQUIREMENTS_HASH_FILE="$VENV_DIR/.requirements.sha256" From 492f7b1e67a2fba07fba793354ad5f9802e24494 Mon Sep 17 00:00:00 2001 From: Diego Mauricio Lagos Date: Thu, 18 Jun 2026 22:26:26 +0200 Subject: [PATCH 03/11] feat: Add counter-validation requirements to internal gateway review and update tests --- .github/skills/internal-gateway-review/SKILL.md | 8 +++++++- .../internal-gateway-review/references/review-gate.md | 7 +++++-- tests/test_workflow_review_contract.py | 2 ++ 3 files changed, 14 insertions(+), 3 deletions(-) diff --git a/.github/skills/internal-gateway-review/SKILL.md b/.github/skills/internal-gateway-review/SKILL.md index 53834ce..426194a 100644 --- a/.github/skills/internal-gateway-review/SKILL.md +++ b/.github/skills/internal-gateway-review/SKILL.md @@ -17,6 +17,11 @@ Portable review orchestrator. This skill owns review scope, lens selection, findings consolidation, critical support, and remediation-plan transition. It does not apply fixes. +Before any user-visible review verdict, run a counter-validation pass that +challenges the draft analysis for missing evidence, false positives, severity +inflation, route errors, and ignored contrary evidence. Revise or reopen the +review before presenting the analysis when the critique exposes a material gap use `internal-gateway-critical-master`. + See `references/review-gate.md` for the review output contract and gate states. ## When to use @@ -28,6 +33,7 @@ See `references/review-gate.md` for the review output contract and gate states. - Findings stay defect-first. - Review flow preserves compact context: prioritize diff and failing evidence first, then expand only when an evidence gap remains. -- Review output carries findings, severity, confidence, evidence gap, route or next owner, and a Review Gate outcome before the final verdict. +- Review output carries findings, severity, confidence, evidence gap, counter-validation result, route or next owner, and a Review Gate outcome before the final verdict. +- The review cannot present analysis to the user until counter-validation confirms it or reopens material gaps. - Retained remediation plans are authored by `internal-gateway-writing-plans`. - The gateway stops before fixes. diff --git a/.github/skills/internal-gateway-review/references/review-gate.md b/.github/skills/internal-gateway-review/references/review-gate.md index bb3e0f1..cb6f17d 100644 --- a/.github/skills/internal-gateway-review/references/review-gate.md +++ b/.github/skills/internal-gateway-review/references/review-gate.md @@ -8,15 +8,18 @@ Use this reference when `internal-gateway-review` needs to package findings befo - Severity - Confidence - Evidence gap +- Counter-validation - Route or next owner - Review Gate outcome ## Gate States -- `review gate: satisfied` when the findings are specific, routed, and ready for the user-visible verdict. -- `review gate: reopen` when material evidence is missing or the remediation choice needs more challenge. +- `review gate: satisfied` when the findings are specific, routed, counter-validated, and ready for the user-visible verdict. +- `review gate: reopen` when material evidence is missing, counter-validation exposes a material flaw, or the remediation choice needs more challenge. ## Boundary - Keep the gate visible before any fixes. +- Run counter-validation before the final user-visible verdict; challenge each finding for evidence, severity, route, and contrary proof. +- Report only material self-critique results: corrections, confidence changes, evidence gaps, or confirmation that no material issue was found. - Use the gate to route each actionable finding to the smallest next owner. diff --git a/tests/test_workflow_review_contract.py b/tests/test_workflow_review_contract.py index 50bb3c2..6d36a85 100644 --- a/tests/test_workflow_review_contract.py +++ b/tests/test_workflow_review_contract.py @@ -57,6 +57,8 @@ def test_review_gateway_exists_and_stops_before_fixes() -> None: assert "severity" in review_gate_lower assert "confidence" in review_gate_lower assert "evidence gap" in review_gate_lower + assert "counter-validation" in skill_text + assert "counter-validation" in review_gate_lower assert "route or next owner" in review_gate_lower From 653beb43e4fcdbb94c173f11a53c421be1f5f2ed Mon Sep 17 00:00:00 2001 From: Diego Mauricio Lagos Date: Fri, 19 Jun 2026 14:22:28 +0200 Subject: [PATCH 04/11] feat: Enhance Python project and script guidelines with new logging, reporting, and dependency management practices --- .../references/anti-patterns-python.md | 3 +- .../skills/internal-python-project/SKILL.md | 16 ++- .../references/common-mistakes.md | 2 + .../references/logging-and-reporting.md | 101 ++++++++++++++++ .../skills/internal-python-script/SKILL.md | 19 ++- .../references/common-mistakes.md | 2 + .../references/layout-and-templates.md | 42 +++++-- .../references/reporting.md | 114 ++++++++++++++++++ .github/skills/internal-python/SKILL.md | 3 + ...est_repository_workflow_policy_contract.py | 16 +++ 10 files changed, 299 insertions(+), 19 deletions(-) create mode 100644 .github/skills/internal-python-project/references/logging-and-reporting.md create mode 100644 .github/skills/internal-python-script/references/reporting.md diff --git a/.github/skills/internal-code-review/references/anti-patterns-python.md b/.github/skills/internal-code-review/references/anti-patterns-python.md index 26f7b69..53df0c9 100644 --- a/.github/skills/internal-code-review/references/anti-patterns-python.md +++ b/.github/skills/internal-code-review/references/anti-patterns-python.md @@ -23,6 +23,7 @@ Baseline owner: `internal-python` | PY-M07 | `print()` instead of `logging` in application/library code | No log level control in production | | PY-M08 | Missing unit tests for new public functions | Violates test coverage mandate | | PY-M09 | Python tests outside repository-root `tests/` or without mirrored source paths | Breaks repository test discoverability and ownership mapping | +| PY-M10 | `rich`, emoji, tables, or panels outside human-facing CLI/reporting boundaries | Mixes terminal UI with importable logic or machine-readable output such as JSON | ## Minor @@ -89,5 +90,5 @@ import logging logger = logging.getLogger(__name__) def process(data: list[dict]) -> None: - logger.info("ℹ️ Processing %d items", len(data)) + logger.info("Processing %d items", len(data)) ``` diff --git a/.github/skills/internal-python-project/SKILL.md b/.github/skills/internal-python-project/SKILL.md index fa44785..509244e 100644 --- a/.github/skills/internal-python-project/SKILL.md +++ b/.github/skills/internal-python-project/SKILL.md @@ -5,6 +5,11 @@ description: Use when creating or modifying Python package or application code w # Python Project Skill +## Referenced skills + +- `internal-python-script`: route CLI adapters, direct operator execution, and rich console reporting boundaries. +- `internal-tdd`: load for bugfixes, features, or project behavior changes with a meaningful public or service seam. + ## When to use - Services, use cases, adapters, packages, and modules in Python applications. @@ -38,17 +43,23 @@ description: Use when creating or modifying Python package or application code w - Choose async only when the workload is I/O-bound and the surrounding stack supports it cleanly. - Keep request or transport models, domain logic, and persistence concerns in separate modules. - Prefer a domain/service/adapter decomposition before adding generic catch-all modules. -- Keep reusable module and service logs neutral or structured; reserve emoji log formatting for outer operator-facing entrypoints. +- Keep reusable module and service logs neutral, structured, or framework-native. Log events should be parsable, searchable, and useful in production. +- Design professional reporting as a boundary concern: core project code returns typed results, events, or DTOs; adapters decide whether to render JSON, HTTP responses, framework logs, metrics, or human-facing CLI reports. +- No emoji or `rich` rendering inside importable domain, service, persistence, framework modules, or machine-readable output paths such as JSON. Use `rich` only in human-facing CLI adapter reporting. +- If a project exposes a CLI adapter, keep the CLI adapter thin and route its operator-facing reporting to the script boundary. A CLI adapter may use an `ExecutionReporter`; the core project code should not know that reporter exists. Load `references/examples.md` when you need a minimal module or test example. +Load `references/logging-and-reporting.md` when project code needs a professional logging/reporting layout, structured log context, result DTOs, adapter-owned rendering, or JSON versus human-output boundaries. + ## Testing - Follow the repository pytest defaults. - BDD-like names: `given_when_then` style. - Prefer fixtures, parameterization, and mocking only when they reduce duplication or isolate real external boundaries. - Use coverage reports to close meaningful behavioral gaps, not as a blanket 100% doctrine. -- For modify tasks: edit implementation first, run existing tests, then update tests only for intentional behavior changes. +- For bugfixes, features, and intentional behavior changes, start test-first through the public API, service boundary, adapter contract, or framework-owned seam: add or update the failing test, confirm it fails for the intended reason, then implement the smallest fix. +- For refactors, prose-only updates, generated fixtures, or mechanical formatting with no executable behavior change, run existing focused tests and syntax validation instead of manufacturing speculative tests. ## Architecture and framework guidance @@ -70,5 +81,6 @@ Load `references/common-mistakes.md` for the full mistake table. ## Validation - `python -m compileall ` (syntax check) +- `pip install --require-hashes -r requirements.txt` (dependency integrity check, only when requirements change) - `pytest tests/` (run tests) - Lint with project's configured linter. diff --git a/.github/skills/internal-python-project/references/common-mistakes.md b/.github/skills/internal-python-project/references/common-mistakes.md index 592bdf6..a4c1eb4 100644 --- a/.github/skills/internal-python-project/references/common-mistakes.md +++ b/.github/skills/internal-python-project/references/common-mistakes.md @@ -6,8 +6,10 @@ | Mutable default arguments (`def f(items=[])`) | Shared state between calls — classic Python gotcha | Use `None` default + create inside function | | Bare `except:` or `except Exception:` | Swallows `KeyboardInterrupt`, `SystemExit` | Catch specific exceptions | | No type hints on public API | Hard to understand contracts, no static analysis | Add type hints on function signatures | +| Updating dependency requirements without refreshed hashes | Reproducible installs break or drift silently | Regenerate exact pins and hashes, then validate with `pip install --require-hashes -r requirements.txt` | | Tests that depend on execution order | Fragile test suite, non-deterministic failures | Each test must be self-contained | | Forcing async into CPU-bound or simple flows | Adds complexity without throughput benefit | Keep it synchronous unless I/O concurrency is the real bottleneck | | Mocking internal implementation details | Makes tests brittle and hides real regressions | Mock only true external boundaries | +| Using `rich`, emoji, tables, or panels outside human-facing CLI adapter reporting | Mixes terminal UI with project behavior or machine-readable output such as JSON | Keep project logs neutral or structured, keep data output plain, and put `rich` reporting in a CLI adapter | | Treating line coverage as the goal | Inflates test volume without improving defect detection | Target coverage around changed behavior and risky paths | | God classes with 10+ methods | Hard to test, hard to reason about | Split by responsibility into focused classes | diff --git a/.github/skills/internal-python-project/references/logging-and-reporting.md b/.github/skills/internal-python-project/references/logging-and-reporting.md new file mode 100644 index 0000000..682e833 --- /dev/null +++ b/.github/skills/internal-python-project/references/logging-and-reporting.md @@ -0,0 +1,101 @@ +# Python Project Logging And Reporting + +Use this reference when Python project code needs professional logging, reporting layout, structured log context, result DTOs, adapter-owned rendering, or a clear boundary between JSON/data output and human-facing CLI reporting. + +## Boundary + +- Project internals should expose behavior through typed results, domain events, DTOs, return values, exceptions, or framework contracts. +- Domain, service, persistence, and framework modules should use standard `logging` or the repository framework's native logging. +- Logs from importable modules should be neutral, structured when useful, and parsable in production. +- Human-facing rendering belongs to adapters: CLI, admin command, report command, or delivery script. +- Machine-readable outputs such as JSON, API responses, event payloads, or exported files must stay plain data. Do not decorate them with `rich`, emoji, color, panels, or tables. +- A CLI adapter may use the script `ExecutionReporter` pattern or `rich`, but the project core should not import or know about that reporter. + +## Professional Layout + +Prefer this ownership split when the project needs both reusable behavior and operator-facing reporting: + +```text +src/{package}/ +├── domain/ # entities, value objects, domain rules; no logging UI +├── services/ # use cases; structured logging and typed results +├── adapters/ +│ ├── cli.py # optional human-facing rendering boundary +│ ├── http.py # framework/API response boundary +│ └── persistence.py +└── observability.py # logger setup helpers only when the project owns setup +``` + +Use existing repository structure first. Do not create these folders just to satisfy the shape when the current project has a clearer convention. + +## Logging Shape + +Use stable event names and explicit context. Prefer values that help production search, alerting, and diagnosis. + +```python +from __future__ import annotations + +import logging +from dataclasses import dataclass +from pathlib import Path + +logger = logging.getLogger(__name__) + + +@dataclass(frozen=True) +class ImportSummary: + imported_count: int + skipped_count: int + output_path: Path + + +def import_records(source_path: Path, output_path: Path) -> ImportSummary: + logger.info( + "records_import_started", + extra={"source_path": source_path.as_posix(), "output_path": output_path.as_posix()}, + ) + + summary = ImportSummary(imported_count=12, skipped_count=1, output_path=output_path) + + logger.info( + "records_import_completed", + extra={ + "imported_count": summary.imported_count, + "skipped_count": summary.skipped_count, + "output_path": summary.output_path.as_posix(), + }, + ) + return summary +``` + +## Adapter Rendering + +Adapters translate project results into the output contract for that boundary. + +```python +def summary_to_json(summary: ImportSummary) -> dict[str, object]: + return { + "imported_count": summary.imported_count, + "skipped_count": summary.skipped_count, + "output_path": summary.output_path.as_posix(), + } + + +def render_human_summary(summary: ImportSummary, reporter: object) -> None: + reporter.summary( + status="completed", + counts={"imported": summary.imported_count, "skipped": summary.skipped_count}, + produced_files=[summary.output_path], + diagnostics=[], + ) +``` + +The JSON adapter returns plain data. The human adapter may use `ExecutionReporter` or `rich` if the CLI/reporting boundary owns that dependency. + +## Review Checklist + +- Does the core return typed results or framework-native responses instead of printing? +- Are logs searchable and useful without terminal formatting? +- Are secrets, tokens, bearer values, passwords, credentials, and sensitive payloads omitted or redacted? +- Is JSON or other machine-readable output plain data? +- If `rich` appears, is it isolated to a human-facing CLI/reporting adapter with a dependency decision note? diff --git a/.github/skills/internal-python-script/SKILL.md b/.github/skills/internal-python-script/SKILL.md index 95ddd2b..678f3a5 100644 --- a/.github/skills/internal-python-script/SKILL.md +++ b/.github/skills/internal-python-script/SKILL.md @@ -5,6 +5,11 @@ description: Use when creating or modifying standalone Python scripts, CLIs, or # Python Script Skill +## Referenced skills + +- `internal-python-project`: route away when imported package, application, service, or framework behavior becomes the primary contract. +- `internal-tdd`: load for bugfixes, features, or script behavior changes with a meaningful executable seam. + ## When to use - New standalone Python scripts. @@ -32,12 +37,15 @@ description: Use when creating or modifying standalone Python scripts, CLIs, or - For operator-facing script work, crossing the 400-line threshold should move toward a toolkit or project structure according to the primary contract, not an ever-growing single entrypoint. - Keep policy checks focused on maintained source; generated outputs and large fixture data are excluded unless directly edited. - Prefer `argparse`, `pathlib.Path`, and small helper functions for operator-facing tools. -- Keep emoji logs at operator-facing boundaries such as start, success, warning, and failure states; keep reusable helpers free of decorative log formatting. +- Keep operator-facing console reporting centralized in a dedicated reporter, for example `ExecutionReporter`. Application logic should call semantic reporter methods instead of constructing styled strings or scattered `print()` calls. +- Use `rich` as the preferred console rendering library for polished human-facing CLI reports when the terminal experience is part of the contract. Keep it out of `--format json`, other machine-readable outputs, and reusable helper logic. +- Keep emoji, panels, tables, and color at human-facing boundaries such as banners, sections, success, warning, error, and summaries. Keep reusable helpers and machine-readable output paths free of decorative log formatting. +- Load `references/reporting.md` when a script needs professional console reporting, `rich` rendering, an `ExecutionReporter` shape, redaction rules, or verbose/debug output boundaries. - When a tool can be called from subdirectories, resolve the repository root explicitly instead of assuming the current working directory. - Use type hints on non-trivial public helpers and CLI-facing boundaries. - Use `asyncio` only when the script truly coordinates multiple I/O-bound tasks. - Reach for `pathlib`, context managers, and small helper functions before adding framework-like structure to a script. -- Add machine-readable output such as `--format json` only when the tool has a real automation consumer. Keep text output as the default operator path. +- Add machine-readable output such as `--format json` only when the tool has a real automation consumer. Keep text output as the default operator path, and do not decorate machine-readable output with `rich`, emoji, color, or tables. - When machine-readable output can become large and the script is agent-facing, add a bounded mode such as `--format compact` that preserves status, blocker or finding counts, key path evidence, and next action without dumping full detail. - Keep full `--format json` available for durable audit/debug use; do not replace it with compact mode. @@ -64,12 +72,15 @@ Dependency decision note - Keep the note short and task-specific. - Compare the standard library with realistic third-party candidates. - If the final choice uses external libraries, create or update the local `requirements.txt` before finishing the task. +- Keep exact pins and current hashes in `requirements.txt`. Use `pip-compile --generate-hashes` or an equivalent repository-approved workflow, then validate with `pip install --require-hashes -r requirements.txt` when the requirements file changes. - If several entrypoints share the same lock file, record the decision once at the shared toolkit `requirements.txt` rather than repeating it in every script. ## Layout and templates Load `references/layout-and-templates.md` when you need the default folder layout, a repo-aligned multi-tool toolkit layout, a minimal entry point, a hash-locked `requirements.txt`, or the launcher pattern. +Load `references/reporting.md` when the script needs a richer `ExecutionReporter`, `rich` console rendering, status tables, redaction behavior, or a final operator summary. + Keep these rules visible while drafting: - Use a dedicated tool folder or toolkit root rather than a loose top-level `.py` file. @@ -82,7 +93,8 @@ Keep these rules visible while drafting: - Follow the repository pytest defaults. - Use coverage reports to inspect missing behavior on touched code, not to force blanket 100% coverage. -- For modify tasks: edit implementation first, run existing tests, then update tests only for intentional behavior changes. +- For bugfixes, features, and intentional behavior changes, start test-first through the public CLI or stable helper seam: add or update the failing test, confirm it fails for the intended reason, then implement the smallest fix. +- For refactors, prose-only updates, generated fixtures, or mechanical formatting with no executable behavior change, run the existing focused tests plus `py_compile` or `compileall` instead of manufacturing speculative tests. - Prefer existing repository commands such as `make lint`, `make test`, or a shared script runner before inventing a one-off validation path. ## Runtime guidance @@ -99,5 +111,6 @@ Load `references/common-mistakes.md` for the full mistake table. - `python -m py_compile .py` (syntax check) - `bash -n run.sh` (launcher syntax check, only when `run.sh` exists) +- `pip install --require-hashes -r requirements.txt` (dependency integrity check, only when requirements change) - `pytest tests/` (run tests) - `python -m compileall ` or the repository's canonical shared runner when the tool already lives inside a maintained toolkit diff --git a/.github/skills/internal-python-script/references/common-mistakes.md b/.github/skills/internal-python-script/references/common-mistakes.md index d1f551c..e104cfd 100644 --- a/.github/skills/internal-python-script/references/common-mistakes.md +++ b/.github/skills/internal-python-script/references/common-mistakes.md @@ -9,6 +9,7 @@ | No argument parsing | Caller has to modify script source to change behavior | Use `argparse` for any configurable parameter | | Installing deps globally or without hash-locked version pinning | Non-reproducible environment and hidden setup drift | Keep dependencies in the local `requirements.txt` with exact pins and hashes | | Adding an empty `requirements.txt` to a stdlib-only tool | Adds noise and implies missing setup steps | Omit `requirements.txt` when the script uses only the standard library | +| Updating `requirements.txt` without refreshed hashes | Breaks reproducible installs and hides dependency drift | Regenerate exact pins and hashes, then validate with `pip install --require-hashes -r requirements.txt` | | Wrapping a stdlib-only script in Bash | Adds setup indirection without solving a real dependency problem | Document direct `python3