Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
d477060
docs: dogfood Spec Kit — bundler SDD artifacts + constitution
mnriem Jun 19, 2026
e108546
feat(bundler): add `specify bundle` subcommand for role-based setups
mnriem Jun 19, 2026
0993867
docs(contributing): document running the full test suite via project …
mnriem Jun 19, 2026
68f7d95
feat(bundler): wire real in-process primitive install + local-artifac…
mnriem Jun 19, 2026
dfe0b2a
docs(bundler): append Phase 8 convergence tasks from converge assessment
mnriem Jun 19, 2026
8240184
Implement Phase 8 convergence tasks (T046–T052)
mnriem Jun 19, 2026
291e9a4
converge: append Phase 9 (T053) — surface bundle trust indicator
mnriem Jun 19, 2026
0c31fd5
Implement T053 — surface bundle trust indicator in discovery
mnriem Jun 19, 2026
3fd1e54
docs: dogfood Spec Kit — bundler SDD artifacts + constitution
mnriem Jun 19, 2026
e9dc1c9
feat(bundler): add `specify bundle` subcommand for role-based setups
mnriem Jun 19, 2026
aabfd5a
docs(contributing): document running the full test suite via project …
mnriem Jun 19, 2026
b595eca
feat(bundler): wire real in-process primitive install + local-artifac…
mnriem Jun 19, 2026
beec873
docs(bundler): append Phase 8 convergence tasks from converge assessment
mnriem Jun 19, 2026
74a3dde
Implement Phase 8 convergence tasks (T046–T052)
mnriem Jun 19, 2026
de1c8ce
converge: append Phase 9 (T053) — surface bundle trust indicator
mnriem Jun 19, 2026
d55439e
Implement T053 — surface bundle trust indicator in discovery
mnriem Jun 19, 2026
d074d0e
Merge commit '0c31fd5beb87e7992c8a8bbd1438be382f9ef145' into mnriem/f…
mnriem Jun 19, 2026
b762e4b
fix(bundler): address PR review — annotations, Windows paths, HTTPS, …
mnriem Jun 19, 2026
9f5a542
fix(bundler): address PR review round 2 — nested output dir + file://…
mnriem Jun 19, 2026
1d85128
fix(bundler): address PR review round 3 — discovery UX + hardening
mnriem Jun 19, 2026
1155404
fix(bundler): address PR review round 4 + markdownlint exclusions
mnriem Jun 19, 2026
90f5dca
fix(bundler): address PR review round 5 + ignore generated files in w…
mnriem Jun 19, 2026
9ad7e9c
fix(bundler): collision-resistant catalog ids, canonical local paths,…
mnriem Jun 19, 2026
b180ec8
fix(bundler): confine config writes, guard indeterminate integration,…
mnriem Jun 19, 2026
9ecdadb
fix(bundler): Windows path handling + review round 8 hardening
mnriem Jun 19, 2026
ab6c81a
fix(bundler): normalize SemVer prerelease spellings before version pa…
mnriem Jun 19, 2026
a822001
fix(bundler): no collateral removal + enforce manifest-pinned versions
mnriem Jun 19, 2026
7816773
feat(scripts): add SPECIFY_INIT_DIR to target a member project from t…
PascalThuet Jun 19, 2026
9598e2f
chore: sync dogfooded .specify core scripts with SPECIFY_INIT_DIR
mnriem Jun 19, 2026
841d69b
fix(bundler): harden remote catalog fetch and config parsing
mnriem Jun 19, 2026
71f18d2
fix(bundler): tighten record read confinement, policy gate, and prece…
mnriem Jun 19, 2026
c3fc064
fix(bundler): no collateral refresh, catalog id integrity, loud info
mnriem Jun 19, 2026
beb978b
fix(bundler): confine catalog-config and integration-marker reads
mnriem Jun 19, 2026
1bf4e9a
fix(bundler): validate manifest tags, disambiguate derived ids by ful…
mnriem Jun 19, 2026
c970842
chore(gitattributes): retain whitespace exemption for constitution.md
mnriem Jun 19, 2026
1c766f2
fix(bundler): accurate uninstall count, confine catalog read, safe bu…
mnriem Jun 19, 2026
bee08e9
fix(bundler): validate requires/provides shapes in manifest and catalog
mnriem Jun 19, 2026
0e3ed20
fix(bundler): require README.md when building a bundle artifact
mnriem Jun 19, 2026
0fe457f
fix(bundler): validate record shapes; drop stale install --refresh claim
mnriem Jun 19, 2026
49dfade
Merge remote-tracking branch 'upstream/main' into mnriem/feat-bundler…
mnriem Jun 19, 2026
b14533b
fix(bundler): refuse symlinked .specify, reject bad url schemes, IPv6…
mnriem Jun 19, 2026
654e1da
fix(bundler): strict semver, narrow artifact skip, preserve priority 0
mnriem Jun 19, 2026
70b9292
fix(bundler): artifact regex for prerelease+build; clarify integratio…
mnriem Jun 19, 2026
3c44029
fix(bundler): --integration can't bypass clash guard; honest rollback…
mnriem Jun 19, 2026
139ef20
fix(bundler): validate component kind/id when loading records
mnriem Jun 19, 2026
54b37b2
fix(bundler): address review 4535234003 (7 findings)
mnriem Jun 19, 2026
b53f7e7
test(bundler): make update --integration help assertion ANSI-safe
mnriem Jun 19, 2026
9f86e3c
fix(bundler): preserve exec bits in artifacts; document install-time …
mnriem Jun 19, 2026
0eabe7a
test(bundler): skip exec-bit packager test on Windows
mnriem Jun 19, 2026
2b05e8e
fix(bundler): normalize prerelease spellings inside version constraints
mnriem Jun 19, 2026
02c3489
fix(bundler): validate schema versions and required record identity f…
mnriem Jun 19, 2026
8ad04a7
chore(bundler): scrub generated dogfooding scaffold before merge
mnriem Jun 19, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .gitattributes
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
* text=auto eol=lf

.github/workflows/*.lock.yml linguist-generated=true merge=ours -whitespace
.github/workflows/*.lock.yml linguist-generated=true merge=ours -whitespace
# The project constitution is the one dogfooding artifact carried forward.
# Keep it exempt from git's whitespace checks (git diff --check / CI) since its
# generated formatting is not hand-edited.
.specify/memory/constitution.md -whitespace
214 changes: 214 additions & 0 deletions .specify/memory/constitution.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
<!--
SYNC IMPACT REPORT
==================
Version change: (template/unratified) → 1.0.0
Bump rationale: Initial ratification of a concrete constitution for the brownfield
Spec Kit / specify-cli codebase, derived from an exhaustive multi-pass analysis of
the source tree, test suite, CI pipelines, and project conventions (AGENTS.md,
CONTRIBUTING.md, DEVELOPMENT.md). MAJOR baseline because it establishes binding
governance where none previously existed.

Principles defined:
I. Code Quality & Architectural Discipline
II. Test-Backed Change (NON-NEGOTIABLE)
III. CLI & User-Experience Consistency
IV. Offline-First Performance & Resource Discipline
V. Minimal Dependencies & Safe, Idempotent File Operations

Added sections:
- Security & Cross-Platform Constraints
- Development Workflow & Quality Gates
- Governance

Templates reviewed for alignment:
✅ .specify/templates/plan-template.md — generic "Constitution Check" gate (line 39)
remains valid; gates are now concretely populated by Principles I–V at plan time.
✅ .specify/templates/spec-template.md — no constitution-specific tokens; no change needed.
✅ .specify/templates/tasks-template.md — task categories (setup/foundational/story/polish)
already accommodate testing + performance + UX tasks mandated here; no change needed.
✅ .github/agents/speckit.*.agent.md — command guidance is agent-agnostic; no change needed.

Follow-up TODOs: none. RATIFICATION_DATE set to first adoption date below.
-->

# Spec Kit Constitution

Spec Kit (the `specify-cli` package and its bundled assets) is a local, offline-capable
developer CLI that bootstraps and operates Spec-Driven Development workflows for AI coding
agents. These principles are derived from the patterns the codebase already enforces. They
are binding on all changes — including the `specify bundle` subcommand and any future
command group, integration, extension, preset, or workflow.

## Core Principles

### I. Code Quality & Architectural Discipline

The codebase follows a strict, registry-driven, layered architecture, and all changes MUST
preserve it.

- **Separate the CLI surface from importable logic.** User-facing commands live in Typer
sub-apps (e.g. `commands/`, `*/_commands.py`); business logic lives in plain, importable
modules with no `@app.command()` decorators. New features MUST keep orchestration logic
testable independently of Typer.
- **Use the established extension pattern.** New agents/integrations MUST subclass one of the
standard base classes (`MarkdownIntegration`, `TomlIntegration`, `YamlIntegration`,
`SkillsIntegration`) and declare the required class attributes (`key`, `config`,
`registrar_config`, and `context_file` where applicable). Extending `IntegrationBase`
directly is permitted only when no base class fits, and the deviation MUST be justified.
- **Honor the single source of truth.** Built-ins are wired through the relevant registry
(e.g. `INTEGRATION_REGISTRY` via `_register_builtins()`), with imports and registrations
kept in alphabetical order. Duplicate keys MUST fail loudly rather than silently override.
- **Naming and typing are not optional.** Private modules/functions are `_`-prefixed and MUST
NOT be imported across package boundaries. Every new module begins with
`from __future__ import annotations` and uses modern type syntax (`dict[str, Any]`,
`str | None`); legacy `Dict`/`List`/`Optional` forms are rejected.
- **Package directories use underscores; keys keep their canonical (often hyphenated) form**
(e.g. package `kiro_cli/`, `key = "kiro-cli"`). For CLI-backed integrations the `key` MUST
match the executable name so `shutil.which(key)` resolves.

**Rationale:** A registry-plus-base-class architecture is what lets dozens of integrations,
extensions, and workflows coexist with minimal coupling. Drift here multiplies maintenance
cost and breaks the "add one subclass, register once, ship a test" contract.

### II. Test-Backed Change (NON-NEGOTIABLE)

Every behavioral change MUST be accompanied by automated tests, and the suite is a hard gate.

- **Tests gate merges.** CI runs `pytest` across a matrix of ubuntu + windows × Python 3.11,
3.12, and 3.13. Changes MUST pass on every cell of that matrix.
- **Parity invariants MUST hold.** Every integration MUST be present in the registry, have a
`CommandRegistrar` config entry where required, and ship a dedicated
`tests/integrations/test_integration_<key>.py` (hyphens in the key become underscores in the
filename). These are enforced by parametrized tests (e.g. `test_registry.py`) and MUST NOT
be weakened.
- **Follow pytest conventions.** Test modules/classes/functions use the `test_*` / `Test*`
naming the project configures, run under `--strict-markers`, and isolate state with
`tmp_path`, `monkeypatch`, and the autouse auth-isolation fixture. Platform-specific tests
MUST be guarded (e.g. `@requires_bash`) rather than left to fail.
- **Security and idempotency tests are mandatory categories.** Path-traversal rejection,
manifest hash integrity/symlink safety, and no-overwrite idempotency are covered by existing
suites; changes touching file writes, path handling, or setup scripts MUST extend (never
reduce) that coverage.
- **Network is mocked.** No test may make a real outbound network call; HTTP MUST be stubbed
so the suite is deterministic and offline-runnable.

**Rationale:** The breadth of supported agents and the offline/air-gapped guarantees can only
be sustained by exhaustive, parametrized tests. The parity and security suites are what stop a
single new integration from regressing the whole matrix.

### III. CLI & User-Experience Consistency

The CLI presents one coherent surface; every command group MUST feel like the others.

- **Reuse the shared verb vocabulary.** Consumer-facing groups use the established verbs —
`list`, `add`/`install`, `remove`, `search`, `info`, `update`, plus `enable`/`disable` and
`set-priority` where relevant. New verbs MUST NOT be invented when an existing one fits, and
any genuinely new verb MUST be justified.
- **Mirror the catalog-stack model.** Catalog-backed groups MUST expose
`<group> catalog list|add|remove`, back it with a priority-ordered source stack (lower number
= higher precedence) plus per-source install policy (`install-allowed` vs `discovery-only`),
and fall back to a built-in default stack when no project config is present.
- **Register sub-apps the standard way.** Command groups are `typer.Typer(...)` instances
attached via `app.add_typer(child, name="...")`, preferably through a modular
`register(app)` function imported in `__init__.py`. Nesting MUST stay within ~2–3 levels.
- **Output is consistent and machine-friendly.** Human output uses the shared Rich
conventions (e.g. `[green]✓[/green]` success, `[red]Error:[/red]` + non-zero exit on
failure, actionable remediation in messages). Where a `--json` flag is offered, valid JSON
goes to stdout and all other logging is redirected to stderr.
- **Interactions are safe and idempotent.** Destructive actions show what will change before
confirming; "already installed / already present" outcomes succeed (exit 0) rather than
error. User-facing command groups MUST be documented under `docs/reference/`.

**Rationale:** Predictability is the product. Users learn one set of verbs, one catalog model,
and one output grammar, then apply them to every group — including `specify bundle`.

### IV. Offline-First Performance & Resource Discipline

Spec Kit is a local CLI; responsiveness, offline operability, and graceful degradation are the
performance contract.

- **`specify init` and core scaffolding MUST work fully offline** using bundled `core_pack`
assets. Asset resolution MUST prefer bundled assets, then a source checkout, before ever
reaching the network.
- **Network use is lazy, bounded, and degradable.** Network calls happen only on explicit
user commands, MUST set timeouts, MUST cache catalog results (1-hour TTL) and fall back to
stale cache on failure, and MUST surface offline/rate-limit conditions as clear messages
without crashing.
- **Keep startup cheap.** Avoid adding heavyweight work to import time. New optional
subsystems SHOULD prefer lazy loading over unconditional eager imports so that unrelated
commands (including `--help`) stay fast.
- **Filesystem writes are minimal and idempotent.** Installs MUST track files (SHA-256
manifests), avoid clobbering user-modified content, only uninstall files whose hash still
matches, and never follow symlinks out of the project root.

**Rationale:** Developers run this tool in air-gapped, enterprise, and flaky-network
environments. Offline-first behavior and idempotent, hash-tracked file operations are what
make it safe and fast to run repeatedly.

### V. Minimal Dependencies & Safe, Idempotent File Operations

The project guards its dependency surface and its on-disk footprint deliberately.

- **Zero new runtime dependencies by default.** The runtime dependency set is intentionally
small and pinned to a minimum major version. Adding a dependency requires maintainer
agreement and a justification that existing deps (typer, click, rich, pyyaml, packaging,
platformdirs, pathspec, json5, readchar) cannot serve the need. New subsystems SHOULD reuse
existing primitive machinery in-process rather than re-implementing or re-shipping it.
- **All paths are validated.** Any project-relative path derived from user/manifest/catalog
input MUST be confined to the project root (`Path.relative_to` checks) and reject traversal
payloads; symlink escapes MUST be refused.
- **Errors are explicit and chained.** Validate inputs up front, raise with actionable context
(offending field/value plus a hint), and use `raise ... from exc` to preserve causes. I/O
that can legitimately fail MUST degrade gracefully rather than emit a raw traceback.
- **Versioning follows SemVer.** User-visible and packaged behavior changes follow
MAJOR.MINOR.PATCH semantics; backward-incompatible changes MUST be called out and justified.

**Rationale:** A lean, pinned dependency set and hardened, idempotent file handling are what
keep the tool trustworthy in enterprise and air-gapped contexts and cheap to maintain.

## Security & Cross-Platform Constraints

- **Cross-platform parity is required.** Code MUST run on Linux, macOS, and Windows and on
Python 3.11–3.13. Windows specifics (UTF-8 stream reconfiguration, bash-dependent tests
auto-skipping) MUST be respected; do not introduce POSIX-only assumptions without a guarded
fallback.
- **Security tooling is a gate.** CodeQL and the project's security test suites
(path-traversal, manifest/symlink hardening) MUST remain green. Network access MUST default
to off in tests and be opt-in, timeout-bounded, and credential-isolated at runtime.
- **Formatting is enforced.** `.editorconfig` rules (LF endings, final newline, no trailing
whitespace, 4-space Python / 2-space YAML-JSON-Markdown), `ruff check src/`, and
`markdownlint-cli2` MUST pass.

## Development Workflow & Quality Gates

- **Branch naming** follows `<type>/<number>-<short-slug>` (or `<type>/<short-slug>` with no
issue), with `<type>` ∈ {feat, fix, docs, community, chore}.
- **PRs are focused** and MUST: pass `ruff`, `pytest` (full matrix), markdown lint, and CodeQL;
add/extend tests for new behavior; update user-facing docs (`README.md`, `docs/`,
`spec-driven.md`) when behavior changes; and disclose any AI assistance used.
- **Slash-command-affecting changes** MUST be manually exercised through a coding agent and the
results reported in the PR, per CONTRIBUTING.md.
- **Large or cross-cutting changes** (new templates, arguments, command groups) MUST be agreed
with maintainers before implementation.

## Governance

This constitution supersedes ad-hoc convention where they conflict; the existing codebase
patterns it codifies remain authoritative references.

- **Authority.** Principles I–V are binding gates. The `## Constitution Check` section of the
plan template MUST be evaluated against these principles, and `/speckit.analyze` treats
conflicts with a MUST as CRITICAL. Violations are resolved by changing the spec, plan, or
tasks — not by diluting a principle.
- **Amendments.** Changes to this document require a PR with rationale, maintainer approval,
and a version bump per the policy below. Any amendment MUST propagate to dependent templates
and command guidance in the same change, recorded in the Sync Impact Report at the top of
this file.
- **Versioning policy (SemVer for governance).** MAJOR = backward-incompatible governance or
principle removal/redefinition; MINOR = a new principle/section or materially expanded
guidance; PATCH = clarifications and non-semantic refinements.
- **Compliance review.** Every PR and review MUST verify compliance with these principles.
Added complexity or any deviation MUST be justified in-PR (and, for plans, in the plan's
Complexity Tracking section). Unjustified violations block merge.

**Version**: 1.0.0 | **Ratified**: 2026-06-19 | **Last Amended**: 2026-06-19
12 changes: 10 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ The toolkit supports multiple AI coding assistants, allowing teams to use their

Each AI agent is a self-contained **integration subpackage** under `src/specify_cli/integrations/<key>/`. The subpackage exposes a single class that declares all metadata and inherits setup/teardown logic from a base class. Built-in integrations are then instantiated and added to the global `INTEGRATION_REGISTRY` by `src/specify_cli/integrations/__init__.py` via `_register_builtins()`.

```
```text
src/specify_cli/integrations/
├── __init__.py # INTEGRATION_REGISTRY + _register_builtins()
├── base.py # IntegrationBase, MarkdownIntegration, TomlIntegration, YamlIntegration, SkillsIntegration
Expand Down Expand Up @@ -340,18 +340,21 @@ Some agents require custom processing beyond the standard template transformatio
### Copilot Integration

GitHub Copilot has unique requirements:

- Commands use `.agent.md` extension (not `.md`)
- Each command gets a companion `.prompt.md` file in `.github/prompts/`
- Installs `.vscode/settings.json` with prompt file recommendations
- Context file lives at `.github/copilot-instructions.md`

Implementation: Extends `IntegrationBase` with custom `setup()` method that:

1. Processes templates with `process_template()`
2. Generates companion `.prompt.md` files
3. Merges VS Code settings

**Skills mode (`--skills`):** Copilot also supports an alternative skills-based layout
via `--integration-options="--skills"`. When enabled:

- Commands are scaffolded as `speckit-<name>/SKILL.md` under `.github/skills/`
- No companion `.prompt.md` files are generated
- No `.vscode/settings.json` merge
Expand All @@ -371,11 +374,13 @@ specify init my-project --integration copilot --integration-options="--skills"
### Forge Integration

Forge has special frontmatter and argument requirements:

- Uses `{{parameters}}` instead of `$ARGUMENTS`
- Strips `handoffs` frontmatter key (Forge-specific collaboration feature)
- Injects `name` field into frontmatter when missing

Implementation: Extends `MarkdownIntegration` with custom `setup()` method that:

1. Inherits standard template processing from `MarkdownIntegration`
2. Adds extra `$ARGUMENTS` → `{{parameters}}` replacement after template processing
3. Applies Forge-specific transformations via `_apply_forge_transformations()`
Expand All @@ -385,11 +390,13 @@ Implementation: Extends `MarkdownIntegration` with custom `setup()` method that:
### Goose Integration

Goose is a YAML-format agent using Block's recipe system:

- Uses `.goose/recipes/` directory for YAML recipe files
- Uses `{{args}}` argument placeholder
- Produces YAML with `prompt: |` block scalar for command content

Implementation: Extends `YamlIntegration` (parallel to `TomlIntegration`):

1. Processes templates through the standard placeholder pipeline
2. Extracts title and description from frontmatter
3. Renders output as Goose recipe YAML (version, title, description, author, extensions, activities, prompt)
Expand All @@ -400,7 +407,7 @@ Implementation: Extends `YamlIntegration` (parallel to `TomlIntegration`):

Branches follow one of two patterns depending on whether an issue exists:

```
```text
<type>/<number>-<short-slug> # when an issue is created first
<type>/<short-slug> # when no issue exists (PR-only changes)
```
Expand Down Expand Up @@ -463,6 +470,7 @@ Disclosure is **continuous**, not a one-time event. A single AI-disclosure parag
3. **Incorrect `requires_cli` value**: Set to `True` only for agents that have a CLI tool; set to `False` for IDE-based agents.
4. **Wrong argument format**: Use `$ARGUMENTS` for Markdown agents, `{{args}}` for TOML agents.
5. **Skipping registration**: The import and `_register()` call in `_register_builtins()` must both be added.
6. **Running tests against the wrong environment**: Always run the suite inside this working tree's own virtualenv (`uv sync --extra test` then `.venv/bin/python -m pytest`, or activate the venv first). A bare `uv run pytest` can resolve to an ambient/global interpreter whose editable `.pth` points at a *different* worktree. The failure is sneaky: test collection still imports `specify_cli` successfully, but newly-added subpackages (e.g. a fresh `specify_cli/bundler/`) resolve as a stale namespace package and raise `ModuleNotFoundError`. If a brand-new subpackage imports under `python -c` but not under pytest, suspect environment contamination, not your code.

---

Expand Down
18 changes: 18 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,24 @@ uv run python -m pytest tests/test_agent_config_consistency.py -q

Run this when you change agent metadata, context update scripts, or integration wiring.

#### Running the full test suite

Install the test dependencies into the project's own virtual environment and run
`pytest` through that interpreter:

```bash
uv pip install -e ".[test]"
.venv/bin/python -m pytest tests -q # Windows: .venv\Scripts\python -m pytest tests -q
```

> **Note:** prefer `.venv/bin/python -m pytest` over a bare `uv run pytest`.
> If another Spec Kit checkout has an editable (`-e`) install registered in a
> shared/global environment, `uv run pytest` can resolve `specify_cli` to that
> *other* worktree, turning it into a partial namespace package that fails to
> import newly added subpackages. Running through the project `.venv` resolves
> `specify_cli` to this checkout's `src/`. This matches the gotcha documented in
> `AGENTS.md` (Common Pitfalls).

### Manual testing

#### Testing setup
Expand Down
Loading