Skip to content

feat(doctor): scan definitions for cleartext sensitive global args (swamp-club#483)#1476

Merged
stack72 merged 1 commit into
mainfrom
worktree-483
May 29, 2026
Merged

feat(doctor): scan definitions for cleartext sensitive global args (swamp-club#483)#1476
stack72 merged 1 commit into
mainfrom
worktree-483

Conversation

@stack72
Copy link
Copy Markdown
Contributor

@stack72 stack72 commented May 29, 2026

Summary

Adds swamp doctor secrets, a read-only diagnostic that reports model definitions whose sensitive: true global arguments hold a cleartext literal value instead of a vault.get(...) expression, with value-free vault remediation guidance, and a non-zero exit when any are found (CI-gateable).

Follow-up to the at-rest write guard (swamp-club#480 / #1469). That guard only protects new writes at the YamlDefinitionRepository.save() chokepoint — it leaves two gaps a write-time guard cannot close:

  1. Legacy definitions authored before the guard still hold the secret in cleartext until re-saved.
  2. Datastore sync / migration copies definition YAML byte-for-byte, so a cleartext literal authored elsewhere lands on disk without passing through save().

This PR is the detection half: it scans what is already on disk and points the user at the fix.

What it does

  • Reuses the existing findLiteralSensitiveGlobalArgs domain rule (from fix: reject path traversal in definition names #480), so what the scan surfaces is exactly what a re-save would now refuse — no parallel detection logic.
  • Scans both the source-of-truth models/ tree (where datastore-synced definitions land after a pull) and .swamp/auto-definitions.
  • Read-only and value-free: never calls save(), never echoes the secret value in any output mode or the JSON payload.
  • Best-effort residual: definitions whose extension type cannot be resolved are reported as advisory "could not be assessed" warnings, not silently skipped.
  • Both log and --json output modes (json shape: { overallStatus, scanned, findings, unresolved }).
  • Scoped to model definitions. Workflow-definition global args and a sync-time guard are intentionally out of scope.

Files

  • src/domain/models/sensitive_field_extractor.tsbuildSensitiveArgRemediations (value-free, structured sibling of the guard's message builder)
  • src/libswamp/models/doctor_secrets.ts — the scan service + createDoctorSecretsDeps
  • src/presentation/renderers/doctor_secrets.ts — log + json renderers
  • src/cli/commands/doctor_secrets.ts + doctor.ts + mod.ts — command wiring/exports
  • design/doctor-secrets.md — design doc

Test Plan

  • Unit tests: domain remediation builder (incl. value-free assertions), service (leak / clean / vault-expr / unresolved-type / multi-def), renderer (log + json, pass/fail, value-free), CLI wiring.
  • Integration test (integration/doctor_secrets_test.ts): real YamlDefinitionRepository + real registry, planting a cleartext leak on disk in both the models/ and .swamp/auto-definitions trees.
  • End-to-end through the compiled binary in a scratch repo: clean scan (exit 0) and a planted-leak scan (exit 1) in both log and --json, confirming the secret never appears in output.
  • deno check, deno lint, deno fmt, full deno run test (6491 passed) all green.

Follow-ups

🤖 Generated with Claude Code

…wamp-club#483)

Adds `swamp doctor secrets`, a read-only diagnostic that reports model
definitions whose `sensitive: true` global arguments hold a cleartext
literal value instead of a `vault.get(...)` expression, with value-free
vault remediation guidance and a non-zero exit when any are found.

Follow-up to the at-rest write guard (swamp-club#480 / #1469): that guard
only protects new writes, leaving legacy definitions and definitions synced
byte-for-byte through a datastore unremediated. This is the detection half.

- Reuses the existing `findLiteralSensitiveGlobalArgs` domain rule, so what
  the scan surfaces is exactly what a re-save would now refuse.
- Scans both the source-of-truth `models/` tree (where datastore-synced
  definitions land after a pull) and `.swamp/auto-definitions`.
- Read-only and value-free: never calls save(), never echoes the secret.
- Best-effort residual: definitions whose extension type cannot be resolved
  are reported as advisory warnings, not silently skipped.
- Scoped to model definitions.

Verified end-to-end through the compiled binary (clean + planted-leak cases,
both log and json) and with an on-disk integration test covering the
models/ and auto-definitions trees.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. Log-mode scanning message is a bit verbose (src/presentation/renderers/doctor_secrets.ts): "Scanning definitions for cleartext sensitive global arguments…" is a long line. Consider "Scanning model definitions for cleartext sensitive arguments…" — it's shorter and the word "sensitive" carries the intent. Minor wording nit, not blocking.

  2. Remediation step 1 placeholder clarity (src/presentation/renderers/doctor_secrets.ts, line ~107): swamp vault put my-vault apiKey <value> uses <value> as a placeholder. It's clear enough, but <your-secret-value> would leave zero ambiguity for first-time users. Not blocking.

  3. Unresolved advisory ordering in fail path (src/presentation/renderers/doctor_secrets.ts): When there are both findings AND unresolved definitions, the unresolved warning appears after the "migrate it to a vault, then re-save" footer text. The footer is a clear call-to-action, and placing an advisory warning below it risks it being missed. Swapping the two (advisory warning → then footer) would make the failure block read findings → warnings → what-to-do. Suggestion only.

Verdict

PASS — Both output modes are complete and consistent with sibling doctor commands. Exit codes are correct (0/1), --repo-dir is present and documented, the JSON shape (overallStatus, scanned, findings, unresolved) gives script consumers everything they need, and the value-free design is correctly enforced across both renderers. No blocking issues.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Blocking Issues

None.

Suggestions

  1. Import ordering in src/libswamp/models/doctor_secrets.ts:30: The export type { SensitiveArgRemediation } sits between two import groups, which reads a bit unusually. Consider moving it after all imports or grouping it with the import it re-exports. Not functionally wrong — just a readability nit.

Summary

Clean, well-structured PR that adds a swamp doctor secrets diagnostic command. Key observations:

  • DDD alignment: Domain rule (buildSensitiveArgRemediations / findLiteralSensitiveGlobalArgs) stays in the domain layer, the scan service lives in libswamp (application service layer), rendering in presentation, CLI wiring in cli. Correct layering throughout.
  • Libswamp import boundary: Both the CLI command and renderer import exclusively from src/libswamp/mod.ts. Only the libswamp-internal test and the service itself import from internal paths. Correct.
  • Security: The scan is read-only and value-free — secret values never flow into findings, remediations, or output. Multiple tests explicitly assert no secret leakage (JSON.stringify(data).includes(SECRET) === false) at unit, service, renderer, and integration levels.
  • Test coverage: Comprehensive across all layers — domain (4 tests), service (5 tests), renderer (5 tests), CLI wiring (3 tests), and integration (2 tests with real on-disk repos). The integration tests cover both models/ and .swamp/auto-definitions trees.
  • Conventions: AGPLv3 headers present on all new files. AnyOptions pattern matches existing doctor commands. Both log and json output modes implemented. withTempDir cleanup uses the Windows .catch(() => {}) pattern. Test names follow the functionName: describes behavior convention.
  • Domain reuse: Correctly reuses the existing findLiteralSensitiveGlobalArgs domain rule rather than duplicating detection logic, ensuring the scan surfaces exactly what the write-time guard would reject.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None.

Medium

  1. Single-quote injection in vault expression templatesrc/domain/models/sensitive_field_extractor.ts:345: The expression template interpolates vaultName and vaultKey directly into single-quoted strings:

    expression: `\${{ vault.get('${vaultName}', '${vaultKey}') }}`

    If an extension author sets vaultName or vaultKey metadata containing a single quote (e.g., .meta({ sensitive: true, vaultName: "o'brien-vault" })), the generated expression becomes syntactically broken: ${{ vault.get('o'brien-vault', 'key') }}. The same breakage flows into the log renderer's swamp vault put command at src/presentation/renderers/doctor_secrets.ts:105.

    Impact: Broken remediation guidance for definitions using that extension type. Not a security issue (values come from extension author code, not user input), but it could confuse users who copy-paste the suggested command.

    Suggested fix: Escape or reject single quotes in vault name/key, or quote the values differently. This is a pre-existing limitation in the vault metadata contract so could also be addressed as a follow-up.

Low

  1. Dead error event variantsrc/libswamp/models/doctor_secrets.ts:80: DoctorSecretsEvent defines { kind: "error"; error: SwampError } and both renderers implement handlers for it, but the doctorSecrets generator never yields an error event. If findAllDefinitions() or getModelDef() throws, the exception propagates as an uncaught rejection through consumeStream, bypassing the error handler entirely. The process still exits non-zero (via the uncaught exception), so behavior is correct, but the error handler is unreachable dead code. This is consistent with other generators in the codebase that define-but-don't-yield error events, so not a regression.

  2. Unquoted arguments in remediation shell commandsrc/presentation/renderers/doctor_secrets.ts:105: The rendered command swamp vault put ${r.vaultName} ${r.vaultKey} <value> doesn't quote the vault name/key. If either contains spaces (from extension metadata), the displayed command would be wrong when copy-pasted. Theoretical since these come from extension author schema metadata, not user data.

Verdict

PASS — Clean, well-structured feature. The code correctly reuses the existing domain rule, properly separates concerns across domain/service/presentation layers, never leaks secret values, and has solid test coverage including integration tests with real on-disk repos. The medium finding is a pre-existing limitation in the vault expression format rather than a regression introduced by this PR.

@stack72 stack72 merged commit 44c62b6 into main May 29, 2026
11 checks passed
@stack72 stack72 deleted the worktree-483 branch May 29, 2026 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant