Skip to content

feat: npm run audit — read-only drift detector for state/dashboard divergence#27

Merged
dhruva-reddy merged 4 commits into
mainfrom
feature/audit-cmd
May 13, 2026
Merged

feat: npm run audit — read-only drift detector for state/dashboard divergence#27
dhruva-reddy merged 4 commits into
mainfrom
feature/audit-cmd

Conversation

@dhruva-reddy
Copy link
Copy Markdown
Contributor

Summary

Adds npm run audit -- <org> — a single-command read-only diagnostic for the state-vs-dashboard drift conditions that have been silently accumulating in customer-fork repos (orphan tools, byte-identical assistant clusters, state ghosts, etc.).

7 checks:

Rule Severity What it catches
orphan-yaml warn Local YAML files with no state entry (Scenario B: dashboard-rename leftovers)
state-ghost warn State entries pointing at UUIDs the dashboard no longer has
state-uuid-collision error Multiple slugs in state pointing at the same UUID (cascade-bug fingerprint)
content-identical warn Multiple slugs sharing the same lastPulledHash (PR #19 enables this)
sibling-base-slug warn Multiple slugs sharing the same base-slug — cascade-duplicate risk warning
dashboard-orphan warn Dashboard UUIDs not in state (suppressed by .vapi-ignore patterns)
inline-tools warn Assistants with non-empty model.tools blocks — suspected duplicate-spawn surface
fetch-failed warn A specific resource type's dashboard fetch threw; other types proceed normally

Exit code: 0 if clean, 1 on any finding. Safe to wire into CI.

Design: DI surface for stateLoader / listLocalIds / remoteFetcher / readAssistantTools — tests are filesystem-free and network-free. Mirrors validate-cmd.ts / validate.ts structure.

Concurrency: per-type dashboard fetches use Promise.allSettled so a transient API hiccup on one type emits a fetch-failed finding and skips the remote checks for that type instead of aborting the whole audit.

Why now

Today's manual cleanup on a customer-fork repo (mudflap-prod) drained 13 orphan endCall tool UUIDs and 7 duplicate assistant UUIDs across two orgs — a ~30-minute cross-reference exercise involving Node scripts, dashboard API calls, and YAML grep. The same cleanup pattern is needed for other customer forks (gitops-amazon3p, gitops-notable). This command makes it a single read.

Files changed

File Type LOC
src/audit.ts new ~490
src/audit-cmd.ts new ~115
tests/audit.test.ts new ~615 (28 tests, all passing)
src/pull.ts edit 1 word (functionexport function listExistingResourceIds)
package.json edit +1 script
README.md edit +1 command-table row

Test plan

  • npm run build — clean
  • npm test — 156 / 156 pass (28 new tests for audit, 128 prior)
  • npx @biomejs/biome check --write — clean
  • Smoke-run against in-memory synthetic state (collision + identical-hash + sibling-base + inline-tools + orphan-yaml): all 6 expected findings fire with correct severity bar and grouped output
  • Operator dogfood against a real customer-fork repo (planned for gitops-amazon3p as the next user)

Residual risks / known follow-ups

Surfaced during the code-review phase (non-blocking, deferred):

  1. credentials state section is intentionally not auditedVALID_RESOURCE_TYPES omits it by design, so the audit skips it silently. If credentials ever drift in state, audit won't detect. (Documented at the top of src/audit.ts.)
  2. Inline-tools rule swallows YAML parse failures — an unparseable assistant YAML produces zero inline-tools findings. validate catches parse errors via a separate code path. Could be promoted to its own unparseable-assistant rule in a follow-up.
  3. .vapi-ignore patterns must use slug form to suppress dashboard-orphan findings — bare UUIDs in .vapi-ignore won't work. Real-world .vapi-ignore content uses slug form, so unlikely to bite.
  4. No --summary flag yet — on a 200-assistant fork with deep drift, a single rename can produce 3–4 findings for one logical issue. Future v2 could collapse per-resource-id.
  5. No --fix flag — v1 is read-only by design; deletion stays a manual delete files → push → API-delete → pull sequence per safety.
  6. Per-rule JSDoc comments missing — top-of-file comment lists all 7, but each checkX function could use a 1-line invariant doc. Mechanical follow-up.

Diagnostic context

This command is part of a broader investigation into duplicate-tool / duplicate-assistant drift in customer-fork repos. A separate vapi-core ticket is being filed against the dashboard UI's assistant-editor save handler, which is the leading suspect for the spawn-source (see customer-fork improvements.md revision for the full diagnosis history).

Adds `npm run audit -- <org>` — single-command audit for the state-vs-dashboard
drift conditions that have been accumulating cruft in customer-fork repos.

Detects (read-only):
- orphan local YAML files (no state entry — Scenario B leftovers)
- state ghosts (state UUID missing on dashboard)
- state UUID collisions (cascade-duplicate fingerprint)
- content-identical resources (same lastPulledHash)
- sibling base-slug clusters (cascade-risk warning)
- dashboard orphans (UUID not in state; suppressed by .vapi-ignore)
- assistants with inline model.tools (suspected duplicate-spawn surface)

Exit code: 0 if clean, 1 if any findings.

Designed for DI: state loader, local file lister, remote fetcher are all
injectable, making tests filesystem-free and network-free.

Promotes `listExistingResourceIds` in src/pull.ts from `function` to
`export function` (one-word edit) to avoid duplicating the directory walker.

Tests in tests/audit.test.ts will be added in a follow-up commit on this
branch.
…ion)

Covers all 7 audit checks via DI fixtures (no filesystem, no network):
- orphan-yaml (3 cases)
- state-ghost (3 cases inc. fetchRemote=false short-circuit)
- state-uuid-collision (2 cases)
- content-identical (3 cases inc. missing-hash safety)
- sibling-base-slug (3 cases inc. cross-ref overlap)
- dashboard-orphan (4 cases inc. .vapi-ignore suppression)
- inline-tools (4 cases inc. async-Promise branch)

Plus: 1 integration test combining multiple checks, 1 exit-code mapping
test, 3 formatter tests.
Closes the gap surfaced by the test-writer phase: audit-cmd.ts had inlined
`findings.length === 0 ? 0 : 1` at every exit-code call site, so the
exit-code test in tests/audit.test.ts could only assert on a parallel
re-derivation rather than the real CLI behavior.

Extracts a tiny exported `exitCodeForFindings(findings)` helper and routes
both exit sites through it. Test imports and pins to the helper, so future
changes to the severity bar (e.g. a `--strict` flag in v2) will surface in
the existing assertion instead of silently drifting.

No behavior change. 155/155 tests pass.
Addresses two non-blocking code-review findings before opening the PR:

1. **Fail-fast → fail-graceful for dashboard fetches.** Switched the
   parallel per-type API calls from `Promise.all` to `Promise.allSettled`.
   A transient 500 / 429 / network blip on one resource type used to abort
   the entire audit, leaving the operator with zero findings instead of
   findings-for-the-types-that-succeeded.

   Now: each failed fetch emits a `fetch-failed` finding (severity: warn,
   message includes the underlying error). The per-type loop checks
   `remoteByType.has(type)` before running state-ghost and dashboard-orphan
   checks — preventing the would-be false-positive where an empty-array
   fallback marks every state entry as a ghost.

   New rule: `AuditRule = ... | "fetch-failed"`.

2. **README command table missing `audit`.** Added a row under the
   `validate` entry so operators discover the command from the same
   surface that lists `pull`/`push`/`cleanup`/`rollback`/etc.

New test pinning the fail-graceful path: one type's `remoteFetcher`
throws → exactly 1 `fetch-failed` finding for that type, 0 false-positive
state-ghost findings for any state entry of that type, and other types'
checks proceed normally.

Suite: 156/156 pass (+1 test).
@dhruva-reddy dhruva-reddy merged commit f7c159e into main May 13, 2026
@dhruva-reddy dhruva-reddy deleted the feature/audit-cmd branch May 13, 2026 19:28
dhruva-reddy added a commit that referenced this pull request May 13, 2026
Followup to #27. Agents discovering the engine surface read the command
tables in AGENTS.md (lines 62-72 and 797-830); both need to mention
`npm run audit` so downstream agents pick it up in normal workflow.

Added to:
- "Common commands" quick-reference table (after `validate`)
- "## Available Commands" bash block (after `validate`, before `sim`)

Docs-only PR — skip test-writer/code-reviewer per the always-apply rule
for docs-only changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant