Skip to content

feat(incident): resolve 6-char short ids in detail/get + add list --nums#26

Merged
ysyneu merged 1 commit into
mainfrom
feat/incident-short-id-lookup
Jun 1, 2026
Merged

feat(incident): resolve 6-char short ids in detail/get + add list --nums#26
ysyneu merged 1 commit into
mainfrom
feat/incident-short-id-lookup

Conversation

@ysyneu
Copy link
Copy Markdown
Contributor

@ysyneu ysyneu commented Jun 1, 2026

Why

A user asked the AI-SRE agent to "分析故障 311510". 311510 is an incident short id (the num shown in the UI — the trailing 6 hex of the 24-char Mongo ObjectID 6a12a4502f0a2396b3311510). The agent failed repeatedly:

call result
incident detail 311510 HTTP 400cannot unmarshal into an ObjectID, length must be 24 but it is 6
incident get 311510 HTTP 400 — same, via incident_ids
incident list --query 311510 empty (default --since 24h; incident was ~8 days old)

The backend does support short-id lookup — but only on POST /incident/list via the nums param (and a 6-char query auto-promotion). The single-incident verbs (detail/get) route the arg into ObjectID-typed fields, so a short id can never work there; and the one path that resolves it was hidden behind a narrow default time window.

The short id is non-unique by design (num is the ObjectID's trailing 3 bytes — backend field is commented "短标识,可能重复"), so it can't be soundly mapped onto the single-result /incident/info endpoint. Resolution therefore belongs in the CLI consumer layer (go-flashduty stays 1:1 with the API; /incident/info keeps its unambiguous contract).

What

  • incident detail <id> / incident get <id>: detect a 6-hex arg and resolve it to a full id via /incident/list with nums over the last 30 days (the list API caps its span at ~31 days):
    • unique hit → proceed with the resolved full id
    • multiple → surface the candidates and stop — never silently analyze the wrong incident
    • none → clear error pointing at the full-id fallback
    • a 24-char id (or any non-short value) passes through unchanged
  • incident list --nums for explicit short-id filtering

Verification

  • go build ./..., go vet ./..., full go test ./... — green; gofmt clean.
  • 6 new unit tests (incident_short_id_test.go) assert the exact wire payloads against the go-flashduty stub: nums:[…] + a 30-day span → /incident/list, then the resolved full id → /incident/info; ambiguity → candidate list with no /incident/info call; not-found → descriptive error; full id → passthrough with no resolve; --nums → array on the wire.
  • The backend's num-resolution is already empirically confirmed by the original session transcript (a since=14d query returned exactly 6a12a4502f0a2396b3311510).
  • A live binary-against-API run was not performed: api-dev keys are encrypted in pgy and the prod service key 500s on /incident/list, so I couldn't exercise the real endpoint with valid creds. The composed behavior is covered by the unit tests + the transcript evidence above.

Notes

  • Scoped to the CLI. Follow-ups (separate PRs): expose nums on the MCP query_incidents tool and add a one-line short-id note to the flashduty skill.
  • make lint currently fails on a pre-existing toolchain skew (go.mod go 1.25.1 vs golangci-lint built against go1.24), unrelated to this change.

`incident detail <id>` and `incident get <id>` assumed a full 24-char
ObjectID and sent the positional arg straight into ObjectID-typed fields,
so a 6-char short id (the "num" shown in the UI, e.g. 311510) failed with
HTTP 400. The only backend path that accepts a num is /incident/list,
which the agent previously reached only by luck and a wide-enough --since.

Detect a 6-hex arg and resolve it to a full id via /incident/list with
`nums` over the last 30 days (the list API caps its query span at ~30 days):
  - unique hit  -> proceed with the resolved full id
  - multiple    -> surface the candidates (num is non-unique by design) and
                   stop; never silently analyze the wrong incident
  - none        -> clear error pointing at the full-id fallback
A 24-char id (or any non-short value) passes through unchanged.

Also add `incident list --nums` for explicit short-id filtering.
@ysyneu ysyneu merged commit 4ac23d1 into main Jun 1, 2026
12 checks passed
@ysyneu ysyneu deleted the feat/incident-short-id-lookup branch June 1, 2026 05:15
ysyneu added a commit that referenced this pull request Jun 2, 2026
Reconcile the full-OpenAPI-coverage feature with main's #26 (incident
6-char short-id resolution in detail/get + list --nums) and #27 (shell
tab-completion + install.sh auto-setup).

incident.go conflicts resolved to main's design:
- list: keep --nums (#26) plus the eval's "--since→--until window < 31d"
  help text.
- get: adopt #26's resolveIncidentArg→List-by-incident_ids body; the
  full-coverage branch's /incident/info get is superseded by #26's new
  `detail` command (which carries the no-window Info path). All 7
  incident short-id tests pass.

postmortem.go / status_page.go (deleted by the coverage branch in favor
of generated commands) kept deleted; #27's registerEnumFlag additions to
them drop with the files — generated post-mortem/status-page commands
cover those ops. completion.go + install.sh + the curated commands'
registerEnumFlag calls retained.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant