You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
prompt-hook: HIGH-tier gate telemetry ('high-keyword'/'high-token') fires even when codegraph_explore errors or returns nothing, masking real failures from the gate's own measured recall funnel #1143
The prompt-hook's HIGH tier calls codegraph_explore and only writes the <codegraph_context>
injection when the result succeeds and has content — correctly guarded. But the telemetry call
right after it fires unconditionally regardless of that guard, so a silent explore failure gets
recorded as a successful HIGH-tier fire rather than folding into the noop-* signal the repo's own
telemetry docs describe as the gate's measured recall/precision funnel.
gate(...) sits one indent level OUTSIDE the if (!result.isError && text.trim()) block — no else, so it runs whether or not the injection actually happened.
// src/bin/codegraph.ts:1077-1079constgate=(outcome: string): void=>{try{getTelemetry().recordUsage('cli_command',`prompt-hook-gate-${outcome}`,true);}catch{/* never break the hook */}};
recordUsage's third argument (success/ok) is hardcoded true — the outcome name is the only
thing that varies, so gate('high-keyword')/gate('high-token') always records success regardless
of what codegraph_explore actually returned.
Why this matters here specifically
// docs/design/telemetry.md:75-81
The prompt hook additionally rolls up its gate DECISION as `cli_command`
counters named `prompt-hook-gate-<outcome>`, outcome ∈ `high-keyword` /
`high-token` / `medium-segment` / `nudge-projects` / `noop-shape` /
`noop-no-index` / `noop-unverified` — decision names only, never prompt
content. This is the gate's measured recall/precision funnel: a rising
`noop-*` share against the `high`/`medium` tiers is the signal that the
gate (keyword table or segment matching) is missing real questions.
That interpretation only holds if a high-* outcome reliably means the tier actually delivered
context. A silent codegraph_explore failure (transient error, or a query that happens to return no
relevant nodes) currently counts as a high-* success in that funnel instead of degrading it toward noop-* — the exact signal this telemetry exists to surface.
Impact
I believe (not measured against live telemetry data — I don't have access to it) this understates
real HIGH-tier failures in the aggregate rollup, making the gate look more effective than it is on
any day/period where codegraph_explore errors or comes back empty for keyword/token-verified
prompts. I don't know how often that happens in practice.
Suggested fix
Move the gate(...) call inside the success branch, and add an explicit failure-outcome name to the else (or fold it into an existing noop-* name — that's a naming call I'd leave to you, since the
telemetry doc's outcome enum is presumably meant to stay a fixed, documented set):
Re-checked immediately before filing — no new overlapping issues since the earlier pass.
All line citations re-read directly just now, including the exact recordUsage(..., true)
hardcoded success argument and the telemetry.md interpretation text.
Summary
The prompt-hook's HIGH tier calls
codegraph_exploreand only writes the<codegraph_context>injection when the result succeeds and has content — correctly guarded. But the telemetry call
right after it fires unconditionally regardless of that guard, so a silent explore failure gets
recorded as a successful HIGH-tier fire rather than folding into the
noop-*signal the repo's owntelemetry docs describe as the gate's measured recall/precision funnel.
Root cause
gate(...)sits one indent level OUTSIDE theif (!result.isError && text.trim())block — noelse, so it runs whether or not the injection actually happened.recordUsage's third argument (success/ok) is hardcodedtrue— the outcome name is the onlything that varies, so
gate('high-keyword')/gate('high-token')always records success regardlessof what
codegraph_exploreactually returned.Why this matters here specifically
That interpretation only holds if a
high-*outcome reliably means the tier actually deliveredcontext. A silent
codegraph_explorefailure (transient error, or a query that happens to return norelevant nodes) currently counts as a
high-*success in that funnel instead of degrading it towardnoop-*— the exact signal this telemetry exists to surface.Impact
I believe (not measured against live telemetry data — I don't have access to it) this understates
real HIGH-tier failures in the aggregate rollup, making the gate look more effective than it is on
any day/period where
codegraph_exploreerrors or comes back empty for keyword/token-verifiedprompts. I don't know how often that happens in practice.
Suggested fix
Move the
gate(...)call inside the success branch, and add an explicit failure-outcome name to theelse(or fold it into an existingnoop-*name — that's a naming call I'd leave to you, since thetelemetry doc's outcome enum is presumably meant to stay a fixed, documented set):
Verification / scope
(Add GDScript and Godot scene graph support #1098, feat(cli): expose circular dependency detection #1012, feat(installer): add Pi as a native-extension agent target #992), none touch this line range (~1123-1147).
"telemetry inflated" — nothing beyond feat(prompt-hook): graph-derived gate tier, confidence-tiered injection, gate telemetry #1136 itself (which introduced this code) and feat(telemetry): anonymous usage telemetry — documented schema, opt-out, public ingest worker #834 (the
underlying telemetry module, unrelated to this specific bug).
recordUsage(..., true)hardcoded success argument and the telemetry.md interpretation text.
Environment
Found on
main(tipe699ee9, v1.2.0).