fix: prefer mental model abstraction in reflect to avoid unnecessary low-level retrieval by Oxygen56 · Pull Request #2012 · vectorize-io/hindsight

Oxygen56 · 2026-06-05T13:03:50Z

Summary

The reflect operation was unconditionally forcing the full hierarchical retrieval chain (search_mental_models -> search_observations -> recall) before allowing the agent to answer, even when mental model abstraction would provide adequate context. This PR adds budget-aware short-circuit logic to prefer higher-level abstractions when appropriate.

Problem

In run_reflect_agent, the forced_sequence always forced search_mental_models, then search_observations, then recall before allowing auto mode. Even when a fresh, relevant mental model could answer the query, the agent was forced to continue to lower-level retrieval, increasing latency, cost, and potentially duplicating already-synthesized knowledge.

Changes

Budget-aware short-circuit in forced sequence (agent.py lines 599-618):
- low budget: Force only search_mental_models, then allow auto mode
- mid budget: Skip lower-level retrieval when mental models are fresh and non-empty
- high budget: Preserve the full hierarchical verification path (existing behavior)
Mental model freshness tracking (agent.py line 425):
- Added mental_models_sufficient flag to track when retrieved mental models are sufficient
Freshness evaluation (agent.py lines 982-990):
- After search_mental_models returns, check for mental models with is_stale=False and non-empty content
- Set mental_models_sufficient = True when at least one model meets these criteria

Behavior

Budget	Before	After
`low`	Force all 3 retrieval layers	Force only `search_mental_models`, then auto
`mid`	Force all 3 retrieval layers	Force `search_mental_models`; skip lower layers if fresh results found
`high`	Force all 3 retrieval layers	Force all 3 retrieval layers (unchanged)

The default budget (low) now completes in fewer iterations when mental models are available, reducing latency and cost while maintaining answer quality.

Backward Compatibility

high budget: Unchanged behavior (full verification chain)
mid budget: Conditional short-circuit only when mental models are truly fresh
low budget: Changed default to prefer speed, consistent with documented budget semantics ("Prioritize speed over completeness")

Fixes #1971

The reflect operation was forcing lower-level retrieval even when the mental model abstraction would provide adequate context. This change adds budget-aware short-circuit logic: - Low budget: force only search_mental_models, then allow auto - Mid budget: skip lower retrieval when mental models are fresh - High budget: preserve full hierarchical verification path The check evaluates is_stale=False and non-empty content on mental models returned by search_mental_models. When at least one model meets these criteria, further forced retrieval steps are skipped. Fixes vectorize-io#1971 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prefer mental model abstraction in reflect to avoid unnecessary low-level retrieval#2012

fix: prefer mental model abstraction in reflect to avoid unnecessary low-level retrieval#2012
Oxygen56 wants to merge 1 commit into
vectorize-io:mainfrom
Oxygen56:fix/reflect-retrieval-optimization-1971

Oxygen56 commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Oxygen56 commented Jun 5, 2026

Summary

Problem

Changes

Behavior

Backward Compatibility

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant