Memory extraction fixes + add_cosmos cadence + process_now full pipeline#20
Merged
aayush3011 merged 7 commits intoJun 3, 2026
Merged
Conversation
There was a problem hiding this comment.
Pull request overview
This PR hardens the core memory-extraction pipeline (facts + episodics) to prevent hallucinated/phantom facts, stop re-processing already-extracted turns, and make episodic memories dedupe/upsert by scope identity. It also fixes add_cosmos + process_now bypassing cadence so in-process callers run the complete pipeline (including procedural + user summary) and can correctly read user-scoped memories even when filtering by thread_id.
Changes:
- Add turn watermarking (
extracted_at) to prevent repeated extraction over the same turns, plus a grounding warning heuristic for suspicious extracted facts. - Redefine episodic identity to be deterministic by
(user_id, scope_type, scope_value)and persist episodics in the__episodic__sentinel partition via upsert. - Ensure
add_cosmostriggers cadence for turn writes, andprocess_nowruns the full 5-step in-process pipeline; update query-building/search to include user-scoped types whenthread_idis provided.
Reviewed changes
Copilot reviewed 29 out of 29 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_reconcile.py | Adds episodic reconciliation/identity tests and asserts new extract result shape. |
| tests/unit/test_process_now.py | Verifies in-process process_now runs all 5 steps and handles transient/permanent tail-step failures. |
| tests/unit/test_procedural_synthesis.py | Updates episodic fixtures to require/use text. |
| tests/unit/test_pipeline_confidence.py | Adds coverage for required episodic text, drop counters, and new embedding behavior. |
| tests/unit/test_memory_type_multi.py | Tests query builder’s thread_id OR-clause behavior for user-scoped types. |
| tests/unit/test_cosmos_memory_client.py | Updates add_cosmos behavior (thread_id required for turns, cadence trigger behavior). |
| tests/unit/store/test_memory_store.py | Validates read/search query behavior for user-scoped types when filtering by thread_id. |
| tests/unit/services/test_pipeline_service.py | Aligns episodic embedding content with new text field behavior. |
| tests/unit/services/test_extract_dry.py | Adds dry-run support for returning processed turns and tests extracted_at watermarking behavior. |
| tests/unit/aio/test_process_now.py | Async mirror of full-pipeline process_now tail-step behavior. |
| tests/unit/aio/test_cosmos_memory_client.py | Async add_cosmos cadence scheduling + background-task draining in tests. |
| tests/unit/aio/store/test_memory_store.py | Async mirror of read/search query behavior for user-scoped types. |
| pyproject.toml | Bumps package version to 0.1.0b2. |
| function_app/requirements.txt | Updates function app dependency pin to 0.1.0b2. |
| CHANGELOG.md | Adds 0.1.0b2 release notes and reformats headings. |
| azure/cosmos/agent_memory/store/memory_store.py | Forces cross-partition search when user-scoped types are in scope with thread_id. |
| azure/cosmos/agent_memory/services/pipeline.py | Adds extracted-turn filtering, episodic upsert-by-scope, grounding warnings, and extracted_at marking after persist. |
| azure/cosmos/agent_memory/services/_pipeline_helpers.py | Adds existing_episodics prompt formatting and fact grounding-check helper. |
| azure/cosmos/agent_memory/prompts/extract_memories.prompty | Tightens extraction rules (speaker discrimination, grounding constraints) and requires episodic text. |
| azure/cosmos/agent_memory/prompts/_schemas.py | Updates episodic schema to require text. |
| azure/cosmos/agent_memory/processors/base.py | Extends ProcessThreadResult with procedural and user_summary. |
| azure/cosmos/agent_memory/cosmos_memory_client.py | Makes add_cosmos cadence-aware for turns and expands process_now to include tail steps with transient-error swallowing. |
| azure/cosmos/agent_memory/aio/store/memory_store.py | Async cross-partition behavior for user-scoped types during search. |
| azure/cosmos/agent_memory/aio/services/pipeline.py | Async mirror of pipeline watermarking, episodic upsert-by-scope, and grounding warnings. |
| azure/cosmos/agent_memory/aio/cosmos_memory_client.py | Async add_cosmos cadence scheduling + async process_now tail-step behavior with transient-error swallowing. |
| azure/cosmos/agent_memory/_utils.py | Adds thread-id OR-clause support via query builder for user-scoped types. |
| azure/cosmos/agent_memory/_query_builder.py | Implements add_thread_id_or_user_scoped helper for SQL condition construction. |
| azure/cosmos/agent_memory/_container_routing.py | Introduces USER_SCOPED_MEMORIES_TYPES constant (episodic, procedural). |
| azure/cosmos/agent_memory/_base/base_client.py | Adds is_transient_tail_step_error classifier for process_now tail-step error handling. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
added 2 commits
June 3, 2026 00:50
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR fixes
The extraction LLM was producing facts the user never asserted. Specific patterns fixed:
Callers using add_cosmos + process_now (instead of add_local + push_to_cosmos) silently bypassed every cadence env var (THREAD_SUMMARY_EVERY_N, FACT_EXTRACTION_EVERY_N, USER_SUMMARY_EVERY_N, etc.) and never fired procedural / user-summary synthesis.