Skip to content

fix(cypher): label-filtered edge traversal silently truncates at 10 results#412

Open
isc-tdyar wants to merge 1 commit into
DeusData:mainfrom
isc-tdyar:fix/cypher-label-filter-truncation
Open

fix(cypher): label-filtered edge traversal silently truncates at 10 results#412
isc-tdyar wants to merge 1 commit into
DeusData:mainfrom
isc-tdyar:fix/cypher-label-filter-truncation

Conversation

@isc-tdyar
Copy link
Copy Markdown

MATCH with label filters silently returns at most 10 results regardless of actual edge count

Discovered while using codebase-memory-mcp on a production codebase. A class with 37 methods returned only 10 via MATCH (c:Class)-[:DEFINES_METHOD]->(m:Method). No error, no warning, no truncation indicator — the query appeared to succeed. Querying without label filters (MATCH (c)-[:DEFINES_METHOD]->(m)) returned all 37.

Root cause

In execute_single(), bind_cap is set to scan_count — the number of nodes returned by the initial pattern scan. When matching a single class by name, scan_count = 1. The edge expansion loop then runs with max_new = bind_cap * CYP_GROWTH_10 = 10, and exits after collecting 10 edges.

// before
int bind_cap = scan_count > 0 ? scan_count : SKIP_ONE;
// bind_cap = 1, max_new = 10 — silently drops any edge beyond the 10th

// after
int bind_cap = scan_count > max_rows ? scan_count : (max_rows > 0 ? max_rows : SKIP_ONE);
// bind_cap = max_rows (default 100k), max_new = 1,000,000 — no truncation

This affects any label-filtered edge traversal where the initial pattern matches few nodes but each node has many edges. A single class with >10 methods, a single module with >10 imports, etc. Language-agnostic.

Regression test

A Python class with 15 methods indexed in full mode must return all 15 via MATCH (c:Class)-[:DEFINES_METHOD]->(m:Method) WHERE c.name = 'BigClass'. Added to tests/test_incremental.c.

Relation to existing issues

This is a distinct variant of the "silent incomplete index" pattern tracked in #391. The difference: the graph is correctly built (all 37 edges exist in SQLite), but the Cypher query engine silently caps traversal at 10. cbm_store_count_edges() returns the correct total; the truncation only appears when querying with label-typed patterns.

Happy to add more detail to the test if useful.

…esults

MATCH (c:Class)-[:DEFINES_METHOD]->(m:Method) returned at most 10 results
for any class, regardless of how many methods it actually has.

Root cause: bind_cap was set to scan_count (the number of nodes matched in
the initial pattern — typically 1 when querying a single class by name).
max_new = bind_cap * 10 = 10, so the edge expansion loop exited after
collecting 10 results. No error, no warning, no truncation indicator.

This is language-agnostic: any class with more than 10 methods in any
language was silently truncated. The fix is two characters:
  bind_cap = scan_count > max_rows ? scan_count : max_rows

Regression test: a Python class with 15 methods must return all 15 via
MATCH (c:Class)-[:DEFINES_METHOD]->(m:Method) with label filtering.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant