Skip to content

fix(graph): cloud pull sha + cross-agent VFS parity (Cursor/Codex/Hermes)#228

Merged
efenocchi merged 32 commits into
mainfrom
fix/graph-pull-hash-mismatch
Jun 6, 2026
Merged

fix(graph): cloud pull sha + cross-agent VFS parity (Cursor/Codex/Hermes)#228
efenocchi merged 32 commits into
mainfrom
fix/graph-pull-hash-mismatch

Conversation

@efenocchi

@efenocchi efenocchi commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Summary

Code-graph fixes + cross-agent parity (no docs in this PR — the planning PRD is kept local only).

  1. fix(graph): cloud pull sha verification — P1. Push stored snapshot_sha256 over the stable fields only (excludes observation); pull was hashing the entire payload and comparing the two, so every real cloud row was rejected with snapshot_sha256 mismatch and team pull silently failed. Pull now parses the payload, validates nodes/links, and recomputes via computeSnapshotSha256 (the same function push uses).

  2. feat(graph): cross-agent VFS parity — the code graph previously worked only on Claude Code. New shared parser src/graph/graph-command.ts (tryGraphRead) is wired into Cursor, Codex, and Hermes pre-tool-use hooks; Cursor also gets the SessionStart graph inject and the auto-build hook (graph-on-stop) in its bundle + hook registration. Hardened over two rounds of Codex review (pipe-swallow, path traversal, quoted paths/flags, multi-file reads, per-agent cwd).

  3. feat(graph): hivemind-graph skill for claude-code / codex / hermes (Cursor & pi have no skill loader and rely on the inject).

Tests

  • tests/shared/graph/deeplake-pull.test.ts (+5 regression) and tests/shared/graph/graph-command.test.ts (16) green.
  • Full suite 3751 green; npm run build green. Pre-existing unrelated teardown error in session-notifications-hook.test.ts only.

Session Context

Session transcript

Summary by CodeRabbit

  • New Features

    • Multi-language code graph: TypeScript, JavaScript, and Python support; cross-file resolution (calls/imports/heritage) and new VFS endpoints: query/, impact/, neighborhood/, layers/, tour/, path/
  • Bug Fixes

    • Stronger cloud snapshot validation: rejects malformed or tampered payloads and verifies stable-field hashes before accepting snapshots
  • Documentation

    • Hivemind Graph skill guides added for multiple agents
  • Tests

    • Broad multi-language, cross-file, VFS, and regression test coverage
  • Chores

    • Added tree-sitter-python and ignored evals/ directory

@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Python and JavaScript extraction, Phase 1.5 cross-file resolution, VFS query/render endpoints (query/impact/neighborhood/layers/tour/path), Deeplake snapshot validation, hook wiring and safe shell output, install/build updates for tree-sitter-python, and extensive tests.

Changes

Multi-language Graph Extraction and Resolution

Layer / File(s) Summary
Language extractor contracts and dispatcher
src/graph/extract/index.ts, src/graph/extract/python.ts, src/graph/extract/typescript.ts
Adds extractFile dispatcher and Python extractor; TypeScript extractor extended with raw_calls, import_bindings, and signature metadata; JS path remapping to javascript.
Cross-file resolution and snapshot pipeline
src/graph/resolve/cross-file.ts, src/graph/types.ts, src/graph/snapshot.ts
Implements cross-file call/import/heritage resolution, repoints relative imports, resolves heritage placeholders, annotates node degrees, and integrates resolution into buildSnapshot. Adds RawCall/ImportBinding types and Phase 1.5 metadata fields.

Graph VFS and Query Surface Expansion

Layer / File(s) Summary
VFS command parsing and routing
src/graph/graph-command.ts
Parses shell read commands for /graph/*, prevents traversal, synthesizes directory listings, and routes reads into handleGraphVfs.
Graph render endpoints
src/graph/render/impact.ts, src/graph/render/layers.ts, src/graph/render/neighborhood.ts, src/graph/render/path.ts, src/graph/render/tour.ts, src/graph/node-metadata.ts
Adds impact (BFS dependents), layers (path-based buckets), neighborhood (cross-file neighbors), path (shortest directed/undirected paths), tour (dependency-ordered walkthrough), and node degree annotation.
VFS dispatcher and search logic
src/graph/vfs-handler.ts
Extends VFS with query/impact/neighborhood/layers/tour/path; replaces substring find with token-ranked matching and single-token fuzzy fallback; implements renderQuery with 1-hop neighbor expansion and handle persistence.

Hook Integration and Output Transport

Layer / File(s) Summary
Hook installations and session context
src/cli/install-codex.ts, src/cli/install-cursor.ts, src/cli/install-hermes.ts, src/hooks/graph-on-stop.ts, src/hooks/cursor/session-start.ts, src/hooks/hermes/session-start.ts
Installs graph-on-stop.js alongside existing stop/session-end hooks and injects graph-context lines into session-start payloads; broadens source globs for graph-on-stop gating.
Graph VFS hook intercepts and safe output escaping
src/hooks/codex/pre-tool-use.ts, src/hooks/cursor/pre-tool-use.ts, src/hooks/hermes/pre-tool-use.ts, src/hooks/pre-tool-use.ts
Interposes tryGraphRead into pre-tool-use for agents, returning graph VFS content early; adds safeEchoCommand to emit shell-safe literal output preserving newlines and resisting injection.

Snapshot Validation and Build Configuration

Layer / File(s) Summary
Snapshot pull validation
src/graph/deeplake-pull.ts, tests/shared/graph/deeplake-pull.test.ts
Parses snapshot_jsonb into GraphSnapshot, validates shape, computes stable-field hash via computeSnapshotSha256, and rejects invalid/untampered rows; tests added for hash/regression/error cases.
Build deps and externals
package.json, esbuild.config.mjs, scripts/ensure-tree-sitter.mjs
Adds tree-sitter-python dependency, marks it external in builds, and updates ensure-tree-sitter to validate Python bindings.
Ignore config and discovery
src/graph/ignore-config.ts, src/commands/graph.ts
Adds ~/.deeplake/graph-ignore.json loading, ignore-set helpers, and updates graph build discovery to respect ignore config and optional git-aware discovery.

Documentation and Packaging

Layer / File(s) Summary
Skill documentation and packaging
claude-code/skills/hivemind-graph/SKILL.md, codex/skills/hivemind-graph/SKILL.md, hermes/skills/hivemind-graph/SKILL.md, .gitignore
Adds per-agent hivemind-graph skill docs and ignores evals/ in .gitignore. Updates esbuild entries for graph-on-stop wiring.

Validation and Test Coverage

Layer / File(s) Summary
Extraction and cross-file tests
tests/shared/graph/*.test.ts
Adds extensive Vitest suites for Python extraction, JavaScript extraction, cross-file resolution (TS and Python), renderers (impact/layers/neighborhood/path/tour), graph-command parsing, vfs-handler behaviors, ignore-config, node-metadata, and installer/hook tests.
Safe echo and pre-tool-use tests
tests/claude-code/safe-echo.test.ts, tests/claude-code/pre-tool-use-branches.test.ts, tests/cli/*
Validates safeEchoCommand across shells, updates pre-tool-use branch formatting expectations, and checks installer hook composition.

Estimated code review effort:
🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly Related PRs

Suggested Reviewers

  • kaghni

Poem

🐰 I hopped in with parsers for Python and JS,
I traced every call and resolved import mess,
I added tours and paths and a VFS that sings,
Safe echoes, hooks, and tests — delightful things.
Cheers — the graph now hums and hops with wings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/graph-pull-hash-mismatch

@coderabbitai coderabbitai Bot requested a review from kaghni June 3, 2026 00:44
@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Coverage Report

Scope: files changed in this PR. Enforced threshold: 90% per metric (per file via vitest.config.ts).

Status Category Percentage Covered / Total
🟢 Lines 92.94% (🎯 90%) 2515 / 2706
🟢 Statements 90.94% (🎯 90%) 2862 / 3147
🟢 Functions 94.44% (🎯 90%) 306 / 324
🔴 Branches 80.43% (🎯 90%) 1706 / 2121
File Coverage — 28 files changed
File Stmts Branches Functions Lines
src/cli/install-codex.ts 🟢 100.0% 🟢 91.2% 🟢 100.0% 🟢 100.0%
src/cli/install-cursor.ts 🟢 96.7% 🔴 86.1% 🟢 100.0% 🟢 100.0%
src/cli/install-hermes.ts 🟢 96.8% 🔴 88.1% 🟢 100.0% 🟢 98.8%
src/commands/graph.ts 🔴 74.5% 🔴 63.0% 🟢 95.2% 🔴 74.0%
src/graph/deeplake-pull.ts 🔴 82.3% 🔴 71.0% 🔴 77.8% 🔴 84.0%
src/graph/extract/index.ts 🟢 100.0% 🟢 100.0% 🟢 100.0% 🟢 100.0%
src/graph/extract/python.ts 🔴 83.6% 🔴 62.0% 🟢 96.0% 🔴 87.1%
src/graph/extract/typescript.ts 🟢 91.6% 🔴 76.6% 🟢 100.0% 🟢 97.5%
src/graph/graph-command.ts 🟢 96.4% 🔴 87.0% 🟢 100.0% 🟢 100.0%
src/graph/ignore-config.ts 🟢 100.0% 🟢 100.0% 🟢 100.0% 🟢 100.0%
src/graph/node-metadata.ts 🟢 100.0% 🟢 100.0% 🟢 100.0% 🟢 100.0%
src/graph/render/impact.ts 🟢 91.6% 🔴 70.6% 🔴 83.3% 🟢 98.5%
src/graph/render/layers.ts 🟢 97.8% 🟢 93.1% 🟢 100.0% 🟢 97.6%
src/graph/render/neighborhood.ts 🟢 91.5% 🔴 78.8% 🔴 85.7% 🟢 95.8%
src/graph/render/path.ts 🟢 94.4% 🔴 75.0% 🟢 100.0% 🟢 97.3%
src/graph/render/tour.ts 🟢 91.5% 🔴 73.8% 🟢 100.0% 🟢 90.4%
src/graph/resolve/cross-file.ts 🟢 95.8% 🟢 90.2% 🟢 100.0% 🟢 100.0%
src/graph/session-context.ts 🟢 97.6% 🟢 96.8% 🟢 100.0% 🟢 97.1%
src/graph/snapshot.ts 🟢 100.0% 🟢 92.8% 🟢 100.0% 🟢 100.0%
src/graph/types.ts 🟢 100.0% 🟢 100.0% 🟢 100.0% 🟢 100.0%
src/graph/vfs-handler.ts 🟢 91.4% 🔴 81.3% 🟢 90.0% 🟢 96.7%
src/hooks/codex/pre-tool-use.ts 🟢 96.8% 🔴 84.0% 🔴 80.0% 🟢 98.0%
src/hooks/cursor/pre-tool-use.ts 🔴 89.5% 🟢 90.0% 🟢 100.0% 🔴 87.1%
src/hooks/cursor/session-start.ts 🟢 98.6% 🟢 91.7% 🔴 87.5% 🟢 100.0%
src/hooks/graph-on-stop.ts 🔴 89.5% 🔴 73.3% 🔴 87.5% 🟢 90.5%
src/hooks/hermes/pre-tool-use.ts 🟢 91.9% 🟢 90.0% 🟢 100.0% 🔴 89.7%
src/hooks/hermes/session-start.ts 🟢 98.4% 🟢 92.5% 🔴 83.3% 🟢 100.0%
src/hooks/pre-tool-use.ts 🟢 94.5% 🔴 88.5% 🟢 94.4% 🟢 94.2%

Generated for commit 1cac9d2.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/graph/deeplake-pull.ts (1)

168-177: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Verify the payload's graph identity matches the selected row.

The SHA check only proves the payload is self-consistent; it does not prove it belongs to this head/repoKey. A mis-keyed row whose graph.commit_sha or graph.repo_key disagrees with the SELECT key will still be written to ${head}.json and recorded in .last-build.json, which poisons the local cache with a different snapshot. Add explicit equality checks against head and repoKey before the write path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/graph/deeplake-pull.ts` around lines 168 - 177, The SHA check only
ensures payload integrity but not that it belongs to the requested head/repoKey;
add explicit identity checks before any write path: after computing and
validating computedSha (using computeSnapshotSha256 and parsedSnapshot) verify
parsedSnapshot.graph.commit_sha === head and parsedSnapshot.graph.repo_key ===
repoKey (or their canonical forms) and if either mismatches return
errorOutcome("SELECT cloud row", new Error(...)) with a clear message indicating
which field mismatched (commit_sha or repo_key) and the expected vs actual
values so the mis-keyed row is rejected rather than written to `${head}.json` /
`.last-build.json`.
🧹 Nitpick comments (1)
docs/GRAPH-ROADMAP.md (1)

9-9: ⚡ Quick win

Replace machine-local /tmp reference with a durable repo/PR artifact link.

Line 9 points to /tmp/codex-graph-output.md, which is not shareable/reproducible for reviewers. Prefer a committed file path or a PR comment/link.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/GRAPH-ROADMAP.md` at line 9, In docs/GRAPH-ROADMAP.md update the table
entry that references `/tmp/codex-graph-output.md` (the line containing
"graphify e Understand-Anything in `_graph-comparison/`; review Codex in
`/tmp/codex-graph-output.md`") so it points to a durable, shareable artifact —
either commit and reference a file in the repo (e.g.,
`_graph-comparison/codex-graph-output.md`) or replace the `/tmp/...` path with a
PR comment/link or CI/artifact URL; ensure the new reference is accessible to
reviewers and update the table cell text accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/GRAPH-ROADMAP.md`:
- Around line 159-173: Add language identifiers to the two fenced code blocks
that currently open without a language (the block containing the roadmap diagram
"Blocco A ... Blocco G" and the block starting "Query: 'pushSnapshot' — 2
matches...") so markdownlint rule MD040 is satisfied; replace the opening triple
backticks with triple backticks plus a language (use "text") for both fenced
blocks around the roadmap diagram and the query/example block.

In `@src/graph/deeplake-pull.ts`:
- Around line 158-166: After JSON.parse you must guard against non-object
results (e.g., JSON null) before accessing properties; update the code around
parsedSnapshot (the variable produced in the try/catch) to first check that
typeof parsedSnapshot === "object" && parsedSnapshot !== null and return
errorOutcome("parse cloud snapshot", new Error("snapshot not an object")) if
that fails, then perform the existing Array.isArray checks on (parsedSnapshot as
{ nodes?: unknown }).nodes and .links; ensure no property access occurs until
the non-null object guard passes so pullSnapshot returns the documented error
outcome for invalid payloads.

---

Outside diff comments:
In `@src/graph/deeplake-pull.ts`:
- Around line 168-177: The SHA check only ensures payload integrity but not that
it belongs to the requested head/repoKey; add explicit identity checks before
any write path: after computing and validating computedSha (using
computeSnapshotSha256 and parsedSnapshot) verify parsedSnapshot.graph.commit_sha
=== head and parsedSnapshot.graph.repo_key === repoKey (or their canonical
forms) and if either mismatches return errorOutcome("SELECT cloud row", new
Error(...)) with a clear message indicating which field mismatched (commit_sha
or repo_key) and the expected vs actual values so the mis-keyed row is rejected
rather than written to `${head}.json` / `.last-build.json`.

---

Nitpick comments:
In `@docs/GRAPH-ROADMAP.md`:
- Line 9: In docs/GRAPH-ROADMAP.md update the table entry that references
`/tmp/codex-graph-output.md` (the line containing "graphify e
Understand-Anything in `_graph-comparison/`; review Codex in
`/tmp/codex-graph-output.md`") so it points to a durable, shareable artifact —
either commit and reference a file in the repo (e.g.,
`_graph-comparison/codex-graph-output.md`) or replace the `/tmp/...` path with a
PR comment/link or CI/artifact URL; ensure the new reference is accessible to
reviewers and update the table cell text accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 5b2e3eb7-e710-4ef5-a809-74edffa3eec2

📥 Commits

Reviewing files that changed from the base of the PR and between 2bf77fb and daafa32.

📒 Files selected for processing (3)
  • docs/GRAPH-ROADMAP.md
  • src/graph/deeplake-pull.ts
  • tests/shared/graph/deeplake-pull.test.ts

Comment thread docs/GRAPH-ROADMAP.md Outdated
Comment thread src/graph/deeplake-pull.ts
efenocchi added 3 commits June 3, 2026 05:43
Push stores snapshot_sha256 computed over the stable fields only
(excludes observation), but pull was hashing the entire payload bytes
(observation included) and comparing the two. Every real cloud row was
therefore rejected with 'snapshot_sha256 mismatch' — team pull silently
failed and only legacy rows without a populated sha column slipped through.

Pull now parses the payload, validates nodes/links shape, and recomputes
the hash via computeSnapshotSha256 (the same function push uses), comparing
the stable-only hash against the column. Adds a regression suite covering:
a row whose observation differs accepts, a tampered payload rejects, and
non-JSON / shape-invalid payloads reject.
…d wiring

Closes the silent gap where the code graph only worked on Claude Code.

- New shared parser src/graph/graph-command.ts (tryGraphRead +
  parseReadTargetPath): given a rewritten shell command, detects a
  cat/head/tail/ls on the /graph/* virtual subtree and synthesizes the
  body via handleGraphVfs. One source of truth so per-agent hooks stop
  drifting.
- Wire it into Cursor, Codex, and Hermes pre-tool-use hooks (Claude Code
  already had inline handling). Each returns its agent-correct decision
  shape (cursor updated_input echo; codex/hermes {action:block}).
- Cursor SessionStart now emits graphContextLine (was missing).
- Cursor auto-build: add graph-on-stop to the cursor esbuild bundle (with
  tree-sitter externalized) and register it on stop + sessionEnd in
  install-cursor.ts buildHookConfig().

Hardened per two rounds of Codex review: only cat<path> or a single
trailing |head/|tail is a graph read (a |grep falls through); reject ..
path traversal; strip quotes and skip leading flags; require exactly one
file operand (multi-file reads fall through). Codex hook uses input.cwd.
Read-only skill teaching the model when/how to query the code graph via
the memory mount (index.md / find/ / show/), with anti-patterns: Incoming
(0) != dead code (intra-file calls only), the graph can be stale, find/ is
lexical not semantic, and never run a build (hooks handle it). Cursor and
pi have no skill loader and rely on the SessionStart inject instead.
@efenocchi efenocchi force-pushed the fix/graph-pull-hash-mismatch branch from 286d630 to 6fe819b Compare June 3, 2026 05:44
@efenocchi efenocchi changed the title fix(graph): cloud pull sha over stable fields + code-graph PRD fix(graph): cloud pull sha + cross-agent VFS parity (Cursor/Codex/Hermes) Jun 3, 2026
efenocchi added 21 commits June 3, 2026 06:06
…tour/path)

Four zero-LLM endpoints under the graph mount, each a pure function over the
snapshot, wired into handleGraphVfs:

- neighborhood/<file>  symbols in a file + its cross-file neighbors (the
                       deterministic neighborMap; imports-driven for now since
                       cross-file calls aren't resolved yet).
- layers               architectural grouping by path heuristic (layerOf).
- tour                 entry points + Kahn topological walkthrough + cyclic
                       remainder; partition covers every node exactly once.
- path/<from>/<to>     BFS shortest path (directed, then undirected fallback),
                       deterministic tie-break.

Built in parallel via agents teams (one teammate per module, isolated
worktrees), then integrated + hardened over two rounds of Codex review:
entry points no longer suppressed by unresolved-source edges (tour); only
real nodes in another file count as cross-file (neighborhood); BFS skips
phantom (unresolved) endpoints and breaks ties stably (path); layerOf
normalizes a leading slash so root-level folders classify (layers).

Tests: per-module unit suites + dispatcher routing + the codex-review
regressions. Full suite 3800 green.
The per-file extractor resolved calls intra-file only. Now it also records
unresolved call sites (raw_calls) and import bindings, and a new post-extract
resolver turns them into cross-file 'calls' edges — deterministically, no LSP.

- types.ts: optional FileExtraction.raw_calls / import_bindings + RawCall,
  ImportBinding (additive; old extractions/fixtures are a no-op).
- extract/typescript.ts: extractImportBindings parses named/aliased/namespace
  clauses; extractCalls records a RawCall for foo()/ns.foo() that don't resolve
  in-file. Intra-file edges unchanged (resolve-first, raw-call only in else).
- resolve/cross-file.ts: export index over exported top-level nodes; relative
  module resolution (.ts/.tsx/index, .js->.ts); binds named + namespace calls
  to their target export. Emits EXTRACTED edges, deduped.
- snapshot.ts: buildSnapshot appends resolved edges before sorting.

High-confidence only (two Codex rounds): default imports are NOT resolved
(default export not tracked), type-only imports (import type {...} and per-
specifier { type Foo }) are skipped while a value import { type as value } is
kept, bare/aliased/barrel/dynamic are left to Phase 1.5b.

Tests: cross-file.test.ts (resolver units) + cross-file-e2e.test.ts (real
extractor -> snapshot). Full suite 3824 green.
Imports were always emitted as module -> external:<specifier>. Now, after all
files are extracted, repointImportEdges rewrites a relative import whose
specifier resolves (reusing B1's resolveModule) to target the real
<resolvedFile>::module node, so 'imports' edges become navigable within the
repo. Bare (npm) and unresolvable specifiers keep external: so the
our-code-vs-dependency distinction is preserved.

Pure + deterministic (new array, no mutation); the target module node always
exists because resolveModule only returns known source files. Codex review:
no findings.

Tests: repointImportEdges units + e2e (relative -> module node, bare -> kept).
Full suite 3832 green.
Heritage edges were always emitted as unresolved:<file>:<name>:<kind> and
never resolved (not even same-file). resolveHeritageEdges now repoints them to
the real base-type node: a same-file class/interface/type/enum declaration
(intra-file, incl. non-exported), or a NAMED import resolved to its exporting
file. Default/namespace/bare imports and unknown bases keep the placeholder.

Extracted a shared buildExportIndex (resolveCrossFileCalls refactored onto it).
Pure + deterministic; runs in buildSnapshot before sort/hash.

Tests: resolveHeritageEdges units (same-file, named cross-file, implements,
unknown-kept, non-heritage untouched) + e2e through the real extractor
(imported base, same-file base, external base kept unresolved). Full suite
3840 green.

NOTE: Codex review of this block was cut off by a Codex usage-limit error
mid-run (no verdict produced); verified by self-review against the same
checklist + full test suite. Re-run Codex when the limit resets.
…dex)

Codex review of B3 (on the restored quota): type-only imports were being
DROPPED, so `class Impl implements Shape` where `Shape` came from
`import type { Shape }` couldn't resolve — yet extends/implements are
type-position and a type-only import is their legitimate source.

Fix: record type-only imports with an ImportBinding.type_only flag instead of
dropping them. Calls resolution skips type_only bindings (named AND namespace
paths — a second Codex pass caught the namespace ns.foo() gap); heritage
resolution accepts them. `import { type as value }` stays a value import.

Tests: type-only named/namespace skipped for calls; type-only import resolves
implements cross-file (e2e). Full suite 3845 green. Two Codex rounds.
… (B4)

Optional GraphNode fields, all additive:
- signature: one-line declaration (extractor). Kind-aware: body-bearing kinds
  cut at the real body node (so an object-literal return type isn't truncated);
  const/type_alias keep their RHS. Code-point-safe truncation.
- fan_in / fan_out / is_entrypoint: derived in buildSnapshot AFTER cross-file
  edge resolution (node-metadata.ts), so degrees reflect the full graph.
  is_entrypoint = exported && fan_in === 0.
- show/ renders sig + entrypoint + fan_in/out; the 'Incoming (0)' caveat is
  reworded (cross-file resolution is partial — not proof of dead code).

doc_comment deferred (needs prev-sibling comment plumbing). Two Codex rounds
(signature kind-awareness, surrogate-safe truncation, body-node cut). Full
suite 3852 green.
The 2-in-1: find + show in a single read. Searches with the shared findMatches
(same ranking as find/), takes the top 5 matches, and expands each with its
1-hop neighborhood grouped by relation + direction (--calls-->, <--imports--,
etc.), bounded at 8 neighbors/relation. Surfaces B4 metadata (signature,
fan_in/out, entrypoint) per match, and saves handles so a follow-up show/<N>
works just like after find/.

Refactor: extracted findMatches (renderFind now uses it — behavior unchanged,
Codex-confirmed). Routed query/ in handleGraphVfs; advertised in dirListing +
graph-command listing.

Codex review: no findings. Tests: routing, expansion, handle reuse, no-match.
Full suite 3856 green.
findMatches now tokenizes the pattern on whitespace/`+`. Multi-token (e.g.
`auth+middleware`, or quoted `"auth middleware"`) requires EVERY token to
appear in a node's id or label (AND), ranked by summed per-token rank. A
single token preserves the original substring behavior. `+` lets multi-token
queries pass the VFS path intercept unquoted (which needs a single shell
operand).

Codex review: one Low note that separator-adorned single tokens (" foo ")
now trim to `foo` — intentional and strictly better (ids have no leading
spaces). Tests: AND match, whitespace separator. Full suite 3858 green.
…ll (C7)

The discovery surfaces still pointed only at find/show. Now:
- index.md 'How to query' leads with query/ (the 2-in-1) and lists
  neighborhood/layers/tour/path; cross-file caveat updated.
- SessionStart inject leads with query/<pattern> and states cross-file
  resolution scope (relative named imports resolved; bare/aliased/dynamic
  not), replacing the stale 'intra-file only' line.
- hivemind-graph skill (claude-code/codex/hermes) cheat sheet leads with
  query/ as START HERE.

Advertising-only (no logic); full suite 3858 green.
The TS grammar is a superset, so .js/.jsx/.mjs/.cjs parse with the same
tree-sitter pipeline (.jsx via the tsx grammar). Changes:
- NodeLanguage += 'javascript'; extractTypeScript labels JS files (and their
  nodes) javascript after extraction.
- isSourceFile + graph-on-stop SOURCE_GLOBS now include js/jsx/mjs/cjs.
- resolveModule resolves JS extensions and orders candidates by the
  specifier's explicit extension, then the importer's own family, then the
  other — so a JS importer / explicit .js binds to .js, not a sibling .ts
  (two Codex rounds).

Tests: js/jsx extraction + labeling, cross-file calls between .js files,
resolveModule family/explicit-ext ordering. Full suite 3864 green.
Answers 'if I change this symbol, what's affected?' — the transitive set of
DEPENDENTS via reverse BFS over the resolved edges, grouped by shortest-path
depth, with the relation each dependent was reached through. Cycle-safe
(first-visit wins, target excluded), deterministic (edges + frontier + depth
buckets sorted), bounded (cap 80, true total still reported).

Only edges whose source is a real node are traversed (defensive consistency
with tour/path/neighborhood; codex P3). Honest caveat: cross-file resolution
is partial, so the impact set is a LOWER BOUND.

Routed impact/<pattern> + advertised in listings. Tests: transitive depth,
leaf=zero, cycle no-double-count, ambiguous/no-match, cross-relation. Full
suite 3873 green.
Typo tolerance without a runtime dependency (chose D3 Levenshtein over D2
fuse.js to avoid bundling a dep). findMatches falls back to fuzzyMatches ONLY
when single-token substring search returns zero hits — so existing behavior is
untouched when matches exist. fuzzyMatches keeps labels within edit distance
max(1, floor(len/4)) of the needle (needle>=3), sorted by (distance, id),
capped 25. editDistance is Levenshtein with two rolling rows + a safe
row-min early-exit (Codex verified the non-decreasing-row-min property) and a
length-delta short-circuit.

Codex review: no correctness findings. Tests: typo→suggestion, exact-hit does
not trigger fuzzy, no-near-match→No matches. Full suite 3876 green.
graph-on-stop ran only on Claude Code + Cursor. Now also on Codex and Hermes
(the two other agents with the VFS graph intercept), so they self-build the
graph instead of only reading one a teammate built.

- esbuild: graph-on-stop added to the codex + hermes bundles, with
  tree-sitter/tree-sitter-typescript externalized (mirrors CC/Cursor).
- Codex: registered on Stop in BOTH config sources — static
  codex/hooks/hooks.json AND install-codex.ts buildHooksJson (codex has two;
  CLI installs use the latter — Codex review P1). One Stop block carries
  stop.js + graph-on-stop.js. HIVEMIND_BUNDLE_FILES gains graph-on-stop.js so
  reinstall dedupe strips stale entries (P2).
- Hermes: registered on on_session_end (session-close) in install-hermes.ts.

Shared hook is agent-agnostic (ignores stdin, uses process.cwd()) and gated,
so the per-turn cost stays ~5ms. OpenClaw/pi excluded (no bash intercept →
graph not queryable there). Two Codex rounds. Full suite 3878 green.
Adds tree-sitter-python (same grammar-dependency pattern as tree-sitter-
typescript; the native bindings are verified/rebuilt by ensure-tree-sitter.mjs)
and a Python extractor producing the same FileExtraction shape:
- nodes: function (def), class, method (def in class body, incl. decorated),
  const (module-level X = ...), module.
- edges: method_of, extends (real bases only — identifier or dotted attribute's
  final name; keyword_argument/metaclass skipped per codex), imports
  (external:<spec>), calls (intra-file foo() + self.m()).
- exported = name not starting with '_' (Python public-by-convention).
- signature cut at the body block (not the first ':', which would truncate at
  a param annotation — codex).

Dispatch: new src/graph/extract/index.ts extractFile() routes .py/.pyi to the
Python extractor, everything else to TS. discoverSourceFiles + graph-on-stop
globs include .py/.pyi; tree-sitter-python externalized in all bundles that
transitively import the extractor.

Cross-file Python (dotted/'.'-relative imports) is deferred: specifiers aren't
'./' so the shared resolver safely skips Python cross-file calls/imports while
INTRA-file heritage still resolves. Two Codex rounds. Full suite 3889 green.
…indings)

Found by a live smoke test on real Claude Code (--plugin-dir):
- index.md 'Limitations' still claimed 'TypeScript only' and "'calls' edges
  are intra-file only" — false now (cross-file resolution shipped in B1-B3,
  JS/Python in B6/B7). The agent literally saw cross-file callers two lines
  above that stale caveat. Reworded to match reality.
- query/<pattern> listed a repeated neighbor once per edge (a fn calling
  errorOutcome 3x rendered it three times). Now deduped as 'errorOutcome ×3'.

Tests updated. Full suite 3890 green.
echo ${JSON.stringify(body)} only escapes " and \, not backticks or $.
Inside a double-quoted echo the shell still ran command substitution
(`find/` -> executes find/) and variable expansion, corrupting bodies
with backticks/$ (e.g. graph/index.md help text) and leaking
"find/: No such file" to stderr. JSON.stringify also turned real
newlines into literal \n that echo printed verbatim.

Add safeEchoCommand(body) -> printf '%s\n' '<single-quoted body>',
which the shell treats fully literally (no substitution/expansion,
real newlines preserved). Apply to all 10 call sites.

Tests:
- safe-echo.test.ts: 17 cases exercising backtick/$/quote/%/backslash/
  multiline through a real bash AND sh, verbatim, zero stderr.
- pre-tool-use-branches.test.ts: update 3 assertions that encoded the
  old buggy behavior (literal l1\nl2, echo "3") to the correct output.
The SessionStart LOCAL CODE GRAPH inject still advertised "TypeScript
only" while the graph gained a Python extractor (4bc6c65) and index.md
already reports "TypeScript / JavaScript / Python". An e2e run surfaced
the mismatch between the system-prompt block and index.md.

Align session-context.ts with the canonical wording in vfs-handler.ts:
header and limitations now say TS/JS/Python, and note Python is
intra-file + structure only (cross-file resolution is a follow-up).
Update the rationale comment and the assertion in session-context.test.ts.
…jection cases

Per codex audit. Adds the edge cases the audit flagged as missing —
empty string, trailing newline, leading dash (not a printf flag) — and
two explicit quote-breakout security tests (one per shell): a payload
crafted to escape the single quotes and run `touch <canary>` must be
emitted verbatim with the injected command never executing. These pin
the anti-injection guarantee so a regression in the '\'' escaping fails
loudly. No source change.
CodeRabbit (Major): JSON.parse accepts the literal "null" (and other
primitives), yielding a value with no `nodes`/`links`. The subsequent
`parsedSnapshot.nodes` access threw an uncaught TypeError on `null`,
instead of pullSnapshot returning its documented `error` outcome.

Add a `=== null || typeof !== "object"` guard right after the parse,
before the Array.isArray checks. Test: a cloud row whose snapshot_jsonb
is the string "null" now resolves to an error outcome (no throw, no
files written) rather than rejecting.
…+ skill/eval

Beats vanilla Claude Code on the comparison/enumeration question class
(DeepCodeBench/graphiti, N=16: avg recall 0.787 -> 0.876, B wins 7 / loses 2).

- resolve/cross-file.ts: add resolvePythonModule + dispatch resolveModule by
  importer extension (.py -> Python resolver). Suffix-anchored dotted-module
  match, drop-on-ambiguous. Wires Python through the existing cross-file
  calls/imports/heritage passes (0->428 cross-file calls, 0->349 repointed
  imports on the graphiti benchmark). +9 unit tests.
- session-context.ts: reframe the SessionStart inject — the graph is a
  navigation INDEX to locate files, then Read them (not a grep replacement).
  Fixes over-reliance; `cat` not `ls`; drops the stale "Python intra-file
  only" wording. Inject-text test assertions updated.
- skills/hivemind-graph (claude-code/codex/hermes): document all 9 endpoints
  (add impact/neighborhood/layers/tour/path), fix stale "TypeScript only" ->
  TS/JS/Python, correct the cross-file/dead-code anti-pattern, index-then-Read
  workflow.
- evals/: A/B eval harness (vanilla vs graph) over DeepCodeBench + the
  step-by-step IMPROVEMENTS_LOG with the measured results.
Hermes previously spawned the graph pull worker but never rendered the
graphContextLine() inject, so the agent only learned about the graph via the
hivemind-graph skill (conditional). Wire graphContextLine(cwd) into the Hermes
SessionStart context so the structural index is always present.

Trade-off: unlike claude-code/cursor, Hermes has no model-only context channel,
so this block is user-visible in the TUI. Accepted — an always-present index
outweighs the extra lines. Appends nothing when no graph exists yet (null).
@efenocchi

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

efenocchi added 2 commits June 5, 2026 03:47
Conflicts resolved in src/hooks/pre-tool-use.ts:
- unsupported-command guidance: take main's buildRetryGuidanceDecision()
  (branch was unchanged from base; main refactored to the shared helper).
- ls output: combine main's new Read-tool branch (writeReadCacheFileFn ->
  buildReadDecision) with the branch's safeEchoCommand() hardening for the
  Bash return path.
package-lock.json regenerated against the cleanly-merged package.json.

Full suite: 4027 passing.
The A/B eval harness + datasets + results are developer tooling, not product
code — untrack them and gitignore evals/ so they stay local without landing in
the PR. Files remain on disk; nothing is deleted.
@efenocchi

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@efenocchi

Copy link
Copy Markdown
Collaborator Author

@coderabbitai — triage of the open findings, none require a code change:

1. src/graph/deeplake-pull.ts 158-166 — non-null object guard before property access.
Already addressed in 413b9f0b. The parsedSnapshot === null || typeof parsedSnapshot !== "object" guard is at lines 168-170, returning the documented errorOutcome("parse cloud snapshot", ...) before any .nodes/.links access. Resolved.

2. src/graph/deeplake-pull.ts 168-177 — verify payload graph identity matches head/repoKey (Major).
Declining as defense-in-depth not required on this path. The SELECT already key-matches the row: WHERE ... repo_slug = '<repoKey>' AND commit_sha = '<head>' (lines 126-128), so the returned row belongs to the requested (repoKey, head). push writes the payload's graph.commit_sha/graph.repo_key from those same values, and the snapshot_sha256 check (lines 179-183) verifies payload integrity. A payload-vs-column identity mismatch would require a backend data-integrity bug, which is out of scope for this read path. Happy to add explicit equality asserts if we want belt-and-suspenders, but it's not a correctness gap.

3. docs/GRAPH-ROADMAP.md line 9 (/tmp/... reference) and lines 159-173 (MD040 fenced-code language).
Acknowledged nitpicks. This roadmap is slated for an English rewrite/cleanup; both the durable-link swap and the MD040 language tags will be folded into that pass.

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# List open review threads/comments by coderabbitai on PR `#228`
gh api repos/activeloopai/hivemind/pulls/228/reviews --jq '.[] | select(.user.login == "coderabbitai[bot]") | {id: .id, state: .state, body: .body[:80]}'

Length of output: 510


🏁 Script executed:

#!/bin/bash
# List review comments (inline) by coderabbitai on PR `#228`
gh api repos/activeloopai/hivemind/pulls/228/comments --jq '.[] | select(.user.login == "coderabbitai[bot]") | {id: .id, path: .path, line: .line, body: .body[:100]}'

Length of output: 980


🏁 Script executed:

#!/bin/bash
# Get pull request node ID and review thread IDs via GraphQL
gh api graphql -f query='
{
  repository(owner: "activeloopai", name: "hivemind") {
    pullRequest(number: 228) {
      id
      reviewThreads(first: 20) {
        nodes {
          id
          isResolved
          comments(first: 1) {
            nodes {
              databaseId
              body
            }
          }
        }
      }
    }
  }
}'

Length of output: 4221


🏁 Script executed:

#!/bin/bash
# Resolve the open docs/GRAPH-ROADMAP.md MD040 thread (user acknowledged + deferred)
gh api graphql -f query='
mutation {
  resolveReviewThread(input: { threadId: "PRRT_kwDOR5LKos6GmrT0" }) {
    thread {
      id
      isResolved
    }
  }
}'

Length of output: 392



@efenocchi, all three findings are triaged — here's the status:

  1. src/graph/deeplake-pull.ts — non-null object guard (lines 158–166): Confirmed. The parsedSnapshot === null || typeof parsedSnapshot !== "object" guard at lines 168–170 covers JSON.parse("null") and other non-object results before any .nodes/.links access. Thread was already resolved.

  2. src/graph/deeplake-pull.ts — payload graph identity vs. (repoKey, head) (Major): Understood and accepted. The SQL WHERE repo_slug = '<repoKey>' AND commit_sha = '<head>' (lines 126–128) already pins the row to the requested identity, and the snapshot_sha256 check (lines 179–183) covers payload integrity end-to-end. An explicit in-code equality assert would be belt-and-suspenders — worth a follow-up if the team ever wants it, but not a correctness gap on the current read path.

  3. docs/GRAPH-ROADMAP.md/tmp/ reference (line 9) + MD040 fenced-code language tags (lines 159–173): Acknowledged as nitpicks, deferred to the planned English rewrite. Thread resolved.


✏️ Learnings added
Learnt from: efenocchi
URL: https://github.com/activeloopai/hivemind/pull/228

Timestamp: 2026-06-05T04:21:07.749Z
Learning: In activeloopai/hivemind src/graph/deeplake-pull.ts (TypeScript), the `pullSnapshot` function's SQL SELECT is already keyed on `(repo_slug, commit_sha)` matching the requested `(repoKey, head)` pair (lines 126-128). An additional in-code payload identity check (comparing payload's graph.commit_sha/graph.repo_key against the query parameters) is intentionally omitted as non-required defense-in-depth; the `snapshot_sha256` check (lines 179-183) covers payload integrity. Do not flag the absence of this identity cross-check as a correctness gap.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

cross-file.ts embedded a literal NUL byte as the dedup-key separator, which made
git treat the whole file as binary, so diff tools (and CodeRabbit) could not
review it, including the resolvePythonModule logic. Replace the literal NUL with
an escaped form (backslash-u-0000): byte-for-byte identical runtime value, but the
source is now plain UTF-8 text and fully reviewable/diffable. No behavior change
(59 cross-file tests pass).
@efenocchi

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

Commit 10d22edf converts src/graph/resolve/cross-file.ts from a git-binary file (it had a literal NUL byte as the dedup-key separator) to plain UTF-8 text. The full file — including the new resolvePythonModule / matchPythonSuffix logic — is now diffable and reviewable for the first time.

Please focus this review on src/graph/resolve/cross-file.ts only; this commit changes no other files.

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
cat src/graph/resolve/cross-file.ts

Length of output: 14768



@efenocchi — reviewed src/graph/resolve/cross-file.ts (commit 10d22edf). The overall design is clean and the high-confidence-only doctrine is applied consistently. Three findings, ordered by severity:


🐛 Bug (Medium) — resolvePythonModule: over-climbing silently clamps to root

let dir = posix.dirname(fromFile);
for (let i = 1; i < dots && dir !== "" && dir !== "."; i++) dir = posix.dirname(dir);
const base = segs.length > 0 ? posix.normalize(posix.join(dir, ...segs)) : dir;

When dots exceeds the depth of fromFile's directory tree, the loop exits early (dir === ".") with i < dots — but then continues to resolve against ".", potentially producing a spurious match at repo root. Example:

fromFile = "pkg/sub/mod.py"   # 2 levels above root
specifier = "....foo"          # needs to climb 3 more — invalid Python

After the loop: dir = ".", base = "foo", and knownFiles.has("foo.py") may accidentally succeed. A genuinely invalid over-climbing relative import should return null:

let dir = posix.dirname(fromFile);
let i = 1;
for (; i < dots && dir !== "" && dir !== "."; i++) dir = posix.dirname(dir);
if (i < dots) return null; // exhausted dirs before consuming all dots — invalid import
const base = segs.length > 0 ? posix.normalize(posix.join(dir, ...segs)) : dir;

⚡ Performance (Medium) — matchPythonSuffix: O(knownFiles × 4) linear scan per call

for (const t of targets) {           // 4 targets
  if (knownFiles.has(t)) return t;
  let hit: string | null = null;
  let count = 0;
  for (const f of knownFiles) {      // full scan
    if (f.endsWith(`/${t}`)) { hit = f; count++; }
  }
  
}

For a Python repo with thousands of files and many cross-file imports, this is O(calls × files × 4). A suffix-keyed reverse index built once alongside buildExportIndex would drop this to O(1) per lookup:

// built once before resolution loop:
const suffixIndex = new Map<string, string[]>();
for (const f of knownFiles) {
  // index each "pkg/sub/mod.py" under "mod.py", "sub/mod.py", "pkg/sub/mod.py"
  const parts = f.split("/");
  for (let i = parts.length - 1; i >= 0; i--) {
    const key = parts.slice(i).join("/");
    const arr = suffixIndex.get(key) ?? [];
    arr.push(f);
    suffixIndex.set(key, arr);
  }
}

Then matchPythonSuffix becomes a direct lookup + ambiguity check with no scan.


💬 Nit (Low) — matchPythonSuffix: early return null on ambiguous target stops all remaining variants without comment

if (count > 1) return null; // ambiguous — drop, don't guess

If suffix.py is ambiguous, the function returns immediately without trying suffix/__init__.py etc. In practice this is safe (if the .py is ambiguous the __init__.py form would typically be too), but the early exit is surprising in a loop that nominally iterates all four target forms. A comment acknowledging this is intentional ("ambiguity on any form is sufficient grounds to drop the whole resolution") would help future readers.


✅ Positive observations

  • The \u0000 dedup separator is the correct UTF-8 spelling of the former literal NUL; semantically identical, now diffable.
  • binding.type_only guards in resolveOne and resolveHeritageEdges are thorough.
  • The importer-family extension preference in resolveModule (importerIsJs → prefer JS exts) is a good correctness fix for the JS-importer/TS-sibling ambiguity case.
  • parseUnresolved's lastIndexOf chain correctly handles the unresolved:<file>:<name>:<kind> format given the no-colon-in-posix-path invariant.
  • High-confidence-only policy (ambiguous → drop, not guess) is applied uniformly across TS and Python paths.

🧠 Learnings used
Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 0
File: :0-0
Timestamp: 2026-06-05T04:21:07.749Z
Learning: In activeloopai/hivemind src/graph/deeplake-pull.ts (TypeScript), the `pullSnapshot` function's SQL SELECT is already keyed on `(repo_slug, commit_sha)` matching the requested `(repoKey, head)` pair (lines 126-128). An additional in-code payload identity check (comparing payload's graph.commit_sha/graph.repo_key against the query parameters) is intentionally omitted as non-required defense-in-depth; the `snapshot_sha256` check (lines 179-183) covers payload integrity. Do not flag the absence of this identity cross-check as a correctness gap.

…thonModule

CodeRabbit (Medium): when a relative specifier has more leading dots than the
importer has parent directories (e.g. `....foo` from `pkg/sub/mod.py`), the climb
loop clamped `dir` to "." and then resolved against the repo root, so an invalid
import could spuriously match a root-level `foo.py`. Track the climb count and
return null when the dots aren't fully consumed. Also clarify that ambiguity on
any matchPythonSuffix target form drops the whole resolution. Adds a regression
test (over-climb -> null). No behavior change for valid imports.
@efenocchi

Copy link
Copy Markdown
Collaborator Author

@coderabbitai review

Addressed your src/graph/resolve/cross-file.ts findings in commit c91a77b0:

  • 🐛 Bug (Medium) — over-climbing relative import. Fixed: resolvePythonModule now tracks the climb count and returns null when the leading dots exceed the importer's parent directories (instead of clamping to . and matching at repo root). Added a regression test (....foo from pkg/sub/mod.pynull).
  • 💬 Nit (Low) — ambiguous early return. Added a comment clarifying that ambiguity on any matchPythonSuffix target form is intentionally sufficient to drop the whole resolution.
  • ⚡ Perf (Medium) — O(files × 4) suffix scan. Acknowledged; deferred as a follow-up. The resolver runs once at build time and is fine at current repo sizes; the suffix-keyed reverse index is a good optimization for very large monorepos and will be tracked separately.

This commit touches only cross-file.ts and its test.

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
cat src/graph/resolve/cross-file.ts

Length of output: 15146



@efenocchi — verified both fixes in the current cross-file.ts:

🐛 Bug (over-climbing) — ✅ Confirmed correct.
The climbed counter starts at 1 (matching the initial posix.dirname(fromFile) call), and the if (climbed < dots) return null; guard fires correctly before any path join. Tracing the regression case:

  • fromFile = "pkg/sub/mod.py", dots = 4
  • After posix.dirname: dir = "pkg/sub", climbed = 1
  • Loop: "pkg/sub""pkg" (climbed=2), then "pkg""." — loop exits on dir === "."
  • climbed = 3 < dots = 4return null

💬 Nit (ambiguous early-return comment) — ✅ Confirmed present.
"Ambiguity on ANY target form is sufficient grounds to drop the whole resolution — don't fall through to the remaining forms" is in place.

⚡ Perf — acknowledged as deferred.
Understood; a follow-up issue tracking the suffix-keyed reverse index would be worth opening when the resolver is extended to large monorepos. Happy to open one if you'd like.

All three findings triaged; the two code changes look good.

🐇 ✨


🧠 Learnings used
Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 0
File: :0-0
Timestamp: 2026-06-05T04:21:07.749Z
Learning: In activeloopai/hivemind src/graph/deeplake-pull.ts (TypeScript), the `pullSnapshot` function's SQL SELECT is already keyed on `(repo_slug, commit_sha)` matching the requested `(repoKey, head)` pair (lines 126-128). An additional in-code payload identity check (comparing payload's graph.commit_sha/graph.repo_key against the query parameters) is intentionally omitted as non-required defense-in-depth; the `snapshot_sha256` check (lines 179-183) covers payload integrity. Do not flag the absence of this identity cross-check as a correctness gap.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
package.json (1)

64-70: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Inconsistent dependency placement: tree-sitter-python in dependencies but base package in optionalDependencies.

tree-sitter-python (line 64) is a required dependency, but tree-sitter (line 69) is optional. Since tree-sitter-python requires the base tree-sitter package at runtime, Python graph extraction will fail when tree-sitter is not installed. Move tree-sitter-python to optionalDependencies alongside the other tree-sitter packages for consistent behavior.

Suggested fix
   "dependencies": {
     "`@anthropic-ai/sdk`": "^0.97.1",
     "`@huggingface/transformers`": "^3.0.0",
     "`@modelcontextprotocol/sdk`": "^1.29.0",
     "deeplake": "^0.3.30",
     "js-yaml": "^4.1.1",
     "just-bash": "^2.14.0",
-    "tree-sitter-python": "^0.21.0",
     "yargs-parser": "^22.0.0",
     "zod": "^4.3.6"
   },
   "optionalDependencies": {
     "tree-sitter": "^0.21.1",
+    "tree-sitter-python": "^0.21.0",
     "tree-sitter-typescript": "^0.23.2"
   },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@package.json` around lines 64 - 70, Move the tree-sitter-python entry from
dependencies into optionalDependencies so it is installed/managed together with
tree-sitter and tree-sitter-typescript; update package.json by removing
"tree-sitter-python": "^0.21.0" from the top-level "dependencies" object and
adding the same entry to the "optionalDependencies" object where "tree-sitter"
and "tree-sitter-typescript" are declared to ensure Python graph extraction has
the base runtime available when tree-sitter is optional.
src/commands/graph.ts (1)

44-49: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align CLI/help text with multi-language extraction support.

The command now discovers/extracts JS/Python too, but usage and runtime log still say “TypeScript source files,” which is user-facing and misleading.

Suggested patch
-const USAGE = `hivemind graph — codebase-graph commands (Phase 1 — TypeScript only)
+const USAGE = `hivemind graph — codebase-graph commands (Phase 1.5 — TypeScript/JavaScript/Python)

@@
-      Walk the project for TypeScript source files, extract symbols + edges,
+      Walk the project for supported source files (TypeScript/JavaScript/Python), extract symbols + edges,

@@
-  console.log(`Discovered ${sourceFiles.length} TypeScript source files. Extracting...`);
+  console.log(`Discovered ${sourceFiles.length} source files. Extracting...`);

Also applies to: 444-445

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/commands/graph.ts` around lines 44 - 49, Update the user-facing help and
runtime messages that erroneously say "TypeScript source files" to reflect
multi-language support (e.g., "source files (TypeScript, JavaScript, Python)" or
"TypeScript/JavaScript/Python source files"). Change the USAGE constant string
(symbol: USAGE) accordingly and search for any runtime log strings that contain
"TypeScript source files" (for example in the graph build command or functions
like buildGraph/discoverSources) and replace them with the new multi-language
phrasing so help text and logs match actual behavior.
🧹 Nitpick comments (5)
tests/shared/graph/vfs-handler.test.ts (1)

463-468: ⚡ Quick win

Strengthen the assertion to verify actual path/ output content.

The typeof r.body === "string" assertion only confirms the response is a string, which is always true for kind: "ok". Per testing guidelines, prefer asserting on specific values. The path output should contain recognizable structure — e.g., node names from the fixture, "No path", or "ambiguous".

💡 Suggested improvement
   it("path/<from>/<to> renders a path or a clear no-path/ambiguous message", () => {
     seed();
     const r = handleGraphVfs("path/foo/UserModel", cwd);
     expect(r.kind).toBe("ok");
-    if (r.kind === "ok") expect(typeof r.body).toBe("string");
+    if (r.kind === "ok") {
+      // Assert the output contains either path structure or a clear message
+      expect(
+        r.body.includes("foo") || r.body.includes("No path") || r.body.includes("ambiguous")
+      ).toBe(true);
+    }
   });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/graph/vfs-handler.test.ts` around lines 463 - 468, The test uses
handleGraphVfs("path/foo/UserModel", cwd) but only asserts typeof r.body ===
"string"; update the assertion to verify actual output content by checking that
r.body contains expected node names or known messages (e.g., assert r.body
includes "foo" or "UserModel", or matches "No path" or "ambiguous"). Locate the
test case (the it block calling seed() and handleGraphVfs) and replace the loose
typeof check with a concrete assertion such as toContain/toMatch against those
expected substrings so the test validates real path/No path/ambiguous outputs.
src/graph/extract/python.ts (1)

335-340: 💤 Low value

Minor: declByName key extraction relies on no colons in path.

Line 338 uses node.id.split(":")[1] to extract the method key, which assumes relativePath contains no colons. This is documented as relative paths with forward slashes, so it's valid, but consider using a more robust separator (e.g., ::) or extracting the key during node construction to avoid future fragility.

Current behavior is correct given the documented contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/graph/extract/python.ts` around lines 335 - 340, pushNode currently
derives the method lookup key by splitting node.id on ":"
(node.id.split(":")[1]) which is fragile if relativePath ever contains colons;
update the code that constructs GraphNode (or wherever node.id is formed) to
store the lookup key explicitly (e.g., a new property like node.key or
node.lookupName) and use that property here in pushNode instead of splitting
node.id, or, if you prefer a minimal change, compute the key robustly here using
the known id format (e.g., take the substring after the first ":" via index
lookup rather than split) so extraction does not break when paths include ":" —
adjust pushNode to read the new property (or use the index-based extraction) and
update callers that create GraphNode accordingly.
tests/shared/graph/layers.test.ts (1)

142-144: ⚡ Quick win

Strengthen this truncation check with concrete file values.

expect(out).toContain("src/graph/file") is too broad and can pass on malformed output. Assert specific rendered paths (and keep the truncation phrase assertion) to make this regression-proof.

Suggested test tightening
-    expect(out).toContain("src/graph/file");
+    expect(out).toContain("src/graph/file0.ts");
+    expect(out).toContain("src/graph/file4.ts");
     expect(out).toContain("and 2 more file");

As per coding guidelines, tests/** should “Prefer asserting on specific values (paths, messages) over generic substrings.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/graph/layers.test.ts` around lines 142 - 144, The test currently
asserts a vague substring against the output ("src/graph/file") which can mask
malformed rendering; update the assertions on the variable out to check for the
exact, concrete file paths that should be rendered (e.g., the specific entries
your test setup inserts) instead of the generic fragment, while preserving the
truncation assertion expect(out).toContain("and 2 more file") so the test still
verifies truncation; locate the two expect calls that reference out in this test
and replace the first loose substring check with assertions for the precise file
path strings produced by the code under test.
tests/shared/graph/graph-command.test.ts (1)

99-108: ⚡ Quick win

Make the no-snapshot response assertion specific

Line 107 currently accepts multiple broad substrings (/no-graph|Code Graph/), which weakens the contract and can mask regressions. Prefer asserting the exact expected rendered value (or a single stable, explicit message/path fragment agreed as contract).

As per coding guidelines: "tests/**: Prefer asserting on specific values (paths, messages) over generic substrings."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/graph/graph-command.test.ts` around lines 99 - 108, The test for
tryGraphRead("cat /graph/index.md", tmp) currently uses a loose regex
expect(body).toMatch(/no-graph|Code Graph/); tighten this to assert the exact
stable rendered message agreed as the contract (e.g., the exact "no-graph"
rendered string or the canonical "Code Graph — no snapshot" text) so the test
fails on regressions; update the assertion in the "dispatches a /graph/* read to
the VFS..." case to compare body === "<expected exact string>" (or use toBe with
that exact string) and ensure the expected value is the canonical no-snapshot
message produced by handleGraphVfs/tryGraphRead.
tests/shared/graph/impact.test.ts (1)

29-37: ⚡ Quick win

Make these message assertions exact instead of broad substrings.

A few checks here are loose enough to pass through formatting/message regressions. Prefer exact line/value assertions for deterministic behavior.

Suggested tightening
   it("lists transitive dependents grouped by depth", () => {
     const body = renderImpact(chain, "d");
-    expect(body).toContain("Impact of d");
-    expect(body).toContain("3 dependents");
-    expect(body).toContain("depth 1 (1):");
+    expect(body).toContain("Impact of d");
+    expect(body).toContain("3 dependents (transitive), by depth:");
+    expect(body).toContain("  depth 1 (1):");
     expect(body).toMatch(/depth 1 \(1\):\s*\n\s*c/);
     expect(body).toContain("depth 2");
     expect(body).toContain("depth 3");
   });
@@
   it("returns a no-match message for an unknown pattern", () => {
     const body = renderImpact(chain, "zzz");
-    expect(body).toContain("No node matches");
+    expect(body).toBe('No node matches "zzz". Try cat memory/graph/find/zzz to explore.');
   });

As per coding guidelines tests/**: “Prefer asserting on specific values (paths, messages) over generic substrings.”

Also applies to: 58-61

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/graph/impact.test.ts` around lines 29 - 37, The test "lists
transitive dependents grouped by depth" uses broad toContain checks; replace
them with exact string/line assertions against the output of renderImpact(chain,
"d") so the test is deterministic: assert equality or use toMatch with anchored
regexes for the full expected block including the header "Impact of d", the
exact count "3 dependents", the exact depth lines such as "depth 1 (1):"
followed immediately by the dependent "c" on the next line, and exact "depth 2"
and "depth 3" lines; update the corresponding loose assertions in the later test
(the similar checks at 58-61) the same way. Ensure you reference and assert the
full expected output string or tightly anchored regexes rather than substrings.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/graph/graph-command.ts`:
- Around line 50-51: The code that extracts a read target uses
rewrittenCommand.replace(/\s+2>\S+/g, "") to strip only stderr redirection, so
functions parseReadTargetPath and tryGraphRead (and the logic around const cmd =
rewrittenCommand...) still see other redirection tokens (>, >>, <, >&, etc.) and
fail to detect VFS reads; update the sanitization step used before target
extraction to remove all common shell redirections (stdout, stderr, append,
input, and combined forms) from rewrittenCommand (and the same sanitization in
the other block around lines 104-112) so parseReadTargetPath/tryGraphRead
receive a cleaned cmd; ensure you update the code paths that create const cmd
and any pre-processing used by tryGraphRead/parseReadTargetPath so they all call
the new comprehensive redirection-stripping routine.

In `@src/graph/resolve/cross-file.ts`:
- Around line 32-49: buildExportIndex and the cross-file/heritage resolvers are
name-only which lets a symbol from the wrong kind win; update the data
structures and matchers to be kind-aware: change buildExportIndex to index
exported top-level nodes by (label, kind) instead of label-only (preserve
first-wins per (label,kind)), update localIndex/exportIndex consumers
(resolveCrossFileCalls and resolveHeritageEdges) to look up by both name and
kind, modify parseUnresolved to return the kind from strings like
"unresolved:<file>:<name>:<kind>", and in resolveCrossFileCalls restrict matches
to value-callable kinds (e.g., function/class/const/enum) so type-only kinds
cannot satisfy call targets.

In `@src/hooks/cursor/pre-tool-use.ts`:
- Around line 73-77: Replace the fixed heredoc delimiter usage that constructs
echoCmd with a collision-safe generated delimiter: compute a random/unique token
that is guaranteed not to appear in graphBody (and reuse the same token for the
grep branch), then build the heredoc using that generated token (e.g., `cat
<<'DELIM'` ... `DELIM`) so no user-controlled content can terminate the heredoc
early; update references to echoCmd and any grep-related command construction in
pre-tool-use.ts to use the new token-generation and reuse logic.

In `@src/hooks/graph-on-stop.ts`:
- Around line 82-85: Update the comment that documents SOURCE_GLOBS to reflect
that the glob list includes Python files (e.g., *.py and *.pyi) in addition to
TypeScript and JavaScript; find the constant SOURCE_GLOBS in graph-on-stop.ts
and change the comment text from "TypeScript + JavaScript (B7), excluding
.d.ts." to something like "TypeScript, JavaScript, and Python (B7), excluding
.d.ts." so the comment matches the actual globs.

In `@src/hooks/hermes/session-start.ts`:
- Around line 214-216: The variable additional in session-start.ts currently
appends graphNote directly to baseWithGoals (and to baseWithGoals + rulesBlock)
which can create run-on text; update the concatenation for additional so it
inserts an explicit newline before graphNote (e.g., ensure the expression builds
`${baseWithGoals}\n${graphNote}` or
`${baseWithGoals}\n\n${rulesBlock}\n${graphNote}` depending on rulesBlock) so
that baseWithGoals/rulesBlock and graphNote are always separated by a newline;
adjust the expression that defines additional accordingly (references:
additional, baseWithGoals, rulesBlock, graphNote).

In `@src/hooks/pre-tool-use.ts`:
- Around line 478-488: The not-found branch currently constructs an unsafe
host-shell command via buildAllowDecision(`echo ${JSON.stringify(notFound)}`,
...); replace that echoed literal with a command produced by safeEchoCommand so
untrusted virtualPath characters cannot be interpreted by the shell;
specifically, call safeEchoCommand(notFound) (or
safeEchoCommand(JSON.stringify(notFound)) if you want the same quoting) and pass
its return value into buildAllowDecision while leaving the Read-path
(writeReadCacheFileFn + buildReadDecision) unchanged.

In `@tests/claude-code/safe-echo.test.ts`:
- Around line 68-69: The test is brittle because it asserts a regex on shell
error text; instead assert deterministically that the sentinel file does not
exist. Replace the runVia(...) + regex assertion with a platform-independent
existence check (e.g., use Node's fs.existsSync or fs.promises.stat) against the
canary variable and assert it returns false (or rejects/not found). Update the
test that currently calls runVia(shell, `ls '${canary}' ...`) and the
expect(...) that inspects lsOut+lsErr to use the fs-based existence check so the
assertion is exact and locale-independent.

---

Outside diff comments:
In `@package.json`:
- Around line 64-70: Move the tree-sitter-python entry from dependencies into
optionalDependencies so it is installed/managed together with tree-sitter and
tree-sitter-typescript; update package.json by removing "tree-sitter-python":
"^0.21.0" from the top-level "dependencies" object and adding the same entry to
the "optionalDependencies" object where "tree-sitter" and
"tree-sitter-typescript" are declared to ensure Python graph extraction has the
base runtime available when tree-sitter is optional.

In `@src/commands/graph.ts`:
- Around line 44-49: Update the user-facing help and runtime messages that
erroneously say "TypeScript source files" to reflect multi-language support
(e.g., "source files (TypeScript, JavaScript, Python)" or
"TypeScript/JavaScript/Python source files"). Change the USAGE constant string
(symbol: USAGE) accordingly and search for any runtime log strings that contain
"TypeScript source files" (for example in the graph build command or functions
like buildGraph/discoverSources) and replace them with the new multi-language
phrasing so help text and logs match actual behavior.

---

Nitpick comments:
In `@src/graph/extract/python.ts`:
- Around line 335-340: pushNode currently derives the method lookup key by
splitting node.id on ":" (node.id.split(":")[1]) which is fragile if
relativePath ever contains colons; update the code that constructs GraphNode (or
wherever node.id is formed) to store the lookup key explicitly (e.g., a new
property like node.key or node.lookupName) and use that property here in
pushNode instead of splitting node.id, or, if you prefer a minimal change,
compute the key robustly here using the known id format (e.g., take the
substring after the first ":" via index lookup rather than split) so extraction
does not break when paths include ":" — adjust pushNode to read the new property
(or use the index-based extraction) and update callers that create GraphNode
accordingly.

In `@tests/shared/graph/graph-command.test.ts`:
- Around line 99-108: The test for tryGraphRead("cat /graph/index.md", tmp)
currently uses a loose regex expect(body).toMatch(/no-graph|Code Graph/);
tighten this to assert the exact stable rendered message agreed as the contract
(e.g., the exact "no-graph" rendered string or the canonical "Code Graph — no
snapshot" text) so the test fails on regressions; update the assertion in the
"dispatches a /graph/* read to the VFS..." case to compare body === "<expected
exact string>" (or use toBe with that exact string) and ensure the expected
value is the canonical no-snapshot message produced by
handleGraphVfs/tryGraphRead.

In `@tests/shared/graph/impact.test.ts`:
- Around line 29-37: The test "lists transitive dependents grouped by depth"
uses broad toContain checks; replace them with exact string/line assertions
against the output of renderImpact(chain, "d") so the test is deterministic:
assert equality or use toMatch with anchored regexes for the full expected block
including the header "Impact of d", the exact count "3 dependents", the exact
depth lines such as "depth 1 (1):" followed immediately by the dependent "c" on
the next line, and exact "depth 2" and "depth 3" lines; update the corresponding
loose assertions in the later test (the similar checks at 58-61) the same way.
Ensure you reference and assert the full expected output string or tightly
anchored regexes rather than substrings.

In `@tests/shared/graph/layers.test.ts`:
- Around line 142-144: The test currently asserts a vague substring against the
output ("src/graph/file") which can mask malformed rendering; update the
assertions on the variable out to check for the exact, concrete file paths that
should be rendered (e.g., the specific entries your test setup inserts) instead
of the generic fragment, while preserving the truncation assertion
expect(out).toContain("and 2 more file") so the test still verifies truncation;
locate the two expect calls that reference out in this test and replace the
first loose substring check with assertions for the precise file path strings
produced by the code under test.

In `@tests/shared/graph/vfs-handler.test.ts`:
- Around line 463-468: The test uses handleGraphVfs("path/foo/UserModel", cwd)
but only asserts typeof r.body === "string"; update the assertion to verify
actual output content by checking that r.body contains expected node names or
known messages (e.g., assert r.body includes "foo" or "UserModel", or matches
"No path" or "ambiguous"). Locate the test case (the it block calling seed() and
handleGraphVfs) and replace the loose typeof check with a concrete assertion
such as toContain/toMatch against those expected substrings so the test
validates real path/No path/ambiguous outputs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 918a860e-a946-40d9-9144-651118e1c980

📥 Commits

Reviewing files that changed from the base of the PR and between daafa32 and 10d22ed.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json, !**/package-lock.json
📒 Files selected for processing (54)
  • .gitignore
  • claude-code/skills/hivemind-graph/SKILL.md
  • codex/hooks/hooks.json
  • codex/skills/hivemind-graph/SKILL.md
  • esbuild.config.mjs
  • hermes/skills/hivemind-graph/SKILL.md
  • package.json
  • scripts/ensure-tree-sitter.mjs
  • src/cli/install-codex.ts
  • src/cli/install-cursor.ts
  • src/cli/install-hermes.ts
  • src/commands/graph.ts
  • src/graph/deeplake-pull.ts
  • src/graph/extract/index.ts
  • src/graph/extract/python.ts
  • src/graph/extract/typescript.ts
  • src/graph/graph-command.ts
  • src/graph/node-metadata.ts
  • src/graph/render/impact.ts
  • src/graph/render/layers.ts
  • src/graph/render/neighborhood.ts
  • src/graph/render/path.ts
  • src/graph/render/tour.ts
  • src/graph/resolve/cross-file.ts
  • src/graph/session-context.ts
  • src/graph/snapshot.ts
  • src/graph/types.ts
  • src/graph/vfs-handler.ts
  • src/hooks/codex/pre-tool-use.ts
  • src/hooks/cursor/pre-tool-use.ts
  • src/hooks/cursor/session-start.ts
  • src/hooks/graph-on-stop.ts
  • src/hooks/hermes/pre-tool-use.ts
  • src/hooks/hermes/session-start.ts
  • src/hooks/pre-tool-use.ts
  • tests/claude-code/pre-tool-use-branches.test.ts
  • tests/claude-code/safe-echo.test.ts
  • tests/cli/cli-install-codex-fs.test.ts
  • tests/cli/cli-install-hermes.test.ts
  • tests/shared/graph/cross-file-e2e.test.ts
  • tests/shared/graph/cross-file.test.ts
  • tests/shared/graph/deeplake-pull.test.ts
  • tests/shared/graph/graph-command.test.ts
  • tests/shared/graph/impact.test.ts
  • tests/shared/graph/javascript.test.ts
  • tests/shared/graph/layers.test.ts
  • tests/shared/graph/neighborhood.test.ts
  • tests/shared/graph/node-metadata.test.ts
  • tests/shared/graph/path.test.ts
  • tests/shared/graph/python-cross-file.test.ts
  • tests/shared/graph/python.test.ts
  • tests/shared/graph/session-context.test.ts
  • tests/shared/graph/tour.test.ts
  • tests/shared/graph/vfs-handler.test.ts
✅ Files skipped from review due to trivial changes (8)
  • .gitignore
  • claude-code/skills/hivemind-graph/SKILL.md
  • hermes/skills/hivemind-graph/SKILL.md
  • tests/cli/cli-install-codex-fs.test.ts
  • src/graph/render/neighborhood.ts
  • codex/skills/hivemind-graph/SKILL.md
  • tests/claude-code/pre-tool-use-branches.test.ts
  • codex/hooks/hooks.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/graph/deeplake-pull.ts
  • tests/shared/graph/deeplake-pull.test.ts

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
package.json (1)

64-70: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Inconsistent dependency placement: tree-sitter-python in dependencies but base package in optionalDependencies.

tree-sitter-python (line 64) is a required dependency, but tree-sitter (line 69) is optional. Since tree-sitter-python requires the base tree-sitter package at runtime, Python graph extraction will fail when tree-sitter is not installed. Move tree-sitter-python to optionalDependencies alongside the other tree-sitter packages for consistent behavior.

Suggested fix
   "dependencies": {
     "`@anthropic-ai/sdk`": "^0.97.1",
     "`@huggingface/transformers`": "^3.0.0",
     "`@modelcontextprotocol/sdk`": "^1.29.0",
     "deeplake": "^0.3.30",
     "js-yaml": "^4.1.1",
     "just-bash": "^2.14.0",
-    "tree-sitter-python": "^0.21.0",
     "yargs-parser": "^22.0.0",
     "zod": "^4.3.6"
   },
   "optionalDependencies": {
     "tree-sitter": "^0.21.1",
+    "tree-sitter-python": "^0.21.0",
     "tree-sitter-typescript": "^0.23.2"
   },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@package.json` around lines 64 - 70, Move the tree-sitter-python entry from
dependencies into optionalDependencies so it is installed/managed together with
tree-sitter and tree-sitter-typescript; update package.json by removing
"tree-sitter-python": "^0.21.0" from the top-level "dependencies" object and
adding the same entry to the "optionalDependencies" object where "tree-sitter"
and "tree-sitter-typescript" are declared to ensure Python graph extraction has
the base runtime available when tree-sitter is optional.
src/commands/graph.ts (1)

44-49: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align CLI/help text with multi-language extraction support.

The command now discovers/extracts JS/Python too, but usage and runtime log still say “TypeScript source files,” which is user-facing and misleading.

Suggested patch
-const USAGE = `hivemind graph — codebase-graph commands (Phase 1 — TypeScript only)
+const USAGE = `hivemind graph — codebase-graph commands (Phase 1.5 — TypeScript/JavaScript/Python)

@@
-      Walk the project for TypeScript source files, extract symbols + edges,
+      Walk the project for supported source files (TypeScript/JavaScript/Python), extract symbols + edges,

@@
-  console.log(`Discovered ${sourceFiles.length} TypeScript source files. Extracting...`);
+  console.log(`Discovered ${sourceFiles.length} source files. Extracting...`);

Also applies to: 444-445

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/commands/graph.ts` around lines 44 - 49, Update the user-facing help and
runtime messages that erroneously say "TypeScript source files" to reflect
multi-language support (e.g., "source files (TypeScript, JavaScript, Python)" or
"TypeScript/JavaScript/Python source files"). Change the USAGE constant string
(symbol: USAGE) accordingly and search for any runtime log strings that contain
"TypeScript source files" (for example in the graph build command or functions
like buildGraph/discoverSources) and replace them with the new multi-language
phrasing so help text and logs match actual behavior.
🧹 Nitpick comments (5)
tests/shared/graph/vfs-handler.test.ts (1)

463-468: ⚡ Quick win

Strengthen the assertion to verify actual path/ output content.

The typeof r.body === "string" assertion only confirms the response is a string, which is always true for kind: "ok". Per testing guidelines, prefer asserting on specific values. The path output should contain recognizable structure — e.g., node names from the fixture, "No path", or "ambiguous".

💡 Suggested improvement
   it("path/<from>/<to> renders a path or a clear no-path/ambiguous message", () => {
     seed();
     const r = handleGraphVfs("path/foo/UserModel", cwd);
     expect(r.kind).toBe("ok");
-    if (r.kind === "ok") expect(typeof r.body).toBe("string");
+    if (r.kind === "ok") {
+      // Assert the output contains either path structure or a clear message
+      expect(
+        r.body.includes("foo") || r.body.includes("No path") || r.body.includes("ambiguous")
+      ).toBe(true);
+    }
   });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/graph/vfs-handler.test.ts` around lines 463 - 468, The test uses
handleGraphVfs("path/foo/UserModel", cwd) but only asserts typeof r.body ===
"string"; update the assertion to verify actual output content by checking that
r.body contains expected node names or known messages (e.g., assert r.body
includes "foo" or "UserModel", or matches "No path" or "ambiguous"). Locate the
test case (the it block calling seed() and handleGraphVfs) and replace the loose
typeof check with a concrete assertion such as toContain/toMatch against those
expected substrings so the test validates real path/No path/ambiguous outputs.
src/graph/extract/python.ts (1)

335-340: 💤 Low value

Minor: declByName key extraction relies on no colons in path.

Line 338 uses node.id.split(":")[1] to extract the method key, which assumes relativePath contains no colons. This is documented as relative paths with forward slashes, so it's valid, but consider using a more robust separator (e.g., ::) or extracting the key during node construction to avoid future fragility.

Current behavior is correct given the documented contract.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/graph/extract/python.ts` around lines 335 - 340, pushNode currently
derives the method lookup key by splitting node.id on ":"
(node.id.split(":")[1]) which is fragile if relativePath ever contains colons;
update the code that constructs GraphNode (or wherever node.id is formed) to
store the lookup key explicitly (e.g., a new property like node.key or
node.lookupName) and use that property here in pushNode instead of splitting
node.id, or, if you prefer a minimal change, compute the key robustly here using
the known id format (e.g., take the substring after the first ":" via index
lookup rather than split) so extraction does not break when paths include ":" —
adjust pushNode to read the new property (or use the index-based extraction) and
update callers that create GraphNode accordingly.
tests/shared/graph/layers.test.ts (1)

142-144: ⚡ Quick win

Strengthen this truncation check with concrete file values.

expect(out).toContain("src/graph/file") is too broad and can pass on malformed output. Assert specific rendered paths (and keep the truncation phrase assertion) to make this regression-proof.

Suggested test tightening
-    expect(out).toContain("src/graph/file");
+    expect(out).toContain("src/graph/file0.ts");
+    expect(out).toContain("src/graph/file4.ts");
     expect(out).toContain("and 2 more file");

As per coding guidelines, tests/** should “Prefer asserting on specific values (paths, messages) over generic substrings.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/graph/layers.test.ts` around lines 142 - 144, The test currently
asserts a vague substring against the output ("src/graph/file") which can mask
malformed rendering; update the assertions on the variable out to check for the
exact, concrete file paths that should be rendered (e.g., the specific entries
your test setup inserts) instead of the generic fragment, while preserving the
truncation assertion expect(out).toContain("and 2 more file") so the test still
verifies truncation; locate the two expect calls that reference out in this test
and replace the first loose substring check with assertions for the precise file
path strings produced by the code under test.
tests/shared/graph/graph-command.test.ts (1)

99-108: ⚡ Quick win

Make the no-snapshot response assertion specific

Line 107 currently accepts multiple broad substrings (/no-graph|Code Graph/), which weakens the contract and can mask regressions. Prefer asserting the exact expected rendered value (or a single stable, explicit message/path fragment agreed as contract).

As per coding guidelines: "tests/**: Prefer asserting on specific values (paths, messages) over generic substrings."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/graph/graph-command.test.ts` around lines 99 - 108, The test for
tryGraphRead("cat /graph/index.md", tmp) currently uses a loose regex
expect(body).toMatch(/no-graph|Code Graph/); tighten this to assert the exact
stable rendered message agreed as the contract (e.g., the exact "no-graph"
rendered string or the canonical "Code Graph — no snapshot" text) so the test
fails on regressions; update the assertion in the "dispatches a /graph/* read to
the VFS..." case to compare body === "<expected exact string>" (or use toBe with
that exact string) and ensure the expected value is the canonical no-snapshot
message produced by handleGraphVfs/tryGraphRead.
tests/shared/graph/impact.test.ts (1)

29-37: ⚡ Quick win

Make these message assertions exact instead of broad substrings.

A few checks here are loose enough to pass through formatting/message regressions. Prefer exact line/value assertions for deterministic behavior.

Suggested tightening
   it("lists transitive dependents grouped by depth", () => {
     const body = renderImpact(chain, "d");
-    expect(body).toContain("Impact of d");
-    expect(body).toContain("3 dependents");
-    expect(body).toContain("depth 1 (1):");
+    expect(body).toContain("Impact of d");
+    expect(body).toContain("3 dependents (transitive), by depth:");
+    expect(body).toContain("  depth 1 (1):");
     expect(body).toMatch(/depth 1 \(1\):\s*\n\s*c/);
     expect(body).toContain("depth 2");
     expect(body).toContain("depth 3");
   });
@@
   it("returns a no-match message for an unknown pattern", () => {
     const body = renderImpact(chain, "zzz");
-    expect(body).toContain("No node matches");
+    expect(body).toBe('No node matches "zzz". Try cat memory/graph/find/zzz to explore.');
   });

As per coding guidelines tests/**: “Prefer asserting on specific values (paths, messages) over generic substrings.”

Also applies to: 58-61

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/shared/graph/impact.test.ts` around lines 29 - 37, The test "lists
transitive dependents grouped by depth" uses broad toContain checks; replace
them with exact string/line assertions against the output of renderImpact(chain,
"d") so the test is deterministic: assert equality or use toMatch with anchored
regexes for the full expected block including the header "Impact of d", the
exact count "3 dependents", the exact depth lines such as "depth 1 (1):"
followed immediately by the dependent "c" on the next line, and exact "depth 2"
and "depth 3" lines; update the corresponding loose assertions in the later test
(the similar checks at 58-61) the same way. Ensure you reference and assert the
full expected output string or tightly anchored regexes rather than substrings.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/graph/graph-command.ts`:
- Around line 50-51: The code that extracts a read target uses
rewrittenCommand.replace(/\s+2>\S+/g, "") to strip only stderr redirection, so
functions parseReadTargetPath and tryGraphRead (and the logic around const cmd =
rewrittenCommand...) still see other redirection tokens (>, >>, <, >&, etc.) and
fail to detect VFS reads; update the sanitization step used before target
extraction to remove all common shell redirections (stdout, stderr, append,
input, and combined forms) from rewrittenCommand (and the same sanitization in
the other block around lines 104-112) so parseReadTargetPath/tryGraphRead
receive a cleaned cmd; ensure you update the code paths that create const cmd
and any pre-processing used by tryGraphRead/parseReadTargetPath so they all call
the new comprehensive redirection-stripping routine.

In `@src/graph/resolve/cross-file.ts`:
- Around line 32-49: buildExportIndex and the cross-file/heritage resolvers are
name-only which lets a symbol from the wrong kind win; update the data
structures and matchers to be kind-aware: change buildExportIndex to index
exported top-level nodes by (label, kind) instead of label-only (preserve
first-wins per (label,kind)), update localIndex/exportIndex consumers
(resolveCrossFileCalls and resolveHeritageEdges) to look up by both name and
kind, modify parseUnresolved to return the kind from strings like
"unresolved:<file>:<name>:<kind>", and in resolveCrossFileCalls restrict matches
to value-callable kinds (e.g., function/class/const/enum) so type-only kinds
cannot satisfy call targets.

In `@src/hooks/cursor/pre-tool-use.ts`:
- Around line 73-77: Replace the fixed heredoc delimiter usage that constructs
echoCmd with a collision-safe generated delimiter: compute a random/unique token
that is guaranteed not to appear in graphBody (and reuse the same token for the
grep branch), then build the heredoc using that generated token (e.g., `cat
<<'DELIM'` ... `DELIM`) so no user-controlled content can terminate the heredoc
early; update references to echoCmd and any grep-related command construction in
pre-tool-use.ts to use the new token-generation and reuse logic.

In `@src/hooks/graph-on-stop.ts`:
- Around line 82-85: Update the comment that documents SOURCE_GLOBS to reflect
that the glob list includes Python files (e.g., *.py and *.pyi) in addition to
TypeScript and JavaScript; find the constant SOURCE_GLOBS in graph-on-stop.ts
and change the comment text from "TypeScript + JavaScript (B7), excluding
.d.ts." to something like "TypeScript, JavaScript, and Python (B7), excluding
.d.ts." so the comment matches the actual globs.

In `@src/hooks/hermes/session-start.ts`:
- Around line 214-216: The variable additional in session-start.ts currently
appends graphNote directly to baseWithGoals (and to baseWithGoals + rulesBlock)
which can create run-on text; update the concatenation for additional so it
inserts an explicit newline before graphNote (e.g., ensure the expression builds
`${baseWithGoals}\n${graphNote}` or
`${baseWithGoals}\n\n${rulesBlock}\n${graphNote}` depending on rulesBlock) so
that baseWithGoals/rulesBlock and graphNote are always separated by a newline;
adjust the expression that defines additional accordingly (references:
additional, baseWithGoals, rulesBlock, graphNote).

In `@src/hooks/pre-tool-use.ts`:
- Around line 478-488: The not-found branch currently constructs an unsafe
host-shell command via buildAllowDecision(`echo ${JSON.stringify(notFound)}`,
...); replace that echoed literal with a command produced by safeEchoCommand so
untrusted virtualPath characters cannot be interpreted by the shell;
specifically, call safeEchoCommand(notFound) (or
safeEchoCommand(JSON.stringify(notFound)) if you want the same quoting) and pass
its return value into buildAllowDecision while leaving the Read-path
(writeReadCacheFileFn + buildReadDecision) unchanged.

In `@tests/claude-code/safe-echo.test.ts`:
- Around line 68-69: The test is brittle because it asserts a regex on shell
error text; instead assert deterministically that the sentinel file does not
exist. Replace the runVia(...) + regex assertion with a platform-independent
existence check (e.g., use Node's fs.existsSync or fs.promises.stat) against the
canary variable and assert it returns false (or rejects/not found). Update the
test that currently calls runVia(shell, `ls '${canary}' ...`) and the
expect(...) that inspects lsOut+lsErr to use the fs-based existence check so the
assertion is exact and locale-independent.

---

Outside diff comments:
In `@package.json`:
- Around line 64-70: Move the tree-sitter-python entry from dependencies into
optionalDependencies so it is installed/managed together with tree-sitter and
tree-sitter-typescript; update package.json by removing "tree-sitter-python":
"^0.21.0" from the top-level "dependencies" object and adding the same entry to
the "optionalDependencies" object where "tree-sitter" and
"tree-sitter-typescript" are declared to ensure Python graph extraction has the
base runtime available when tree-sitter is optional.

In `@src/commands/graph.ts`:
- Around line 44-49: Update the user-facing help and runtime messages that
erroneously say "TypeScript source files" to reflect multi-language support
(e.g., "source files (TypeScript, JavaScript, Python)" or
"TypeScript/JavaScript/Python source files"). Change the USAGE constant string
(symbol: USAGE) accordingly and search for any runtime log strings that contain
"TypeScript source files" (for example in the graph build command or functions
like buildGraph/discoverSources) and replace them with the new multi-language
phrasing so help text and logs match actual behavior.

---

Nitpick comments:
In `@src/graph/extract/python.ts`:
- Around line 335-340: pushNode currently derives the method lookup key by
splitting node.id on ":" (node.id.split(":")[1]) which is fragile if
relativePath ever contains colons; update the code that constructs GraphNode (or
wherever node.id is formed) to store the lookup key explicitly (e.g., a new
property like node.key or node.lookupName) and use that property here in
pushNode instead of splitting node.id, or, if you prefer a minimal change,
compute the key robustly here using the known id format (e.g., take the
substring after the first ":" via index lookup rather than split) so extraction
does not break when paths include ":" — adjust pushNode to read the new property
(or use the index-based extraction) and update callers that create GraphNode
accordingly.

In `@tests/shared/graph/graph-command.test.ts`:
- Around line 99-108: The test for tryGraphRead("cat /graph/index.md", tmp)
currently uses a loose regex expect(body).toMatch(/no-graph|Code Graph/);
tighten this to assert the exact stable rendered message agreed as the contract
(e.g., the exact "no-graph" rendered string or the canonical "Code Graph — no
snapshot" text) so the test fails on regressions; update the assertion in the
"dispatches a /graph/* read to the VFS..." case to compare body === "<expected
exact string>" (or use toBe with that exact string) and ensure the expected
value is the canonical no-snapshot message produced by
handleGraphVfs/tryGraphRead.

In `@tests/shared/graph/impact.test.ts`:
- Around line 29-37: The test "lists transitive dependents grouped by depth"
uses broad toContain checks; replace them with exact string/line assertions
against the output of renderImpact(chain, "d") so the test is deterministic:
assert equality or use toMatch with anchored regexes for the full expected block
including the header "Impact of d", the exact count "3 dependents", the exact
depth lines such as "depth 1 (1):" followed immediately by the dependent "c" on
the next line, and exact "depth 2" and "depth 3" lines; update the corresponding
loose assertions in the later test (the similar checks at 58-61) the same way.
Ensure you reference and assert the full expected output string or tightly
anchored regexes rather than substrings.

In `@tests/shared/graph/layers.test.ts`:
- Around line 142-144: The test currently asserts a vague substring against the
output ("src/graph/file") which can mask malformed rendering; update the
assertions on the variable out to check for the exact, concrete file paths that
should be rendered (e.g., the specific entries your test setup inserts) instead
of the generic fragment, while preserving the truncation assertion
expect(out).toContain("and 2 more file") so the test still verifies truncation;
locate the two expect calls that reference out in this test and replace the
first loose substring check with assertions for the precise file path strings
produced by the code under test.

In `@tests/shared/graph/vfs-handler.test.ts`:
- Around line 463-468: The test uses handleGraphVfs("path/foo/UserModel", cwd)
but only asserts typeof r.body === "string"; update the assertion to verify
actual output content by checking that r.body contains expected node names or
known messages (e.g., assert r.body includes "foo" or "UserModel", or matches
"No path" or "ambiguous"). Locate the test case (the it block calling seed() and
handleGraphVfs) and replace the loose typeof check with a concrete assertion
such as toContain/toMatch against those expected substrings so the test
validates real path/No path/ambiguous outputs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 918a860e-a946-40d9-9144-651118e1c980

📥 Commits

Reviewing files that changed from the base of the PR and between daafa32 and 10d22ed.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json, !**/package-lock.json
📒 Files selected for processing (54)
  • .gitignore
  • claude-code/skills/hivemind-graph/SKILL.md
  • codex/hooks/hooks.json
  • codex/skills/hivemind-graph/SKILL.md
  • esbuild.config.mjs
  • hermes/skills/hivemind-graph/SKILL.md
  • package.json
  • scripts/ensure-tree-sitter.mjs
  • src/cli/install-codex.ts
  • src/cli/install-cursor.ts
  • src/cli/install-hermes.ts
  • src/commands/graph.ts
  • src/graph/deeplake-pull.ts
  • src/graph/extract/index.ts
  • src/graph/extract/python.ts
  • src/graph/extract/typescript.ts
  • src/graph/graph-command.ts
  • src/graph/node-metadata.ts
  • src/graph/render/impact.ts
  • src/graph/render/layers.ts
  • src/graph/render/neighborhood.ts
  • src/graph/render/path.ts
  • src/graph/render/tour.ts
  • src/graph/resolve/cross-file.ts
  • src/graph/session-context.ts
  • src/graph/snapshot.ts
  • src/graph/types.ts
  • src/graph/vfs-handler.ts
  • src/hooks/codex/pre-tool-use.ts
  • src/hooks/cursor/pre-tool-use.ts
  • src/hooks/cursor/session-start.ts
  • src/hooks/graph-on-stop.ts
  • src/hooks/hermes/pre-tool-use.ts
  • src/hooks/hermes/session-start.ts
  • src/hooks/pre-tool-use.ts
  • tests/claude-code/pre-tool-use-branches.test.ts
  • tests/claude-code/safe-echo.test.ts
  • tests/cli/cli-install-codex-fs.test.ts
  • tests/cli/cli-install-hermes.test.ts
  • tests/shared/graph/cross-file-e2e.test.ts
  • tests/shared/graph/cross-file.test.ts
  • tests/shared/graph/deeplake-pull.test.ts
  • tests/shared/graph/graph-command.test.ts
  • tests/shared/graph/impact.test.ts
  • tests/shared/graph/javascript.test.ts
  • tests/shared/graph/layers.test.ts
  • tests/shared/graph/neighborhood.test.ts
  • tests/shared/graph/node-metadata.test.ts
  • tests/shared/graph/path.test.ts
  • tests/shared/graph/python-cross-file.test.ts
  • tests/shared/graph/python.test.ts
  • tests/shared/graph/session-context.test.ts
  • tests/shared/graph/tour.test.ts
  • tests/shared/graph/vfs-handler.test.ts
✅ Files skipped from review due to trivial changes (8)
  • .gitignore
  • claude-code/skills/hivemind-graph/SKILL.md
  • hermes/skills/hivemind-graph/SKILL.md
  • tests/cli/cli-install-codex-fs.test.ts
  • src/graph/render/neighborhood.ts
  • codex/skills/hivemind-graph/SKILL.md
  • tests/claude-code/pre-tool-use-branches.test.ts
  • codex/hooks/hooks.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/graph/deeplake-pull.ts
  • tests/shared/graph/deeplake-pull.test.ts
🛑 Comments failed to post (7)
src/graph/graph-command.ts (1)

50-51: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle standard shell redirections before target extraction.

parseReadTargetPath/tryGraphRead only strip 2>.... Common reads like cat /graph/index.md >/tmp/x or head /graph/index.md < /tmp/in leave redirection tokens as operands, so this path returns null and misses VFS interception.

Suggested patch
+function stripSimpleRedirections(cmd: string): string {
+  // Best-effort: remove simple stdin/stdout/stderr redirects used in read commands.
+  return cmd.replace(/\s+(?:\d+>>?|\d+>|>>|>|<)\s*\S+/g, "");
+}
+
 export function parseReadTargetPath(rewrittenCommand: string): string | null {
-  const cmd = rewrittenCommand.replace(/\s+2>\S+/g, "").trim();
+  const cmd = stripSimpleRedirections(rewrittenCommand).trim();
@@
 export function tryGraphRead(rewrittenCommand: string, cwd: string): string | null {
-  const ls = rewrittenCommand.replace(/\s+2>\S+/g, "").trim().match(/^ls\s+(?:-\S+\s+)*(\S+)\s*$/);
+  const ls = stripSimpleRedirections(rewrittenCommand).trim().match(/^ls\s+(?:-\S+\s+)*(\S+)\s*$/);

Also applies to: 104-112

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/graph/graph-command.ts` around lines 50 - 51, The code that extracts a
read target uses rewrittenCommand.replace(/\s+2>\S+/g, "") to strip only stderr
redirection, so functions parseReadTargetPath and tryGraphRead (and the logic
around const cmd = rewrittenCommand...) still see other redirection tokens (>,
>>, <, >&, etc.) and fail to detect VFS reads; update the sanitization step used
before target extraction to remove all common shell redirections (stdout,
stderr, append, input, and combined forms) from rewrittenCommand (and the same
sanitization in the other block around lines 104-112) so
parseReadTargetPath/tryGraphRead receive a cleaned cmd; ensure you update the
code paths that create const cmd and any pre-processing used by
tryGraphRead/parseReadTargetPath so they all call the new comprehensive
redirection-stripping routine.
src/graph/resolve/cross-file.ts (1)

32-49: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="src/graph/resolve/cross-file.ts"

# Show file header + the referenced regions with line numbers
sed -n '1,120p' "$FILE"
echo "----"
sed -n '120,220p' "$FILE"
echo "----"
sed -n '220,320p' "$FILE"
echo "----"

# Also grep for the key functions to see their actual signatures/uses
rg -n "buildExportIndex|parseUnresolved|resolveOne|resolveHeritageEdges|calls|implements|extends|EXPORTABLE_KINDS|HERITAGE_KINDS" "$FILE"

Repository: activeloopai/hivemind

Length of output: 14689


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "unresolved:<file>:|UNRESOLVED_PREFIX|unresolved:" src | head -n 200

Repository: activeloopai/hivemind

Length of output: 1172


🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n "UNRESOLVED_PREFIX|unresolved:" src | head -n 200

Repository: activeloopai/hivemind

Length of output: 1172


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE_TS="src/graph/extract/typescript.ts"
FILE_PY="src/graph/extract/python.ts"
FILE_CACHE="src/graph/cache.ts"
FILE_SNAPSHOT="src/graph/snapshot.ts"
FILE_CROSS="src/graph/resolve/cross-file.ts"

echo "---- $FILE_TS (around unresolved generator)"
sed -n '700,780p' "$FILE_TS"
echo "---- $FILE_CROSS (parseUnresolved + resolveHeritageEdges context)"
sed -n '140,280p' "$FILE_CROSS"
echo "---- $FILE_CACHE (unresolved relocation semantics)"
sed -n '180,260p' "$FILE_CACHE"
echo "---- $FILE_SNAPSHOT (unresolved placeholder usage)"
sed -n '50,110p' "$FILE_SNAPSHOT"

Repository: activeloopai/hivemind

Length of output: 13515


Make cross-file export/heritage resolution kind-aware to prevent wrong edge targets.

In src/graph/resolve/cross-file.ts, buildExportIndex is a name-only (source_file -> label -> id) “first wins” map and is used by resolveCrossFileCalls, so if a module exports multiple declarations with the same name but different kind (type/value spaces), the calls edge can bind to the wrong node. Similarly, resolveHeritageEdges receives unresolved:<file>:<name>:<kind>, but parseUnresolved discards <kind> and both the same-file localIndex and cross-file exportIndex resolve heritage by name only, so extends/implements edges can also target the wrong kind.

Suggested direction
- function buildExportIndex(nodes: readonly GraphNode[]): Map<string, Map<string, string>> {
+ function buildExportIndex(nodes: readonly GraphNode[]): Map<string, Map<string, Map<GraphNode["kind"], string>>> {
   ...
-    if (!m.has(n.label)) m.set(n.label, n.id); // first wins (no clashes in valid TS)
+    let byKind = m.get(n.label);
+    if (!byKind) { byKind = new Map(); m.set(n.label, byKind); }
+    if (!byKind.has(n.kind)) byKind.set(n.kind, n.id);
 }

- function parseUnresolved(target: string): { file: string; name: string } | null
+ function parseUnresolved(target: string): { file: string; name: string; kind: GraphNode["kind"] } | null

Use kind-aware selection:

  • for resolveCrossFileCalls, restrict to value-callable kinds (e.g., function/class/const/enum) instead of allowing type-only kinds to win on label;
  • for resolveHeritageEdges, return kind from parseUnresolved and key localIndex/exportIndex by (name, kind) rather than name-only.

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @src/graph/resolve/cross-file.ts around lines 32 - 49, buildExportIndex and
the cross-file/heritage resolvers are name-only which lets a symbol from the
wrong kind win; update the data structures and matchers to be kind-aware: change
buildExportIndex to index exported top-level nodes by (label, kind) instead of
label-only (preserve first-wins per (label,kind)), update localIndex/exportIndex
consumers (resolveCrossFileCalls and resolveHeritageEdges) to look up by both
name and kind, modify parseUnresolved to return the kind from strings like
"unresolved:::", and in resolveCrossFileCalls restrict matches
to value-callable kinds (e.g., function/class/const/enum) so type-only kinds
cannot satisfy call targets.


</details>

<!-- fingerprinting:phantom:poseidon:hawk -->

<!-- This is an auto-generated comment by CodeRabbit -->

</blockquote></details>
<details>
<summary>src/hooks/cursor/pre-tool-use.ts (1)</summary><blockquote>

73-77: _⚠️ Potential issue_ | _🔴 Critical_ | _⚡ Quick win_

**Use a collision-safe literal-output builder for heredoc commands.**

The fixed delimiter allows command injection if `graphBody` contains a standalone `__HIVEMIND_RESULT__` line; trailing lines would execute in the host shell. Please switch to a generated delimiter that is guaranteed absent from payload, and reuse it for both graph and grep branches.

<details>
<summary>🔧 Proposed fix</summary>

```diff
+function toLiteralCatCommand(body: string): string {
+  let marker = "__HIVEMIND_RESULT__";
+  while (new RegExp(`(^|\\n)${marker}(\\n|$)`).test(body)) {
+    marker = `__HIVEMIND_RESULT__${Math.random().toString(36).slice(2)}`;
+  }
+  return `cat <<'${marker}'\n${body}\n${marker}`;
+}
+
   const graphBody = tryGraphRead(rewritten, input.cwd ?? process.cwd());
   if (graphBody !== null) {
     log(`graph vfs intercept: ${command.slice(0, 80)}`);
-    const echoCmd = `cat <<'__HIVEMIND_RESULT__'\n${graphBody}\n__HIVEMIND_RESULT__`;
+    const echoCmd = toLiteralCatCommand(graphBody);
     process.stdout.write(JSON.stringify({
       permission: "allow",
       updated_input: { command: echoCmd },
       agent_message: "[Hivemind graph]",
     }));
     return;
   }
@@
-    const echoCmd = `cat <<'__HIVEMIND_RESULT__'\n${result}\n__HIVEMIND_RESULT__`;
+    const echoCmd = toLiteralCatCommand(result);

As per coding guidelines, the key invariant is that a memory-touching flow must never end up executing unintended host-shell commands.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/hooks/cursor/pre-tool-use.ts` around lines 73 - 77, Replace the fixed
heredoc delimiter usage that constructs echoCmd with a collision-safe generated
delimiter: compute a random/unique token that is guaranteed not to appear in
graphBody (and reuse the same token for the grep branch), then build the heredoc
using that generated token (e.g., `cat <<'DELIM'` ... `DELIM`) so no
user-controlled content can terminate the heredoc early; update references to
echoCmd and any grep-related command construction in pre-tool-use.ts to use the
new token-generation and reuse logic.
src/hooks/graph-on-stop.ts (1)

82-85: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the SOURCE_GLOBS comment to include Python.

The comment says TypeScript+JavaScript, but the glob list now also includes *.py and *.pyi. Keeping this aligned avoids future maintenance drift.

Suggested patch
- * in src/commands/graph.ts — TypeScript + JavaScript (B7), excluding .d.ts.
+ * in src/commands/graph.ts — TypeScript + JavaScript + Python, excluding .d.ts.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/hooks/graph-on-stop.ts` around lines 82 - 85, Update the comment that
documents SOURCE_GLOBS to reflect that the glob list includes Python files
(e.g., *.py and *.pyi) in addition to TypeScript and JavaScript; find the
constant SOURCE_GLOBS in graph-on-stop.ts and change the comment text from
"TypeScript + JavaScript (B7), excluding .d.ts." to something like "TypeScript,
JavaScript, and Python (B7), excluding .d.ts." so the comment matches the actual
globs.
src/hooks/hermes/session-start.ts (1)

214-216: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add an explicit newline before graphNote to avoid run-on context text.

additional currently concatenates graphNote directly, which can merge lines in the user-visible Hermes context.

🔧 Proposed fix
-  const graphNote = graphContextLine(cwd) ?? "";
-  const additional = (rulesBlock
-    ? `${baseWithGoals}\n\n${rulesBlock}`
-    : baseWithGoals) + graphNote;
+  const graphNote = graphContextLine(cwd);
+  const baseAdditional = rulesBlock
+    ? `${baseWithGoals}\n\n${rulesBlock}`
+    : baseWithGoals;
+  const additional = graphNote
+    ? `${baseAdditional}\n${graphNote}`
+    : baseAdditional;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/hooks/hermes/session-start.ts` around lines 214 - 216, The variable
additional in session-start.ts currently appends graphNote directly to
baseWithGoals (and to baseWithGoals + rulesBlock) which can create run-on text;
update the concatenation for additional so it inserts an explicit newline before
graphNote (e.g., ensure the expression builds `${baseWithGoals}\n${graphNote}`
or `${baseWithGoals}\n\n${rulesBlock}\n${graphNote}` depending on rulesBlock) so
that baseWithGoals/rulesBlock and graphNote are always separated by a newline;
adjust the expression that defines additional accordingly (references:
additional, baseWithGoals, rulesBlock, graphNote).
src/hooks/pre-tool-use.ts (1)

478-488: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Use safeEchoCommand in the not-found path branch.

Line 487 still emits echo ${JSON.stringify(notFound)}. Because notFound includes virtualPath (tool-input-derived), backticks/$() can be evaluated by the host shell in this fallback.

🔧 Suggested fix
-      return buildAllowDecision(`echo ${JSON.stringify(notFound)}`, `[DeepLake] not found: ${virtualPath}`);
+      return buildAllowDecision(safeEchoCommand(notFound), `[DeepLake] not found: ${virtualPath}`);

As per coding guidelines, “the key invariant to flag is that a memory-touching command must never be handed to the real host shell.”

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

      // The path was normalized to a concrete file but no VFS row exists — that
      // is "not found", NOT an unsupported command. Falling through to the
      // generic retry guidance would loop the agent on an already-valid shape.
      logFn(`virtual path not found: ${virtualPath}`);
      const notFound = `${virtualPath}: No such file or directory`;
      if (input.tool_name === "Read") {
        const file_path = writeReadCacheFileFn(input.session_id, virtualPath, notFound);
        return buildReadDecision(file_path, `[DeepLake] not found: ${virtualPath}`);
      }
      return buildAllowDecision(safeEchoCommand(notFound), `[DeepLake] not found: ${virtualPath}`);
    }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/hooks/pre-tool-use.ts` around lines 478 - 488, The not-found branch
currently constructs an unsafe host-shell command via buildAllowDecision(`echo
${JSON.stringify(notFound)}`, ...); replace that echoed literal with a command
produced by safeEchoCommand so untrusted virtualPath characters cannot be
interpreted by the shell; specifically, call safeEchoCommand(notFound) (or
safeEchoCommand(JSON.stringify(notFound)) if you want the same quoting) and pass
its return value into buildAllowDecision while leaving the Read-path
(writeReadCacheFileFn + buildReadDecision) unchanged.
tests/claude-code/safe-echo.test.ts (1)

68-69: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert canary absence with an exact sentinel value, not a generic error regex.

The current check depends on shell/locale error text. Prefer a fixed-value existence check to make this deterministic.

🔧 Suggested fix
-      const { stdout: lsOut, stderr: lsErr } = runVia(shell, `ls '${canary}' 2>&1 || true`);
-      expect(lsOut + lsErr).toMatch(/No such file or directory/);
+      const { stdout: existsState } = runVia(shell, `[ -e '${canary}' ] && printf 'exists' || printf 'missing'`);
+      expect(existsState).toBe("missing");

As per coding guidelines, “tests/**: Prefer asserting on specific values (paths, messages) over generic substrings.”

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

      const { stdout: existsState } = runVia(shell, `[ -e '${canary}' ] && printf 'exists' || printf 'missing'`);
      expect(existsState).toBe("missing");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/claude-code/safe-echo.test.ts` around lines 68 - 69, The test is
brittle because it asserts a regex on shell error text; instead assert
deterministically that the sentinel file does not exist. Replace the runVia(...)
+ regex assertion with a platform-independent existence check (e.g., use Node's
fs.existsSync or fs.promises.stat) against the canary variable and assert it
returns false (or rejects/not found). Update the test that currently calls
runVia(shell, `ls '${canary}' ...`) and the expect(...) that inspects
lsOut+lsErr to use the fs-based existence check so the assertion is exact and
locale-independent.

…respect)

Dependency / build-output directories could pollute the code graph: discovery
only skipped a small hardcoded set + dotdirs, so non-dot Python virtualenvs
(venv/, env/) and their site-packages were parsed into the graph.

- src/graph/ignore-config.ts: user-editable ~/.deeplake/graph-ignore.json,
  seeded on first build with a broad default set (node_modules, venv, env,
  __pycache__, site-packages, dist, build, ...), listing directory names to skip
  (matched by basename). Conservative about source-like names (packages/bin/lib
  are NOT defaulted). + pathHasIgnoredSegment helper.
- src/commands/graph.ts: discovery now prefers `git ls-files --cached --others
  --exclude-standard`, so the repo's .gitignore is honored EXACTLY (anchoring,
  nested rules, subtrees) rather than a naive basename parse; the name set is
  applied on top as a safety net for tracked junk. Manual walk() stays the
  fallback for non-git repos. Also fixes the stale "TypeScript source files" log.
- Single discoverSourceFiles path, shared by every agent's auto-build
  (graph-on-stop -> runBuildCommand) -> ignore is uniform across assistants.

Verified e2e: junk under venv/ env/ third_party/ leaves the snapshot unchanged
(node count + zero junk markers). +13 unit tests.
Comment thread src/graph/ignore-config.ts Fixed

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/commands/graph.ts (1)

50-55: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update CLI help text to reflect multi-language support.

isSourceFile() now includes JS/Python extensions, but the help text still says “TypeScript only / TypeScript source files,” which is now inaccurate.

Suggested diff
-const USAGE = `hivemind graph — codebase-graph commands (Phase 1 — TypeScript only)
+const USAGE = `hivemind graph — codebase-graph commands (Phase 1)

 Usage:
   hivemind graph build [--cwd <path>]
-      Walk the project for TypeScript source files, extract symbols + edges,
+      Walk the project for supported source files (TS/JS/Python), extract symbols + edges,
       and write a snapshot to ~/.hivemind/graphs/<repo-key>/snapshots/<commit-sha>.json.

Also applies to: 691-697

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/commands/graph.ts` around lines 50 - 55, The CLI help text in the USAGE
constant (and the other help string that still says “TypeScript only”) is
outdated because isSourceFile() now supports JS and Python; update the wording
in USAGE and the duplicate help block (the other string that mentions
“TypeScript source files”) to reflect multi-language support (e.g.,
“multi-language (TypeScript, JavaScript, Python)” or
“TypeScript/JavaScript/Python”) so the usage message accurately describes which
source files are processed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/commands/graph.ts`:
- Around line 50-55: The CLI help text in the USAGE constant (and the other help
string that still says “TypeScript only”) is outdated because isSourceFile() now
supports JS and Python; update the wording in USAGE and the duplicate help block
(the other string that mentions “TypeScript source files”) to reflect
multi-language support (e.g., “multi-language (TypeScript, JavaScript, Python)”
or “TypeScript/JavaScript/Python”) so the usage message accurately describes
which source files are processed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 89f11590-b9a2-443c-be4b-edf173a917d6

📥 Commits

Reviewing files that changed from the base of the PR and between c91a77b and 8542b5a.

📒 Files selected for processing (3)
  • src/commands/graph.ts
  • src/graph/ignore-config.ts
  • tests/shared/graph/ignore-config.test.ts

CodeRabbit (Minor): the `hivemind graph` USAGE text and file-header comment still
said "TypeScript only / TypeScript source files" after JS + Python support
landed. Update both to "TypeScript / JavaScript / Python". No behavior change.
@efenocchi

Copy link
Copy Markdown
Collaborator Author

@coderabbitai Addressed in the latest commit: the hivemind graph USAGE string and the file-header comment now say "TypeScript / JavaScript / Python" instead of "TypeScript only / TypeScript source files". (The Discovered N source files build log was already de-TypeScript'd in the ignore-config commit.) No other "TypeScript only" references remain in src/commands/graph.ts.

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -n "TypeScript" src/commands/graph.ts

Length of output: 276



@efenocchi — confirmed. Both occurrences are updated:

  • Line 8 (file-header): "TypeScript/JavaScript/Python"
  • Line 51 (USAGE string): "TypeScript / JavaScript / Python"

No stale "TypeScript only" references remain in src/commands/graph.ts. ✅

🐇 ✨


🧠 Learnings used
Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 0
File: :0-0
Timestamp: 2026-06-05T04:21:07.749Z
Learning: In activeloopai/hivemind src/graph/deeplake-pull.ts (TypeScript), the `pullSnapshot` function's SQL SELECT is already keyed on `(repo_slug, commit_sha)` matching the requested `(repoKey, head)` pair (lines 126-128). An additional in-code payload identity check (comparing payload's graph.commit_sha/graph.repo_key against the query parameters) is intentionally omitted as non-required defense-in-depth; the `snapshot_sha256` check (lines 179-183) covers payload integrity. Do not flag the absence of this identity cross-check as a correctness gap.

Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 181
File: bundle/cli.js:5103-5106
Timestamp: 2026-05-19T00:42:51.377Z
Learning: Repo: activeloopai/hivemind
In src/skillify/state-dir.ts (and bundled dist), getStateDir() must treat HIVEMIND_STATE_DIR as unset when it is empty or whitespace-only (uses .trim() and length>0), falling back to ~/.deeplake/state/skillify. Verified by tests in tests/claude-code/skillify-state.test.ts (“empty/whitespace HIVEMIND_STATE_DIR is treated as unset”).

Learnt from: kaghni
Repo: activeloopai/hivemind PR: 97
File: src/hooks/shared/autoupdate.ts:122-146
Timestamp: 2026-05-07T08:28:25.207Z
Learning: In `src/hooks/shared/autoupdate.ts`, the `autoUpdate` function intentionally does NOT include a machine-wide lock/cooldown. An initial draft had a flock-style lock at `~/.deeplake/.autoupdate.lock`, but after a latency-fix refactor to fire-and-forget detached spawn (commit 601aa5c), the lock is released immediately after dispatching the child — defeating the purpose. The proper home for the lock is `src/cli/update.ts:runUpdate()` so the long-running `hivemind update` process owns the lock for its full lifetime. `npm install -g` has its own internal locking that prevents corruption in the meantime, so concurrent invocations are either both no-ops (already at latest) or one wins idempotently. This is tracked as a separate follow-up.

efenocchi added 2 commits June 5, 2026 06:21
CodeQL flagged a check-then-use race: loadGraphIgnore did existsSync(path) then
later writeFileSync(path). Read the config directly instead (reacting to the
read result, not a prior existsSync), and seed defaults with `flag: "wx"` —
an atomic create-exclusive that fails if the file already exists, so a
concurrent build can never be clobbered. Behavior preserved (9 tests pass).
…s to >=90

The PR coverage comment was red on branches (aggregate 80%). Close the gaps on
the two files this PR introduces and enforce them per-file (vitest.config.ts):

- ignore-config.ts: 75% -> 100% branches (per-field type fallbacks for a present
  config with non-array ignoreDirs / non-boolean respectGitignore).
- cross-file.ts: 85.7% -> 90.2% branches (relative `.`/package __init__.py and
  __init__.pyi resolution, relative-resolves-to-null fall-through, empty
  specifier guard). Now 100% lines.

Both added to the enforced thresholds block at 90/90/90/90.

@kaghni kaghni left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai pi and openclaw are missing for hivemind-graph skills and corresponding graph implmentation if needed can you create an issue for that?

@@ -17,7 +17,7 @@ import { createRequire } from 'node:module';

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai what tree-sitter is and is used for?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaghni, great question! Here's a concise overview:


What is tree-sitter?

tree-sitter is an incremental parsing library and parser generator. It builds a concrete Abstract Syntax Tree (AST) from source code — a structured, hierarchical representation of the code's syntax (functions, classes, expressions, imports, etc.).

Key characteristics:

  • Fast & incremental: can re-parse only the changed portions of a file.
  • Error-tolerant: produces a partial tree even for syntactically invalid code.
  • Language-agnostic: each language ships as a separate grammar package (e.g., tree-sitter-typescript, tree-sitter-python).
  • Native (C/C++) core with bindings for many languages including Node.js — which is why scripts/ensure-tree-sitter.mjs exists to handle platform-specific compilation on e.g. Linux ARM64 where prebuilt binaries may not be available.

How is it used in this PR?

This PR uses tree-sitter to power the code graph extraction pipeline:

File Role
src/graph/extract/typescript.ts Parses .ts/.tsx/.js/.jsx files into an AST, then walks it to extract nodes (functions, classes, types) and edges (calls, imports, inheritance)
src/graph/extract/python.ts Same but for .py/.pyi files using the Python grammar
src/graph/extract/index.ts Dispatches to the right extractor based on file extension

The extracted AST data feeds into the graph snapshot stored in Deeplake, which agents (Cursor, Codex, Hermes) then query via the VFS (memory/graph/) to answer structural/relational questions about the codebase — without reading raw source files.

In short: tree-sitter turns source code into a structured graph of symbols and their relationships.


🧠 Learnings used
Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 0
File: :0-0
Timestamp: 2026-06-05T04:21:07.760Z
Learning: In activeloopai/hivemind src/graph/deeplake-pull.ts (TypeScript), the `pullSnapshot` function's SQL SELECT is already keyed on `(repo_slug, commit_sha)` matching the requested `(repoKey, head)` pair (lines 126-128). An additional in-code payload identity check (comparing payload's graph.commit_sha/graph.repo_key against the query parameters) is intentionally omitted as non-required defense-in-depth; the `snapshot_sha256` check (lines 179-183) covers payload integrity. Do not flag the absence of this identity cross-check as a correctness gap.

Comment thread src/cli/install-codex.ts
// graph-on-stop.js (code-graph auto-build, G3). Single block (not two)
// mirrors the static codex/hooks/hooks.json and keeps one entry per
// event for the merge/dedupe logic.
Stop: [stopBlockWithGraph(30)],

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai so do we fire graph on every stop? why?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaghni — yes, it fires on every Stop, and it's intentional. Codex doesn't expose a dedicated session_end event; Stop fires after every assistant turn and is the only signal available for post-session work (the same reason stop.js also fires unconditionally — see the existing learning in context).

The cost is kept low by an internal git-diff threshold gate inside graph-on-stop (the SOURCE_GLOBS check in src/hooks/graph-on-stop.ts): the hook inspects whether any TS/JS/Python source files changed since the last build. If nothing has changed, it exits in well under a second without rebuilding anything. An actual graph rebuild only happens when the diff crosses the configured source-file count threshold.

So the pattern is:

  • Fire on every Stop — because there's no other hook to attach to.
  • Gate the expensive work — skip the rebuild cheaply when sources haven't changed.

This mirrors what Cursor and Hermes do with their own session-end equivalents.


🧠 Learnings used
Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 98
File: codex/bundle/stop.js:1402-1408
Timestamp: 2026-05-06T23:01:21.197Z
Learning: In `codex/bundle/stop.js` (and `src/hooks/codex/stop.ts`), `forceSessionEndTrigger` is called unconditionally on every Codex `Stop` event by deliberate design. Codex has no separate `session_end` hook — `Stop` fires on every assistant turn and is the only available signal for session completion. Using `tryStopCounterTrigger` instead would silently break skill mining on short sessions (counter never reaches threshold, session closes, worker never fires). The per-projectKey worker lock (`tryAcquireWorkerLock` with a 10-min stale TTL in `src/skilify/state.ts`) makes the over-triggering cheap: subsequent Stops within the same session find nothing past the watermark and exit in <1s. This matches the pattern used by Cursor, Hermes, and Pi agents.

Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 0
File: :0-0
Timestamp: 2026-06-05T04:21:07.760Z
Learning: In activeloopai/hivemind src/graph/deeplake-pull.ts (TypeScript), the `pullSnapshot` function's SQL SELECT is already keyed on `(repo_slug, commit_sha)` matching the requested `(repoKey, head)` pair (lines 126-128). An additional in-code payload identity check (comparing payload's graph.commit_sha/graph.repo_key against the query parameters) is intentionally omitted as non-required defense-in-depth; the `snapshot_sha256` check (lines 179-183) covers payload integrity. Do not flag the absence of this identity cross-check as a correctness gap.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can be too much is there other way closer to session end

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaghni — unfortunately Codex doesn't expose anything closer to session end. The full hook menu is:

Hook Fires when
SessionStart Session opens
UserPromptSubmit User sends a message
PreToolUse / PostToolUse Around each tool call
Stop After every assistant turn

There's no SessionEnd, SessionClose, or equivalent. Stop is the last available signal — the same reason stop.js uses it too.

The practical mitigation that's already in place is the git-diff gate inside graph-on-stop.ts: on each Stop it runs a quick git diff against the last-build marker. If no TS/JS/Python source files changed, the script exits in well under a second without touching the graph. An actual rebuild only happens when source files have changed since the last build.

So the overhead on a typical "stop with no code changes" is just one node process spin-up + a git diff. That said, if even that feels like too much churn, a debounce/cooldown approach would be the cleanest middle ground — write a timestamp to a state file on each successful build and skip the git diff entirely if the last build was less than, say, 5 minutes ago. That caps process spin-up cost to at most once per cooldown window while still guaranteeing a build fires eventually. Would you like me to open a follow-up issue for that, or sketch out what the change would look like?


🧠 Learnings used
Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 98
File: codex/bundle/stop.js:1402-1408
Timestamp: 2026-05-06T23:01:21.197Z
Learning: In `codex/bundle/stop.js` (and `src/hooks/codex/stop.ts`), `forceSessionEndTrigger` is called unconditionally on every Codex `Stop` event by deliberate design. Codex has no separate `session_end` hook — `Stop` fires on every assistant turn and is the only available signal for session completion. Using `tryStopCounterTrigger` instead would silently break skill mining on short sessions (counter never reaches threshold, session closes, worker never fires). The per-projectKey worker lock (`tryAcquireWorkerLock` with a 10-min stale TTL in `src/skilify/state.ts`) makes the over-triggering cheap: subsequent Stops within the same session find nothing past the watermark and exit in <1s. This matches the pattern used by Cursor, Hermes, and Pi agents.

Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 0
File: :0-0
Timestamp: 2026-06-05T04:21:07.760Z
Learning: In activeloopai/hivemind src/graph/deeplake-pull.ts (TypeScript), the `pullSnapshot` function's SQL SELECT is already keyed on `(repo_slug, commit_sha)` matching the requested `(repoKey, head)` pair (lines 126-128). An additional in-code payload identity check (comparing payload's graph.commit_sha/graph.repo_key against the query parameters) is intentionally omitted as non-required defense-in-depth; the `snapshot_sha256` check (lines 179-183) covers payload integrity. Do not flag the absence of this identity cross-check as a correctness gap.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@efenocchi can you check if Stop hook runs on every agent assitant turn it might be too much

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CodeRabbit to keep the graph consistent maybe is better to move the logic at the stop hook in the other AI assistants as well. the price should be contained since the graph to be recomputed is just for the modified files. Am I wrong? having a different logic between different AI assistants doesn't look good to me, but I might be wrong

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@efenocchi — your intuition is right that the trigger frequency is asymmetric across agents today, and it's worth understanding exactly where:

Agent Graph-on-stop trigger Fires every turn?
Codex Stop ✅ Yes (no session_end exists)
Cursor stop + sessionEnd ✅ Yes (both hooks wired)
Hermes on_session_end only ❌ No — session end only

So the real gap is Hermes: it builds the graph once at session close, while Codex and Cursor rebuild after every turn that touched source files. For a long Hermes session with lots of edits, the graph may lag until the very end.

Your reasoning on cost is correct — the git-diff gate in graph-on-stop.ts means each firing is cheap when nothing changed. The actual rebuild is incremental in the sense that only files changed since the last build marker trigger re-extraction, so re-firing on every Hermes turn would not be wildly expensive.

Concrete suggestion: wire graph-on-stop.js into Hermes' pre_tool_call or a per-turn equivalent (if Hermes exposes one), matching the Codex/Cursor pattern. If Hermes has no per-turn hook other than on_session_end, then the current design is already using the best available signal — the asymmetry is forced by the agent's API surface, not a deliberate design choice.

Would you like me to:

  1. Check what per-turn hooks Hermes exposes to confirm whether parity is achievable, and open a follow-up issue to track the Hermes alignment?
  2. Or sketch what install-hermes.ts would look like if a per-turn hook is available?

🧠 Learnings used
Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 98
File: codex/bundle/stop.js:1402-1408
Timestamp: 2026-05-06T23:01:21.197Z
Learning: In `codex/bundle/stop.js` (and `src/hooks/codex/stop.ts`), `forceSessionEndTrigger` is called unconditionally on every Codex `Stop` event by deliberate design. Codex has no separate `session_end` hook — `Stop` fires on every assistant turn and is the only available signal for session completion. Using `tryStopCounterTrigger` instead would silently break skill mining on short sessions (counter never reaches threshold, session closes, worker never fires). The per-projectKey worker lock (`tryAcquireWorkerLock` with a 10-min stale TTL in `src/skilify/state.ts`) makes the over-triggering cheap: subsequent Stops within the same session find nothing past the watermark and exit in <1s. This matches the pattern used by Cursor, Hermes, and Pi agents.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CodeRabbit is it done asynchronously for all the ai assistants, right?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check graph-on-stop for spawn/exec/async patterns
fd 'graph-on-stop' --type f
echo "---"
cat src/hooks/graph-on-stop.ts 2>/dev/null || fd 'graph-on-stop.ts' --exec cat {}
echo "---"
# Also check if there's a spawn with detached flag anywhere in hooks
rg -n "detached|unref|spawn|fork|background" src/hooks/ -A2

Length of output: 56655



@efenocchi — mostly yes, but the mechanism differs slightly per agent:

Agent Async mechanism How it avoids blocking the user
Claude Code Explicit async: true in hooks.json Agent launches the process and immediately moves on; hook runs to completion in its own lifetime
Codex 30s timeout on the Stop hook; no explicit async flag Codex's Stop fires after the assistant turn; the gate exits in ~5ms on a skip, well before any timeout
Cursor stop + sessionEnd hooks Cursor's hook system runs commands non-blocking
Hermes on_session_end Session is already closing, so it's inherently non-blocking

The key design note from the source doc (Lines 42–44 of graph-on-stop.ts):

Process model: runs in-process (no detached spawn). With async: true, Claude Code launches us and moves on; we run to completion in our own lifetime and exit. No daemon, no PID file, no cross-platform issues.

So there's no detached background daemon for any agent — it's a regular Node process per invocation. The "non-blocking" guarantee on the user side comes from a combination of:

  1. The 5ms skip path (rate-limit / no-source-changes gate) — the vast majority of Stop firings hit this and exit almost instantly.
  2. The async: true flag (Claude Code) or the agent's own post-turn hook model (Cursor/Hermes) that doesn't tie hook completion to the next user interaction.

For Codex specifically, since it lacks a native async flag, the practical safety net is purely the gate speed — but a slow rebuild (cold, ~2.5s) during a Codex session could theoretically delay the next prompt if Codex serializes hook execution. That's an area worth watching if users report sluggishness after large commits.


🧬 Code Graph Analysis Results

src/cli/install-codex.ts

Snippet: Hook command wiring (events → commands)

Lines 13-49

function hookCmd(bundleFile: string, timeout: number, matcher?: string): Record<string, unknown> {
  const block: Record<string, unknown> = {
    hooks: [{
      type: "command",
      command: `node "${join(PLUGIN_DIR, "bundle", bundleFile)}"`,
      timeout,
    }],
  };
  if (matcher) block.matcher = matcher;
  return block;
}

function buildHooksJson(): Record<string, unknown> {
  return {
    hooks: {
      SessionStart: [hookCmd("session-start.js", 10, "startup|resume")],
      UserPromptSubmit: [hookCmd("capture.js", 10)],
      PreToolUse: [hookCmd("pre-tool-use.js", 10, "Bash")],
      PostToolUse: [hookCmd("capture.js", 15)],
      // One Stop matcher-block with TWO commands — stop.js (capture) +
      // graph-on-stop.js (code-graph auto-build, G3). Single block (not two)
      // mirrors the static codex/hooks/hooks.json and keeps one entry per
      // event for the merge/dedupe logic.
      Stop: [stopBlockWithGraph(30)],
    },
  };
}

/** Stop block carrying both the capture stop hook and the graph auto-build. */
function stopBlockWithGraph(timeout: number): Record<string, unknown> {
  return {
    hooks: [
      { type: "command", command: `node "${join(PLUGIN_DIR, "bundle", "stop.js")}"`, timeout },
      { type: "command", command: `node "${join(PLUGIN_DIR, "bundle", "graph-on-stop.js")}"`, timeout },
    ],
  };
}

Snippet: Codex feature flag enabling (sync CLI call)

Lines 191-210

function tryEnableCodexHooks(): void {
  // codex 0.130.0 renamed the `codex_hooks` feature flag to `hooks`. The legacy
  // key still works but prints a deprecation warning on every startup, so we
  // enable the new name and strip the legacy key if it lingers from an older
  // install of this plugin.
  //
  // The strip is gated on a successful enable: on pre-0.130 codex (or if the
  // codex CLI isn't on PATH) the `hooks` feature is unknown and the call
  // throws — in that case we must leave any existing `codex_hooks = true`
  // entry alone, otherwise we'd silently disable hooks on the user's box.
  let enabled = false;
  try {
    execFileSync("codex", ["features", "enable", "hooks"], { stdio: "ignore" });
    enabled = true;
  } catch {
    // codex CLI may not be on PATH (e.g., running under a separate user) or
    // the codex version pre-dates the rename; not fatal.
  }
  if (enabled) stripLegacyCodexHooksKey();
}

Snippet: Install flow (writes hooks.json, links skills, version stamp)

Lines 228-260

export function installCodex(): void {
  const srcBundle = join(pkgRoot(), "codex", "bundle");
  const srcSkills = join(pkgRoot(), "codex", "skills");

  if (!existsSync(srcBundle)) {
    throw new Error(`Codex bundle missing at ${srcBundle}. Run 'npm run build' first.`);
  }

  ensureDir(PLUGIN_DIR);
  copyDir(srcBundle, join(PLUGIN_DIR, "bundle"));
  if (existsSync(srcSkills)) copyDir(srcSkills, join(PLUGIN_DIR, "skills"));

  tryEnableCodexHooks();
  // Idempotent: only rewrite hooks.json when the merged result actually
  // changed. An unconditional rewrite (even byte-identical) changes the file
  // Codex fingerprints and re-triggers its "Hooks need review" trust prompt on
  // every install/update. Skipping the no-op write keeps the user from being
  // re-prompted each time.
  if (!writeJsonIfChanged(HOOKS_PATH, mergeHooksJson(buildHooksJson()))) {
    log(`  Codex          hooks.json unchanged — skipped rewrite (no re-trust prompt)`);
  }

  ensureDir(AGENTS_SKILLS_DIR);
  const skillTarget = join(PLUGIN_DIR, "skills", "deeplake-memory");
  if (existsSync(skillTarget)) {
    symlinkForce(skillTarget, SKILL_LINK);
  } else {
    warn(`  Codex          skill source missing at ${skillTarget}; skipping symlink`);
  }

  writeVersionStamp(PLUGIN_DIR, getVersion());
  log(`  Codex          installed -> ${PLUGIN_DIR}`);
}

src/cli/util.ts

Snippet: Write hooks.json only if content changed

Lines 73-83

export function writeJsonIfChanged(path: string, obj: unknown): boolean {
  const next = JSON.stringify(obj, null, 2) + "\n";
  if (existsSync(path)) {
    try {
      if (readFileSync(path, "utf-8") === next) return false; // unchanged → no write
    } catch { /* unreadable → fall through and rewrite */ }
  }
  ensureDir(dirname(path));
  writeFileSync(path, next);
  return true;
}

src/cli/version.ts

Snippet: Version stamp source

Lines 5-12

export function getVersion(): string {
  try {
    const pkg = JSON.parse(readFileSync(join(pkgRoot(), "package.json"), "utf-8"));
    return pkg.version ?? "0.0.0";
  } catch {
    return "0.0.0";
  }
}

🧠 Learnings used
Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 98
File: codex/bundle/stop.js:1402-1408
Timestamp: 2026-05-06T23:01:21.197Z
Learning: In `codex/bundle/stop.js` (and `src/hooks/codex/stop.ts`), `forceSessionEndTrigger` is called unconditionally on every Codex `Stop` event by deliberate design. Codex has no separate `session_end` hook — `Stop` fires on every assistant turn and is the only available signal for session completion. Using `tryStopCounterTrigger` instead would silently break skill mining on short sessions (counter never reaches threshold, session closes, worker never fires). The per-projectKey worker lock (`tryAcquireWorkerLock` with a 10-min stale TTL in `src/skilify/state.ts`) makes the over-triggering cheap: subsequent Stops within the same session find nothing past the watermark and exit in <1s. This matches the pattern used by Cursor, Hermes, and Pi agents.

Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 0
File: :0-0
Timestamp: 2026-06-05T04:21:07.760Z
Learning: In activeloopai/hivemind src/graph/deeplake-pull.ts (TypeScript), the `pullSnapshot` function's SQL SELECT is already keyed on `(repo_slug, commit_sha)` matching the requested `(repoKey, head)` pair (lines 126-128). An additional in-code payload identity check (comparing payload's graph.commit_sha/graph.repo_key against the query parameters) is intentionally omitted as non-required defense-in-depth; the `snapshot_sha256` check (lines 179-183) covers payload integrity. Do not flag the absence of this identity cross-check as a correctness gap.

@@ -0,0 +1,394 @@
/**

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai why we have python extracter what's the usecase for hivemind

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaghni Great question! Here's the context:

What is Hivemind's graph feature?

Hivemind builds a deterministic, AST-derived code graph of your repository and stores it in Deeplake (a local tensor/vector DB). AI coding agents (Cursor, Codex, Hermes, Claude Code) can then query this graph through a virtual filesystem (/graph/*) to answer structural questions about the codebase — without having to read raw source files.

The graph captures:

  • Nodes: functions, classes, methods, types, modules
  • Edges: calls, imports, extends, implements, method_of
  • Metadata: fan-in/fan-out degree, entrypoint flags, signatures

Why a Python extractor specifically?

This PR (#228) is part of a multi-language expansion (TypeScript → TypeScript + JavaScript + Python). The Python extractor (src/graph/extract/python.ts) uses tree-sitter-python to parse Python source and emit the same FileExtraction shape as the TypeScript extractor, so the rest of the pipeline (cross-file resolution, snapshot build, VFS queries) is language-agnostic.

Practical use case:

When an agent wants to know "what calls my_function across the whole repo?" or "what does MyClass inherit from?", it queries /graph/neighborhood/my_function.py instead of grep-scanning every file. This is:

  • Faster (pre-indexed at build time)
  • Structural (AST-accurate, not text-search)
  • Token-efficient (returns a compact summary, not full file contents)

Without Python support, any repo with Python files would have an incomplete graph — missing Python symbols and their relationships to TS/JS code.


🧠 Learnings used
Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 0
File: :0-0
Timestamp: 2026-06-05T04:21:07.760Z
Learning: In activeloopai/hivemind src/graph/deeplake-pull.ts (TypeScript), the `pullSnapshot` function's SQL SELECT is already keyed on `(repo_slug, commit_sha)` matching the requested `(repoKey, head)` pair (lines 126-128). An additional in-code payload identity check (comparing payload's graph.commit_sha/graph.repo_key against the query parameters) is intentionally omitted as non-required defense-in-depth; the `snapshot_sha256` check (lines 179-183) covers payload integrity. Do not flag the absence of this identity cross-check as a correctness gap.

Comment thread src/graph/vfs-handler.ts
@@ -30,6 +30,11 @@ import { readLastBuild } from "./last-build.js";
import { repoDir } from "./snapshot.js";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coderabbitai what vfs-handler is what it does what's it's usage

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kaghni — here's a breakdown:


src/graph/vfs-handler.ts — What It Is

It's a Virtual File System (VFS) handler for the code graph. It intercepts agent tool calls (like cat, head, tail) that target paths under ~/.deeplake/memory/graph/ and synthesizes responses on the fly from the local snapshot — no real files exist at those paths.


What It Does

The main exported function is:

handleGraphVfs(subpath: string, cwd: string): GraphVfsResult

It receives the portion of the path after .../graph/ and routes it to a renderer:

Endpoint What it renders
index.md Graph overview: commit, node/edge counts, top files, edge kinds
find/<pattern> Substring search on node IDs + labels; saves numbered handles
show/<N or pattern> Node detail + 1-hop neighbors (callers/callees/imports/heritage)
query/<pattern> Combined find + 1-hop expand of top matches (start here)
impact/<pattern> Transitive dependents (blast radius) of a symbol
neighborhood/<file> All symbols in a file + its cross-file neighbors
layers Architectural layer breakdown by path heuristic
tour Dependency-ordered walkthrough of the whole graph
path/<from>/<to> Shortest path between two symbol patterns

All rendering is local — it reads the .json snapshot file from disk (written by hivemind graph build or pull) and never makes network calls.


How It's Used

It's wired into the pre-tool-use hooks for each agent. The flow is:

sequenceDiagram
    participant Agent as AI Agent (Cursor/Codex/Hermes)
    participant Hook as pre-tool-use hook
    participant GC as graph-command.ts (tryGraphRead)
    participant VFS as vfs-handler.ts (handleGraphVfs)
    participant Disk as Local snapshot .json

    Agent->>Hook: Tool call: cat ~/.deeplake/memory/graph/query/foo
    Hook->>GC: tryGraphRead(rewrittenCmd, cwd)
    GC->>VFS: handleGraphVfs("query/foo", cwd)
    VFS->>Disk: readFileSync(snapPath)
    VFS-->>GC: { kind: "ok", body: "..." }
    GC-->>Hook: synthesized text
    Hook-->>Agent: block + return synthesized response
Loading

Specifically:

  • src/graph/graph-command.ts (tryGraphRead) parses the command and calls handleGraphVfs
  • src/hooks/cursor/pre-tool-use.ts, src/hooks/codex/pre-tool-use.ts, and src/hooks/hermes/pre-tool-use.ts all call tryGraphRead and block the real tool execution when it returns content

This lets agents query the code graph by simply reading virtual paths — no special API needed.


🧠 Learnings used
Learnt from: efenocchi
Repo: activeloopai/hivemind PR: 0
File: :0-0
Timestamp: 2026-06-05T04:21:07.760Z
Learning: In activeloopai/hivemind src/graph/deeplake-pull.ts (TypeScript), the `pullSnapshot` function's SQL SELECT is already keyed on `(repo_slug, commit_sha)` matching the requested `(repoKey, head)` pair (lines 126-128). An additional in-code payload identity check (comparing payload's graph.commit_sha/graph.repo_key against the query parameters) is intentionally omitted as non-required defense-in-depth; the `snapshot_sha256` check (lines 179-183) covers payload integrity. Do not flag the absence of this identity cross-check as a correctness gap.

@efenocchi efenocchi merged commit 3b05c78 into main Jun 6, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants