Skip to content

Proposal: First-class integration with OpenCodeReview — benchmark results & collaboration offer #1056

Description

@lizhengfeng101

Hi @colbymchenry,

I'm the author of OpenCodeReview (OCR), an open-source AI code review CLI. We recently integrated CodeGraph as an MCP tool provider in our review pipeline and ran a benchmark across 200 real-world pull requests from production repositories. The results were impressive enough that I wanted to share them and explore a deeper collaboration.

What we did

OCR spawns CodeGraph via MCP (codegraph serve --mcp) during code review. The agent uses codegraph_explore alongside our built-in tools (file_read, code_search, etc.) to understand code structure before generating review comments. We evaluated on the same 200 PRs (6 were auto-skipped as test-only changes) with and without CodeGraph, using Claude Opus 4.6.

Benchmark results (200 PRs, Claude Opus 4.6)

Review quality improvement

Metric Without CodeGraph With CodeGraph Change
Precision 30.6% (285/931) 31.9% (307/963) +1.3pp
Recall 18.9% (285/1505) 20.4% (307/1505) +1.5pp
True positive issues found 285 307 +7.7%
Zero-line comments (low quality) 1.27% 0.72% -43%

CodeGraph helped the agent find 22 more real issues across the same PR set while simultaneously reducing low-quality (zero-line) comments by 43%.

Efficiency improvement

Metric Without CodeGraph With CodeGraph Change
Total tool calls 7,363 (avg 37/PR) 7,107 (avg 36/PR) -3.5%
file_read calls 3,101 (avg 15) 2,779 (avg 14) -10.4%
code_search calls 2,253 (avg 11) 1,973 (avg 10) -12.4%
file_find calls 128 114 -10.9%
Wall-clock time 4h 46m 21s 4h 41m 32s -1.7%

Even though CodeGraph added 328 codegraph_explore calls, net tool calls still decreased — the agent read fewer files and ran fewer searches because codegraph_explore gave it the structural context it needed upfront. This aligns perfectly with the design goal described in CodeGraph's docs: making the agent's answer sufficient enough to stop it from reading.

Token usage

Metric Without CodeGraph With CodeGraph
Total tokens 79.5M 89.6M
Output tokens 2.0M 2.0M
Avg per PR 410K 462K

Input tokens increased ~13% (CodeGraph's context), but output tokens stayed flat — the agent used the extra context to make better decisions, not to write more.

Current integration approach

Right now, OCR treats CodeGraph as an external MCP server configured by the user:

{
  "mcp_servers": {
    "codegraph": {
      "command": "codegraph",
      "args": ["serve", "--mcp"],
      "setup": "codegraph init && codegraph index"
    }
  }
}

This works, but requires users to install CodeGraph separately and configure it manually. We'd like to make it zero-config.

Proposal: make CodeGraph a built-in provider in OCR

We want to ship CodeGraph as a built-in, zero-configuration code intelligence provider in OCR. When a user runs ocr review, OCR would automatically:

  1. Discover a locally installed codegraph binary (PATH lookup)
  2. If not found, download the platform-specific bundle from GitHub Releases (same self-heal logic as npm-shim.js) and cache it in ~/.codegraph/bundles/
  3. Run codegraph init && codegraph index on the target repo
  4. Connect via MCP and register tools
  5. Clean up after review

Users can disable it with ocr config set codegraph.enabled false or OCR_NO_CODEGRAPH=1.

What we need from CodeGraph (all optional, nothing blocking)

The integration works today with no changes to CodeGraph. But these would improve the experience:

  1. --no-daemon / --no-watch CLI flags — We currently pass CODEGRAPH_NO_DAEMON=1 and CODEGRAPH_NO_WATCH=1 as env vars. Explicit flags would make the subprocess invocation more self-documenting.

  2. Stable version manifest — A version.json release asset listing the latest version and checksums would let us check for updates without hitting GitHub API rate limits.

  3. codegraph index --timeout <duration> — For large repos, we kill the index process externally after a timeout. A built-in timeout with graceful partial-index checkpoint would be cleaner.

None of these are blockers — we can ship the integration with CodeGraph as-is.

What we offer in return

  • Promotion in OCR's README and docs: CodeGraph as the recommended code intelligence provider, with a link to the CodeGraph repo.
  • Real-world case study: We're happy to be featured as an integration case study in CodeGraph's docs/README — an AI code review tool that uses CodeGraph to improve review quality by 7.7% on real PRs.
  • Upstream contributions: We're willing to submit PRs for any of the improvements above.
  • Ongoing benchmark data: As we iterate, we can share updated benchmark results to help validate CodeGraph improvements.

About OpenCodeReview

  • Open-source AI code review CLI for Git repositories
  • Supports multiple LLM providers (Anthropic, OpenAI, custom)
  • Distributed via npm (@alibaba-group/open-code-review) and GitHub Releases
  • Go binary, cross-compiled for 6 platforms (darwin/linux/windows x amd64/arm64)
  • GitHub: https://github.com/alibaba/open-code-review

Looking forward to your thoughts! Happy to discuss any technical details or alternative approaches.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions