Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 36 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,45 @@ CodeLens uses a modular engine architecture. To add a new analysis capability:
4. **Implement the engine** following the pattern of existing engines (return `{status, workspace, findings, summary}`)
5. **Add a command module** in `commands/yourfeature.py` with `add_args(subparser)` and `execute(args)` functions
6. **Add tests** in `tests/`
7. **Update documentation** in `SKILL.md`, `SKILL-QUICK.md`, `README.md`, and `CHANGELOG.md`
7. **Sync command counts** — see "Syncing Command Counts" below; do NOT hand-edit the count in `README.md`, `SKILL.md`, `SKILL-QUICK.md`, `pyproject.toml`, `skill.json`, or `scripts/mcp_server.py`
8. **Update documentation** in `SKILL.md`, `SKILL-QUICK.md`, `README.md`, and `CHANGELOG.md`

Commands auto-register via `commands/__init__.py` — no manual wiring needed.

### Syncing Command Counts (issue #38)

The number of CLI commands and MCP tools must never be hand-edited in
documentation or metadata files — it drifts every time a command is added or
removed. The single source of truth is the runtime `COMMAND_REGISTRY` (and
`_TOOL_DEFINITIONS` for MCP static tools). The `scripts/sync_command_count.py`
helper propagates the runtime count into every doc/metadata file.

When you add or remove a command:

```bash
# 1. Run the sync helper in --check mode to see what would change:
PYTHONPATH=scripts python3 scripts/sync_command_count.py --check

# 2. Apply the changes:
PYTHONPATH=scripts python3 scripts/sync_command_count.py --apply

# 3. Update the strict regression sentinel in tests/test_integration.py
# (TestModuleStructure.test_command_registry_has_all_commands)
# to match the new len(COMMAND_REGISTRY). This is the ONE place where
# the count is intentionally hardcoded — it is the regression anchor.

# 4. Verify:
PYTHONPATH=scripts python3 -m pytest tests/test_command_count.py tests/test_integration.py::TestModuleStructure -v
```

The test suite enforces this in CI:

- `tests/test_command_count.py::test_all_docs_in_sync_with_command_registry`
fails if any doc/metadata file mentions a stale count.
- `tests/test_integration.py::TestModuleStructure::test_command_registry_has_all_commands`
fails if `len(COMMAND_REGISTRY)` changes in either direction (strict `==`,
not `>=`).

### Adding New Language Parsers

1. **Check tree-sitter support** for the language
Expand Down
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@

> **Before an AI writes a new class/id/function, CodeLens must be checked. This is not optional.**

CodeLens is an AI-native code intelligence platform that gives AI agents **full visibility** into a codebase before they write any code. It prevents collision, overwrite of existing logic, security vulnerabilities, and dead code through 57 CLI commands, an MCP server with 55 tools (50 static + 5 dynamic), AST-based taint analysis, live CVE/OSV scanning, a plugin system with OWASP Top 10 + Compliance rule packs, a true graph data model (nodes + edges) for structural code queries, and token-efficient `--format compact` output for high-volume agent workflows (issue #17).
CodeLens is an AI-native code intelligence platform that gives AI agents **full visibility** into a codebase before they write any code. It prevents collision, overwrite of existing logic, security vulnerabilities, and dead code through 64 CLI commands, an MCP server with 62 tools (51 static + 11 dynamic), AST-based taint analysis, live CVE/OSV scanning, a plugin system with OWASP Top 10 + Compliance rule packs, a true graph data model (nodes + edges) for structural code queries, and token-efficient `--format compact` output for high-volume agent workflows (issue #17).

## Features

- **57 CLI Commands** — From basic scan/query to AST taint analysis, CVE scanning, plugin management, auto-fix, dashboards, CI/CD quality gates, and `graph-schema` for cheap graph-shape introspection
- **MCP Server (55 Tools)** — Native AI agent integration via Model Context Protocol (JSON-RPC over stdio), 50 statically-defined tools + 5 dynamically discovered, every tool accepts a `format` parameter (`json`/`markdown`/`ai`/`sarif`/`compact`)
- **64 CLI Commands** — From basic scan/query to AST taint analysis, CVE scanning, plugin management, auto-fix, dashboards, CI/CD quality gates, and `graph-schema` for cheap graph-shape introspection
- **MCP Server (62 Tools)** — Native AI agent integration via Model Context Protocol (JSON-RPC over stdio), 51 statically-defined tools + 11 dynamically discovered, every tool accepts a `format` parameter (`json`/`markdown`/`ai`/`sarif`/`compact`)
- **Token-Efficient Compact Output (v8.2, issue #17)** — `--format compact` produces single-char-key JSON with abbreviated types, omitted null fields, and relative paths — ~50% smaller than `json` on real trace output. Combined with `--limit`/`--offset` pagination, 5 structural queries now cost <5k tokens (down from 30-80k)
- **AST Taint Engine** — Tree-sitter based taint analysis with return-value propagation, scope hierarchy, and branch condition refinement
- **Live CVE/OSV Scanning** — Real-time vulnerability data from OSV.dev API with SQLite cache, 9 ecosystems (PyPI, npm, crates.io, Go, Maven, NuGet, RubyGems, Pub, Hex)
Expand Down Expand Up @@ -225,8 +225,8 @@ codelens/
│ ├── changelog.md # Older changelog (per-version highlights)
│ └── agent-integration.md # AI agent integration guide
├── scripts/
│ ├── codelens.py # CLI entry point (56 commands registered)
│ ├── mcp_server.py # MCP JSON-RPC server (54 tools)
│ ├── codelens.py # CLI entry point (64 commands registered)
│ ├── mcp_server.py # MCP JSON-RPC server (62 tools)
│ ├── registry.py # Registry read/write/build
│ ├── persistent_registry.py # SQLite persistent storage (WAL mode)
│ ├── base_parser.py # Base tree-sitter parser
Expand Down Expand Up @@ -283,7 +283,7 @@ codelens/
│ ├── plugin_system.py # Plugin system & marketplace
│ ├── pre_commit_hook.py # Git pre-commit hook integration
│ ├── utils.py # Shared utilities (version, helpers)
│ ├── commands/ # One file per CLI command (auto-registered, 57 commands incl. graph-schema)
│ ├── commands/ # One file per CLI command (auto-registered, 64 commands)
│ ├── formatters/ # Output formatters (markdown, sarif, compact)
│ ├── parsers/ # Tree-sitter + fallback parsers
│ │ ├── html_parser.py, css_parser.py, js_frontend_parser.py, js_backend_parser.py
Expand Down
12 changes: 6 additions & 6 deletions SKILL-QUICK.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ $CLI list --limit 5 --offset 10 --format compact # → paginated + co
| "Cross-file taint" | `dataflow` | `taint` (taint is single-file, AST-deep) |
| "Auto-fix issues" | `fix` | `check` (check just gates, doesn't fix) |

## All 56 Commands
## All 64 Commands

### Setup & Lifecycle (8+)
`init` · `scan [--incremental] [--max-files N] [--full]` · `validate` · `detect` · `watch [--debounce SECS] [--git-mode] [--interval SECS]` · `git-status` · `migrate` · `serve` · `lsp-status`
Expand Down Expand Up @@ -145,19 +145,19 @@ $CLI list --limit 5 --offset 10 --format compact # → paginated + co
### Tooling (1)
`plugin <install|list|search|update|info|validate>`

**Total: 60 commands** (56 original + `graph-schema` #17 + `architecture` #19 + `resolve-types` #13 + `git-status` #14; verified via `commands/__init__.py` auto-registration)
**Total: 64 commands** (auto-registered via `commands/__init__.py`; rerun `python3 scripts/sync_command_count.py --apply` after adding/removing a command)

## MCP Server (58 Tools)
## MCP Server (62 Tools)

Start the MCP server for AI agent integration:

```bash
python3 scripts/codelens.py serve
```

Exposes 58 tools as `codelens_<command>` (e.g., `codelens_query`, `codelens_taint`, `codelens_graph_schema`, `codelens_architecture`, `codelens_resolve_types`, `codelens_git_status`):
- 51 statically-defined tools (full JSON schemas in `mcp_server.py`) including `codelens_graph_schema` (#17), `codelens_architecture` (#19), `codelens_resolve_types` (#13), and `codelens_git_status` (#14)
- 7 dynamically-discovered tools (`benchmark`, `dashboard`, `history`, `lsp-status`, `migrate`, `diff`, `resolve-types`)
Exposes 62 tools as `codelens_<command>` (e.g., `codelens_query`, `codelens_taint`, `codelens_graph_schema`, `codelens_architecture`, `codelens_resolve_types`, `codelens_git_status`):
- 51 statically-defined tools (full JSON schemas in `mcp_server.py`)
- 11 dynamically-discovered tools (auto-discovered from `COMMAND_REGISTRY`; long-running `watch` and `serve` are excluded)
- Every tool accepts a `format` parameter (`json`/`markdown`/`ai`/`sarif`/`compact`). Use `format: "compact"` for token-efficient responses (~50% smaller than `json`).
- `watch` and `serve` itself are excluded (long-running)

Expand Down
4 changes: 2 additions & 2 deletions SKILL.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
---
name: codelens
description: >
CodeLens — AI-Native Code Intelligence. 56 commands for AI-powered code analysis,
CodeLens — AI-Native Code Intelligence. 64 commands for AI-powered code analysis,
security auditing, quality scoring, AST-based taint analysis, live CVE scanning,
and pre-write safety checks. Supports 28+ languages with tree-sitter + regex
fallback parsing. MCP server exposes 54 tools for AI agent integration.
fallback parsing. MCP server exposes 62 tools for AI agent integration.
For quick command reference with validated output schemas, see SKILL-QUICK.md.
For version history, see CHANGELOG.md.
---
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "codelens"
version = "8.2.0"
description = "Live Codebase Reference Intelligence — 45 commands for AI-powered code analysis, security auditing, and quality scoring"
description = "Live Codebase Reference Intelligence — 64 commands for AI-powered code analysis, security auditing, and quality scoring"
readme = "README.md"
license = {text = "MIT"}
requires-python = ">=3.8"
Expand Down
30 changes: 29 additions & 1 deletion scripts/codelens.py
Original file line number Diff line number Diff line change
Expand Up @@ -769,8 +769,28 @@ def compute_confidence_distribution_flat(result: Dict[str, Any]) -> Dict[str, in
# ─── CLI Entry Point ──────────────────────────────────────────

def main():
# Command count is derived from COMMAND_REGISTRY at runtime so it can never
# drift from the actual number of registered commands (issue #38). The
# `--command-count` flag below prints it for scripts / CI; the description
# also includes it so `--help` is self-documenting.
from commands import COMMAND_REGISTRY as _cli_registry_for_count
_command_count = len(_cli_registry_for_count)

parser = argparse.ArgumentParser(
description=f"CodeLens v{CODELENS_VERSION} — Live Codebase Reference Intelligence (Tree-sitter Edition)"
description=(
f"CodeLens v{CODELENS_VERSION} — Live Codebase Reference Intelligence "
f"(Tree-sitter Edition). {_command_count} commands available; run "
f"`python3 scripts/codelens.py --command-count` to print just the count."
)
)
# Quick introspection flag — prints the runtime command count and exits.
# Used by tests / CI / sync_command_count.py to verify the registry size.
parser.add_argument(
"--command-count",
action="store_true",
default=False,
help="Print the runtime command count (len(COMMAND_REGISTRY)) and exit. "
"Single source of truth for issue #38 reconciliation.",
)
subparsers = parser.add_subparsers(dest="command", help="Available commands")

Expand Down Expand Up @@ -836,6 +856,14 @@ def main():
print(json.dumps({"status": "error", "error": str(e)}, indent=2))
sys.exit(0)

# Handle --command-count as a special top-level flag (issue #38):
# prints just the runtime command count and exits. Used by tests, CI,
# and sync_command_count.py to verify the registry size without parsing
# the full --help output.
if "--command-count" in sys.argv:
print(_command_count)
sys.exit(0)

# Pre-parse to capture global flags before subparser overwrites them
global_format = None
global_top = None
Expand Down
2 changes: 1 addition & 1 deletion scripts/graph_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
- New tables `graph_nodes` and `graph_edges` are additive (prefixed `graph_`
to avoid colliding with any existing table name).
- The flat registry tables and JSON files are untouched.
- All 56 existing CLI commands continue to work unchanged.
- All 64 existing CLI commands continue to work unchanged.

Schema:
graph_nodes(
Expand Down
2 changes: 1 addition & 1 deletion scripts/mcp_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

Implements the MCP specification (2025-03-26) over stdio (JSON-RPC 2.0).
Provides persistent server mode with in-memory registry caching, sub-millisecond
query latency after initial scan, and automatic tool discovery for all 45+ CodeLens commands.
query latency after initial scan, and automatic tool discovery for all 64 CodeLens commands.

Usage:
python3 codelens.py serve # Start MCP server (stdio transport)
Expand Down
Loading
Loading