Token reduction + routing reliability, provider/cost fixes, native ABI guard by vishalveerareddy123 · Pull Request #73 · Fast-Editor/Lynkr

vishalveerareddy123 · 2026-06-09T04:41:36Z

Summary

Token-reduction features plus reliability fixes uncovered while running Claude Code through Lynkr end-to-end.

Token reduction (A1–A5, inspired by 9router)

RTK compressors: grep (per-file match cap), dedup_log (collapse repeated lines), smart_truncate (head/tail fallback).
Request bypass: canned responses for Claude CLI housekeeping (Warmup / count / title-prefill / isNewTopic) — no provider round-trip. CLI-only, always on.
MCP tool dedup: strip built-in WebSearch/WebFetch when Exa/Tavily MCP tools are present. Always on.
Caveman: opt-in terse-output system-prompt injector (off by default).

Routing reliability

Pass client tools through in client/passthrough mode — previously the model's tool calls were dropped as "hallucinated", stalling agentic sessions.
De-escalate risk false-positives: drop over-broad session/token substring keywords that forced ordinary requests to COMPLEX.
Session→provider affinity: keep a tool-bearing conversation on one provider so tool_call_id linkage doesn't break (fixes Azure/Moonshot 400s).

Providers, cost & telemetry

Moonshot: remap deprecated moonshot-v1-auto; pin temperature:1 + top_p:0.95 for kimi-k2.x thinking models.
OpenAI-format converter: replace empty-string content with a space (Kimi "tokenization failed").
Telemetry: request_text/response_text columns (+ migration), captured text + per-request cost_usd on success/fallback — builds the routing-ML training corpus.
Deterministic cost resolution: replace the registry's bidirectional fuzzy substring match (confidently-wrong prices) with an ordered ladder — override → exact → prefix-strip → alias → date-normalize → longest-prefix → unknown. Unknown models record cost_usd=null (+ warn) instead of a guess. Adds MODEL_PRICE_OVERRIDES.
Native ABI guard: postinstall rebuilds native deps on a Node-version mismatch (better-sqlite3 was silently disabling telemetry/sessions/memory after a Node bump).

Dashboard

Configured Providers now derives from TIER_* ∩ configured credentials, shows per-provider tiers, and flags tiers pointing at an unconfigured provider.

Tests

New: token-reduction.test.js (17), session-affinity.test.js, model-registry-cost.test.js; all in test:unit.
Verified live end-to-end (200, SIMPLE→ollama, cost + text recorded in telemetry).

Follow-up (not in this PR)

Trivial-turn fast path: unwrap the <system-reminder> scaffolding before the greeting/force-local check so a wrapped "Hi" routes SIMPLE instead of COMPLEX.

🤖 Generated with Claude Code

…aveman Port category-A token-reduction features inspired by 9router: - tool-result-compressor: add grep (per-file match cap), dedup_log (collapse repeated lines), and smart_truncate (head/tail fallback) compressors. - bypass: short-circuit Claude CLI housekeeping (Warmup, count, title prefill, isNewTopic title extraction) with a canned response — no provider round-trip. Always on; CLI-only so real work is never affected. - tool-dedup: strip built-in WebSearch/WebFetch when an equivalent MCP tool (Exa/Tavily) is present. Always on. - caveman: opt-in terse-output system-prompt injector (lite/full/ultra) to reduce output tokens; off by default. Adds test/token-reduction.test.js (17 cases). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…er affinity - risk-analyzer: remove the over-broad 'session' and 'token' substring keywords that force-escalated ordinary requests to COMPLEX (they matched benign paths like src/sessions/* and tokenizer.js). Real secrets stay covered by secret/credential/api-key. Genuine auth/security paths still flag high. - session-affinity: new module + wiring in determineProviderSmart. When a conversation already carries tool_use/tool_result history, reuse the provider it first routed to, so tool_call_id linkage doesn't break across providers (the Azure/Moonshot 400 "tool_call_id not found" errors). Fresh turns still route per-tier. Adds test/session-affinity.test.js. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- Gate smart tool selection to server mode only. In client/passthrough mode the client (Claude Code) owns tool execution, so stripping its tools made the model emit calls for removed tools that were then dropped as "hallucinated", stalling the session. Tools now pass through intact. - Wire request bypass (early), MCP tool dedup, and caveman injection into processMessage / runAgentLoop. - Thread session id (_sessionId) for provider affinity. - config: add caveman flags (CAVEMAN_ENABLED/LEVEL). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…I guard Providers: - Moonshot: remap deprecated moonshot-v1-auto (400 "tokenization failed") to a fixed model; pin temperature=1 and top_p=0.95 for kimi-k2.x thinking models (they reject other values). - openrouter-utils: replace empty-string message content with a space so Kimi/OpenAI-compatible providers don't 400 on "tokenization failed". Telemetry / cost: - telemetry: add request_text + response_text columns (+ migration for existing DBs); capture prompt/response text (truncated) and per-request cost_usd (model-registry pricing x tokens) on success + fallback paths. Builds the training corpus for the routing ML models. Native modules: - scripts/check-native.js + postinstall: detect a better-sqlite3 ABI mismatch (Node upgrade) and rebuild native optional deps. Best-effort, never fails install. Fixes telemetry/sessions/memory silently going dark after a Node bump. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Derive the Configured Providers list from TIER_* + MODEL_PROVIDER intersected with configured credentials, instead of listing every provider that merely has an endpoint set (which always showed llamacpp/lmstudio via their default endpoints). Show the tiers each provider serves, and flag tiers pointing at a provider with no credentials. Also documents the caveman env flags in .env.example. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ing match Replace the bidirectional `includes()` fuzzy match (which returned confidently wrong prices for unrelated names) with a deterministic ladder: override → exact → provider-prefix-strip → alias → date-normalize → longest-prefix (one-directional, length-bounded) → unknown - Unknown models now return `{ ...DEFAULT_COST, unknown: true }` and computeCostUsd records cost_usd=null (+ warn once) instead of a fabricated guess, so the bandit/cost-optimizer aren't poisoned. - Add MODEL_PRICE_OVERRIDES env so operators can pin prices the registry lacks. - Each result carries a `resolution` tag for debuggability. Adds test/model-registry-cost.test.js (incl. a regression test that a name containing a real key as a substring no longer false-matches). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ures - Benchmark Results: switch to the Lynkr-vs-LiteLLM numbers from BENCHMARK_REPORT.md (53% / 87.6% token reduction with cost, 171ms cache, tier-routing comparison, ~50% monthly cost projection) + run date/versions. - Advanced Features: document always-on optimizations (smart tool selection, RTK compression, MCP tool dedup, request bypass), opt-in caveman terse mode, and the cost-tracking / MODEL_PRICE_OVERRIDES behavior. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…+ TOON) Document the in-process tool_result compression that was previously undocumented: - RTK pattern compressors (test/git/grep/lint/build/container/json/dir/file + dedup_log/smart_truncate), tier-aware size thresholds, and tee lossless recovery via GET /tee/:id. - TOON binary JSON encoding with its config (TOON_ENABLED/MIN_BYTES/FAIL_OPEN). Adds an overview row and fixes the phase-count labels. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- Correct the curl installer's printed port from 8080 to 8081 (matches .env.example/README; the installer copies .env.example which uses 8081, so the old instructions were wrong for real installs). - `npm install --production` → `--omit=dev` (keeps optionalDependencies; the postinstall native-ABI guard runs here). - After install, probe better-sqlite3 and tell the user whether telemetry / memory / sessions are enabled, with a `npm run rebuild-native` hint if the native module isn't loadable. Lynkr still runs without it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Update the GitHub Pages homepage version indicators (JSON-LD softwareVersion and the hero badge) from 9.3.2 to 9.4.6 to match the latest npm release. The benchmark-provenance line is left at v9.3.2 since that records the version the published benchmarks were actually run on. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

vishal veerareddy and others added 10 commits June 8, 2026 21:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Token reduction + routing reliability, provider/cost fixes, native ABI guard#73

Token reduction + routing reliability, provider/cost fixes, native ABI guard#73
vishalveerareddy123 wants to merge 10 commits into
mainfrom
feat/token-reduction-rtk

vishalveerareddy123 commented Jun 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

vishalveerareddy123 commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Token reduction (A1–A5, inspired by 9router)

Routing reliability

Providers, cost & telemetry

Dashboard

Tests

Follow-up (not in this PR)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vishalveerareddy123 commented Jun 9, 2026 •

edited

Loading