filter reasoning tokens out of Edit/Apply output and chat utility calls by achmelev · Pull Request #12420 · continuedev/continue

achmelev · 2026-05-15T15:50:31Z

Description

Problem

When using a reasoning model (Anthropic extended thinking, DeepSeek R1, xAI Grok, or any provider that emits delta.reasoning_content / thinking_delta) for Edit (Cmd+I) or Apply, the model's internal reasoning is written into the file alongside the actual code changes.

Continue converts all provider-specific reasoning formats into an internal ChatMessage with role: "thinking". The bug is that renderChatMessage() in core/util/messageContent.ts handles "thinking" in the same switch branch as "assistant":

case "thinking":   // ← falls through to the same return as "assistant"
case "assistant":
  return stripImages(message.content);

This makes thinking content indistinguishable from code output at every point downstream. The existing post-processing filters in streamDiffLines.ts (filterEnglishLinesAtStart, filterCodeBlockLines, etc.) operate only on string content and have no awareness of message roles — they provide no meaningful protection. The reasoning: false LLM option is a best-effort hint to the provider and does not suppress chunks that have already arrived.

The result: reasoning text enters the diff pipeline tagged as DiffLine { type: "new" } and is accepted into the file as code.

This is tracked in GitHub issue #11590 ("Edit Model failed to apply, but just paste its reasoning process to my code").

The same root cause affects several internal one-shot utility calls that use BaseLLM.chat() — a non-streaming wrapper around streamChat() that accumulates all chunks into a single completion string. Without filtering, thinking content is silently merged into that string and corrupts the results of: chat title generation, repo map summarisation, context retrieval tool selection, next-edit prediction, and conversation compaction.

Fix

New function

A new renderChatMessageWithoutThinking() is added to core/util/messageContent.ts:

export function renderChatMessageWithoutThinking(message: ChatMessage): string {
  if (message.role === "thinking") return "";
  return renderChatMessage(message);
}

renderChatMessage() is left unchanged because gui/src/pages/gui/Chat.tsx legitimately reads thinking content to populate the ThinkingBlockPeek collapsible UI component.

Call site changes — three categories

Category 1 — Streaming chat, line-oriented output (core/diff/util.ts)

streamLines() is the conversion point from ChatMessage chunks to strings for all Edit/Apply paths: the main edit flow (streamDiffLines → recursiveStream → llm.streamChat()), lazy apply (streamLazyApply), and lazy replace. A single fix here covers all three:

// before:
const chunk = typeof update === "string" ? update : renderChatMessage(update);
// after:
const chunk = typeof update === "string" ? update : renderChatMessageWithoutThinking(update);

Category 2 — Streaming complete, provider uses Chat API internally (9 provider files)

When Edit runs without rules, llm.streamComplete() is called. Each provider implements _streamComplete() by calling its Chat API internally and converting ChatMessage chunks to strings before yielding them. The fix is applied at that conversion point in each provider:

OpenAI.ts, Anthropic.ts, Bedrock.ts, Gemini.ts, VertexAI.ts, Cohere.ts, Cloudflare.ts, Flowise.ts, CustomLLM.ts

// before:
yield renderChatMessage(chunk);
// after:
yield renderChatMessageWithoutThinking(chunk);

Category 3 — Non-streaming chat (core/llm/index.ts)

BaseLLM.chat() is a non-streaming wrapper around streamChat() used for one-shot utility calls (title generation, repo map summarisation, tool selection, next-edit prediction, conversation compaction). Without the fix, thinking content would be merged into the returned completion string.

// before:
completion += renderChatMessage(message);
// after:
completion += renderChatMessageWithoutThinking(message);

Additional call sites (follow-up)

The same fix is applied to core/util/historyUtils.ts (toMarkDown()) and the four built-in legacy slash commands (/commit, /review, /draftIssue, /onboard), which all stream ChatMessage chunks directly to the chat UI.

Call site intentionally not changed

core/edit/recursiveStream.ts accumulates streamed chunks into an internal buffer intended as a faithful reproduction of all model output, for use by a recursive continuation mechanism (currently inactive) that re-prompts the model when it hits its token limit mid-edit. Stripping thinking content from this buffer would corrupt the continuation context. Thinking is filtered downstream by streamLines() (Category 1 above), so nothing reaches the diff pipeline or the UI regardless.

AI Code Review

Team members only: AI review runs automatically when PR is opened or marked ready for review
Team members can also trigger a review by commenting @continue-review

Checklist

I've read the contributing guide
The relevant docs, if any, have been updated or created
The relevant tests, if any, have been updated or created

Screen recording or screenshot

Not applicable

Tests

No tests added, existing tests work (except for tests requiring API key, which couldn't be verified)

Summary by cubic

Prevents reasoning/“thinking” tokens from being inserted into files and utility outputs by filtering them at render time. Fixes cases where Edit/Apply and internal helpers pasted model reasoning into code (fixes #11590).

Bug Fixes
- Added renderChatMessageWithoutThinking() to drop role: "thinking" chunks.
- Applied to streamLines() (Edit/Apply), all providers’ _streamComplete() paths, BaseLLM.chat(), toMarkDown(), and legacy slash commands (/commit, /review, /draftIssue, /onboard).
- Kept renderChatMessage() unchanged for chat UI; left recursiveStream buffer unfiltered to preserve continuation context.

^{Written for commit ef340a0. Summary will update on new commits. Review in cubic}

Add renderChatMessageWithoutThinking() to messageContent.ts which returns an empty string for role:"thinking" chunks instead of rendering them as plain text. Replace renderChatMessage() with renderChatMessageWithoutThinking() at all call sites where thinking content must not appear in output: - streamLines() in diff/util.ts (Apply and Edit-with-rules path) - _streamComplete() in all provider implementations (Edit-without-rules path) - BaseLLM.chat() in llm/index.ts (title generation, repo map summarisation, context retrieval tool selection, next-edit prediction, conversation compaction)

…gacy slash commands Extends the renderChatMessageWithoutThinking fix to six remaining call sites: recursiveStream (diff buffer), toMarkDown (history export), and the four built-in-legacy slash commands (commit, review, draftIssue, onboard) that streamed chunks directly to the UI.

…er integrity The buffer in recursiveStream must be a faithful copy of all model output so the recursive continuation path can resume from the correct position. Thinking content is already filtered downstream by streamLines() in diff/util.ts, so no thinking leaks to the diff pipeline or UI.

github-actions · 2026-05-15T15:50:42Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

achmelev · 2026-05-15T15:51:34Z

I have read the CLA Document and I hereby sign the CLA

cubic-dev-ai

No issues found across 17 files

_{Re-trigger cubic}

achmelev added 3 commits May 15, 2026 17:42

achmelev requested a review from a team as a code owner May 15, 2026 15:50

achmelev requested review from sestinj and removed request for a team May 15, 2026 15:50

github-project-automation Bot added this to Issues and PRs May 15, 2026

github-project-automation Bot moved this to Todo in Issues and PRs May 15, 2026

dosubot Bot added the size:M This PR changes 30-99 lines, ignoring generated files. label May 15, 2026

cubic-dev-ai Bot reviewed May 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

filter reasoning tokens out of Edit/Apply output and chat utility calls#12420

filter reasoning tokens out of Edit/Apply output and chat utility calls#12420
achmelev wants to merge 3 commits into
continuedev:mainfrom
achmelev:PRFilteringReasoningTokens

achmelev commented May 15, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

github-actions Bot commented May 15, 2026 •

edited

Loading

Uh oh!

achmelev commented May 15, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

achmelev commented May 15, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Problem

Fix

New function

Call site changes — three categories

Additional call sites (follow-up)

Call site intentionally not changed

AI Code Review

Checklist

Screen recording or screenshot

Tests

Summary by cubic

Uh oh!

github-actions Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

achmelev commented May 15, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

achmelev commented May 15, 2026 •

edited by cubic-dev-ai Bot

Loading

github-actions Bot commented May 15, 2026 •

edited

Loading