fix: handle large (1M) context windows for Opus 4.x and compaction by akhil29 · Pull Request #3519 · tailcallhq/forgecode

akhil29 · 2026-06-15T23:25:21Z

Two related fixes so that large context windows (e.g. Claude Opus 4.x at 1M tokens) are used fully, instead of being treated as 200K or having compaction capped to a small hardcoded threshold.

1. `get_context_length()` returned 200K for 1M-token Opus models

crates/forge_app/src/dto/anthropic/response.rs matched the prefix claude-opus-4- and returned 200_000, which incorrectly captured claude-opus-4-6/4-7/4-8 — all 1M-token models per crates/forge_repo/src/provider/provider.json (e.g. "id": "claude-opus-4-8", "context_length": 1000000).

Fix: add an explicit 1M branch for the claude-opus-4-6/4-7/4-8 prefixes before the generic claude-opus-4- 200K branch. Older Opus 4.x models (4, 4-1) remain at 200K. Added test_get_context_length_opus_1m_models.

Note: this get_context_length table is the fallback for the native Anthropic /models path; the vertex_ai_anthropic path already reads context_length from provider.json.

2. Default compaction `token_threshold` (100K) ignored large windows

The effective compaction trigger is min(configured_token_threshold, context_window * token_threshold_percentage) (crates/forge_domain/src/agent.rs). The embedded default token_threshold = 100000 (crates/forge_config/.forge.toml) made the trigger min(100K, 0.7 × 1M) = 100K on a 1M model — only ~10% of the window, firing compaction far too early and leaving ~900K tokens unused.

Fix:

Remove the hardcoded token_threshold = 100000 from the embedded default config.
In compaction_threshold, when token_threshold is unset, derive it from the model's context window (70%); when it is explicitly set, keep treating it as an absolute cap (min with the window-derived value) for safety headroom.

Resulting behavior:

Model context window	configured `token_threshold`	effective threshold
128K (codex-spark)	none	89.6K (70%)
200K	none	140K (70%)
1M (Opus)	none	700K (70%)
1M (Opus)	100K (explicit)	100K (respected as cap)

Updated test_compaction_threshold_uses_context_window_percentage_when_unset and added test_compaction_threshold_large_window_not_capped_to_hardcoded_default. Existing tests that explicitly set token_threshold are unaffected (they exercise the Some cap branch).

Testing

No Rust toolchain was available in my local environment to run cargo test, so I'm relying on CI. The changes are small and the affected unit tests were updated to match the new behavior.

CLAassistant · 2026-06-15T23:25:30Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Akhil Appana seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

amitksingh1490 · 2026-06-16T03:55:22Z

 message_threshold = 200
 on_turn_end = false
 retention_window = 6
-token_threshold = 100000


This should be kept same as user can configure it.

Good point — done. I've restored token_threshold = 100000 in crates/forge_config/.forge.toml so it stays user-configurable.

To still fix the large-window issue I made compact.token_threshold optional in compaction_threshold (crates/forge_domain/src/agent.rs):

When set (the shipped default 100K, or any user value) → treated as an absolute cap: min(token_threshold, 70% × context_window), preserving the small-window headroom safety.

When unset → derived purely from the context window (70%), so large windows (e.g. 1M Opus) aren't capped to a small hardcoded value.

So the default behavior is unchanged for everyone, the config knob is preserved, and users on large-context models can either raise token_threshold or unset it to get automatic window-based sizing. Force-pushed as e375e34.

Two related fixes so large context windows are used fully instead of being treated as 200K / capped to a small hardcoded compaction threshold. 1. get_context_length() returned 200K for 1M-token Opus models. The generic `claude-opus-4-` prefix branch captured claude-opus-4-6/4-7/4-8, which are 1M-token models. Add an explicit 1M branch before it. 2. Make the compaction `token_threshold` optional so it no longer forces a small cap on large windows. The configurable default is kept in crates/forge_config/.forge.toml (token_threshold = 100000) so users can still tune it. When it is set it is treated as an absolute cap (lower of it and 70% of the context window) preserving headroom on small windows; when it is unset the threshold is derived purely from the context window (70%), so large windows (e.g. 1M-token models) are not capped to a small hardcoded value. Fixes tailcallhq#3518

github-actions Bot added os: windows Windows-specific issue or feature. type: fix Iterations on existing features or infrastructure. labels Jun 15, 2026

amitksingh1490 reviewed Jun 16, 2026

View reviewed changes

akhil29 force-pushed the fix/large-context-window-handling branch from 9483f26 to e375e34 Compare June 16, 2026 05:51

[autofix.ci] apply automated fixes

a7d04fb

akhil29 mentioned this pull request Jun 16, 2026

User .forge.toml [compact] settings silently ignored: agent-priority merge overrides user config (token_threshold uncontrollable) #3526

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle large (1M) context windows for Opus 4.x and compaction#3519

fix: handle large (1M) context windows for Opus 4.x and compaction#3519
akhil29 wants to merge 2 commits into
tailcallhq:mainfrom
akhil29:fix/large-context-window-handling

akhil29 commented Jun 15, 2026

Uh oh!

CLAassistant commented Jun 15, 2026

Uh oh!

amitksingh1490 Jun 16, 2026

Uh oh!

akhil29 Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

akhil29 commented Jun 15, 2026

1. get_context_length() returned 200K for 1M-token Opus models

2. Default compaction token_threshold (100K) ignored large windows

Testing

Uh oh!

CLAassistant commented Jun 15, 2026

Uh oh!

amitksingh1490 Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

akhil29 Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. `get_context_length()` returned 200K for 1M-token Opus models

2. Default compaction `token_threshold` (100K) ignored large windows