diff --git a/docs/llmservice/models/claude-fable-5.md b/docs/llmservice/models/claude-fable-5.md new file mode 100644 index 0000000..09e98d5 --- /dev/null +++ b/docs/llmservice/models/claude-fable-5.md @@ -0,0 +1,53 @@ +# Claude Fable 5 + +## Overview + +Claude Fable 5 is a high-capability Anthropic model available on B.AI for advanced reasoning, coding, long-context analysis, and agentic workflows. It is designed for complex tasks that require sustained context, tool-assisted execution, and high-quality structured outputs. Specific capabilities, context limits, tool support, and availability may vary by B.AI model catalog and platform configuration. + +## Key Features + +* **Advanced Reasoning**: Suitable for complex analytical, technical, and professional knowledge tasks. +* **Software Engineering Workflows**: Designed for coding assistance, debugging, refactoring, code review, and multi-step implementation planning. +* **Long-Context Tasks**: Supports extended analysis across large codebases, long documents, and multi-turn work sessions when enabled by the platform configuration. +* **Agentic and Tool-Assisted Workflows**: Suitable for workflows that rely on tool use, function calling, code execution, MCP, or compatible agent environments. +* **Multimodal Understanding**: Supports text and image input for document, screenshot, chart, and diagram understanding where available. + +## Best Use Cases + +* **Complex Software Engineering**: Large feature work, repository-scale refactors, migration planning, bug investigation, and code review. +* **Extended Agentic Workflows**: Multi-step tasks that require planning, tool use, verification, and sustained context over longer sessions. +* **Research and Knowledge Work**: Analysis and synthesis across technical documents, legal or financial materials, and structured research sources. +* **Visual Document Analysis**: Understanding screenshots, diagrams, charts, PDFs, and other image-based materials when supported by the workflow. + +## Capabilities and Limitations + +| Capability | Description | +| :----------------- | :-------------------------------------------------------------------------------------------------- | +| **Reasoning** | Advanced reasoning for complex professional and technical tasks | +| **Coding** | Strong coding, debugging, refactoring, and code review capabilities | +| **Agentic** | Suitable for long-running tool workflows and multi-step agent tasks | +| **Computer Use** | Can support browser and desktop interaction through compatible tools and environments | +| **Multimodal** | Text and image input; text output | +| **Context Window** | Up to 1,000,000 tokens, subject to platform configuration | +| **Max Output** | Up to 128,000 tokens, subject to platform configuration | +| **Tool Use** | Function calling, code execution, MCP support, adaptive thinking, and compatible agent workflows | +| **Multilingual** | Strong multilingual performance across major world languages | + +### Known Limitations + +* Specific capability availability may depend on the B.AI integration, Anthropic platform support, plan settings, and rollout status. +* Web access, code execution, computer use, and external actions require compatible tools or integrations. +* Image input is supported, but native audio or video input is not listed for this model. +* Public evaluations, third-party comparisons, policy behavior, and implementation details may change over time, so they are not treated as fixed guarantees in this documentation. + +## Credits Usage + +| Model | Input (Credits/Token) | Cache Write (Credits/Token) | Cache Read (Credits/Token) | Output (Credits/Token) | Web Search (Credits/Use) | Billing Notes | +| :--- | --------------------: | --------------------------: | -------------------------: | ---------------------: | -----------------------: | :--- | +| **Claude Fable 5** | `10.00` | `12.50` | `1.00` | `50.00` | `10,000` | - | + +:::info Pricing note +Prices shown in the documentation are B.AI standard reference prices for base billing purposes. B.AI may provide lower actual usage costs through top-up bonuses and account benefits. Specific prices, bonus Credits, and account benefits are subject to the platform display and final billing records. +::: + +* **Prompt caching**: Cache writes are charged at 1.25x base input price for the 5-minute TTL option, or 2x base input price for the 1-hour TTL option. Cache reads are charged at 0.1x base input price. Prompt caching requires a minimum of 1,024 tokens. diff --git a/docs/llmservice/models/claude-sonnet-5.md b/docs/llmservice/models/claude-sonnet-5.md new file mode 100644 index 0000000..51ab466 --- /dev/null +++ b/docs/llmservice/models/claude-sonnet-5.md @@ -0,0 +1,50 @@ +# Claude Sonnet 5 + +## Overview + +Claude Sonnet 5, released by Anthropic on June 30, 2026, is the next generation of the Sonnet family and a drop-in upgrade for Claude Sonnet 4.6. It is designed for stronger agentic behavior, coding, tool use, computer-use workflows, and knowledge work at Sonnet-tier pricing. + +## Key Features + +* **Agentic Task Execution**: Designed to plan, use tools such as browsers and terminals, and complete multi-step work more reliably than Claude Sonnet 4.6. +* **Adaptive Thinking by Default**: Requests run with adaptive thinking unless `thinking: {type: "disabled"}` is passed; the `effort` parameter controls the capability, latency, and token-spend tradeoff. +* **1M Context Window**: Supports a 1M-token context window by default, with no smaller context variant and no long-context surcharge. +* **Broad Platform Availability**: Available through the Claude API, Claude Code, Claude Platform on AWS, Amazon Bedrock, Google Cloud, Microsoft Foundry preview, and Claude consumer plans. + +## Best Use Cases + +* **Production Agent Workflows**: Multi-step automation, agentic search, tool-heavy reasoning, browser/terminal workflows, and long-running delegated tasks. +* **Software Engineering**: Coding, debugging, refactoring, code review, test-fix loops, and brownfield repository work where follow-through matters. +* **High-Volume Knowledge Work**: Research, analysis, structured extraction, business operations, legal research, customer workflows, and internal productivity tools that need a balance of capability and cost. + +## Capabilities and Limitations + +| Capability | Description | +| :------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| **Reasoning** | Adaptive thinking is on by default; `effort` supports `low`, `medium`, `high`, `xhigh`, and `max`, with `high` as the default. Anthropic reports stronger agentic, coding, and knowledge-work performance than Sonnet 4.6. | +| **Creative Writing** | Supports general text generation and long-form writing; Anthropic recommends re-evaluating style prompts because prose style may shift versus earlier Sonnet models. | +| **Multimodal** | Text and image input, text output, multilingual capabilities, and vision. Native audio/video input or generation is not listed. | +| **Response Speed** | Listed with "Fast" comparative latency in Anthropic's model table. Lower effort can reduce latency and token usage; higher effort increases thinking/tool-use depth. | +| **Context Window** | 1M tokens. | +| **Max Output** | 128K tokens. | +| **Tool Use** | Supports the same tool and platform feature set as Claude Sonnet 4.6 except Priority Tier is not available; tool use is more readily triggered at higher effort levels. | +| **Multilingual** | Official docs state multilingual support across current Claude models, but do not publish a separate Sonnet 5 language benchmark. | + +### Known Limitations + +* Manual extended thinking (`thinking: {type: "enabled", budget_tokens: N}`) is removed and returns a 400 error; use adaptive thinking with `effort` instead. +* Non-default sampling parameters (`temperature`, `top_p`, `top_k`) return a 400 error; use system instructions for tone and style control. +* Assistant message prefilling remains unsupported and returns a 400 error. +* The new tokenizer produces approximately 30% more tokens for the same text than Claude Sonnet 4.6, so token budgets, context usage, and equivalent-request costs should be remeasured. +* Cybersecurity safeguards are enabled by default; high-risk or prohibited cybersecurity requests may be refused with `stop_reason: "refusal"`. + +## Credits Usage + +| Model | Pricing Period | Input (Credits/Token) | 5m Cache Write (Credits/Token) | 1h Cache Write (Credits/Token) | Cache Read (Credits/Token) | Output (Credits/Token) | Web Search (Credits/Use) | +| :--- | :--- | --------------------: | -----------------------------: | -----------------------------: | -------------------------: | ---------------------: | -----------------------: | +| **Claude Sonnet 5** | Through Aug 31, 2026 | `2.00` | `2.50` | `4.00` | `0.20` | `10.00` | `10,000` | +| **Claude Sonnet 5** | From Sep 1, 2026 | `3.00` | `3.75` | `6.00` | `0.30` | `15.00` | `10,000` | + +:::info Pricing note +The main pricing table shows the currently effective standard reference price. For Claude Sonnet 5, the current standard reference price applies through August 31, 2026. Prices shown in the documentation are B.AI standard reference prices for base billing purposes. B.AI may provide lower actual usage costs through top-up bonuses and account benefits. Specific prices, bonus Credits, and account benefits are subject to the platform display and final billing records. +::: diff --git a/docs/llmservice/pricing-and-usage.md b/docs/llmservice/pricing-and-usage.md index 6115ff8..484059b 100644 --- a/docs/llmservice/pricing-and-usage.md +++ b/docs/llmservice/pricing-and-usage.md @@ -33,10 +33,12 @@ The platform uses a unified Credits system to measure and settle usage across al | GPT-5 Mini | 0.25 | 0.25 | 0.025 | 2.00 | 10,000 | | GPT-5.4 Nano | 0.20 | 0.20 | 0.02 | 1.25 | 10,000 | | GPT-5 Nano | 0.05 | 0.05 | 0.005 | 0.40 | - | +| Claude Fable 5 | 10.00 | 12.50 | 1.00 | 50.00 | 10,000 | | Claude Opus 4.8 | 5.00 | 6.25 | 0.50 | 25.00 | 10,000 | | Claude Opus 4.7 | 5.00 | 6.25 | 0.50 | 25.00 | 10,000 | | Claude Opus 4.6 | 5.00 | 6.25 | 0.50 | 25.00 | 10,000 | | Claude Opus 4.5 | 5.00 | 6.25 | 0.50 | 25.00 | 10,000 | +| Claude Sonnet 5 | 2.00 | 2.50 | 0.20 | 10.00 | 10,000 | | Claude Sonnet 4.6 | 3.00 | 3.75 | 0.30 | 15.00 | 10,000 | | Claude Sonnet 4.5 | 3.00 | 3.75 | 0.30 | 15.00 | 10,000 | | Claude Haiku 4.5 | 1.00 | 1.25 | 0.10 | 5.00 | 10,000 | @@ -44,6 +46,10 @@ The platform uses a unified Credits system to measure and settle usage across al | Gemini 3.5 Flash | 1.50 | 1.50 | 0.15 | 9.00 | 14,000 | | Gemini 3 Flash | 0.50 | 0.50 | 0.05 | 3.00 | 14,000 | +:::caution Main table scope +The main pricing table shows the currently effective standard reference price for each model. The `Cache Write` column represents the billing rate when cache writing occurs; it does not imply a unified cache TTL across all models. Cache behavior, retention time, and extended caching options may vary by model provider. If a model has special caching rules, 1-hour cache write pricing, or time-based pricing, please refer to the corresponding model detail page. +::: + :::info Pricing note Prices shown in the documentation are B.AI standard reference prices for base billing purposes. B.AI may provide lower actual usage costs through top-up bonuses and account benefits. Specific prices, bonus Credits, and account benefits are subject to the platform display and final billing records. ::: diff --git a/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/models/claude-fable-5.md b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/models/claude-fable-5.md new file mode 100644 index 0000000..c355efe --- /dev/null +++ b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/models/claude-fable-5.md @@ -0,0 +1,53 @@ +# Claude Fable 5 + +## 概述 + +Claude Fable 5 是 B.AI 上可用的 Anthropic 高能力模型,面向复杂推理、代码任务、长上下文分析和 Agent 工作流。它适合需要持续上下文、工具辅助执行和高质量结构化输出的复杂任务。具体能力、上下文长度、工具支持和可用状态可能会随 B.AI 模型目录和平台配置调整。 + +## 核心特性 + +* **高级推理能力**:适合复杂分析、技术任务和专业知识工作。 +* **软件工程工作流**:面向代码辅助、调试、重构、代码审查和多步骤实现规划。 +* **长上下文任务**:在平台配置支持时,可用于大型代码库、长文档和多轮工作会话的持续分析。 +* **Agent 与工具辅助工作流**:适合依赖工具调用、函数调用、代码执行、MCP 或兼容 Agent 环境的工作流。 +* **多模态理解**:在可用场景下,支持文本和图像输入,可用于文档、截图、图表和技术示意图理解。 + +## 适用场景 + +* **复杂软件工程**:大型功能开发、仓库级重构、迁移规划、Bug 排查和代码审查。 +* **长时间 Agent 工作流**:需要规划、工具调用、验证和持续上下文保持的多步骤任务。 +* **研究与知识工作**:技术文档、法律或金融材料以及结构化研究资料的分析与综合。 +* **视觉文档分析**:在工作流支持时,可处理截图、图表、PDF 和其他图像型材料。 + +## 能力与限制 + +| 能力维度 | 说明 | +| :--- | :--- | +| **推理能力** | 适合复杂专业任务和技术任务的高级推理 | +| **编程能力** | 具备较强的编码、调试、重构和代码审查能力 | +| **Agent 能力** | 适合长时间工具调用工作流和多步骤 Agent 任务 | +| **计算机操作** | 可通过兼容工具和环境支持浏览器及桌面交互 | +| **多模态能力** | 支持文本和图像输入;输出为文本 | +| **上下文窗口** | 最高 1,000,000 tokens,具体以平台配置为准 | +| **最大输出** | 最高 128,000 tokens,具体以平台配置为准 | +| **工具调用** | 支持函数调用、代码执行、MCP、自适应思考和兼容 Agent 工作流 | +| **多语言能力** | 在主要世界语言上具备较强的多语言表现 | + +### 已知限制 + +* 具体能力可用性可能取决于 B.AI 集成、Anthropic 平台支持、套餐配置和功能上线状态。 +* 联网访问、代码执行、计算机操作和外部动作需要兼容工具或集成支持。 +* 支持图像输入,但该模型未标明原生音频或视频输入能力。 +* 公开评测、第三方对比、策略行为和实现细节可能随时间变化,因此本文档不将其作为固定承诺。 + +## 积分消耗 + +| 模型名称 | 输入 (Credits/Token) | Cache Write (Credits/Token) | Cache Read (Credits/Token) | 输出 (Credits/Token) | 网页搜索(Credits/次) | 计费说明 | +| :--- | --------------------: | --------------------------: | -------------------------: | -------------------: | ---------------------: | :--- | +| **Claude Fable 5** | `10.00` | `12.50` | `1.00` | `50.00` | `10,000` | - | + +:::info 价格说明 +文档价格为 B.AI 平台模型标准参考价,仅供基础计费说明使用。B.AI 可能会通过充值赠送及账户权益等方式,为用户提供更低的实际使用成本。具体价格、赠送积分及账户权益请以平台页面展示及最终账单为准。 +::: + +* **Prompt caching**:缓存写入按基础输入价格的 1.25x 计费(5 分钟 TTL),或按基础输入价格的 2x 计费(1 小时 TTL)。缓存读取按基础输入价格的 0.1x 计费。使用 Prompt caching 时,最低需要 1,024 tokens。 diff --git a/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/models/claude-sonnet-5.md b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/models/claude-sonnet-5.md new file mode 100644 index 0000000..0d4feeb --- /dev/null +++ b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/models/claude-sonnet-5.md @@ -0,0 +1,50 @@ +# Claude Sonnet 5 + +## 概述 + +Claude Sonnet 5 是 Anthropic 于 2026 年 6 月 30 日发布的新一代 Sonnet 系列模型,可作为 Claude Sonnet 4.6 的直接升级版本。该模型面向更强的 Agent 行为、代码能力、工具调用、计算机使用工作流和知识工作场景,并保持 Sonnet 级别的价格定位。 + +## 核心特性 + +* **Agentic Task Execution**:面向规划、浏览器和终端等工具使用,以及更可靠的多步骤任务完成能力。 +* **默认自适应思考**:请求默认使用 adaptive thinking,除非传入 `thinking: {type: "disabled"}`;`effort` 参数用于控制能力、延迟和 token 消耗之间的权衡。 +* **1M 上下文窗口**:默认支持 1M tokens 上下文窗口,没有更小上下文变体,也没有长上下文额外费用。 +* **广泛平台可用性**:可通过 Claude API、Claude Code、AWS 上的 Claude Platform、Amazon Bedrock、Google Cloud、Microsoft Foundry preview 和 Claude 消费者计划使用。 + +## 适用场景 + +* **生产级 Agent 工作流**:多步骤自动化、Agentic Search、工具密集型推理、浏览器/终端工作流,以及长时间委托任务。 +* **软件工程**:代码生成、调试、重构、代码审查、测试修复循环,以及既有代码仓库维护。 +* **高频知识工作**:研究、分析、结构化抽取、业务运营、法律研究、客户工作流和内部生产力工具。 + +## 能力与限制 + +| 能力维度 | 说明 | +| :--- | :--- | +| **推理能力** | 默认启用 adaptive thinking;`effort` 支持 `low`、`medium`、`high`、`xhigh` 和 `max`,默认值为 `high`。Anthropic 表示其 Agent、代码和知识工作能力强于 Sonnet 4.6。 | +| **创意写作** | 支持通用文本生成和长文写作;Anthropic 建议重新评估样式提示词,因为相较早期 Sonnet 模型,文本风格可能发生变化。 | +| **多模态能力** | 支持文本和图像输入、文本输出、多语言能力和视觉理解;未列出原生音频/视频输入或生成能力。 | +| **响应速度** | 在 Anthropic 模型表中被标记为 "Fast";较低 effort 可降低延迟和 token 使用,较高 effort 会增加思考和工具使用深度。 | +| **上下文窗口** | 1M tokens | +| **最大输出** | 128K tokens | +| **工具调用** | 支持与 Claude Sonnet 4.6 相同的工具和平台能力集,但不支持 Priority Tier;更高 effort 下更容易触发工具使用。 | +| **多语言能力** | 官方文档表示当前 Claude 模型支持多语言,但未单独公布 Sonnet 5 的语言覆盖基准。 | + +### 已知限制 + +* 已移除手动 extended thinking(`thinking: {type: "enabled", budget_tokens: N}`),使用该参数会返回 400 错误;请改用 adaptive thinking 和 `effort`。 +* 非默认采样参数(`temperature`、`top_p`、`top_k`)会返回 400 错误;建议使用 system instructions 控制语气和风格。 +* 不支持 assistant message prefilling,使用时会返回 400 错误。 +* 新 tokenizer 对同一文本可能比 Claude Sonnet 4.6 多产生约 30% tokens,因此 token 预算、上下文使用量和等效请求成本需要重新评估。 +* 默认启用网络安全防护;高风险或禁止的网络安全请求可能会以 `stop_reason: "refusal"` 被拒绝。 + +## 积分消耗 + +| 模型名称 | 价格周期 | 输入 (Credits/Token) | 5m Cache Write (Credits/Token) | 1h Cache Write (Credits/Token) | Cache Read (Credits/Token) | 输出 (Credits/Token) | 网页搜索(Credits/次) | +| :--- | :--- | --------------------: | -----------------------------: | -----------------------------: | -------------------------: | -------------------: | ---------------------: | +| **Claude Sonnet 5** | 截至 2026 年 8 月 31 日 | `2.00` | `2.50` | `4.00` | `0.20` | `10.00` | `10,000` | +| **Claude Sonnet 5** | 2026 年 9 月 1 日起 | `3.00` | `3.75` | `6.00` | `0.30` | `15.00` | `10,000` | + +:::info 价格说明 +价格总表展示当前生效的标准参考价。Claude Sonnet 5 当前标准参考价适用至 2026 年 8 月 31 日。文档价格为 B.AI 平台模型标准参考价,仅供基础计费说明使用。B.AI 可能会通过充值赠送及账户权益等方式,为用户提供更低的实际使用成本。具体价格、赠送积分及账户权益请以平台页面展示及最终账单为准。 +::: diff --git a/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/pricing-and-usage.md b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/pricing-and-usage.md index 608b934..f487eba 100644 --- a/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/pricing-and-usage.md +++ b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/pricing-and-usage.md @@ -33,10 +33,12 @@ | GPT-5 Mini | 0.25 | 0.25 | 0.025 | 2.00 | 10,000 | | GPT-5.4 Nano | 0.20 | 0.20 | 0.02 | 1.25 | 10,000 | | GPT-5 Nano | 0.05 | 0.05 | 0.005 | 0.40 | - | +| Claude Fable 5 | 10.00 | 12.50 | 1.00 | 50.00 | 10,000 | | Claude Opus 4.8 | 5.00 | 6.25 | 0.50 | 25.00 | 10,000 | | Claude Opus 4.7 | 5.00 | 6.25 | 0.50 | 25.00 | 10,000 | | Claude Opus 4.6 | 5.00 | 6.25 | 0.50 | 25.00 | 10,000 | | Claude Opus 4.5 | 5.00 | 6.25 | 0.50 | 25.00 | 10,000 | +| Claude Sonnet 5 | 2.00 | 2.50 | 0.20 | 10.00 | 10,000 | | Claude Sonnet 4.6 | 3.00 | 3.75 | 0.30 | 15.00 | 10,000 | | Claude Sonnet 4.5 | 3.00 | 3.75 | 0.30 | 15.00 | 10,000 | | Claude Haiku 4.5 | 1.00 | 1.25 | 0.10 | 5.00 | 10,000 | @@ -44,6 +46,10 @@ | Gemini 3.5 Flash | 1.50 | 1.50 | 0.15 | 9.00 | 14,000 | | Gemini 3 Flash | 0.50 | 0.50 | 0.05 | 3.00 | 14,000 | +:::caution 价格总表说明 +价格总表展示每个模型当前生效的标准参考价。表中的“缓存写入(Cache Write)”表示发生缓存写入时的计费价格,不代表所有模型使用统一的缓存有效期。不同模型厂商的缓存策略、缓存有效期和扩展缓存能力可能不同;如模型存在特殊缓存规则、1 小时缓存写入价格或分时间生效的价格,请以对应模型详情页说明为准。 +::: + :::info 价格说明 文档价格为 B.AI 平台模型标准参考价,仅供基础计费说明使用。B.AI 可能会通过充值赠送及账户权益等方式,为用户提供更低的实际使用成本。具体价格、赠送积分及账户权益请以平台页面展示及最终账单为准。 ::: diff --git a/i18n/zh-Hans/docusaurus-plugin-content-docs/current/sidebars.js b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/sidebars.js index 6f78563..f000ff6 100644 --- a/i18n/zh-Hans/docusaurus-plugin-content-docs/current/sidebars.js +++ b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/sidebars.js @@ -241,6 +241,8 @@ const sidebars = { 'llmservice/models/claude-opus-4-6', 'llmservice/models/claude-opus-4-7', 'llmservice/models/claude-opus-4-8', + 'llmservice/models/claude-fable-5', + 'llmservice/models/claude-sonnet-5', 'llmservice/models/claude-sonnet-4-5', 'llmservice/models/claude-sonnet-4-6', 'llmservice/models/deepseek-v3.2', diff --git a/sidebars.js b/sidebars.js index 89f65e9..c8a4d9d 100644 --- a/sidebars.js +++ b/sidebars.js @@ -238,6 +238,8 @@ const sidebars = { 'llmservice/models/claude-opus-4-6', 'llmservice/models/claude-opus-4-7', 'llmservice/models/claude-opus-4-8', + 'llmservice/models/claude-fable-5', + 'llmservice/models/claude-sonnet-5', 'llmservice/models/claude-sonnet-4-5', 'llmservice/models/claude-sonnet-4-6', 'llmservice/models/deepseek-v3.2',