docs: make English the default docs language and fix link paths

2026-07-18 12:07:15 +08:00 · 2026-05-31 17:52:22 +08:00
parent 126649f70f
commit 7bf4ef3d05
231 changed files with 8999 additions and 8974 deletions
--- a/docs/models/mimo.mdx
+++ b/docs/models/mimo.mdx
@@ -1,15 +1,15 @@
 ---
-title: 小米 MiMo
-description: 小米 MiMo 模型配置（文本对话 + 图像理解 + 语音合成）
+title: MiMo
+description: Xiaomi MiMo model configuration (Text Chat + Image Understanding + Text-to-Speech)
 ---

-小米 MiMo 是原生全模态大模型，单 `mimo_api_key` 即可同时启用文本对话、图像理解与语音合成。
+Xiaomi MiMo is a native omni-modal large model. A single `mimo_api_key` enables text chat, image understanding, and text-to-speech all at once.

 <Tip>
-  通过 Web 控制台的「模型管理」页面可一站式配置以下全部能力，无需手动改配置文件。
+  All capabilities below can be configured in one place via the "Model Management" page in the Web Console — no need to manually edit the configuration file.
 </Tip>

-## 文本对话
+## Text Chat

 ```json
 {
@@ -19,24 +19,24 @@ description: 小米 MiMo 模型配置（文本对话 + 图像理解 + 语音合
 }
 ```

-| 参数 | 说明 |
+| Parameter | Description |
 | --- | --- |
-| `model` | 默认推荐 `mimo-v2.5-pro`，也可使用 `mimo-v2.5` |
-| `mimo_api_key` | 在 [MiMo 开放平台](https://platform.xiaomimimo.com/console/api-keys) 创建 |
-| `mimo_api_base` | 可选，默认为 `https://api.xiaomimimo.com/v1` |
+| `model` | Default recommendation: `mimo-v2.5-pro`; `mimo-v2.5` is also supported |
+| `mimo_api_key` | Create one in the [MiMo Open Platform](https://platform.xiaomimimo.com/console/api-keys) |
+| `mimo_api_base` | Optional, defaults to `https://api.xiaomimimo.com/v1` |

-### 模型选择
+### Model Selection

-| 模型 | 适用场景 |
+| Model | Use Case |
 | --- | --- |
-| `mimo-v2.5-pro` | 旗舰，原生全模态 + Agent 能力，最高 100 万 tokens 上下文 |
-| `mimo-v2.5` | 综合版，原生全模态（文本 / 图像 / 视频 / 音频） |
+| `mimo-v2.5-pro` | Flagship: native omni-modal + Agent capability, up to 1M tokens context |
+| `mimo-v2.5` | General-purpose, native omni-modal (text / image / video / audio) |

-## 思考模式
+## Thinking Mode

-MiMo V2.5 系列默认开启「思考模式」：模型在输出最终回答前会先输出 `reasoning_content`（思维链），提升复杂任务表现。
+The MiMo V2.5 series enables "thinking mode" by default: the model emits `reasoning_content` (chain-of-thought) before the final answer, improving performance on complex tasks.

-通过全局配置 `enable_thinking` 控制是否展示（也可在 Web 控制台 - 配置页面切换）：
+Use the global `enable_thinking` flag to toggle visibility (also switchable from the Web Console settings):

 ```json
 {
@@ -44,14 +44,14 @@ MiMo V2.5 系列默认开启「思考模式」：模型在输出最终回答前
 }
 ```

-## 图像理解
+## Image Understanding

-配置 `mimo_api_key` 后，Agent 的 Vision 工具可以自动使用 MiMo 视觉模型：
+Once `mimo_api_key` is configured, the Agent's Vision tool can automatically use MiMo's vision models:

- 当主模型本身是多模态时（`mimo-v2.5-pro` / `mimo-v2.5`），直接由主模型识别图像，无需额外配置
- 当主模型是其他厂商时，Vision 工具会根据顺序自动 fallback 到 `mimo-v2.5-pro`
+- When the main model itself is multimodal (`mimo-v2.5-pro` / `mimo-v2.5`), images are handled directly by the main model with no extra setup.
+- When the main model belongs to another vendor, the Vision tool falls back to `mimo-v2.5-pro` in order.

-如需手动指定 Vision 模型，可在配置文件中显式配置：
+To force a specific Vision model, set it explicitly in the configuration:

 ```json
 {
@@ -64,7 +64,7 @@ MiMo V2.5 系列默认开启「思考模式」：模型在输出最终回答前
 }
 ```

-## 语音合成
+## Text-to-Speech (TTS)

 ```json
 {
@@ -74,62 +74,63 @@ MiMo V2.5 系列默认开启「思考模式」：模型在输出最终回答前
 }
 ```

-| 参数 | 说明 |
+| Parameter | Description |
 | --- | --- |
-| `text_to_voice_model` | 当前仅支持 `mimo-v2.5-tts`（预置音色 + 唱歌模式） |
-| `tts_voice_id` | 预置音色名（中文音色直接使用中文名作为 ID） |
+| `text_to_voice_model` | Currently only `mimo-v2.5-tts` (preset voices + singing mode) |
+| `tts_voice_id` | Preset voice name (Chinese voice IDs use the Chinese name directly) |

-### 预置音色
+### Preset Voices

-| 音色 ID | 说明 |
+| Voice ID | Description |
 | --- | --- |
-| `冰糖` | 中文 · 女声（默认） |
-| `茉莉` | 中文 · 女声 |
-| `苏打` | 中文 · 男声 |
-| `白桦` | 中文 · 男声 |
-| `Mia` | 英文 · 女声 |
-| `Chloe` | 英文 · 女声 |
-| `Milo` | 英文 · 男声 |
-| `Dean` | 英文 · 男声 |
+| `Mia` | English · Female |
+| `Chloe` | English · Female |
+| `Milo` | English · Male |
+| `Dean` | English · Male |
+| `冰糖` | Chinese · Female (default) |
+| `茉莉` | Chinese · Female |
+| `苏打` | Chinese · Male |
+| `白桦` | Chinese · Male |

-也可在 Web 控制台的「模型管理 → 语音合成」下拉框中可视化选择。

-### 风格控制
+You can also pick a voice visually from the Web Console under "Model Management → Text-to-Speech".

-MiMo TTS 支持在合成文本中嵌入 **音频标签** 来控制情绪、语调、方言、角色甚至唱歌。标签需出现在 **最终被合成为语音的文本（即 Agent 回复内容）** 中，整体风格标签写在开头：
+### Style Control
+
+MiMo TTS supports embedding **audio tags** in the synthesis text to control emotion, tone, dialect, persona, and even singing. Tags must appear in the **text that will be synthesized to speech (i.e. the Agent's reply)**, with the overall style tag placed at the very beginning:

 ```
-(风格)待合成内容
+(style)content-to-synthesize
 ```

-支持半角 `()`、全角 `（）` 或 `[]` 三种括号。常见风格示例：
+Half-width `()`, full-width `（）`, and `[]` brackets are all accepted. Both Chinese and English style descriptors work — pick whichever language expresses the timbre most precisely. Common examples:

-| 类型 | 示例标签 |
+| Category | Example tags |
 | --- | --- |
-| 基础情绪 | `开心` `悲伤` `愤怒` `恐惧` `惊讶` `兴奋` `委屈` `平静` `冷漠` |
-| 复合情绪 | `怅然` `欣慰` `无奈` `愧疚` `释然` `忐忑` `动情` |
-| 整体语调 | `温柔` `高冷` `活泼` `严肃` `慵懒` `俏皮` `深沉` `干练` `凌厉` |
-| 音色定位 | `磁性` `醇厚` `清亮` `空灵` `稚嫩` `苍老` `甜美` `沙哑` |
-| 人设腔调 | `夹子音` `御姐音` `正太音` `大叔音` `台湾腔` |
-| 方言 | `东北话` `四川话` `河南话` `粤语` |
-| 角色扮演 | `孙悟空` `林黛玉` |
-| 唱歌 | `唱歌`（等价于 `sing` / `singing`） |
+| Basic emotions | `happy` `sad` `angry` `fear` `surprised` `excited` `aggrieved` `calm` `indifferent` |
+| Compound emotions | `wistful` `relieved` `helpless` `guilty` `at ease` `uneasy` `touched` |
+| Overall tone | `gentle` `aloof` `lively` `serious` `languid` `playful` `deep` `sharp` `cutting` |
+| Voice character | `magnetic` `mellow` `bright` `ethereal` `childlike` `aged` `sweet` `husky` |
+| Persona | `squeaky` `mature lady` `young boy` `uncle` `Taiwanese accent` |
+| Dialect | `Northeastern` `Sichuan` `Henan` `Cantonese` |
+| Role-play | `Sun Wukong` `Lin Daiyu` |
+| Singing | `sing` / `singing` |

-示例：
+Examples:

- (磁性)夜已经深了，城市还在呼吸。
- (东北话)哎呀妈呀，这天儿也忒冷了吧！
- (粤语)呢个真係好正啊！
- (唱歌)原谅我这一生不羁放纵爱自由…
+- `(magnetic)The night is deep, and the city is still breathing.`
+- `(gentle)Take a breath. You've got this.`
+- `(serious)This is the final warning before the system reboots.`
+- `(singing)Oh, when the saints go marching in…`

-也可以在文本任意位置插入细粒度音频标签来控制呼吸、笑声、停顿等，例如：
+You can also insert fine-grained audio tags at any position in the text to control breathing, laughter, pauses, etc. For example:

 ```
-（紧张，深呼吸）呼……冷静，冷静。（语速加快）自我介绍我背了五十遍了，应该没问题。
+(nervous, deep breath) Phew… stay calm, stay calm. (faster pace) I've rehearsed this intro fifty times, it'll be fine.
 ```

-完整标签列表参见 [MiMo 语音合成文档](https://platform.xiaomimimo.com/docs/zh-CN/usage-guide/speech-synthesis-v2.5)。
+See the [MiMo speech synthesis documentation](https://platform.xiaomimimo.com/docs/zh-CN/usage-guide/speech-synthesis-v2.5) for the full tag list.

 <Tip>
-  CowAgent 在调用 TTS 时会将 Agent 的回复原文（含 `(...)` 标签）直接送入 MiMo 合成。你可以在人设 / 系统提示词里要求模型「在回复开头用 `(风格)` 标签控制语气」，即可让 IM 渠道（微信 / 飞书 / 钉钉 / 企微）的语音回复带上情绪、方言、唱歌等效果。
+  When CowAgent calls TTS, the Agent's reply text (including any `(...)` tags) is forwarded directly to MiMo for synthesis. Tell the model in its persona / system prompt to "prefix replies with a `(style)` tag to control the tone", and IM channels (WeChat / Feishu / DingTalk / WeCom) will play voice replies with the corresponding emotion, dialect, or even singing.
 </Tip>