diff --git a/.gitignore b/.gitignore
index 0612e1e3..de10c0b7 100644
--- a/.gitignore
+++ b/.gitignore
@@ -32,7 +32,6 @@ plugins/banwords/lib/__pycache__
 !plugins/role
 !plugins/keyword
 !plugins/linkai
-!plugins/agent
 !plugins/cow_cli
 client_config.json
 ref/
diff --git a/README.md b/README.md
index 951e5e37..1edf1d3f 100644
--- a/README.md
+++ b/README.md
@@ -1,918 +1,257 @@
-<p align="center"><img src= "https://github.com/user-attachments/assets/eca9a9ec-8534-4615-9e0f-96c5ac1d10a3" alt="CowAgent" width="550" /></p>
+<p align="center"><img src="https://github.com/user-attachments/assets/eca9a9ec-8534-4615-9e0f-96c5ac1d10a3" alt="CowAgent" width="420" /></p>
 
 <p align="center">
   <a href="https://github.com/zhayujie/CowAgent/releases/latest"><img src="https://img.shields.io/github/v/release/zhayujie/CowAgent" alt="Latest release"></a>
   <a href="https://github.com/zhayujie/CowAgent/blob/master/LICENSE"><img src="https://img.shields.io/github/license/zhayujie/CowAgent" alt="License: MIT"></a>
   <a href="https://github.com/zhayujie/CowAgent"><img src="https://img.shields.io/github/stars/zhayujie/CowAgent?style=flat-square" alt="Stars"></a> <br/>
-  [中文] | [<a href="docs/en/README.md">English</a>] | [<a href="docs/ja/README.md">日本語</a>]
+  [English] | [<a href="docs/zh/README.md">中文</a>] | [<a href="docs/ja/README.md">日本語</a>]
 </p>
 
-**CowAgent** 是基于大模型的超级 AI 助理，能够主动思考和任务规划、操作计算机和外部资源、创造和执行 Skills、拥有长期记忆和知识库并不断成长，比 OpenClaw 更轻量和便捷。CowAgent 支持灵活切换多种模型，能处理文本、语音、图片、文件等多模态消息，可接入微信、飞书、钉钉、企微智能机器人、QQ、企微自建应用、微信公众号、网页中使用，7*24小时运行于你的个人电脑或服务器中。
+**CowAgent** is an open-source super AI assistant that proactively plans tasks, controls your computer and external services, creates and runs Skills, and grows alongside you through a personal knowledge base and long-term memory — a reference implementation of Agent Harness engineering.
+
+CowAgent is lightweight, easy to deploy, and built to extend. Plug in any major LLM provider and run it 24/7 on a personal computer or server, across the web and all major IM platforms.
 
 <p align="center">
-  <a href="https://cowagent.ai/">🌐 官网</a> &nbsp;·&nbsp;
-  <a href="https://docs.cowagent.ai/">📖 文档中心</a> &nbsp;·&nbsp;
-  <a href="https://docs.cowagent.ai/guide/quick-start">🚀 快速开始</a> &nbsp;·&nbsp;
-  <a href="https://skills.cowagent.ai/">🧩 技能广场</a> &nbsp;·&nbsp;
-  <a href="https://link-ai.tech/cowagent/create">☁️ 在线体验</a>
+  <a href="https://cowagent.ai/">🌐 Website</a> &nbsp;·&nbsp;
+  <a href="https://docs.cowagent.ai/en/intro/index">📖 Docs</a> &nbsp;·&nbsp;
+  <a href="https://docs.cowagent.ai/en/guide/quick-start">🚀 Quick Start</a> &nbsp;·&nbsp;
+  <a href="https://skills.cowagent.ai/">🧩 Skill Hub</a> &nbsp;·&nbsp;
+  <a href="https://link-ai.tech/cowagent/create">☁️ Try Online</a>
 </p>
 
+<br/>
 
-# 简介
+## 🌟 Highlights
 
-> 该项目既是一个可以开箱即用的超级 AI 助理，也是一个支持高扩展的 Agent 框架，可以通过为项目扩展大模型接口、接入渠道、内置工具、Skills 系统来灵活实现各种定制需求。核心能力如下：
-
--  ✅  **自主任务规划**：能够理解复杂任务并自主规划执行，持续思考和调用工具直到完成目标
--  ✅  **长期记忆：** 自动将对话记忆持久化至本地文件和数据库中，包括核心记忆、日级记忆和梦境蒸馏，支持关键词及向量检索
--  ✅  **个人知识库：** 自动整理结构化知识，通过交叉引用构建知识图谱，支持通过对话管理和可视化浏览知识库
--  ✅  **技能系统：** Skills 安装和运行的引擎，支持从 [Skill Hub](https://skills.cowagent.ai/)、GitHub 等一键安装技能，或通过对话创造 Skills
--  ✅  **工具系统：** 内置文件读写、终端执行、浏览器操作、定时任务等工具，支持 MCP 协议，通过 Agent 自主调用完成复杂任务
--  ✅  **CLI系统：** 提供终端命令和对话命令，支持进程管理、技能安装、配置修改等操作
--  ✅  **多模态消息：** 支持对文本、图片、语音、文件等多类型消息进行解析、处理、生成、发送等操作
--  ✅  **多模型支持：** 支持 DeepSeek、MiniMax、Claude、Gemini、OpenAI、GLM、Qwen、Doubao、Kimi 等国内外主流模型厂商
--  ✅  **多通道接入：** 支持运行在本地计算机或服务器，可集成到微信、飞书、钉钉、企业微信、QQ、微信公众号、网页中使用
-
-## 声明
-
-1. 本项目遵循 [MIT 开源协议](/LICENSE)，主要用于技术研究和学习，使用本项目时需遵守所在地法律法规、相关政策以及企业章程，禁止用于任何违法或侵犯他人权益的行为。任何个人、团队和企业，无论以何种方式使用该项目、对何对象提供服务，所产生的一切后果，本项目均不承担任何责任。
-2. 成本与安全：Agent 模式下 Token 使用量高于普通对话模式，请根据效果及成本综合选择模型。Agent 具有访问所在操作系统的能力，请谨慎选择项目部署环境。同时项目也会持续升级安全机制、并降低模型消耗成本。
-3. CowAgent 项目专注于开源技术开发，不会参与、授权或发行任何加密货币。
-
-## 演示
-
-- 使用说明( Agent 模式)：[CowAgent 介绍](https://docs.cowagent.ai/intro/features)
-
-- 免部署在线体验：[CowAgent](https://link-ai.tech/cowagent/create)
-
-- DEMO 视频(对话模式)：https://cdn.link-ai.tech/doc/cow_demo.mp4
-
-## 社区
-
-添加小助手微信加入开源项目交流群：
-
-<img width="140" src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/open-community.png">
+| Capability | Description |
+| :--- | :--- |
+| [Planning](https://docs.cowagent.ai/en/intro/architecture) | Decomposes complex tasks and executes them step by step, looping over tools until the goal is reached |
+| [Memory](https://docs.cowagent.ai/en/memory/index) | Three-tier architecture (context → daily → core), automatic Deep Dream distillation, hybrid keyword + vector retrieval |
+| [Knowledge](https://docs.cowagent.ai/en/knowledge/index) | Auto-curates structured knowledge into a Markdown wiki, builds an evolving knowledge graph with visual browsing |
+| [Skills](https://docs.cowagent.ai/en/skills/index) | One-click install from [Skill Hub](https://skills.cowagent.ai/), GitHub, ClawHub; or create custom skills via natural-language conversation |
+| [Tools](https://docs.cowagent.ai/en/tools/index) | Built-in file I/O, terminal, browser, scheduler, memory retrieval, web search, and 10+ more tools — with native MCP integration |
+| [Channels](https://docs.cowagent.ai/en/channels/index) | Integrates with Web, WeChat, Feishu, DingTalk, WeCom, QQ, Official Accounts, Telegram, and Slack |
+| Multimodal | First-class support for text, images, voice, and files — recognition, generation, and delivery |
+| [Models](https://docs.cowagent.ai/en/models/index) | Claude, GPT, Gemini, DeepSeek, Qwen, GLM, Kimi, MiniMax, Doubao, and more — swap providers from the Web console with one click |
+| [Deploy](https://docs.cowagent.ai/en/guide/quick-start) | One-line installer, unified Web console, multiple deployment modes (local, Docker, server) |
 
 <br/>
 
-# 企业服务
+## 🏗️ Architecture
 
-<a href="https://link-ai.tech" target="_blank"><img width="650" src="https://cdn.link-ai.tech/image/link-ai-intro.jpg"></a>
+<img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/architecture/en/architecture.jpg" alt="CowAgent Architecture" width="750"/>
 
-> [LinkAI](https://link-ai.tech/) 是面向企业和个人的一站式 AI 智能体平台，聚合多模态大模型、知识库、技能、工作流等能力，支持一键接入主流平台并管理，支持 SaaS、私有化部署等多种模式，可免部署在线运行[CowAgent 助理](https://link-ai.tech/cowagent/create)。
->
-> LinkAI 目前已在智能客服、私域运营、企业效率助手等场景积累了丰富的 AI 解决方案，在消费、健康、文教、科技制造等各行业沉淀了大模型落地应用的最佳实践，致力于帮助更多企业和开发者拥抱 AI 生产力。
+CowAgent is a complete **Agent Harness**: messages flow in through **Channels**; the **Agent Core** plans and reasons over memory, knowledge, and the available tools and skills; **Models** generate the response, which is sent back through the originating channel. Every layer is decoupled and independently extensible.
 
-**产品咨询和企业服务** 可联系产品客服：
-
-<img width="150" src="https://cdn.link-ai.tech/portal/linkai-customer-service.png">
+Read more in [Architecture](https://docs.cowagent.ai/en/intro/architecture).
 
 <br/>
 
-# 🏷 更新日志
+## 🚀 Quick Start
 
->**2026.05.06：** [2.0.8版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.8)，飞书渠道全面升级（语音、流式输出和Markdown、一键扫码接入）、新模型支持（DeepSeek V4、百度千帆）、定时任务工具增强等
+A one-line installer takes care of dependencies, configuration, and startup:
 
->**2026.04.22：** [2.0.7版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.7)，图像生成内置技能（GPT Image 2、Nano Banana 等）、新模型支持（Kimi K2.6、Claude Opus 4.7、GLM 5.1）、知识库和记忆增强、Web 控制台优化
+**Linux / macOS:**
 
->**2026.04.14：** [2.0.6版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.6)，知识库系统、梦境记忆模块、上下文智能压缩、Web 控制台多会话及多项优化。
-
->**2026.04.01：** [2.0.5版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.5)，Cow CLI 命令系统、Skill Hub 开源、浏览器工具、企微扫码创建、多项优化和修复。
-
->**2026.03.22：** [2.0.4版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.4)，新增个人微信通道（微信扫码即用）、新增 MiniMax-M2.7 和 GLM-5-Turbo 模型、run.sh 脚本重构、日文文档及多项修复。
-
->**2026.03.18：** [2.0.3版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.3)，新增企微智能机器人和 QQ 通道、支持 Coding Plan、新增多个模型、Web 端文件处理、记忆系统升级。
-
->**2026.02.27：** [2.0.2版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.2)，Web 控制台全面升级（流式对话、模型/技能/记忆/通道/定时任务/日志管理）、支持多通道同时运行、会话持久化存储、新增多个模型。
-
->**2026.02.13：** [2.0.1版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.1)，内置 Web Search 工具、智能上下文裁剪策略、运行时信息动态更新、Windows 兼容性适配，修复定时任务记忆丢失、飞书连接等多项问题。
-
->**2026.02.03：** [2.0.0版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.0)，正式升级为超级 Agent 助理，支持多轮任务决策、具备长期记忆、实现多种系统工具、支持 Skills 框架，新增多种模型并优化了接入渠道。
-
-更多更新历史请查看: [更新日志](https://docs.cowagent.ai/releases)
-
-<br/>
-
-# 🚀 快速开始
-
-项目提供了一键安装、配置、启动、管理程序的脚本，推荐使用脚本快速运行，也可以根据下文中的详细指引一步步安装运行。
-
-在终端执行以下命令：
-
-**Linux / macOS：**
 ```bash
 bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
 ```
 
-**Windows（PowerShell）：**
+**Windows (PowerShell):**
+
 ```powershell
 irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
 ```
 
-脚本使用说明：[一键运行脚本](https://docs.cowagent.ai/guide/quick-start)。安装后可使用 `cow start`、`cow stop` 等 [CLI 命令](https://docs.cowagent.ai/cli/index) 管理服务。
-
-
-## 一、准备
-
-### 1. 模型API
-
-项目支持国内外主流厂商的模型接口，可选模型及配置说明参考：[模型说明](#模型说明)。
-
-> 注：Agent 模式下推荐使用以下模型，可根据效果及成本综合选择：deepseek-v4-flash、MiniMax-M2.7、glm-5.1、kimi-k2.6、qwen3.5-plus、claude-sonnet-4-6、gemini-3.1-pro-preview、gpt-5.4、gpt-5.4-mini、ernie-5.1
-
-同时支持使用 **LinkAI 平台** 接口，支持上述全部模型，并支持知识库、工作流、插件等 Agent 技能，参考 [接口文档](https://docs.link-ai.tech/platform/api)。
-
-### 2.环境安装
-
-支持 Linux、MacOS、Windows 操作系统，可在个人计算机及服务器上运行，需安装 `Python`，Python 版本需在 3.7 ~ 3.13 之间。
-
-> 注意：Agent 模式推荐使用源码运行，若选择 Docker 部署则无需安装 python 环境和下载源码，可直接快进到下一节。
-
-**(1) 克隆项目代码：**
-
-```bash
-git clone https://github.com/zhayujie/CowAgent
-cd CowAgent/
-```
-
-若遇到网络问题可使用国内仓库地址：https://gitee.com/zhayujie/CowAgent
-
-**(2) 安装核心依赖 (必选)：**
-
-```bash
-pip3 install -r requirements.txt
-```
-
-**(3) 拓展依赖 (可选，建议安装)：**
-
-```bash
-pip3 install -r requirements-optional.txt
-```
-
-> 国内网络可使用镜像源加速：`pip3 install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple`
-
-如果某项依赖安装失败可注释掉对应的行后重试。
-
-**(4) 安装 Cow CLI (推荐)：**
-
-```bash
-pip3 install -e .
-```
-
-安装后可使用 `cow` 命令管理服务（启动、停止、更新等）和技能，详见 [命令文档](https://docs.cowagent.ai/cli/index)。
-
-**(5) 安装浏览器工具 (可选)：**
-
-如果需要 Agent 操作浏览器（如访问网页、填写表单等），需要额外安装浏览器依赖：
-
-```bash
-cow install-browser
-```
-
-该命令会自动安装 `playwright` 和 Chromium 浏览器，国内网络自动使用镜像加速。详见 [浏览器工具文档](https://docs.cowagent.ai/tools/browser)。
-
-## 二、配置
-
-配置文件的模板在根目录的 `config-template.json` 中，需复制该模板创建最终生效的 `config.json` 文件：
-
-```bash
-  cp config-template.json config.json
-```
-
-然后在 `config.json` 中填入配置，以下是对默认配置的说明，可根据需要进行自定义修改（注意实际使用时请去掉注释，保证 JSON 格式的规范）：
-
-```bash
-# config.json 文件内容示例
-{
-  "channel_type": "weixin",                                   # 接入渠道类型，默认为 weixin, 支持修改为 feishu,dingtalk,wecom_bot,qq,wechatcom_app,wechatmp_service,wechatmp,terminal
-  "model": "deepseek-v4-flash",                                # 模型名称
-  "deepseek_api_key": "",                                      # DeepSeek API Key
-  "deepseek_api_base": "https://api.deepseek.com/v1",         # DeepSeek API 地址
-  "minimax_api_key": "",                                      # MiniMax API Key
-  "zhipu_ai_api_key": "",                                     # 智谱 GLM API Key
-  "moonshot_api_key": "",                                     # Kimi/Moonshot API Key
-  "ark_api_key": "",                                          # 豆包(火山方舟) API Key
-  "dashscope_api_key": "",                                    # 百炼(通义千问) API Key
-  "claude_api_key": "",                                       # Claude API Key
-  "claude_api_base": "https://api.anthropic.com/v1",          # Claude API 地址，修改可接入三方代理平台
-  "gemini_api_key": "",                                       # Gemini API Key
-  "gemini_api_base": "https://generativelanguage.googleapis.com", # Gemini API 地址
-  "open_ai_api_key": "",                                      # OpenAI API Key
-  "open_ai_api_base": "https://api.openai.com/v1",            # OpenAI API 地址
-  "linkai_api_key": "",                                       # LinkAI API Key
-  "proxy": "",                                                # 代理客户端的 ip 和端口，国内环境需要开启代理的可填写该项，如 "127.0.0.1:7890"
-  "speech_recognition": false,                                # 是否开启语音识别
-  "group_speech_recognition": false,                          # 是否开启群组语音识别
-  "voice_reply_voice": false,                                 # 是否使用语音回复语音
-  "use_linkai": false,                                        # 是否使用 LinkAI 接口，默认关闭，设置为 true 后可对接 LinkAI 平台模型
-  "web_password": "",                                         # Web 控制台访问密码，留空则不启用密码保护（监听 0.0.0.0 时务必设置）
-  "agent": true,                                              # 是否启用 Agent 模式，启用后拥有多轮工具决策、长期记忆、Skills 能力等
-  "agent_workspace": "~/cow",                                 # Agent 的工作空间路径，用于存储 memory、skills、系统设定等
-  "agent_max_context_tokens": 50000,                          # Agent 模式下最大上下文 tokens，超出将自动智能压缩处理
-  "agent_max_context_turns": 20,                              # Agent 模式下最大上下文记忆轮次，一问一答为一轮，超出后智能压缩处理
-  "agent_max_steps": 20,                                      # Agent 模式下单次任务的最大决策步数，超出后将停止继续调用工具
-  "enable_thinking": false                                    # 是否启用深度思考模式
-}
-```
-
-**配置补充说明:** 
-
-<details>
-<summary>1. 语音配置</summary>
-
-+ 添加 `"speech_recognition": true` 将开启语音识别，默认使用 openai 的 whisper 模型识别为文字，同时以文字回复，该参数仅支持私聊 (注意由于语音消息无法匹配前缀，一旦开启将对所有语音自动回复，支持语音触发画图)；
-+ 添加 `"group_speech_recognition": true` 将开启群组语音识别，默认使用 openai 的 whisper 模型识别为文字，同时以文字回复，参数仅支持群聊 (会匹配 group_chat_prefix 和 group_chat_keyword, 支持语音触发画图)；
-+ 添加 `"voice_reply_voice": true` 将开启语音回复语音（同时作用于私聊和群聊）
-+ 使用 MiniMax TTS：设置 `"text_to_voice": "minimax"`，并配置 `minimax_api_key`；可通过 `"tts_voice_id"` 指定发音人（如 `English_Graceful_Lady`），`"text_to_voice_model"` 指定模型（如 `speech-2.8-hd`、`speech-2.8-turbo`）
-</details>
-
-<details>
-<summary>2. 其他配置</summary>
-
-+ `model`: 模型名称，Agent 模式下推荐使用 `deepseek-v4-flash`、`MiniMax-M2.7`、`glm-5.1`、`kimi-k2.6`、`qwen3.6-plus`、`claude-sonnet-4-6`、`gemini-3.1-pro-preview`，全部模型名称参考[common/const.py](https://github.com/zhayujie/CowAgent/blob/master/common/const.py)文件
-+ `character_desc`：普通对话模式下的机器人系统提示词。在 Agent 模式下该配置不生效，由工作空间中的文件内容构成。
-+ `subscribe_msg`：订阅消息，公众号和企业微信 channel 中请填写，当被订阅时会自动回复， 可使用特殊占位符。目前支持的占位符有{trigger_prefix}，在程序中它会自动替换成 bot 的触发词。
-</details>
-
-<details>
-<summary>3. LinkAI 配置</summary>
-
-+ `use_linkai`: 是否使用 LinkAI 接口，默认关闭，设置为 true 后可对接 LinkAI 平台，使用模型、知识库、工作流、插件等技能, 参考[接口文档](https://docs.link-ai.tech/platform/api/chat)
-+ `linkai_api_key`: LinkAI Api Key，可在 [控制台](https://link-ai.tech/console/interface) 创建
-</details>
-
-注：全部配置项说明可在 [`config.py`](https://github.com/zhayujie/CowAgent/blob/master/config.py) 文件中查看。
-
-## 三、运行
-
-### 1.本地运行
-
-如果是个人计算机 **本地运行**，直接在项目根目录下执行：
-
-```bash
-cow start              # 推荐，需先安装 Cow CLI
-python3 app.py         # 或直接运行，windows 环境下该命令通常为 python app.py
-```
-
-运行后默认会启动 web 服务，可通过访问 `http://localhost:9899/chat` 在网页端对话。
-
-如果需要接入其他应用通道只需修改 `config.json` 配置文件中的 `channel_type` 参数，详情参考：[通道说明](#通道说明)。
-
-
-### 2.服务器部署
-
-推荐使用 `cow` 命令管理服务：
-
-```bash
-cow start              # 后台启动
-cow stop               # 停止服务
-cow restart            # 重启服务
-cow status             # 查看运行状态
-cow logs               # 查看日志
-cow update             # 拉取最新代码并重启
-```
-
-也可以使用传统方式后台运行：
-
-```bash
-nohup python3 app.py & tail -f nohup.out
-```
-
-此外，项目根目录下的 `run.sh` 脚本也支持一键管理服务，包括 `./run.sh start`、`./run.sh stop`、`./run.sh restart` 等命令，执行 `./run.sh help` 可查看全部用法。
-
-> 如果需要通过浏览器访问 Web 控制台，请确保服务器的 `9899` 端口已在防火墙或安全组中放行，建议仅对指定 IP 开放以保证安全。
-
-### 3.Docker部署
-
-使用 docker 部署无需下载源码和安装依赖，只需要获取 `docker-compose.yml` 配置文件并启动容器即可。Agent 模式下更推荐使用源码进行部署，以获得更多系统访问能力。
-
-> 前提是需要安装好 `docker` 及 `docker-compose`，安装成功后执行 `docker -v` 和 `docker-compose version` (或 `docker compose version`) 可查看到版本号。安装地址为 [docker官网](https://docs.docker.com/engine/install/) 。
-
-**(1) 下载 docker-compose.yml 文件**
+**Docker:**
 
 ```bash
 curl -O https://cdn.link-ai.tech/code/cow/docker-compose.yml
+docker compose up -d
 ```
 
-下载完成后打开 `docker-compose.yml` 填写所需配置，例如 `CHANNEL_TYPE`、`OPEN_AI_API_KEY` 和等配置。
+Once started, open `http://localhost:9899` to access the **Web console** — your one-stop hub to chat with the Agent, configure models, connect channels, and install skills.
 
-**(2) 启动容器**
+> Deploying on a server? Set `web_host` to `0.0.0.0` in `config.json` to make the console reachable from outside, and set `web_password` to protect it. Don't forget to open port `9899` in your firewall or security group.
 
-在 `docker-compose.yml` 所在目录下执行以下命令启动容器：
+> 📖 Detailed guides: [Quick Start](https://docs.cowagent.ai/en/guide/quick-start) · [Install from Source](https://docs.cowagent.ai/en/guide/manual-install) · [Upgrade](https://docs.cowagent.ai/en/guide/upgrade)
+
+After installation, manage the service with the [cow CLI](https://docs.cowagent.ai/en/cli/index):
 
 ```bash
-sudo docker compose up -d         # 若docker-compose为 1.X 版本，则执行 `sudo  docker-compose up -d`
+cow start | stop | restart        # service control
+cow status | logs                  # status and logs
+cow update                         # pull latest code and restart
+cow skill install <name>           # install a skill
+cow install-browser                # install browser automation
 ```
 
-运行命令后，会自动取 [docker hub](https://hub.docker.com/r/zhayujie/chatgpt-on-wechat) 拉取最新 release 版本的镜像。当执行 `sudo docker ps` 能查看到 NAMES 为 chatgpt-on-wechat 的容器即表示运行成功。最后执行以下命令可查看容器的运行日志：
-
-```bash
-sudo docker logs -f chatgpt-on-wechat
-```
-
-> 如果需要通过浏览器访问 Web 控制台，请确保服务器的 `9899` 端口已在防火墙或安全组中放行，建议仅对指定 IP 开放以保证安全。
-
-## 模型说明
-
-推荐通过 Web 控制台在线管理模型配置，无需手动编辑文件，详见 [模型文档](https://docs.cowagent.ai/models)。以下是手动修改 `config.json` 配置模型的说明：
-
-<details>
-<summary>DeepSeek</summary>
-
-1. API Key 创建：在 [DeepSeek 平台](https://platform.deepseek.com/api_keys) 创建 API Key
-
-2. 填写配置
-
-方式一：官方接入（推荐）：
-
-```json
-{
-    "model": "deepseek-v4-flash",
-    "deepseek_api_key": "sk-xxxxxxxxxxx"
-}
-```
-
- - `model`: 推荐填写 `deepseek-v4-flash`、`deepseek-v4-pro`
- - `deepseek_api_key`: DeepSeek 平台的 API Key
- - `deepseek_api_base`: 可选，默认为 `https://api.deepseek.com/v1`，可修改为第三方代理地址
-
-方式二：OpenAI 兼容方式接入：
-
-```json
-{
-    "model": "deepseek-v4-flash",
-    "bot_type": "openai",
-    "open_ai_api_key": "sk-xxxxxxxxxxx",
-    "open_ai_api_base": "https://api.deepseek.com/v1"
-}
-```
-
-</details>
-
-<details>
-<summary>MiniMax</summary>
-
-方式一：官方接入，配置如下(推荐)：
-
-```json
-{
-    "model": "MiniMax-M2.7",
-    "minimax_api_key": ""
-}
-```
- - `model`: 可填写 `MiniMax-M2.7、MiniMax-M2.7-highspeed、MiniMax-M2.5、MiniMax-M2.1、MiniMax-M2.1-lightning、MiniMax-M2、abab6.5-chat` 等
- - `minimax_api_key`：MiniMax 平台的 API-KEY，在 [控制台](https://platform.minimaxi.com/user-center/basic-information/interface-key) 创建
-
-方式二：OpenAI 兼容方式接入，配置如下：
-```json
-{
-  "bot_type": "openai",
-  "model": "MiniMax-M2.7",
-  "open_ai_api_base": "https://api.minimaxi.com/v1",
-  "open_ai_api_key": ""
-}
-```
-- `bot_type`: OpenAI 兼容方式
-- `model`: 可填 `MiniMax-M2.7、MiniMax-M2.7-highspeed、MiniMax-M2.5、MiniMax-M2.1、MiniMax-M2.1-lightning、MiniMax-M2`，参考[API文档](https://platform.minimaxi.com/document/%E5%AF%B9%E8%AF%9D?key=66701d281d57f38758d581d0#QklxsNSbaf6kM4j6wjO5eEek)
-- `open_ai_api_base`: MiniMax 平台 API 的 BASE URL
-- `open_ai_api_key`: MiniMax 平台的 API-KEY
-</details>
-
-<details>
-<summary>Claude</summary>
-
-1. API Key 创建：在 [Claude控制台](https://console.anthropic.com/settings/keys) 创建 API Key
-
-2. 填写配置
-
-```json
-{
-    "model": "claude-sonnet-4-6",
-    "claude_api_key": "YOUR_API_KEY"
-}
-```
- - `model`: 参考 [官方模型ID](https://docs.anthropic.com/en/docs/about-claude/models/overview#model-aliases) ，支持 `claude-sonnet-4-6、claude-opus-4-7、claude-opus-4-6、claude-sonnet-4-5、claude-sonnet-4-0、claude-opus-4-0、claude-3-5-sonnet-latest` 等
-</details>
-
-<details>
-<summary>Gemini</summary>
-
-API Key 创建：在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn) 创建 API Key ，配置如下
-```json
-{
-    "model": "gemini-3.1-flash-lite-preview",
-    "gemini_api_key": ""
-}
-```
- - `model`: 参考[官方文档-模型列表](https://ai.google.dev/gemini-api/docs/models?hl=zh-cn)，支持 `gemini-3.1-flash-lite-preview、gemini-3.1-pro-preview、gemini-3-flash-preview、gemini-3-pro-preview` 等
-</details>
-
-<details>
-<summary>OpenAI</summary>
-
-1. API Key 创建：在 [OpenAI平台](https://platform.openai.com/api-keys) 创建 API Key
-
-2. 填写配置
-
-```json
-{
-    "model": "gpt-5.4",
-    "open_ai_api_key": "YOUR_API_KEY",
-    "open_ai_api_base": "https://api.openai.com/v1",
-    "bot_type": "openai"
-}
-```
-
- - `model`: 与 OpenAI 接口的 [model参数](https://platform.openai.com/docs/models) 一致，支持包括 gpt-5.4、gpt-5.4-mini、gpt-5.4-nano、o 系列、gpt-4.1 等模型，Agent 模式推荐使用  `gpt-5.4`、`gpt-5.4-mini`
- - `open_ai_api_base`: 如果需要接入第三方代理接口，可通过修改该参数进行接入
- - `bot_type`: 使用 OpenAI 相关模型时无需填写。当使用第三方代理接口接入 Claude 等非 OpenAI 官方模型时，该参数设为 `openai`
-</details>
-
-<details>
-<summary>智谱AI (GLM)</summary>
-
-方式一：官方接入，配置如下(推荐)：
-
-```json
-{
-  "model": "glm-5.1",
-  "zhipu_ai_api_key": ""
-}
-```
- - `model`: 可填 `glm-5.1、glm-5-turbo、glm-5、glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long` 等, 参考 [glm 系列模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4)
- - `zhipu_ai_api_key`: 智谱AI 平台的 API KEY，在 [控制台](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) 创建
-
-方式二：OpenAI 兼容方式接入，配置如下：
-```json
-{
-  "bot_type": "openai",
-  "model": "glm-5.1",
-  "open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
-  "open_ai_api_key": ""
-}
-```
-- `bot_type`: OpenAI 兼容方式
-- `model`: 可填 `glm-5.1、glm-5-turbo、glm-5、glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long` 等
-- `open_ai_api_base`: 智谱AI 平台的 BASE URL
-- `open_ai_api_key`: 智谱AI 平台的 API KEY
-</details>
-
-<details>
-<summary>通义千问 (Qwen)</summary>
-
-方式一：官方 SDK 接入，配置如下(推荐)：
-
-```json
-{
-    "model": "qwen3.6-plus",
-    "dashscope_api_key": "sk-qVxxxxG"
-}
-```
- - `model`: 可填写 `qwen3.6-plus、qwen3.5-plus、qwen3-max、qwen-max、qwen-plus、qwen-turbo、qwen-long、qwq-plus` 等
- - `dashscope_api_key`: 通义千问的 API-KEY，参考 [官方文档](https://bailian.console.aliyun.com/?tab=api#/api) ，在 [百炼控制台](https://bailian.console.aliyun.com/?tab=model#/api-key) 创建
-
-方式二：OpenAI 兼容方式接入，配置如下：
-```json
-{
-  "bot_type": "openai",
-  "model": "qwen3.6-plus",
-  "open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
-  "open_ai_api_key": "sk-qVxxxxG"
-}
-```
-- `bot_type`: OpenAI 兼容方式
-- `model`: 支持官方所有模型，参考[模型列表](https://help.aliyun.com/zh/model-studio/models?spm=a2c4g.11186623.0.0.78d84823Kth5on#9f8890ce29g5u)
-- `open_ai_api_base`: 通义千问 API 的 BASE URL
-- `open_ai_api_key`: 通义千问的 API-KEY
-</details>
-
-<details>
-<summary>豆包 (Doubao)</summary>
-
-1. API Key 创建：在 [火山方舟控制台](https://console.volcengine.com/ark/region:ark+cn-beijing/apikey) 创建API Key
-
-2. 填写配置
-
-```json
-{
-    "model": "doubao-seed-2-0-code-preview-260215",
-    "ark_api_key": "YOUR_API_KEY"
-}
-```
- - `model`: 可填写 `doubao-seed-2-0-code-preview-260215、doubao-seed-2-0-pro-260215、doubao-seed-2-0-lite-260215、doubao-seed-2-0-mini-260215` 等
- - `ark_api_key`: 火山方舟平台的 API Key，在 [控制台](https://console.volcengine.com/ark/region:ark+cn-beijing/apikey) 创建
- - `ark_base_url`: 可选，默认为 `https://ark.cn-beijing.volces.com/api/v3`
-</details>
-
-<details>
-<summary>Kimi (Moonshot)</summary>
-
-方式一：官方接入，配置如下：
-
-```json
-{
-    "model": "kimi-k2.6",
-    "moonshot_api_key": ""
-}
-```
- - `model`: 可填写 `kimi-k2.6、kimi-k2.5、kimi-k2、moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
- - `moonshot_api_key`: Moonshot 的 API-KEY，在 [控制台](https://platform.moonshot.cn/console/api-keys) 创建
-
-方式二：OpenAI 兼容方式接入，配置如下：
-```json
-{
-  "bot_type": "openai",
-  "model": "kimi-k2.6",
-  "open_ai_api_base": "https://api.moonshot.cn/v1",
-  "open_ai_api_key": ""
-}
-```
-- `bot_type`: OpenAI 兼容方式
-- `model`: 可填写 `kimi-k2.6、kimi-k2.5、kimi-k2、moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
-- `open_ai_api_base`: Moonshot 的 BASE URL
-- `open_ai_api_key`: Moonshot 的 API-KEY
-</details>
-
-<details>
-<summary>ModelScope</summary>
-
-```json
-{
-  "bot_type": "modelscope",
-  "model": "Qwen/QwQ-32B",
-  "modelscope_api_key": "your_api_key",
-  "modelscope_base_url": "https://api-inference.modelscope.cn/v1/chat/completions",
-  "text_to_image": "MusePublic/489_ckpt_FLUX_1"
-}
-```
-
-- `bot_type`: modelscope 接口格式
-- `model`: 参考[模型列表](https://www.modelscope.cn/models?filter=inference_type&page=1)
-- `modelscope_api_key`: 参考 [官方文档-访问令牌](https://modelscope.cn/docs/accounts/token) ，在 [控制台](https://modelscope.cn/my/myaccesstoken)
-- `modelscope_base_url`: modelscope 平台的 BASE URL
-- `text_to_image`: 图像生成模型，参考[模型列表](https://www.modelscope.cn/models?filter=inference_type&page=1)
-</details>
-
-<details>
-<summary>LinkAI</summary>
-
-1. API Key 创建：在 [LinkAI平台](https://link-ai.tech/console/interface) 创建 API Key
-
-2. 填写配置
-
-```json
-{
-    "model": "gpt-5.4-mini",
-    "use_linkai": true,
-    "linkai_api_key": "YOUR API KEY"
-}
-```
-
-+ `use_linkai`: 是否使用 LinkAI 接口，默认关闭，设置为 true 后可对接 LinkAI 平台的模型，并使用知识库、工作流、数据库、插件等丰富的 Agent 技能
-+ `linkai_api_key`: LinkAI 平台的 API Key，可在 [控制台](https://link-ai.tech/console/interface) 中创建
-+ `model`: [模型列表](https://link-ai.tech/console/models)中的全部模型均可使用
-</details>
-
-<details>
-<summary>Azure</summary>
-
-1. API Key 创建：在 [Azure平台](https://oai.azure.com/) 创建 API Key 
-
-2. 填写配置
-
-```json
-{
-  "model": "",
-  "use_azure_chatgpt": true,
-  "open_ai_api_key": "",
-  "open_ai_api_base": "",
-  "azure_deployment_id": "",
-  "azure_api_version": "2025-01-01-preview"
-}
-```
-
- - `model`: 留空即可
- - `use_azure_chatgpt`: 设为 true 
- - `open_ai_api_key`: Azure 平台的密钥
- - `open_ai_api_base`: Azure 平台的 BASE URL
- - `azure_deployment_id`: Azure 平台部署的模型名称
- - `azure_api_version`: api 版本以及以上参数可以在部署的 [模型配置](https://oai.azure.com/resource/deployments) 界面查看
-</details>
-
-<details>
-<summary>百度千帆 / ERNIE</summary>
-
-方式一：官方接入（推荐），配置如下：
-
-```json
-{
-  "model": "ernie-5.1",
-  "qianfan_api_key": "",
-  "qianfan_api_base": "https://qianfan.baidubce.com/v2"
-}
-```
-
- - `model`: 默认推荐填写 `ernie-5.1`（多模态，可直接识图），也可填写 `ernie-5.0`、`ernie-x1.1`、`ernie-4.5-turbo-128k`、`ernie-4.5-turbo-32k`；当主模型为纯文本 ERNIE 时，Vision 工具会自动 fallback 到 `ernie-4.5-turbo-vl`
- - `qianfan_api_key`: 百度千帆 API Key，通常以 `bce-v3/` 开头，可在百度智能云控制台创建
- - `qianfan_api_base`: 可选，默认为 `https://qianfan.baidubce.com/v2`
-
-方式二：OpenAI 兼容方式接入，配置如下：
-```json
-{
-  "bot_type": "openai",
-  "model": "ernie-5.1",
-  "open_ai_api_base": "https://qianfan.baidubce.com/v2",
-  "open_ai_api_key": ""
-}
-```
-- `bot_type`: OpenAI 兼容方式
-- `model`: 支持千帆平台上的 ERNIE 模型
-- `open_ai_api_base`: 百度千帆 OpenAI 兼容 API 的 BASE URL
-- `open_ai_api_key`: 百度千帆 API Key
-
-</details>
-
-<details>
-<summary>讯飞星火</summary>
-
-方式一：官方接入，配置如下：
-参考 [官方文档-快速指引](https://www.xfyun.cn/doc/platform/quickguide.html#%E7%AC%AC%E4%BA%8C%E6%AD%A5-%E5%88%9B%E5%BB%BA%E6%82%A8%E7%9A%84%E7%AC%AC%E4%B8%80%E4%B8%AA%E5%BA%94%E7%94%A8-%E5%BC%80%E5%A7%8B%E4%BD%BF%E7%94%A8%E6%9C%8D%E5%8A%A1) 获取 `APPID、 APISecret、 APIKey` 三个参数
-
-```json
-{
-  "model": "xunfei",
-  "xunfei_app_id": "",
-  "xunfei_api_key": "",
-  "xunfei_api_secret": "",
-  "xunfei_domain": "4.0Ultra",
-  "xunfei_spark_url": "wss://spark-api.xf-yun.com/v4.0/chat"
-}
-```
- - `model`: 填 `xunfei`
- - `xunfei_domain`: 可填写 `4.0Ultra、generalv3.5、max-32k、generalv3、pro-128k、lite`
- - `xunfei_spark_url`: 填写参考 [官方文档-请求地址](https://www.xfyun.cn/doc/spark/Web.html#_1-1-%E8%AF%B7%E6%B1%82%E5%9C%B0%E5%9D%80) 的说明
- 
-方式二：OpenAI 兼容方式接入，配置如下：
-```json
-{
-  "bot_type": "openai",
-  "model": "4.0Ultra",
-  "open_ai_api_base": "https://spark-api-open.xf-yun.com/v1",
-  "open_ai_api_key": ""
-}
-```
-- `bot_type`: OpenAI 兼容方式
-- `model`: 可填写 `4.0Ultra、generalv3.5、max-32k、generalv3、pro-128k、lite`
-- `open_ai_api_base`: 讯飞星火平台的 BASE URL
-- `open_ai_api_key`: 讯飞星火平台的[APIPassword](https://console.xfyun.cn/services/bm3) ，因模型而已
-</details>
-
-<details>
-<summary>Coding Plan</summary>
-
-Coding Plan 是各厂商推出的编程包月套餐，所有厂商均可通过 OpenAI 兼容方式接入：
-
-```json
-{
-  "bot_type": "openai",
-  "model": "模型名称",
-  "open_ai_api_base": "厂商 Coding Plan API Base",
-  "open_ai_api_key": "YOUR_API_KEY"
-}
-```
-
-目前支持阿里云、MiniMax、智谱 GLM、Kimi、火山引擎等厂商，各厂商详细配置请参考 [Coding Plan 文档](https://docs.cowagent.ai/models/coding-plan)。
-</details>
-
-
-## 通道说明
-
-推荐通过 Web 控制台在线管理通道配置，无需手动编辑文件，详见 [通道文档](https://docs.cowagent.ai/channels/weixin)。以下为手动修改 `config.json` 配置通道的说明：
-
-支持同时可接入多个通道，配置时可通过逗号进行分割，例如 `"channel_type": "feishu,dingtalk"`。
-
-<details>
-<summary>1. Weixin - 微信</summary>
-
-接入个人微信，扫码登录即可使用，支持文本、图片、语音、文件等消息收发。
-
-```json
-{
-    "channel_type": "weixin"
-}
-```
-
-启动后终端会显示二维码，使用微信扫码授权即可，也可以在 Web 控制台的「通道」页面中扫码接入。登录凭证会自动保存至 `~/.weixin_cow_credentials.json`，下次启动无需重新扫码，如需重新登录删除该文件后重启即可。
-
-详细步骤和参数说明参考 [微信接入](https://docs.cowagent.ai/channels/weixin)
-
-</details>
-
-<details>
-<summary>2. Web</summary>
-
-项目启动后会默认运行 Web 控制台，配置如下：
-
-```json
-{
-    "channel_type": "web",
-    "web_host": "0.0.0.0",
-    "web_password": "YOUR PASSWORD",
-    "web_port": 9899
-}
-```
-
-- `web_host`: 监听地址，默认 `127.0.0.1`（仅本机），如需公网访问请改为 `0.0.0.0` 并设置密码
-- `web_port`: 默认为 9899，可按需更改，需要服务器防火墙和安全组放行该端口
-- `web_password`: 访问密码，留空则不启用密码保护。部署在公网环境时请务必设置
-- 如本地运行，启动后请访问 `http://localhost:9899` ；如服务器运行，请访问 `http://YOUR_IP:9899`
-> 注：请将上述 url 中的 ip 或者 port 替换为实际的值
-</details>
-
-<details>
-<summary>3. Feishu - 飞书</summary>
-
-飞书使用 WebSocket 长连接模式，无需公网 IP。详细步骤参考 [飞书接入](https://docs.cowagent.ai/channels/feishu)。
-
-**方式一：扫码一键创建（推荐）**
-
-启动 Cow 后打开 Web 控制台，**通道** → **接入通道** → 选择 **飞书** → 扫码创建。也支持 CLI 启动时在终端打印二维码。
-
-**方式二：手动配置**
-
-在飞书开放平台创建自建应用并配置权限后，将凭据填入 `config.json`：
-
-```json
-{
-    "channel_type": "feishu",
-    "feishu_app_id": "APP_ID",
-    "feishu_app_secret": "APP_SECRET",
-    "feishu_stream_reply": true
-}
-```
-
-- `feishu_stream_reply`：是否开启流式打字机回复，默认开启（需 `cardkit:card:write` 权限 + 飞书客户端 ≥ 7.20）
-
-</details>
-
-<details>
-<summary>4. DingTalk - 钉钉</summary>
-
-钉钉需要在开放平台创建智能机器人应用，将以下配置填入 `config.json`：
-
-```json
-{
-    "channel_type": "dingtalk",
-    "dingtalk_client_id": "CLIENT_ID",
-    "dingtalk_client_secret": "CLIENT_SECRET"
-}
-```
-详细步骤和参数说明参考 [钉钉接入](https://docs.cowagent.ai/channels/dingtalk)
-</details>
-
-<details>
-<summary>5. WeCom Bot - 企微智能机器人</summary>
-
-企微智能机器人使用 WebSocket 长连接模式，无需公网 IP 和域名。详细步骤参考 [企微智能机器人接入](https://docs.cowagent.ai/channels/wecom-bot)。
-
-**方式一：扫码一键创建（推荐）**
-
-启动 Cow 后打开 Web 控制台，**通道** → **接入通道** → 选择 **企微智能机器人** → 使用企业微信扫码创建。
-
-**方式二：手动配置**
-
-在企业微信中创建智能机器人并选择**长连接模式**，记录 Bot ID 和 Secret 后填入 `config.json`：
-
-```json
-{
-    "channel_type": "wecom_bot",
-    "wecom_bot_id": "YOUR_BOT_ID",
-    "wecom_bot_secret": "YOUR_SECRET"
-}
-```
-
-</details>
-
-<details>
-<summary>6. QQ - QQ 机器人</summary>
-
-QQ 机器人使用 WebSocket 长连接模式，无需公网 IP 和域名，支持 QQ 单聊、群聊和频道消息：
-
-```json
-{
-    "channel_type": "qq",
-    "qq_app_id": "YOUR_APP_ID",
-    "qq_app_secret": "YOUR_APP_SECRET"
-}
-```
-详细步骤和参数说明参考 [QQ 机器人接入](https://docs.cowagent.ai/channels/qq)
-
-</details>
-
-<details>
-<summary>7. WeCom App - 企业微信应用</summary>
-
-企业微信自建应用接入需在后台创建应用并启用消息回调，配置示例：
-
-```json
-{
-    "channel_type": "wechatcom_app",
-    "wechatcom_corp_id": "CORPID",
-    "wechatcomapp_token": "TOKEN",
-    "wechatcomapp_port": 9898,
-    "wechatcomapp_secret": "SECRET",
-    "wechatcomapp_agent_id": "AGENTID",
-    "wechatcomapp_aes_key": "AESKEY"
-}
-```
-详细步骤和参数说明参考 [企微自建应用接入](https://docs.cowagent.ai/channels/wecom)
-
-</details>
-
-<details>
-<summary>8. WeChat MP - 微信公众号</summary>
-
-本项目支持订阅号和服务号两种公众号，通过服务号（`wechatmp_service`）体验更佳。
-
-**个人订阅号（wechatmp）**
-
-```json
-{
-    "channel_type": "wechatmp",
-    "wechatmp_token": "TOKEN",
-    "wechatmp_port": 80,
-    "wechatmp_app_id": "APPID",
-    "wechatmp_app_secret": "APPSECRET",
-    "wechatmp_aes_key": ""
-}
-```
-
-**企业服务号（wechatmp_service）**
-
-```json
-{
-    "channel_type": "wechatmp_service",
-    "wechatmp_token": "TOKEN",
-    "wechatmp_port": 80,
-    "wechatmp_app_id": "APPID",
-    "wechatmp_app_secret": "APPSECRET",
-    "wechatmp_aes_key": ""
-}
-```
-
-详细步骤和参数说明参考 [微信公众号接入](https://docs.cowagent.ai/channels/wechatmp)
-
-</details>
-
-<details>
-<summary>9. Terminal - 终端</summary>
-
-修改 `config.json` 中的 `channel_type` 字段：
-
-```json
-{
-    "channel_type": "terminal"
-}
-```
-
-运行后可在终端与机器人进行对话。
-
-</details>
-
 <br/>
 
-# 🔗 相关项目
+## 🤖 Models
 
-- [Cow Skill Hub](https://github.com/zhayujie/cow-skill-hub)：开源的 AI Agent 技能广场，浏览、搜索、安装和发布技能，支持 CowAgent、OpenClaw、Claude Code 等多种 Agent。
-- [bot-on-anything](https://github.com/zhayujie/bot-on-anything)：轻量和高可扩展的大模型应用框架，支持接入 Slack, Telegram, Discord, Gmail 等海外平台，可作为本项目的补充使用。
-- [AgentMesh](https://github.com/MinimalFuture/AgentMesh)：开源的多智能体( Multi-Agent )框架，可以通过多智能体团队的协同来解决复杂问题。
+CowAgent supports all mainstream LLM providers. **Chat, vision, image generation, ASR/TTS, and embeddings** can each be routed to a different vendor. Providers are configured directly in the Web console — no manual file editing required.
 
+| Provider | Featured Models | Chat | Vision | Image Gen | ASR | TTS | Embedding |
+| --- | --- | :-: | :-: | :-: | :-: | :-: | :-: |
+| [Claude](https://docs.cowagent.ai/en/models/claude) | claude-opus-4-8 | ✅ | ✅ | | | | |
+| [OpenAI](https://docs.cowagent.ai/en/models/openai) | gpt-5.5, o-series | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [Gemini](https://docs.cowagent.ai/en/models/gemini) | gemini-3.5-flash | ✅ | ✅ | ✅ | | | |
+| [DeepSeek](https://docs.cowagent.ai/en/models/deepseek) | deepseek-v4-flash / pro | ✅ | | | | | |
+| [Qwen](https://docs.cowagent.ai/en/models/qwen) | qwen3.7-max | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [GLM](https://docs.cowagent.ai/en/models/glm) | glm-5.1, glm-5v-turbo | ✅ | ✅ | | ✅ | | ✅ |
+| [Doubao](https://docs.cowagent.ai/en/models/doubao) | doubao-seed-2.0 series | ✅ | ✅ | ✅ | | | ✅ |
+| [Kimi](https://docs.cowagent.ai/en/models/kimi) | kimi-k2.6 | ✅ | ✅ | | | | |
+| [MiniMax](https://docs.cowagent.ai/en/models/minimax) | MiniMax-M2.7 | ✅ | ✅ | ✅ | | ✅ | |
+| [ERNIE](https://docs.cowagent.ai/en/models/qianfan) | ernie-5.1 | ✅ | ✅ | | | | |
+| [MiMo](https://docs.cowagent.ai/en/models/mimo) | mimo-v2.5 / pro | ✅ | ✅ | | | ✅ | |
+| [LinkAI](https://docs.cowagent.ai/en/models/linkai) | One key for 100+ models | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [Custom](https://docs.cowagent.ai/en/models/custom) | Local models / third-party proxy | ✅ | | | | | |
 
+> For details on each provider, see the [Models overview](https://docs.cowagent.ai/en/models/index).
 
+<br/>
 
-# 🔎 常见问题
+## 💬 Channels
 
-FAQs： <https://github.com/zhayujie/CowAgent/wiki/FAQs>
+A single Agent instance can serve multiple channels in parallel. Most channels can be onboarded right from the Web console.
 
-或直接在线咨询 [项目小助手](https://link-ai.tech/app/Kv2fXJcH)  (知识库持续完善中，回复供参考)
+| Channel | Text | Image | File | Voice | Group |
+| --- | :-: | :-: | :-: | :-: | :-: |
+| [Web Console](https://docs.cowagent.ai/en/channels/web) (default) | ✅ | ✅ | ✅ | ✅ | |
+| [WeChat](https://docs.cowagent.ai/en/channels/weixin) | ✅ | ✅ | ✅ | ✅ | |
+| [Feishu / Lark](https://docs.cowagent.ai/en/channels/feishu) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [DingTalk](https://docs.cowagent.ai/en/channels/dingtalk) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [WeCom Bot](https://docs.cowagent.ai/en/channels/wecom-bot) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [QQ](https://docs.cowagent.ai/en/channels/qq) | ✅ | ✅ | ✅ | | ✅ |
+| [WeCom App](https://docs.cowagent.ai/en/channels/wecom) | ✅ | ✅ | ✅ | ✅ | |
+| [WeChat Official Account](https://docs.cowagent.ai/en/channels/wechatmp) | ✅ | ✅ | | ✅ | |
+| [Telegram](https://docs.cowagent.ai/en/channels/telegram) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [Slack](https://docs.cowagent.ai/en/channels/slack) | ✅ | ✅ | ✅ | | ✅ |
 
-# 🛠️ 开发
+> See the [Channels overview](https://docs.cowagent.ai/en/channels/index) for setup details.
 
-欢迎接入更多应用通道，参考 [飞书通道](https://github.com/zhayujie/CowAgent/blob/master/channel/feishu/feishu_channel.py) 新增自定义通道，实现接收和发送消息逻辑即可完成接入。同时欢迎贡献新的 Skills，向 [Skill Hub](https://skills.cowagent.ai/submit) 提交技能。
+<img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/screenshots/en/web-console-chat.png" alt="CowAgent Web Console" width="800"/>
 
-# ✉ 联系
+*The Web console is the default channel and the unified entry point to configure models, channels, skills, memory, and more.*
 
-欢迎提交PR、Issues进行反馈，以及通过 🌟Star 支持并关注项目更新。项目运行遇到问题可以查看 [常见问题列表](https://github.com/zhayujie/CowAgent/wiki/FAQs) ，以及前往 [Issues](https://github.com/zhayujie/CowAgent/issues) 中搜索。个人开发者可加入开源交流群参与更多讨论，企业用户可联系[产品客服](https://cdn.link-ai.tech/portal/linkai-customer-service.png)咨询。
+<br/>
 
-# 🌟 贡献者
+## 🧠 Memory & Knowledge Base
+
+**Long-term memory** uses a three-tier architecture: conversation context (short-term) → daily memory (mid-term) → MEMORY.md (long-term). A nightly **Deep Dream** pass distills scattered memories into refined long-term entries and a narrative journal. See [Long-term Memory](https://docs.cowagent.ai/en/memory/index) · [Deep Dream](https://docs.cowagent.ai/en/memory/deep-dream).
+
+**Personal knowledge base** complements the time-ordered memory by organizing structured knowledge **by topic**. The Agent automatically curates valuable information from conversations, maintains cross-references and indexes, and the Web console offers an interactive knowledge-graph view. See [Personal Knowledge Base](https://docs.cowagent.ai/en/knowledge/index).
+
+<table>
+  <tr>
+    <td width="50%">
+      <img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/screenshots/en/web-console-memory.png" alt="Long-term Memory" />
+      <p align="center"><em>Long-term Memory · Three-tier architecture + Deep Dream</em></p>
+    </td>
+    <td width="50%">
+      <img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/screenshots/en/web-console-knowledge.png" alt="Personal Knowledge Base" />
+      <p align="center"><em>Knowledge Base · Auto-curated Markdown wiki</em></p>
+    </td>
+  </tr>
+</table>
+
+<br/>
+
+## 🔧 Tools & Skills
+
+**Tools** are atomic capabilities the Agent uses to interact with system resources. **Skills** are higher-level workflows defined by a manifest file that compose multiple tools to accomplish complex tasks.
+
+### Tool System
+
+**Built-in tools** cover file I/O (`read` / `write` / `edit` / `ls`), terminal (`bash`), file sending (`send`), memory retrieval (`memory`), environment variables (`env_config`), web fetching (`web_fetch`), scheduling (`scheduler`), web search (`web_search`), vision (`vision`), and browser automation (`browser`).
+
+**MCP protocol** integrates the open ecosystem of [Model Context Protocol](https://modelcontextprotocol.io) servers. A single `mcp.json` is enough — supports stdio / SSE transports, hot reload, and zero-code integration.
+
+Learn more: [Tools overview](https://docs.cowagent.ai/en/tools/index) · [MCP integration](https://docs.cowagent.ai/en/tools/mcp).
+
+### Skills System
+
+- **[Skill Hub](https://skills.cowagent.ai/)** — open skill marketplace: browse, search, install in one click
+- **GitHub / ClawHub / URL and more** — install skills from any source
+- **Conversational authoring** — generate custom skills through dialogue with `skill-creator`; turn any workflow or third-party API into a reusable skill
+
+```bash
+/skill list                   # list installed skills
+/skill search <keyword>        # search the marketplace
+/skill install <name>          # one-click install
+```
+
+Learn more: [Skills overview](https://docs.cowagent.ai/en/skills/index) · [Creating Skills](https://docs.cowagent.ai/en/skills/create).
+
+<br/>
+
+## 🏷 Changelog
+
+> **2026.05.22:** [v2.0.9](https://github.com/zhayujie/CowAgent/releases/tag/2.0.9) — Model management, MCP protocol support, persistent browser sessions, new models (gpt-5.5, gemini-3.5-flash, qwen3.7-max), deployment hardening.
+
+> **2026.05.06:** [v2.0.8](https://github.com/zhayujie/CowAgent/releases/tag/2.0.8) — Feishu channel overhaul (voice, streaming, QR onboarding), DeepSeek V4 and Baidu Qianfan support, scheduler tool upgrades.
+
+> **2026.04.22:** [v2.0.7](https://github.com/zhayujie/CowAgent/releases/tag/2.0.7) — Built-in image generation (GPT Image 2, Nano Banana), new models (Kimi K2.6, Claude Opus 4.7, GLM 5.1), memory and knowledge enhancements.
+
+> **2026.04.14:** [v2.0.6](https://github.com/zhayujie/CowAgent/releases/tag/2.0.6) — Knowledge base, Deep Dream memory distillation, smart context compression, multi-session Web console.
+
+> **2026.04.01:** [v2.0.5](https://github.com/zhayujie/CowAgent/releases/tag/2.0.5) — Cow CLI, Skill Hub open source, browser tool, WeCom Bot QR onboarding.
+
+> **2026.02.03:** [v2.0.0](https://github.com/zhayujie/CowAgent/releases/tag/2.0.0) — Major upgrade to a super Agent assistant with multi-step task planning, long-term memory, and the Skills framework.
+
+Full history: [Release Notes](https://docs.cowagent.ai/en/releases/overview)
+
+<br/>
+
+## 🤝 Community & Support
+
+[File an issue](https://github.com/zhayujie/CowAgent/issues) on GitHub, or scan the QR code below to join our WeChat community:
+
+<img width="130" src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/open-community.png">
+
+<br/>
+
+## 🔗 Related Projects
+
+- **[Cow Skill Hub](https://github.com/zhayujie/cow-skill-hub)** — open skill marketplace for AI Agents; works with CowAgent, OpenClaw, Claude Code, and more
+- **[bot-on-anything](https://github.com/zhayujie/bot-on-anything)** — lightweight LLM application framework with integrations for Slack, Telegram, Discord, Gmail, and more
+- **[AgentMesh](https://github.com/MinimalFuture/AgentMesh)** — open-source multi-agent framework for solving complex problems through team collaboration
+
+<br/>
+
+## 🏢 Enterprise Services
+
+[**LinkAI**](https://link-ai.tech/) is an all-in-one AI Agent platform for enterprises and developers, offering managed hosting and enterprise-grade support for CowAgent:
+
+- **🚀 Zero-deployment hosted runtime** — spin up a [CowAgent online assistant](https://link-ai.tech/cowagent/create) in under a minute, no server required
+- **🧠 Agent infrastructure** — unified access to LLMs, knowledge bases, databases, skills, and workflows; plug-and-play building blocks that extend what CowAgent can do
+- **🏢 Team & enterprise features** — workspaces, role-based access, audit logs, and private deployment for production use cases
+
+For enterprise inquiries: sales@simple-future.tech or [scan the QR code](https://cdn.link-ai.tech/consultant.jpg) to reach our team on WeChat.
+
+<br/>
+
+## 🛠️ Development & Contributing
+
+Contributions are welcome — add a new channel by following the [Feishu channel reference](https://github.com/zhayujie/CowAgent/blob/master/channel/feishu/feishu_channel.py), or contribute new skills to [Skill Hub](https://skills.cowagent.ai/submit).
+
+⭐ Star the project to follow updates, and feel free to open PRs and Issues.
+
+## 🌟 Contributors
 
 ![cow contributors](https://contrib.rocks/image?repo=zhayujie/CowAgent&max=1000)
 
-# 📌 项目更名说明
+<br/>
 
-本项目原名 `chatgpt-on-wechat`（GitHub 原地址：https://github.com/zhayujie/chatgpt-on-wechat ），
-于 2026.04.13 正式更名为 **CowAgent**。GitHub 已自动设置重定向，原有链接仍可正常访问。
+## ⚠️ Disclaimer
 
-如需更新本地仓库的远程地址（可选）：
-```bash
-git remote set-url origin https://github.com/zhayujie/CowAgent.git
-```
+1. This project is licensed under the [MIT License](/LICENSE) and is intended for technical research and learning. You are responsible for complying with applicable laws and regulations in your jurisdiction; the maintainers assume no liability for any consequences arising from use of this project.
+2. **Cost & safety:** Agent mode consumes substantially more tokens than regular chat — pick models that balance quality and cost. The Agent has access to your local operating system, so only deploy it in trusted environments.
+3. CowAgent is a pure open-source project and does not participate in, authorize, or issue any cryptocurrency.
+
+<br/>
+
+## 📌 Project Renaming Notice
+
+This project was previously named `chatgpt-on-wechat` and is now officially **CowAgent**. The old GitHub URL redirects automatically; existing users may optionally run `git remote set-url origin https://github.com/zhayujie/CowAgent.git` to update the local remote.
diff --git a/agent/memory/conversation_store.py b/agent/memory/conversation_store.py
index c5d215bf..48148f61 100644
--- a/agent/memory/conversation_store.py
+++ b/agent/memory/conversation_store.py
@@ -44,6 +44,7 @@ CREATE TABLE IF NOT EXISTS messages (
     role         TEXT    NOT NULL,
     content      TEXT    NOT NULL,
     created_at   INTEGER NOT NULL,
+    extras       TEXT    NOT NULL DEFAULT '',
     UNIQUE (session_id, seq)
 );
 
@@ -67,6 +68,12 @@ _MIGRATION_ADD_CONTEXT_START_SEQ = """
 ALTER TABLE sessions ADD COLUMN context_start_seq INTEGER NOT NULL DEFAULT 0;
 """
 
+# Generic JSON sidecar for per-message attachments (TTS audio URL, future use).
+# Always optional — readers must tolerate missing column / empty / invalid JSON.
+_MIGRATION_ADD_MSG_EXTRAS = """
+ALTER TABLE messages ADD COLUMN extras TEXT NOT NULL DEFAULT '';
+"""
+
 DEFAULT_MAX_AGE_DAYS: int = 30
 
 
@@ -169,20 +176,26 @@ def _group_into_display_turns(
     cur_rest: List[tuple] = []
     started = False
 
-    for role, raw_content, created_at in rows:
+    for role, raw_content, created_at, raw_extras in rows:
         try:
             content = json.loads(raw_content)
         except Exception:
             content = raw_content
+        try:
+            extras = json.loads(raw_extras) if raw_extras else {}
+            if not isinstance(extras, dict):
+                extras = {}
+        except Exception:
+            extras = {}
 
         if role == "user" and _is_visible_user_message(content):
             if started:
                 groups.append((cur_user, cur_rest))
-            cur_user = (content, created_at)
+            cur_user = (content, created_at, extras)
             cur_rest = []
             started = True
         else:
-            cur_rest.append((role, content, created_at))
+            cur_rest.append((role, content, created_at, extras))
 
     if started:
         groups.append((cur_user, cur_rest))
@@ -195,7 +208,7 @@ def _group_into_display_turns(
     for user_row, rest in groups:
         # User turn
         if user_row:
-            content, created_at = user_row
+            content, created_at, _u_extras = user_row
             text = _extract_display_text(content)
             if text:
                 turns.append({"role": "user", "content": text, "created_at": created_at})
@@ -206,8 +219,11 @@ def _group_into_display_turns(
         tool_results: Dict[str, str] = {}
         final_text = ""
         final_ts: Optional[int] = None
+        merged_extras: Dict[str, Any] = {}
 
-        for role, content, created_at in rest:
+        for role, content, created_at, extras in rest:
+            if role == "assistant" and isinstance(extras, dict):
+                merged_extras.update(extras)
             if role == "user":
                 tool_results.update(_extract_tool_results(content))
             elif role == "assistant":
@@ -256,6 +272,8 @@ def _group_into_display_turns(
                 "steps": steps,
                 "created_at": final_ts or (user_row[1] if user_row else 0),
             }
+            if merged_extras:
+                turn["extras"] = merged_extras
             turns.append(turn)
 
     return turns
@@ -411,13 +429,15 @@ class ConversationStore:
                         content = json.dumps(
                             msg.get("content", ""), ensure_ascii=False
                         )
+                        extras_obj = msg.get("extras") or {}
+                        extras = json.dumps(extras_obj, ensure_ascii=False) if extras_obj else ""
                         conn.execute(
                             """
                             INSERT OR IGNORE INTO messages
-                                (session_id, seq, role, content, created_at)
-                            VALUES (?, ?, ?, ?, ?)
+                                (session_id, seq, role, content, created_at, extras)
+                            VALUES (?, ?, ?, ?, ?, ?)
                             """,
-                            (session_id, next_seq, role, content, now),
+                            (session_id, next_seq, role, content, now, extras),
                         )
                         next_seq += 1
 
@@ -651,6 +671,55 @@ class ConversationStore:
             logger.info(f"[ConversationStore] Pruned {deleted} expired sessions")
         return deleted
 
+    def attach_extras_to_last_assistant(
+        self,
+        session_id: str,
+        extras: Dict[str, Any],
+    ) -> Optional[int]:
+        """
+        Merge ``extras`` into the latest assistant message of a session.
+
+        Used by post-processing (e.g. TTS) that needs to annotate an already
+        persisted bot reply with attachments such as audio URLs.
+
+        Returns the message seq that was updated, or ``None`` if no assistant
+        message exists or the update could not be applied.
+        """
+        if not extras:
+            return None
+        with self._lock:
+            conn = self._connect()
+            try:
+                row = conn.execute(
+                    """
+                    SELECT seq, extras FROM messages
+                    WHERE session_id = ? AND role = 'assistant'
+                    ORDER BY seq DESC LIMIT 1
+                    """,
+                    (session_id,),
+                ).fetchone()
+                if not row:
+                    return None
+                seq, raw = row
+                try:
+                    cur = json.loads(raw) if raw else {}
+                    if not isinstance(cur, dict):
+                        cur = {}
+                except Exception:
+                    cur = {}
+                cur.update(extras)
+                conn.execute(
+                    "UPDATE messages SET extras = ? WHERE session_id = ? AND seq = ?",
+                    (json.dumps(cur, ensure_ascii=False), session_id, seq),
+                )
+                conn.commit()
+                return seq
+            except Exception as e:
+                logger.warning(f"[ConversationStore] attach_extras failed: {e}")
+                return None
+            finally:
+                conn.close()
+
     def load_history_page(
         self,
         session_id: str,
@@ -698,15 +767,31 @@ class ConversationStore:
                 ).fetchone()
                 ctx_start = ctx_row[0] if ctx_row else 0
 
-                rows = conn.execute(
-                    """
-                    SELECT seq, role, content, created_at
-                    FROM messages
-                    WHERE session_id = ?
-                    ORDER BY seq ASC
-                    """,
-                    (session_id,),
-                ).fetchall()
+                # extras column is added by migration; tolerate older DBs that
+                # might miss it by falling back to a NULL literal.
+                try:
+                    rows = conn.execute(
+                        """
+                        SELECT seq, role, content, created_at, extras
+                        FROM messages
+                        WHERE session_id = ?
+                        ORDER BY seq ASC
+                        """,
+                        (session_id,),
+                    ).fetchall()
+                except sqlite3.OperationalError:
+                    rows = [
+                        (seq, role, content, created_at, "")
+                        for (seq, role, content, created_at) in conn.execute(
+                            """
+                            SELECT seq, role, content, created_at
+                            FROM messages
+                            WHERE session_id = ?
+                            ORDER BY seq ASC
+                            """,
+                            (session_id,),
+                        ).fetchall()
+                    ]
             finally:
                 conn.close()
 
@@ -719,13 +804,16 @@ class ConversationStore:
             include_thinking = False
 
         # Strip seq for display grouping, but record max seq per visible user group
-        plain_rows = [(role, content, created_at) for _seq, role, content, created_at in rows]
+        plain_rows = [
+            (role, content, created_at, extras_raw)
+            for _seq, role, content, created_at, extras_raw in rows
+        ]
         visible = _group_into_display_turns(plain_rows, include_thinking=include_thinking)
 
         # Build a mapping: find the seq of each visible user message to annotate context boundary.
         # Walk through rows to find visible user message seqs in order.
         visible_user_seqs: List[int] = []
-        for seq, role, raw_content, _ts in rows:
+        for seq, role, raw_content, _ts, _extras in rows:
             if role != "user":
                 continue
             try:
@@ -911,6 +999,18 @@ class ConversationStore:
             except Exception as e:
                 logger.warning(f"[ConversationStore] Migration (context_start_seq) failed: {e}")
 
+        msg_cols = {
+            row[1]
+            for row in conn.execute("PRAGMA table_info(messages)").fetchall()
+        }
+        if "extras" not in msg_cols:
+            try:
+                conn.execute(_MIGRATION_ADD_MSG_EXTRAS)
+                conn.commit()
+                logger.info("[ConversationStore] Migrated: added messages.extras column")
+            except Exception as e:
+                logger.warning(f"[ConversationStore] Migration (extras) failed: {e}")
+
     def _connect(self) -> sqlite3.Connection:
         conn = sqlite3.connect(str(self._db_path), timeout=10)
         conn.execute("PRAGMA journal_mode=WAL")
diff --git a/agent/memory/embedding/state.py b/agent/memory/embedding/state.py
index 3fb60b23..5efffef2 100644
--- a/agent/memory/embedding/state.py
+++ b/agent/memory/embedding/state.py
@@ -31,9 +31,13 @@ def detect_index_dim(storage) -> Optional[int]:
     if not row or not row["embedding"]:
         return None
     try:
-        emb = json.loads(row["embedding"])
+        raw = row["embedding"]
+        if isinstance(raw, (bytes, bytearray)):
+            # New BLOB format: 4 bytes per float32
+            return len(raw) // 4
+        emb = json.loads(raw)
         return len(emb) if isinstance(emb, list) else None
-    except (json.JSONDecodeError, TypeError):
+    except (json.JSONDecodeError, TypeError, Exception):
         return None
 
 
diff --git a/agent/memory/manager.py b/agent/memory/manager.py
index 6aaac767..5ec2ade7 100644
--- a/agent/memory/manager.py
+++ b/agent/memory/manager.py
@@ -13,7 +13,7 @@ from datetime import datetime, timedelta
 from agent.memory.config import MemoryConfig, get_default_memory_config
 from agent.memory.storage import MemoryStorage, MemoryChunk, SearchResult
 from agent.memory.chunker import TextChunker
-from agent.memory.embedding import EmbeddingProvider
+from agent.memory.embedding import EmbeddingProvider, EmbeddingCache
 from agent.memory.summarizer import MemoryFlushManager, create_memory_files_if_needed
 
 
@@ -61,7 +61,11 @@ class MemoryManager:
             logger.info(
                 "[MemoryManager] No embedding provider; memory will use keyword search only"
             )
-        
+
+        # Cache for query embeddings (avoids redundant API calls within a session)
+        self._embedding_cache = EmbeddingCache()
+
+
         # Initialize memory flush manager
         workspace_dir = self.config.get_workspace()
         self.flush_manager = MemoryFlushManager(
@@ -128,7 +132,14 @@ class MemoryManager:
         vector_results = []
         if self.embedding_provider:
             try:
-                query_embedding = self.embedding_provider.embed_query(query)
+                provider_name = type(self.embedding_provider).__name__
+                model_name = getattr(self.embedding_provider, 'model', '')
+                cached = self._embedding_cache.get(query, provider_name, model_name)
+                if cached is not None:
+                    query_embedding = cached
+                else:
+                    query_embedding = self.embedding_provider.embed_query(query)
+                    self._embedding_cache.put(query, provider_name, model_name, query_embedding)
                 vector_results = self.storage.search_vector(
                     query_embedding=query_embedding,
                     user_id=user_id,
diff --git a/agent/memory/storage.py b/agent/memory/storage.py
index 0a4e6edb..683b083f 100644
--- a/agent/memory/storage.py
+++ b/agent/memory/storage.py
@@ -5,12 +5,42 @@ Provides vector and keyword search capabilities
 """
 
 from __future__ import annotations
+import re
 import sqlite3
 import json
 import hashlib
+import threading
 from typing import List, Dict, Optional, Any
 from pathlib import Path
 from dataclasses import dataclass
+try:
+    import numpy as np
+    _HAS_NUMPY = True
+except ImportError:
+    _HAS_NUMPY = False
+    np = None  # type: ignore[assignment]
+
+# UPSERT (INSERT … ON CONFLICT DO UPDATE) requires SQLite ≥ 3.24.0 (2018).
+# Older systems (e.g. CentOS 7 ships SQLite 3.7) fall back to INSERT OR REPLACE,
+# which risks FTS5 rowid drift on chunk updates (see save_chunk docstring).
+_HAS_UPSERT = sqlite3.sqlite_version_info >= (3, 24, 0)
+
+# ---------------------------------------------------------------------------
+# CJK character ranges, compiled once at module load.
+# Covers: CJK Symbols/Punctuation, Japanese kana (hiragana + katakana),
+#         CJK Unified Ideographs + Extension A, Korean syllables (Hangul),
+#         CJK Compatibility Ideographs, and CJK Extension B–F.
+# ---------------------------------------------------------------------------
+_CJK_RANGES = (
+    r'\u3000-\u30ff'          # CJK Symbols/Punctuation + Japanese kana
+    r'\u3400-\u9fff'          # CJK Unified Ideographs (incl. Extension A)
+    r'\uac00-\ud7af'          # Korean syllables (Hangul)
+    r'\uf900-\ufaff'          # CJK Compatibility Ideographs
+    r'\U00020000-\U0002fa1f'  # CJK Extension B–F
+)
+_RE_CONTAINS_CJK   = re.compile(f'[{_CJK_RANGES}]')
+_RE_CJK_WORDS      = re.compile(f'[{_CJK_RANGES}]+')
+_RE_TRIGRAM_TOKENS = re.compile(f'[{_CJK_RANGES}]+|[A-Za-z0-9_]+')
 
 
 @dataclass
@@ -48,6 +78,10 @@ class MemoryStorage:
         self.db_path = db_path
         self.conn: Optional[sqlite3.Connection] = None
         self.fts5_available = False  # Track FTS5 availability
+        # RLock protects concurrent writes from the same process.
+        # SQLite WAL mode handles read/write concurrency at the file level,
+        # but same-process concurrent writes still need a Python-level lock.
+        self._lock = threading.RLock()
         self._init_db()
     
     def _check_fts5_support(self) -> bool:
@@ -69,6 +103,14 @@ class MemoryStorage:
             
             # Check FTS5 support
             self.fts5_available = self._check_fts5_support()
+            if not _HAS_UPSERT:
+                from common.log import logger
+                logger.warning(
+                    "[MemoryStorage] SQLite %s < 3.24 — UPSERT unavailable. "
+                    "Falling back to INSERT OR REPLACE; FTS5 rowid may drift on "
+                    "chunk updates (rebuild index periodically to recover).",
+                    sqlite3.sqlite_version,
+                )
             if not self.fts5_available:
                 from common.log import logger
                 logger.debug("[MemoryStorage] FTS5 not available, using LIKE-based keyword search")
@@ -175,6 +217,75 @@ class MemoryStorage:
                 )
                 self._rebuild_fts5_from_chunks()
 
+        # Internal key-value store for persistent flags (e.g. backfill tracking)
+        self.conn.execute("""
+            CREATE TABLE IF NOT EXISTS _meta (
+                key TEXT PRIMARY KEY,
+                value TEXT NOT NULL
+            )
+        """)
+
+        # Create trigram FTS5 table for CJK / mixed-language search
+        self.trigram_fts5_available = False
+        if self.fts5_available:
+            try:
+                self.conn.execute("""
+                    CREATE VIRTUAL TABLE IF NOT EXISTS chunks_fts_trigram USING fts5(
+                        text,
+                        id UNINDEXED,
+                        user_id UNINDEXED,
+                        path UNINDEXED,
+                        source UNINDEXED,
+                        scope UNINDEXED,
+                        content='chunks',
+                        content_rowid='rowid',
+                        tokenize='trigram case_sensitive 0'
+                    )
+                """)
+                self.conn.execute("""
+                    CREATE TRIGGER IF NOT EXISTS chunks_trigram_ai
+                    AFTER INSERT ON chunks BEGIN
+                        INSERT INTO chunks_fts_trigram(rowid, text, id, user_id, path, source, scope)
+                        VALUES (new.rowid, new.text, new.id, new.user_id, new.path, new.source, new.scope);
+                    END
+                """)
+                self.conn.execute("""
+                    CREATE TRIGGER IF NOT EXISTS chunks_trigram_ad
+                    AFTER DELETE ON chunks BEGIN
+                        DELETE FROM chunks_fts_trigram WHERE rowid = old.rowid;
+                    END
+                """)
+                self.conn.execute("""
+                    CREATE TRIGGER IF NOT EXISTS chunks_trigram_au
+                    AFTER UPDATE ON chunks BEGIN
+                        UPDATE chunks_fts_trigram
+                        SET text=new.text, id=new.id, user_id=new.user_id,
+                            path=new.path, source=new.source, scope=new.scope
+                        WHERE rowid = new.rowid;
+                    END
+                """)
+                # One-time backfill for existing rows.
+                # NOTE: COUNT(*) on an FTS5 content table always returns 0, so we
+                # use a persistent flag in _meta instead of counting trigram rows.
+                backfill_done = self.conn.execute(
+                    "SELECT 1 FROM _meta WHERE key = 'trigram_backfill_done'"
+                ).fetchone()
+                chunks_count = self.conn.execute(
+                    "SELECT COUNT(*) as c FROM chunks"
+                ).fetchone()['c']
+                if chunks_count > 0 and not backfill_done:
+                    self.conn.execute(
+                        "INSERT INTO chunks_fts_trigram(chunks_fts_trigram) VALUES('rebuild')"
+                    )
+                    self.conn.execute(
+                        "INSERT OR REPLACE INTO _meta(key, value) VALUES('trigram_backfill_done', '1')"
+                    )
+                self.trigram_fts5_available = True
+            except Exception:
+                from common.log import logger
+                logger.warning("[MemoryStorage] trigram FTS5 unavailable, CJK search will use LIKE fallback", exc_info=True)
+                self.trigram_fts5_available = False
+
         # Create files metadata table
         self.conn.execute("""
             CREATE TABLE IF NOT EXISTS files (
@@ -186,7 +297,7 @@ class MemoryStorage:
                 updated_at INTEGER DEFAULT (strftime('%s', 'now'))
             )
         """)
-        
+
         self.conn.commit()
 
     def _fts5_state_inconsistent(self) -> bool:
@@ -299,43 +410,98 @@ class MemoryStorage:
         self.conn.commit()
 
     def save_chunk(self, chunk: MemoryChunk):
-        """Save a memory chunk"""
-        self.conn.execute("""
-            INSERT OR REPLACE INTO chunks 
-            (id, user_id, scope, source, path, start_line, end_line, text, embedding, hash, metadata, updated_at)
-            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, strftime('%s', 'now'))
-        """, (
-            chunk.id,
-            chunk.user_id,
-            chunk.scope,
-            chunk.source,
-            chunk.path,
-            chunk.start_line,
-            chunk.end_line,
-            chunk.text,
-            json.dumps(chunk.embedding) if chunk.embedding else None,
+        """Save a memory chunk (insert or update by id).
+
+        Uses SQLite UPSERT (INSERT … ON CONFLICT DO UPDATE) instead of
+        INSERT OR REPLACE.  INSERT OR REPLACE internally does DELETE+INSERT,
+        which changes the row's rowid.  Because both FTS5 tables use
+        content_rowid='rowid', a new rowid would leave the old FTS index
+        entries pointing at a non-existent rowid and trigger
+        "fts5: missing row N from content table" errors.
+        ON CONFLICT DO UPDATE fires the AFTER UPDATE trigger (chunks_au /
+        chunks_trigram_au) and keeps the original rowid intact.
+        """
+        if _HAS_UPSERT:
+            _SQL = """
+                INSERT INTO chunks
+                (id, user_id, scope, source, path, start_line, end_line,
+                 text, embedding, hash, metadata, updated_at)
+                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, strftime('%s', 'now'))
+                ON CONFLICT(id) DO UPDATE SET
+                    user_id     = excluded.user_id,
+                    scope       = excluded.scope,
+                    source      = excluded.source,
+                    path        = excluded.path,
+                    start_line  = excluded.start_line,
+                    end_line    = excluded.end_line,
+                    text        = excluded.text,
+                    embedding   = excluded.embedding,
+                    hash        = excluded.hash,
+                    metadata    = excluded.metadata,
+                    updated_at  = strftime('%s', 'now')
+            """
+        else:
+            _SQL = """
+                INSERT OR REPLACE INTO chunks
+                (id, user_id, scope, source, path, start_line, end_line,
+                 text, embedding, hash, metadata, updated_at)
+                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, strftime('%s', 'now'))
+            """
+        params = (
+            chunk.id, chunk.user_id, chunk.scope, chunk.source, chunk.path,
+            chunk.start_line, chunk.end_line, chunk.text,
+            self._encode_embedding(chunk.embedding),
             chunk.hash,
-            json.dumps(chunk.metadata) if chunk.metadata else None
-        ))
-        self.conn.commit()
-    
+            json.dumps(chunk.metadata) if chunk.metadata else None,
+        )
+        with self._lock:
+            self.conn.execute(_SQL, params)
+            self.conn.commit()
+
     def save_chunks_batch(self, chunks: List[MemoryChunk]):
-        """Save multiple chunks in a batch"""
-        self.conn.executemany("""
-            INSERT OR REPLACE INTO chunks 
-            (id, user_id, scope, source, path, start_line, end_line, text, embedding, hash, metadata, updated_at)
-            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, strftime('%s', 'now'))
-        """, [
+        """Save multiple chunks in a batch (insert or update by id).
+
+        See save_chunk for why UPSERT is used instead of INSERT OR REPLACE.
+        """
+        if _HAS_UPSERT:
+            _SQL = """
+                INSERT INTO chunks
+                (id, user_id, scope, source, path, start_line, end_line,
+                 text, embedding, hash, metadata, updated_at)
+                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, strftime('%s', 'now'))
+                ON CONFLICT(id) DO UPDATE SET
+                    user_id     = excluded.user_id,
+                    scope       = excluded.scope,
+                    source      = excluded.source,
+                    path        = excluded.path,
+                    start_line  = excluded.start_line,
+                    end_line    = excluded.end_line,
+                    text        = excluded.text,
+                    embedding   = excluded.embedding,
+                    hash        = excluded.hash,
+                    metadata    = excluded.metadata,
+                    updated_at  = strftime('%s', 'now')
+            """
+        else:
+            _SQL = """
+                INSERT OR REPLACE INTO chunks
+                (id, user_id, scope, source, path, start_line, end_line,
+                 text, embedding, hash, metadata, updated_at)
+                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, strftime('%s', 'now'))
+            """
+        params_list = [
             (
                 c.id, c.user_id, c.scope, c.source, c.path,
                 c.start_line, c.end_line, c.text,
-                json.dumps(c.embedding) if c.embedding else None,
+                self._encode_embedding(c.embedding),
                 c.hash,
-                json.dumps(c.metadata) if c.metadata else None
+                json.dumps(c.metadata) if c.metadata else None,
             )
             for c in chunks
-        ])
-        self.conn.commit()
+        ]
+        with self._lock:
+            self.conn.executemany(_SQL, params_list)
+            self.conn.commit()
     
     def get_chunk(self, chunk_id: str) -> Optional[MemoryChunk]:
         """Get a chunk by ID"""
@@ -356,21 +522,21 @@ class MemoryStorage:
         limit: int = 10
     ) -> List[SearchResult]:
         """
-        Vector similarity search using in-memory cosine similarity
-        (sqlite-vec can be added later for better performance)
+        Vector similarity search using numpy-vectorized cosine similarity.
+        All embeddings are loaded then scored in a single BLAS matrix-vector
+        multiply, which is ~100x faster than the pure-Python per-row loop.
         """
         if scopes is None:
             scopes = ["shared"]
             if user_id:
                 scopes.append("user")
-        
-        # Build query
+
         scope_placeholders = ','.join('?' * len(scopes))
-        params = scopes
-        
+        params = list(scopes)
+
         if user_id:
             query = f"""
-                SELECT * FROM chunks 
+                SELECT * FROM chunks
                 WHERE scope IN ({scope_placeholders})
                 AND (scope = 'shared' OR user_id = ?)
                 AND embedding IS NOT NULL
@@ -378,51 +544,95 @@ class MemoryStorage:
             params.append(user_id)
         else:
             query = f"""
-                SELECT * FROM chunks 
+                SELECT * FROM chunks
                 WHERE scope IN ({scope_placeholders})
                 AND embedding IS NOT NULL
             """
-        
+
         rows = self.conn.execute(query, params).fetchall()
+        if not rows:
+            return []
 
-        # Calculate cosine similarity. We probe the first row's dim to fail
-        # loudly on a query/index dim mismatch — otherwise every doc would
-        # score 0 silently, leaving the user wondering why search broke.
-        results = []
-        query_dim = len(query_embedding)
-        if rows:
-            first = json.loads(rows[0]['embedding'])
-            if isinstance(first, list) and len(first) != query_dim:
-                raise ValueError(
-                    f"Embedding dim mismatch: query is {query_dim}-dim but "
-                    f"index stores {len(first)}-dim vectors. The configured "
-                    f"embedding model differs from the one that built the "
-                    f"index — run /memory rebuild-index to re-embed."
-                )
-
+        # Parse embeddings and build a (N, D) matrix in one pass.
+        # New rows store BLOB bytes (np.frombuffer); legacy rows fall back to JSON.
+        # Filter out rows whose embedding dimension differs from the query —
+        # mixing dimensions would cause np.array() to produce an object array
+        # and matrix @ q_vec to raise ValueError.
+        expected_dim = len(query_embedding)
+        valid_rows = []
+        vectors = []
         for row in rows:
-            embedding = json.loads(row['embedding'])
-            similarity = self._cosine_similarity(query_embedding, embedding)
+            vec = self._decode_embedding(row['embedding'])
+            if not vec:
+                continue
+            if len(vec) != expected_dim:
+                from common.log import logger
+                logger.warning(
+                    "[MemoryStorage] Skipping chunk %s: embedding dim %d != query dim %d",
+                    row['id'], len(vec), expected_dim
+                )
+                continue
+            valid_rows.append(row)
+            vectors.append(vec)
 
-            if similarity > 0:
-                results.append((similarity, row))
-        
-        # Sort by similarity and limit
-        results.sort(key=lambda x: x[0], reverse=True)
-        results = results[:limit]
-        
-        return [
-            SearchResult(
-                path=row['path'],
-                start_line=row['start_line'],
-                end_line=row['end_line'],
-                score=score,
-                snippet=self._truncate_text(row['text'], 500),
-                source=row['source'],
-                user_id=row['user_id']
-            )
-            for score, row in results
-        ]
+        if not vectors:
+            return []
+
+        if _HAS_NUMPY:
+            matrix = np.array(vectors, dtype=np.float32)        # (N, D)
+            q_vec = np.array(query_embedding, dtype=np.float32)  # (D,)
+
+            # Vectorized cosine similarity: dot(matrix, q) / (||matrix|| * ||q||)
+            dots = matrix @ q_vec                                # (N,)
+            row_norms = np.linalg.norm(matrix, axis=1)           # (N,)
+            q_norm = float(np.linalg.norm(q_vec))
+            denominators = row_norms * q_norm
+            np.maximum(denominators, 1e-10, out=denominators)    # avoid div-by-zero
+            sims = dots / denominators                           # (N,)
+
+            # Select TopK using argpartition (O(N) average), then sort only those K
+            k = min(limit, len(valid_rows))
+            top_idx = np.argpartition(sims, -k)[-k:]
+            top_idx = top_idx[np.argsort(sims[top_idx])[::-1]]
+
+            return [
+                SearchResult(
+                    path=valid_rows[i]['path'],
+                    start_line=valid_rows[i]['start_line'],
+                    end_line=valid_rows[i]['end_line'],
+                    score=float(sims[i]),
+                    snippet=self._truncate_text(valid_rows[i]['text'], 500),
+                    source=valid_rows[i]['source'],
+                    user_id=valid_rows[i]['user_id']
+                )
+                for i in top_idx
+                if sims[i] > 0
+            ]
+        else:
+            # Pure-Python cosine similarity fallback (numpy not installed)
+            import math
+            q = query_embedding
+            q_norm = math.sqrt(sum(x * x for x in q)) or 1e-10
+            scored = []
+            for i, vec in enumerate(vectors):
+                dot = sum(a * b for a, b in zip(vec, q))
+                v_norm = math.sqrt(sum(x * x for x in vec)) or 1e-10
+                sim = dot / (v_norm * q_norm)
+                if sim > 0:
+                    scored.append((sim, valid_rows[i]))
+            scored.sort(key=lambda x: x[0], reverse=True)
+            return [
+                SearchResult(
+                    path=row['path'],
+                    start_line=row['start_line'],
+                    end_line=row['end_line'],
+                    score=sim,
+                    snippet=self._truncate_text(row['text'], 500),
+                    source=row['source'],
+                    user_id=row['user_id']
+                )
+                for sim, row in scored[:limit]
+            ]
     
     def search_keyword(
         self,
@@ -445,12 +655,37 @@ class MemoryStorage:
             if user_id:
                 scopes.append("user")
 
-        if self.fts5_available:
+        # Step 1: Standard FTS5 (unicode61) — pure ASCII queries only.
+        # Skipped when query contains any CJK characters: unicode61 tokenises CJK
+        # as individual characters without forming meaningful tokens, so it would
+        # match only the ASCII portion of a mixed query (e.g. "Python" from
+        # "Python教程") and silently discard the CJK part.  Those queries go
+        # directly to Step 2 (trigram), which handles both ASCII and CJK together.
+        fts1_attempted = False
+        if (self.fts5_available
+                and not MemoryStorage._contains_cjk(query)
+                and MemoryStorage._build_fts_query(query)):
+            fts1_attempted = True
             fts_results = self._search_fts5(query, user_id, scopes, limit)
             if fts_results:
                 return fts_results
 
-        return self._search_like(query, user_id, scopes, limit)
+        # Step 2: Trigram FTS5 — CJK/mixed queries, plus fallback when unicode61
+        # returned nothing (trigram indexes all scripts with 3-char sliding windows,
+        # so it can catch terms that unicode61 tokenisation misses).
+        if self.trigram_fts5_available and (
+            MemoryStorage._contains_cjk(query) or fts1_attempted
+        ):
+            trigram_results = self._search_fts5_trigram(query, user_id, scopes, limit)
+            if trigram_results:
+                return trigram_results
+
+        # Step 3: LIKE fallback — last resort (FTS5 unavailable, or CJK tokens
+        # shorter than 3 characters that trigram cannot match, e.g. a single-char query).
+        if not self.fts5_available or MemoryStorage._contains_cjk(query):
+            return self._search_like(query, user_id, scopes, limit)
+
+        return []
     
     def _search_fts5(
         self,
@@ -471,7 +706,7 @@ class MemoryStorage:
             sql_query = f"""
                 SELECT chunks.*, bm25(chunks_fts) as rank
                 FROM chunks_fts
-                JOIN chunks ON chunks.id = chunks_fts.id
+                JOIN chunks ON chunks.rowid = chunks_fts.rowid
                 WHERE chunks_fts MATCH ? 
                 AND chunks.scope IN ({scope_placeholders})
                 AND (chunks.scope = 'shared' OR chunks.user_id = ?)
@@ -483,7 +718,7 @@ class MemoryStorage:
             sql_query = f"""
                 SELECT chunks.*, bm25(chunks_fts) as rank
                 FROM chunks_fts
-                JOIN chunks ON chunks.id = chunks_fts.id
+                JOIN chunks ON chunks.rowid = chunks_fts.rowid
                 WHERE chunks_fts MATCH ? 
                 AND chunks.scope IN ({scope_placeholders})
                 ORDER BY rank
@@ -505,13 +740,11 @@ class MemoryStorage:
                 )
                 for row in rows
             ]
-        except Exception as e:
+        except Exception:
             from common.log import logger
-            logger.error(
-                f"[MemoryStorage] FTS5 search failed (caller will fall back to LIKE): {e}"
-            )
+            logger.warning("[MemoryStorage] _search_fts5 failed, returning empty", exc_info=True)
             return []
-    
+
     def _search_like(
         self,
         query: str,
@@ -522,12 +755,11 @@ class MemoryStorage:
         """LIKE-based search.
 
         Used as the keyword-search fallback when FTS5 is unavailable, fails,
-        or returns empty. Supports both CJK runs and ASCII word tokens so it
-        can serve as a true safety net for any query.
+        or returns empty. Supports both CJK runs (1+ chars) and ASCII word
+        tokens (3+ chars) so it can serve as a true safety net for any query.
         """
-        import re
-        # CJK runs (2+ chars) + ASCII word tokens (3+ chars to avoid noise)
-        cjk_words = re.findall(r'[\u4e00-\u9fff]{2,}', query)
+        # CJK runs (1+ chars, wide Unicode range) + ASCII words (3+ chars to avoid noise)
+        cjk_words = _RE_CJK_WORDS.findall(query)
         ascii_words = [t for t in re.findall(r'[A-Za-z0-9_]+', query) if len(t) >= 3]
         words = cjk_words + ascii_words
         if not words:
@@ -565,44 +797,54 @@ class MemoryStorage:
         
         try:
             rows = self.conn.execute(sql_query, params).fetchall()
-            return [
-                SearchResult(
+            results = []
+            for row in rows:
+                # Dynamic score: reward chunks that contain more of the query words.
+                # Use all tokens (CJK + ASCII) so pure-ASCII queries are not skipped.
+                # matched_count is always ≥1 because the WHERE clause uses OR, but
+                # guard defensively so unexpected zero-match rows are never surfaced.
+                text_lower = row['text'].lower()
+                matched_count = sum(1 for w in words if w.lower() in text_lower)
+                if matched_count == 0:
+                    continue
+                score = min(0.85, 0.3 + 0.15 * matched_count)
+                results.append(SearchResult(
                     path=row['path'],
                     start_line=row['start_line'],
                     end_line=row['end_line'],
-                    score=0.5,  # Fixed score for LIKE search
+                    score=score,
                     snippet=self._truncate_text(row['text'], 500),
                     source=row['source'],
                     user_id=row['user_id']
-                )
-                for row in rows
-            ]
-        except Exception as e:
+                ))
+            results.sort(key=lambda r: r.score, reverse=True)
+            return results
+        except Exception:
             from common.log import logger
-            logger.error(f"[MemoryStorage] LIKE search failed: {e}")
+            logger.warning("[MemoryStorage] _search_like failed, returning empty", exc_info=True)
             return []
-    
+
     def delete_by_path(self, path: str):
         """Delete all chunks from a file"""
-        self.conn.execute("""
-            DELETE FROM chunks WHERE path = ?
-        """, (path,))
-        self.conn.commit()
-    
+        with self._lock:
+            self.conn.execute("DELETE FROM chunks WHERE path = ?", (path,))
+            self.conn.commit()
+
     def get_file_hash(self, path: str) -> Optional[str]:
         """Get stored file hash"""
         row = self.conn.execute("""
             SELECT hash FROM files WHERE path = ?
         """, (path,)).fetchone()
         return row['hash'] if row else None
-    
+
     def update_file_metadata(self, path: str, source: str, file_hash: str, mtime: int, size: int):
         """Update file metadata"""
-        self.conn.execute("""
-            INSERT OR REPLACE INTO files (path, source, hash, mtime, size, updated_at)
-            VALUES (?, ?, ?, ?, ?, strftime('%s', 'now'))
-        """, (path, source, file_hash, mtime, size))
-        self.conn.commit()
+        with self._lock:
+            self.conn.execute("""
+                INSERT OR REPLACE INTO files (path, source, hash, mtime, size, updated_at)
+                VALUES (?, ?, ?, ?, ?, strftime('%s', 'now'))
+            """, (path, source, file_hash, mtime, size))
+            self.conn.commit()
     
     def get_stats(self) -> Dict[str, int]:
         """Get storage statistics"""
@@ -632,7 +874,8 @@ class MemoryStorage:
                 self.conn.close()
                 self.conn = None  # Mark as closed
             except Exception as e:
-                print(f"⚠️  Error closing database connection: {e}")
+                from common.log import logger
+                logger.warning("[MemoryStorage] Error closing database connection: %s", e)
     
     def __del__(self):
         """Destructor to ensure connection is closed"""
@@ -642,7 +885,33 @@ class MemoryStorage:
             pass  # Ignore errors during cleanup
     
     # Helper methods
-    
+
+    @staticmethod
+    def _encode_embedding(embedding: Optional[List[float]]) -> Optional[bytes]:
+        """Encode embedding as float32 BLOB bytes (~6x smaller and faster than JSON).
+        Falls back to struct.pack when numpy is unavailable."""
+        if embedding is None:
+            return None
+        if _HAS_NUMPY:
+            return np.array(embedding, dtype=np.float32).tobytes()
+        import struct
+        return struct.pack(f'{len(embedding)}f', *embedding)
+
+    @staticmethod
+    def _decode_embedding(raw) -> Optional[List[float]]:
+        """Decode embedding from BLOB bytes or legacy JSON string.
+        Handles both numpy and numpy-free environments."""
+        if raw is None:
+            return None
+        if isinstance(raw, (bytes, bytearray)):
+            if _HAS_NUMPY:
+                return np.frombuffer(raw, dtype=np.float32).tolist()
+            import struct
+            n = len(raw) // 4
+            return list(struct.unpack(f'{n}f', raw))
+        # Legacy JSON format written by older versions
+        return json.loads(raw)
+
     def _row_to_chunk(self, row) -> MemoryChunk:
         """Convert database row to MemoryChunk"""
         return MemoryChunk(
@@ -654,32 +923,89 @@ class MemoryStorage:
             start_line=row['start_line'],
             end_line=row['end_line'],
             text=row['text'],
-            embedding=json.loads(row['embedding']) if row['embedding'] else None,
+            embedding=self._decode_embedding(row['embedding']),
             hash=row['hash'],
             metadata=json.loads(row['metadata']) if row['metadata'] else None
         )
     
     @staticmethod
-    def _cosine_similarity(vec1: List[float], vec2: List[float]) -> float:
-        """Calculate cosine similarity between two vectors"""
-        if len(vec1) != len(vec2):
-            return 0.0
-        
-        dot_product = sum(a * b for a, b in zip(vec1, vec2))
-        norm1 = sum(a * a for a in vec1) ** 0.5
-        norm2 = sum(b * b for b in vec2) ** 0.5
-        
-        if norm1 == 0 or norm2 == 0:
-            return 0.0
-        
-        return dot_product / (norm1 * norm2)
+    def _contains_cjk(text: str) -> bool:
+        """Check if text contains CJK or related characters (Chinese, Japanese, Korean)."""
+        return bool(_RE_CONTAINS_CJK.search(text))
     
     @staticmethod
-    def _contains_cjk(text: str) -> bool:
-        """Check if text contains CJK (Chinese/Japanese/Korean) characters"""
-        import re
-        return bool(re.search(r'[\u4e00-\u9fff]', text))
-    
+    def _build_trigram_query(raw_query: str) -> Optional[str]:
+        """
+        Build FTS5 MATCH query for the trigram tokenizer.
+        Extracts CJK sequences (including single characters) and ASCII words,
+        joining them with AND so all terms must appear in the matched chunk.
+        """
+        tokens = _RE_TRIGRAM_TOKENS.findall(raw_query)
+        tokens = [t for t in tokens if t]
+        if not tokens:
+            return None
+        # Escape embedded double-quotes (FTS5 uses "" inside quoted phrases)
+        quoted = [f'"{t.replace(chr(34), chr(34)*2)}"' for t in tokens]
+        return ' AND '.join(quoted)
+
+    def _search_fts5_trigram(
+        self,
+        query: str,
+        user_id: Optional[str],
+        scopes: List[str],
+        limit: int
+    ) -> List[SearchResult]:
+        """Trigram FTS5 search — handles CJK and mixed queries with BM25 ranking."""
+        trigram_query = self._build_trigram_query(query)
+        if not trigram_query:
+            return []
+
+        scope_placeholders = ','.join('?' * len(scopes))
+        params = [trigram_query] + list(scopes)
+
+        if user_id:
+            sql = f"""
+                SELECT chunks.*, bm25(chunks_fts_trigram) as rank
+                FROM chunks_fts_trigram
+                JOIN chunks ON chunks.rowid = chunks_fts_trigram.rowid
+                WHERE chunks_fts_trigram MATCH ?
+                AND chunks.scope IN ({scope_placeholders})
+                AND (chunks.scope = 'shared' OR chunks.user_id = ?)
+                ORDER BY rank
+                LIMIT ?
+            """
+            params.extend([user_id, limit])
+        else:
+            sql = f"""
+                SELECT chunks.*, bm25(chunks_fts_trigram) as rank
+                FROM chunks_fts_trigram
+                JOIN chunks ON chunks.rowid = chunks_fts_trigram.rowid
+                WHERE chunks_fts_trigram MATCH ?
+                AND chunks.scope IN ({scope_placeholders})
+                ORDER BY rank
+                LIMIT ?
+            """
+            params.append(limit)
+
+        try:
+            rows = self.conn.execute(sql, params).fetchall()
+            return [
+                SearchResult(
+                    path=row['path'],
+                    start_line=row['start_line'],
+                    end_line=row['end_line'],
+                    score=self._bm25_rank_to_score(row['rank']),
+                    snippet=self._truncate_text(row['text'], 500),
+                    source=row['source'],
+                    user_id=row['user_id']
+                )
+                for row in rows
+            ]
+        except Exception:
+            from common.log import logger
+            logger.warning("[MemoryStorage] _search_fts5_trigram failed, returning empty", exc_info=True)
+            return []
+
     @staticmethod
     def _build_fts_query(raw_query: str) -> Optional[str]:
         """
@@ -688,7 +1014,6 @@ class MemoryStorage:
         Works best for English and word-based languages.
         For CJK characters, LIKE search will be used as fallback.
         """
-        import re
         # Extract words (primarily English words and numbers)
         tokens = re.findall(r'[A-Za-z0-9_]+', raw_query)
         if not tokens:
@@ -701,9 +1026,22 @@ class MemoryStorage:
     
     @staticmethod
     def _bm25_rank_to_score(rank: float) -> float:
-        """Convert BM25 rank to 0-1 score"""
-        normalized = max(0, rank) if rank is not None else 999
-        return 1 / (1 + normalized)
+        """Convert SQLite BM25 rank to a [0, 1) relevance score.
+
+        SQLite's bm25() returns a non-positive float (0 or negative).
+        More negative = more relevant.  max(0, rank) would clip every
+        negative value to 0, making every score 1/(1+0) = 1.0 and
+        destroying all ranking information.
+
+        abs(rank) / (1 + abs(rank)) maps the absolute relevance magnitude
+        to [0, 1): larger |rank| (stronger match) → score closer to 1.
+        """
+        if rank is None:
+            return 0.0
+        # Add a floor of 0.3 so any FTS5 match always exceeds typical
+        # min_score thresholds (default 0.1).  Small-corpus ranks close to
+        # 0 would otherwise produce score≈0 and be filtered out downstream.
+        return 0.3 + 0.69 * (abs(rank) / (1.0 + abs(rank)))
     
     @staticmethod
     def _truncate_text(text: str, max_chars: int) -> str:
diff --git a/agent/protocol/__init__.py b/agent/protocol/__init__.py
index a9fe5a3e..f0a7a4e2 100644
--- a/agent/protocol/__init__.py
+++ b/agent/protocol/__init__.py
@@ -3,6 +3,11 @@ from .agent_stream import AgentStreamExecutor
 from .task import Task, TaskType, TaskStatus
 from .result import AgentResult, AgentAction, AgentActionType, ToolResult
 from .models import LLMModel, LLMRequest, ModelFactory
+from .cancel import (
+    AgentCancelledError,
+    CancelTokenRegistry,
+    get_cancel_registry,
+)
 
 __all__ = [
     'Agent', 
@@ -16,5 +21,8 @@ __all__ = [
     'ToolResult',
     'LLMModel',
     'LLMRequest', 
-    'ModelFactory'
-]
\ No newline at end of file
+    'ModelFactory',
+    'AgentCancelledError',
+    'CancelTokenRegistry',
+    'get_cancel_registry',
+]
diff --git a/agent/protocol/agent.py b/agent/protocol/agent.py
index 285a9732..d944660b 100644
--- a/agent/protocol/agent.py
+++ b/agent/protocol/agent.py
@@ -365,7 +365,8 @@ class Agent:
 
         return action
 
-    def run_stream(self, user_message: str, on_event=None, clear_history: bool = False, skill_filter=None) -> str:
+    def run_stream(self, user_message: str, on_event=None, clear_history: bool = False,
+                   skill_filter=None, cancel_event=None) -> str:
         """
         Execute single agent task with streaming (based on tool-call)
 
@@ -374,6 +375,7 @@ class Agent:
         - Multi-turn reasoning based on tool-call
         - Event callbacks
         - Persistent conversation history across calls
+        - User-initiated cancellation via ``cancel_event``
 
         Args:
             user_message: User message
@@ -381,6 +383,11 @@ class Agent:
                      event = {"type": str, "timestamp": float, "data": dict}
             clear_history: If True, clear conversation history before this call (default: False)
             skill_filter: Optional list of skill names to include in this run
+            cancel_event: Optional threading.Event polled at agent checkpoints.
+                When set, the loop exits at the next safe point, injects a
+                "[Interrupted by user]" assistant note, and returns the
+                partial response. ``messages`` stays in a valid state
+                (tool_use/tool_result pairs preserved).
 
         Returns:
             Final response text
@@ -424,7 +431,8 @@ class Agent:
             max_turns=self.max_steps,
             on_event=on_event,
             messages=messages_copy,  # Pass copied message history
-            max_context_turns=max_context_turns
+            max_context_turns=max_context_turns,
+            cancel_event=cancel_event,
         )
 
         # Execute
diff --git a/agent/protocol/agent_stream.py b/agent/protocol/agent_stream.py
index 75b4f4ff..e3be20b8 100644
--- a/agent/protocol/agent_stream.py
+++ b/agent/protocol/agent_stream.py
@@ -7,11 +7,19 @@ import json
 import time
 from typing import List, Dict, Any, Optional, Callable, Tuple
 
+from agent.protocol.cancel import AgentCancelledError
 from agent.protocol.models import LLMRequest, LLMModel
 from agent.protocol.message_utils import sanitize_claude_messages, compress_turn_to_text_only
 from agent.tools.base_tool import BaseTool, ToolResult
 from common.log import logger
 
+# Optional: repair malformed JSON args from non-strict providers (e.g. unescaped quotes in long content).
+try:
+    from json_repair import repair_json as _repair_json
+    _HAS_JSON_REPAIR = True
+except ImportError:
+    _HAS_JSON_REPAIR = False
+
 
 # Maximum number of characters of model "reasoning / thinking" content to persist
 # in conversation history. The full reasoning is still streamed to the UI in real
@@ -44,6 +52,30 @@ def _truncate_reasoning_for_storage(text: str) -> str:
     return head + _REASONING_TRUNCATE_MARKER.format(omitted=omitted) + tail
 
 
+def _parse_tool_args(args_str: str, finish_reason: Optional[str]) -> Tuple[dict, Optional[str]]:
+    """Parse tool args JSON. Returns (args, error_msg); error_msg is None on success.
+
+    On JSONDecodeError: detect truncation first (skip repair, surface max_tokens hint);
+    otherwise try json-repair for escape issues; finally fall back to the raw decoder error.
+    """
+    if not args_str:
+        return {}, None
+    try:
+        return json.loads(args_str), None
+    except json.JSONDecodeError as e:
+        if finish_reason in ("length", "max_tokens") or not args_str.rstrip().endswith("}"):
+            return {}, "Output truncated (max_tokens reached). Split content into smaller chunks across multiple tool calls."
+        if _HAS_JSON_REPAIR:
+            try:
+                repaired = _repair_json(args_str, return_objects=True)
+                if isinstance(repaired, dict):
+                    logger.warning(f"Tool args JSON repaired ({len(args_str)} chars)")
+                    return repaired, None
+            except Exception:
+                pass
+        return {}, f"Invalid JSON in tool arguments: {e.msg}"
+
+
 class AgentStreamExecutor:
     """
     Agent Stream Executor
@@ -64,7 +96,8 @@ class AgentStreamExecutor:
             max_turns: int = 50,
             on_event: Optional[Callable] = None,
             messages: Optional[List[Dict]] = None,
-            max_context_turns: int = 30
+            max_context_turns: int = 30,
+            cancel_event=None,
     ):
         """
         Initialize stream executor
@@ -78,6 +111,10 @@ class AgentStreamExecutor:
             on_event: Event callback function
             messages: Optional existing message history (for persistent conversations)
             max_context_turns: Maximum number of conversation turns to keep in context
+            cancel_event: Optional threading.Event used to signal user cancel.
+                Checked at every safe point (turn boundary, before tool execution,
+                during LLM streaming). When set, raises AgentCancelledError which
+                run_stream catches to gracefully wind down.
         """
         self.agent = agent
         self.model = model
@@ -87,6 +124,7 @@ class AgentStreamExecutor:
         self.max_turns = max_turns
         self.on_event = on_event
         self.max_context_turns = max_context_turns
+        self.cancel_event = cancel_event
 
         # Message history - use provided messages or create new list
         self.messages = messages if messages is not None else []
@@ -97,6 +135,73 @@ class AgentStreamExecutor:
         # Track files to send (populated by read tool)
         self.files_to_send = []  # List of file metadata dicts
 
+    def _check_cancelled(self) -> None:
+        """Raise AgentCancelledError if the user requested cancellation.
+
+        Called at safe points (turn start, between tool calls, between LLM
+        chunks). Cheap to call: just an Event.is_set() probe.
+        """
+        if self.cancel_event is not None and self.cancel_event.is_set():
+            raise AgentCancelledError("agent cancelled by user")
+
+    def _handle_cancelled(self, partial_response: str) -> None:
+        """Wind down ``self.messages`` after a user-initiated cancel.
+
+        The messages list may be in any of these states when we get here:
+          (a) Last message is an assistant message containing tool_use
+              blocks but the matching tool_result has not been appended yet.
+          (b) Last message is an assistant text-only reply (cancel happened
+              right before the next turn started).
+          (c) Last message is a user tool_result message and we cancelled
+              between turns.
+
+        For (a) we MUST synthesise tool_result blocks, otherwise the next
+        request will fail Claude/OpenAI's strict pairing validation. For
+        (b)/(c) the state is already valid and we just append a small
+        cancellation note so the user/LLM both see the boundary clearly.
+        """
+        try:
+            # Step 1: close any orphaned tool_use in the trailing assistant
+            # message by injecting matching tool_result blocks.
+            if self.messages and isinstance(self.messages[-1], dict) \
+                    and self.messages[-1].get("role") == "assistant":
+                last = self.messages[-1]
+                content = last.get("content")
+                if isinstance(content, list):
+                    pending_tool_use_ids = [
+                        block.get("id")
+                        for block in content
+                        if isinstance(block, dict) and block.get("type") == "tool_use"
+                    ]
+                    pending_tool_use_ids = [tid for tid in pending_tool_use_ids if tid]
+                    if pending_tool_use_ids:
+                        tool_result_blocks = [
+                            {
+                                "type": "tool_result",
+                                "tool_use_id": tid,
+                                "content": "Cancelled by user before this tool finished.",
+                                "is_error": True,
+                            }
+                            for tid in pending_tool_use_ids
+                        ]
+                        self.messages.append({
+                            "role": "user",
+                            "content": tool_result_blocks,
+                        })
+                        logger.info(
+                            f"[Agent] Injected {len(tool_result_blocks)} cancellation "
+                            f"tool_result blocks to keep message history valid"
+                        )
+
+            # Step 2: append a stable "interrupted" marker so the LLM sees a
+            # clear stop boundary on the next turn.
+            self.messages.append({
+                "role": "assistant",
+                "content": [{"type": "text", "text": "_(Cancelled by user)_"}],
+            })
+        except Exception as e:
+            logger.warning(f"[Agent] _handle_cancelled cleanup failed: {e}")
+
     def _emit_event(self, event_type: str, data: dict = None):
         """Emit event"""
         if self.on_event:
@@ -270,8 +375,13 @@ class AgentStreamExecutor:
         final_response = ""
         turn = 0
 
+        cancelled = False
         try:
             while turn < self.max_turns:
+                # Check at the very top of every turn so a cancel arriving
+                # between turns short-circuits cleanly.
+                self._check_cancelled()
+
                 turn += 1
                 logger.info(f"[Agent] 第 {turn} 轮")
                 self._emit_event("turn_start", {"turn": turn})
@@ -375,6 +485,8 @@ class AgentStreamExecutor:
 
                 try:
                     for tool_call in tool_calls:
+                        # Honour cancel between tool invocations within the same turn
+                        self._check_cancelled()
                         result = self._execute_tool(tool_call)
                         tool_results.append(result)
                         
@@ -557,6 +669,15 @@ class AgentStreamExecutor:
                         self.messages.pop(prompt_insert_idx)
                         logger.debug("[Agent] Removed injected max-steps prompt from message history")
 
+        except AgentCancelledError:
+            # User-initiated stop: wind down message history cleanly so the
+            # next turn is unaffected; channels emit a "cancelled" UI event.
+            cancelled = True
+            logger.info(f"[Agent] 🛑 已被用户中止 (第 {turn} 轮)")
+            self._handle_cancelled(final_response)
+            if not final_response or not final_response.strip():
+                final_response = "_(Cancelled)_"
+
         except Exception as e:
             logger.error(f"❌ Agent执行错误: {e}")
             self._emit_event("error", {"error": str(e)})
@@ -564,8 +685,11 @@ class AgentStreamExecutor:
 
         finally:
             final_response = final_response.strip() if final_response else final_response
-            logger.info(f"[Agent] 🏁 完成 ({turn}轮)")
-            self._emit_event("agent_end", {"final_response": final_response})
+            if cancelled:
+                # Emit before agent_end so channels can mark UI as cancelled
+                self._emit_event("agent_cancelled", {"final_response": final_response})
+            logger.info(f"[Agent] 🏁 完成 ({turn}轮)" + (" [cancelled]" if cancelled else ""))
+            self._emit_event("agent_end", {"final_response": final_response, "cancelled": cancelled})
 
         return final_response
 
@@ -603,15 +727,24 @@ class AgentStreamExecutor:
         except Exception as e:
             logger.debug(f"[Agent] MCP sync skipped: {e}")
 
-        # Prepare tool definitions (OpenAI/Claude format)
+        # Prepare tool definitions. Prefer get_json_schema() when it yields
+        # real properties (lets tools augment schema at runtime), otherwise
+        # fall back to the static `tool.params` (MCP tools rely on this).
         tools_schema = None
         if self.tools:
             tools_schema = []
             for tool in self.tools.values():
+                input_schema = tool.params
+                try:
+                    dynamic = (tool.get_json_schema() or {}).get("parameters") or {}
+                    if dynamic.get("properties"):
+                        input_schema = dynamic
+                except Exception:
+                    pass
                 tools_schema.append({
                     "name": tool.name,
                     "description": tool.description,
-                    "input_schema": tool.params  # Claude uses input_schema
+                    "input_schema": input_schema,
                 })
 
         # Create request
@@ -635,7 +768,32 @@ class AgentStreamExecutor:
         try:
             stream = self.model.call_stream(request)
 
+            # Probe cancel every N chunks to bound reaction time without
+            # checking on every token.
+            _cancel_probe_counter = 0
+            _CANCEL_PROBE_EVERY = 8
+
             for chunk in stream:
+                _cancel_probe_counter += 1
+                if _cancel_probe_counter >= _CANCEL_PROBE_EVERY:
+                    _cancel_probe_counter = 0
+                    if self.cancel_event is not None and self.cancel_event.is_set():
+                        # Persist partial text only; tool_use args may be
+                        # truncated mid-stream and would fail validation.
+                        logger.info("[Agent] cancel detected mid-stream, aborting LLM call")
+                        if full_content:
+                            partial_msg = {
+                                "role": "assistant",
+                                "content": [{"type": "text", "text": full_content}],
+                            }
+                            self.messages.append(partial_msg)
+                        self._emit_event("message_end", {
+                            "content": full_content,
+                            "tool_calls": [],
+                            "cancelled": True,
+                        })
+                        raise AgentCancelledError("cancelled during LLM streaming")
+
                 # Check for errors
                 if isinstance(chunk, dict) and chunk.get("error"):
                     # Extract error message from nested structure
@@ -729,6 +887,10 @@ class AgentStreamExecutor:
                     elif isinstance(choice, dict) and choice.get("_gemini_raw_parts"):
                         gemini_raw_parts = choice["_gemini_raw_parts"]
 
+        except AgentCancelledError:
+            # Must propagate untouched; never treat as a retryable error.
+            raise
+
         except Exception as e:
             error_str = str(e)
             error_str_lower = error_str.lower()
@@ -842,26 +1004,17 @@ class AgentStreamExecutor:
                 import uuid
                 tool_id = f"call_{uuid.uuid4().hex[:24]}"
 
-            try:
-                # Safely get arguments, handle None case
-                args_str = tc.get("arguments") or ""
-                arguments = json.loads(args_str) if args_str else {}
-            except json.JSONDecodeError as e:
-                # Handle None or invalid arguments safely
-                args_str = tc.get('arguments') or ""
-                args_preview = args_str[:200] if len(args_str) > 200 else args_str
-                logger.error(f"Failed to parse tool arguments for {tc['name']}")
-                logger.error(f"Arguments length: {len(args_str)} chars")
-                logger.error(f"Arguments preview: {args_preview}...")
-                logger.error(f"JSON decode error: {e}")
-
-                # Return a clear error message to the LLM instead of empty dict
-                # This helps the LLM understand what went wrong
+            args_str = tc.get("arguments") or ""
+            arguments, parse_err = _parse_tool_args(args_str, stop_reason)
+            if parse_err:
+                logger.error(
+                    f"Tool args parse failed for {tc['name']} ({len(args_str)} chars): {parse_err}"
+                )
                 tool_calls.append({
                     "id": tool_id,
                     "name": tc["name"],
                     "arguments": {},
-                    "_parse_error": f"Invalid JSON in tool arguments: {args_preview}... Error: {str(e)}. Tip: For large content, consider splitting into smaller chunks or using a different approach."
+                    "_parse_error": parse_err,
                 })
                 continue
 
@@ -949,14 +1102,11 @@ class AgentStreamExecutor:
         tool_id = tool_call["id"]
         arguments = tool_call["arguments"]
 
-        # Check if there was a JSON parse error
         if "_parse_error" in tool_call:
-            parse_error = tool_call["_parse_error"]
-            logger.error(f"Skipping tool execution due to parse error: {parse_error}")
             result = {
                 "status": "error",
-                "result": f"Failed to parse tool arguments. {parse_error}. Please ensure your tool call uses valid JSON format with all required parameters.",
-                "execution_time": 0
+                "result": tool_call["_parse_error"],
+                "execution_time": 0,
             }
             self._record_tool_result(tool_name, arguments, False)
             return result
diff --git a/agent/protocol/cancel.py b/agent/protocol/cancel.py
new file mode 100644
index 00000000..6354cd38
--- /dev/null
+++ b/agent/protocol/cancel.py
@@ -0,0 +1,121 @@
+"""
+Cancel token registry for aborting in-flight agent runs.
+
+A user cancel (web Cancel button, /cancel command) sets a threading.Event
+that the agent loop polls at safe checkpoints. Tokens are keyed by
+request_id (preferred) and tracked under session_id as a fallback. Entries
+are released after the run completes to keep the registry bounded.
+
+No project deps — importable from any layer without circular imports.
+"""
+
+from __future__ import annotations
+
+import threading
+from typing import Dict, Optional
+
+
+class AgentCancelledError(Exception):
+    """Raised inside the agent loop when a stop has been requested.
+
+    The agent stream executor catches this, injects a "[Interrupted]" note
+    into the message history (preserving tool_use/tool_result integrity)
+    and returns a partial response to the caller.
+    """
+
+
+class _CancelEntry:
+    __slots__ = ("event", "session_id")
+
+    def __init__(self, session_id: Optional[str]):
+        self.event = threading.Event()
+        self.session_id = session_id
+
+
+class CancelTokenRegistry:
+    """In-process registry mapping request_id -> cancel Event.
+
+    Thread-safe. Singleton via module-level ``_registry``.
+    """
+
+    def __init__(self):
+        self._lock = threading.Lock()
+        self._by_request: Dict[str, _CancelEntry] = {}
+        # session_id -> set of request_ids currently in flight (usually 1).
+        self._by_session: Dict[str, set] = {}
+
+    def register(self, request_id: str, session_id: Optional[str] = None) -> threading.Event:
+        """Create (or return existing) cancel event for a request.
+
+        Returns the threading.Event the caller should poll via ``is_set()``.
+        """
+        if not request_id:
+            return threading.Event()
+        with self._lock:
+            entry = self._by_request.get(request_id)
+            if entry is None:
+                entry = _CancelEntry(session_id)
+                self._by_request[request_id] = entry
+                if session_id:
+                    self._by_session.setdefault(session_id, set()).add(request_id)
+            return entry.event
+
+    def get_event(self, request_id: str) -> Optional[threading.Event]:
+        if not request_id:
+            return None
+        with self._lock:
+            entry = self._by_request.get(request_id)
+            return entry.event if entry else None
+
+    def cancel_request(self, request_id: str) -> bool:
+        """Trigger cancel for a specific request. Returns True when matched."""
+        if not request_id:
+            return False
+        with self._lock:
+            entry = self._by_request.get(request_id)
+        if entry is None:
+            return False
+        entry.event.set()
+        return True
+
+    def cancel_session(self, session_id: str) -> int:
+        """Trigger cancel for every in-flight request of a session.
+
+        Returns the number of requests cancelled (0 when nothing was running).
+        """
+        if not session_id:
+            return 0
+        with self._lock:
+            request_ids = list(self._by_session.get(session_id, ()))
+            entries = [self._by_request[r] for r in request_ids if r in self._by_request]
+        for entry in entries:
+            entry.event.set()
+        return len(entries)
+
+    def unregister(self, request_id: str) -> None:
+        """Remove an entry once the agent run is done. Safe to call twice."""
+        if not request_id:
+            return
+        with self._lock:
+            entry = self._by_request.pop(request_id, None)
+            if entry and entry.session_id:
+                bucket = self._by_session.get(entry.session_id)
+                if bucket is not None:
+                    bucket.discard(request_id)
+                    if not bucket:
+                        self._by_session.pop(entry.session_id, None)
+
+    def has_active(self, session_id: str) -> bool:
+        if not session_id:
+            return False
+        with self._lock:
+            bucket = self._by_session.get(session_id)
+            return bool(bucket)
+
+
+_registry = CancelTokenRegistry()
+
+
+def get_cancel_registry() -> CancelTokenRegistry:
+    """Module-level accessor for the singleton registry."""
+    return _registry
diff --git a/agent/tools/browser/browser_service.py b/agent/tools/browser/browser_service.py
index 69ec0e06..f499fb29 100644
--- a/agent/tools/browser/browser_service.py
+++ b/agent/tools/browser/browser_service.py
@@ -15,7 +15,7 @@ import threading
 from typing import Optional, Dict, Any, List, Callable
 
 from common.log import logger
-from common.utils import expand_path
+from common.utils import expand_path, is_cloud_deployment
 
 
 _DEFAULT_USER_DATA_DIR = "~/.cow/browser_profile"
@@ -436,6 +436,20 @@ class BrowserService:
         if self._headless:
             launch_args.append("--no-sandbox")
 
+        if is_cloud_deployment():
+            launch_args.extend([
+                "--disable-gpu",
+                "--disable-software-rasterizer",
+                "--disable-extensions",
+                "--disable-background-networking",
+                "--disable-background-timer-throttling",
+                "--disable-renderer-backgrounding",
+                "--disable-features=site-per-process,TranslateUI,IsolateOrigins",
+                "--no-zygote",
+                "--js-flags=--max-old-space-size=384",
+                "--memory-pressure-off",
+            ])
+
         extra_args = self._config.get("launch_args", [])
         if extra_args:
             launch_args.extend(extra_args)
diff --git a/agent/tools/browser/browser_tool.py b/agent/tools/browser/browser_tool.py
index c5139812..c91be26c 100644
--- a/agent/tools/browser/browser_tool.py
+++ b/agent/tools/browser/browser_tool.py
@@ -145,7 +145,8 @@ class BrowserTool(BaseTool):
         url = args.get("url", "").strip()
         if not url:
             return ToolResult.fail("Error: 'url' is required for navigate action")
-        if not url.startswith(("http://", "https://")):
+        # Only auto-prepend https:// for bare hosts; preserve file://, about:, data:, etc.
+        if "://" not in url and not url.startswith(("about:", "data:")):
             url = "https://" + url
         timeout = args.get("timeout", 30000)
         service = self._get_service()
diff --git a/agent/tools/mcp/mcp_client.py b/agent/tools/mcp/mcp_client.py
index 694a0c46..be93c716 100644
--- a/agent/tools/mcp/mcp_client.py
+++ b/agent/tools/mcp/mcp_client.py
@@ -1,8 +1,8 @@
 """
 MCP (Model Context Protocol) client module.
 
-Implements JSON-RPC 2.0 over stdio and SSE transports without any external
-MCP SDK dependency.
+Implements JSON-RPC 2.0 over stdio, SSE and Streamable HTTP transports
+without any external MCP SDK dependency.
 """
 
 import json
@@ -17,18 +17,29 @@ from typing import Optional
 from common.log import logger
 
 
+# Aliases accepted for the Streamable HTTP transport type
+_STREAMABLE_HTTP_ALIASES = {"streamable-http", "streamable_http", "streamablehttp", "http"}
+
+
 class McpClient:
-    """Single MCP Server client supporting stdio and SSE transports."""
+    """Single MCP Server client supporting stdio, SSE and Streamable HTTP transports."""
 
     def __init__(self, config: dict):
         """
         config examples:
-          stdio: {"name": "filesystem", "type": "stdio", "command": "npx", "args": [...]}
-          SSE:   {"name": "my-api",    "type": "sse",   "url": "http://localhost:8000/sse"}
+          stdio:           {"name": "filesystem", "type": "stdio", "command": "npx", "args": [...]}
+          SSE:             {"name": "my-api",    "type": "sse",   "url": "http://localhost:8000/sse"}
+          streamable-http: {"name": "pubmed",    "type": "streamable-http", "url": "https://x/mcp"}
         """
         self.config = config
         self.name: str = config.get("name", "unknown")
-        self.transport: str = config.get("type", "stdio")
+        raw_transport: str = config.get("type", "stdio")
+        # Normalize streamable-http aliases to a single internal key
+        self.transport: str = (
+            "streamable-http"
+            if raw_transport.lower() in _STREAMABLE_HTTP_ALIASES
+            else raw_transport
+        )
 
         # stdio state
         self._proc: Optional[subprocess.Popen] = None
@@ -37,6 +48,11 @@ class McpClient:
         self._sse_url: Optional[str] = None
         self._post_url: Optional[str] = None  # endpoint for sending messages (resolved from SSE)
 
+        # Streamable HTTP state
+        self._http_url: Optional[str] = None
+        self._http_headers: dict = {}  # extra headers from user config (e.g. Authorization)
+        self._http_session_id: Optional[str] = None  # Mcp-Session-Id assigned by the server
+
         # Shared state
         self._next_id = 1
         self._id_lock = threading.Lock()
@@ -54,6 +70,8 @@ class McpClient:
                 return self._init_stdio()
             elif self.transport == "sse":
                 return self._init_sse()
+            elif self.transport == "streamable-http":
+                return self._init_streamable_http()
             else:
                 logger.warning(f"[MCP:{self.name}] Unknown transport type: {self.transport!r}")
                 return False
@@ -109,6 +127,21 @@ class McpClient:
                     pass
             self._proc = None
             logger.debug(f"[MCP:{self.name}] stdio process terminated")
+
+        # Best-effort streamable-http session termination
+        if self.transport == "streamable-http" and self._http_session_id and self._http_url:
+            try:
+                req = urllib.request.Request(
+                    self._http_url,
+                    method="DELETE",
+                    headers={"Mcp-Session-Id": self._http_session_id, **self._http_headers},
+                )
+                with urllib.request.urlopen(req, timeout=5):
+                    pass
+            except Exception:
+                pass
+            self._http_session_id = None
+
         self._initialized = False
 
     # ------------------------------------------------------------------
@@ -234,6 +267,120 @@ class McpClient:
             raw = resp.read().decode("utf-8")
             return json.loads(raw)
 
+    # ------------------------------------------------------------------
+    # Streamable HTTP transport (MCP spec 2025-03-26)
+    # ------------------------------------------------------------------
+
+    def _init_streamable_http(self) -> bool:
+        url = self.config.get("url")
+        if not url:
+            logger.warning(f"[MCP:{self.name}] streamable-http config missing 'url'")
+            return False
+
+        self._http_url = url
+        # Allow user-provided headers (e.g. {"Authorization": "Bearer xxx"})
+        extra_headers = self.config.get("headers") or {}
+        if isinstance(extra_headers, dict):
+            self._http_headers = {str(k): str(v) for k, v in extra_headers.items()}
+
+        return self._handshake()
+
+    def _streamable_http_send(self, message: dict) -> dict:
+        """POST a JSON-RPC request and return the response (JSON or SSE-wrapped)."""
+        return self._streamable_http_post(message, expect_response=True)
+
+    def _streamable_http_post(self, message: dict, expect_response: bool) -> dict:
+        """
+        POST a JSON-RPC message over Streamable HTTP.
+
+        Per the spec, the response Content-Type can be either:
+          - application/json   -> single JSON-RPC response in body
+          - text/event-stream  -> SSE stream; we read until we get a matching response
+        """
+        body = json.dumps(message).encode("utf-8")
+        headers = {
+            "Content-Type": "application/json",
+            "Accept": "application/json, text/event-stream",
+        }
+        if self._http_session_id:
+            headers["Mcp-Session-Id"] = self._http_session_id
+        headers.update(self._http_headers)
+
+        req = urllib.request.Request(
+            self._http_url,
+            data=body,
+            method="POST",
+            headers=headers,
+        )
+
+        try:
+            resp = urllib.request.urlopen(req, timeout=30)
+        except urllib.error.HTTPError as e:
+            # Surface the server-provided error body for easier debugging
+            detail = ""
+            try:
+                detail = e.read().decode("utf-8", errors="ignore")
+            except Exception:
+                pass
+            raise IOError(
+                f"[MCP:{self.name}] streamable-http HTTP {e.code}: {detail[:200]}"
+            )
+
+        with resp:
+            # Capture session id assigned by the server (if any)
+            session_id = resp.headers.get("Mcp-Session-Id")
+            if session_id and not self._http_session_id:
+                self._http_session_id = session_id
+
+            status = resp.status if hasattr(resp, "status") else resp.getcode()
+
+            # Notifications: server may reply with 202 Accepted and no body
+            if not expect_response or status == 202:
+                try:
+                    resp.read()
+                except Exception:
+                    pass
+                return {}
+
+            content_type = (resp.headers.get("Content-Type") or "").lower()
+            expected_id = message.get("id")
+
+            if "text/event-stream" in content_type:
+                return self._read_sse_response(resp, expected_id)
+
+            raw = resp.read().decode("utf-8")
+            if not raw:
+                return {}
+            return json.loads(raw)
+
+    def _read_sse_response(self, resp, expected_id) -> dict:
+        """Read an SSE stream and return the first JSON-RPC response with matching id."""
+        data_buf: list = []
+        for raw_line in resp:
+            line = raw_line.decode("utf-8").rstrip("\n\r")
+            if line == "":
+                # End of an SSE event, attempt to parse accumulated data
+                if data_buf:
+                    payload = "\n".join(data_buf)
+                    data_buf = []
+                    try:
+                        msg = json.loads(payload)
+                    except json.JSONDecodeError:
+                        continue
+                    # Skip notifications / mismatched ids
+                    if "id" not in msg:
+                        continue
+                    if expected_id is None or msg.get("id") == expected_id:
+                        return msg
+                continue
+            if line.startswith(":"):
+                continue  # SSE comment / keepalive
+            if line.startswith("data:"):
+                data_buf.append(line[len("data:"):].lstrip())
+            # Ignore 'event:' / 'id:' lines; we only care about JSON-RPC payloads
+
+        raise IOError(f"[MCP:{self.name}] streamable-http SSE stream closed before response")
+
     # ------------------------------------------------------------------
     # Common JSON-RPC helpers
     # ------------------------------------------------------------------
@@ -267,6 +414,8 @@ class McpClient:
                 return self._stdio_send(message)
             elif self.transport == "sse":
                 return self._sse_send(message)
+            elif self.transport == "streamable-http":
+                return self._streamable_http_send(message)
             else:
                 raise ValueError(f"[MCP:{self.name}] Unsupported transport: {self.transport}")
 
@@ -291,6 +440,11 @@ class McpClient:
                     pass
             except Exception:
                 pass  # notifications are fire-and-forget
+        elif self.transport == "streamable-http":
+            try:
+                self._streamable_http_post(notification, expect_response=False)
+            except Exception:
+                pass  # notifications are fire-and-forget
 
     def _handshake(self) -> bool:
         """Perform the MCP initialize / notifications/initialized handshake."""
diff --git a/agent/tools/scheduler/integration.py b/agent/tools/scheduler/integration.py
index 9e559a43..7421a525 100644
--- a/agent/tools/scheduler/integration.py
+++ b/agent/tools/scheduler/integration.py
@@ -57,34 +57,44 @@ def init_scheduler(agent_bridge) -> bool:
                 _task_store = TaskStore(store_path)
                 logger.debug(f"[Scheduler] Task store initialized: {store_path}")
 
-            # Create execute callback
+            # Create execute callback. Returns True on success, False to ask
+            # the scheduler to retry on the next tick (e.g. channel not yet
+            # ready right after process start).
             def execute_task_callback(task: dict):
-                """Callback to execute a scheduled task"""
                 try:
                     action = task.get("action", {})
                     action_type = action.get("type")
+                    channel_type = action.get("channel_type", "unknown")
+                    receiver = action.get("receiver", "")
+
+                    if not _is_channel_ready(channel_type, receiver):
+                        logger.warning(
+                            f"[Scheduler] Task {task.get('id')}: channel "
+                            f"'{channel_type}' not ready for receiver={receiver} "
+                            f"(no inbound msg cached since restart?); deferring"
+                        )
+                        return False
 
                     if action_type == "agent_task":
-                        _execute_agent_task(task, agent_bridge)
+                        return _execute_agent_task(task, agent_bridge)
                     elif action_type == "send_message":
-                        # Legacy support for old tasks
-                        _execute_send_message(task, agent_bridge)
+                        return _execute_send_message(task, agent_bridge)
                     elif action_type == "tool_call":
-                        # Legacy support for old tasks
-                        _execute_tool_call(task, agent_bridge)
+                        return _execute_tool_call(task, agent_bridge)
                     elif action_type == "skill_call":
-                        # Legacy support for old tasks
-                        _execute_skill_call(task, agent_bridge)
+                        return _execute_skill_call(task, agent_bridge)
                     else:
                         logger.warning(f"[Scheduler] Unknown action type: {action_type}")
+                        return True
                 except Exception as e:
                     logger.error(f"[Scheduler] Error executing task {task.get('id')}: {e}")
+                    return False
 
             # Create scheduler service
             _scheduler_service = SchedulerService(_task_store, execute_task_callback)
             _scheduler_service.start()
 
-            logger.debug("[Scheduler] Scheduler service initialized and started")
+            logger.info("[Scheduler] Service initialized and started")
             return True
 
         except Exception as e:
@@ -92,6 +102,40 @@ def init_scheduler(agent_bridge) -> bool:
             return False
 
 
+def _is_channel_ready(channel_type: str, receiver: str) -> bool:
+    """Best-effort readiness probe for outbound channels.
+
+    Returns False when we know the send will drop (e.g. weixin not yet
+    logged in, web session has no polling queue), so the scheduler can
+    defer instead of consuming the task. Unknown channels return True
+    to preserve previous behaviour.
+    """
+    if not channel_type or channel_type == "unknown":
+        return True
+    try:
+        from channel.channel_factory import create_channel
+        channel = create_channel(channel_type)
+        if channel is None:
+            return False
+
+        if channel_type == "weixin":
+            tokens = getattr(channel, "_context_tokens", None)
+            if not tokens or receiver not in tokens:
+                return False
+            return True
+
+        if channel_type == "web":
+            queues = getattr(channel, "session_queues", None)
+            if not queues or receiver not in queues:
+                return False
+            return True
+
+        return True
+    except Exception as e:
+        logger.warning(f"[Scheduler] Channel readiness check failed for {channel_type}: {e}")
+        return True
+
+
 def get_task_store():
     """Get the global task store instance"""
     return _task_store
@@ -145,13 +189,10 @@ def _remember_delivered_output(
         )
 
 
-def _execute_agent_task(task: dict, agent_bridge):
+def _execute_agent_task(task: dict, agent_bridge) -> bool:
     """
-    Execute an agent_task action - let Agent handle the task
-    
-    Args:
-        task: Task dictionary
-        agent_bridge: AgentBridge instance
+    Execute an agent_task action - let Agent handle the task.
+    Returns True on successful delivery, False to retry next tick.
     """
     try:
         action = task.get("action", {})
@@ -162,11 +203,11 @@ def _execute_agent_task(task: dict, agent_bridge):
         
         if not task_description:
             logger.error(f"[Scheduler] Task {task['id']}: No task_description specified")
-            return
+            return True  # malformed task, don't loop forever
         
         if not receiver:
             logger.error(f"[Scheduler] Task {task['id']}: No receiver specified")
-            return
+            return True
         
         # Check for unsupported channels
         if channel_type == "dingtalk":
@@ -209,51 +250,47 @@ def _execute_agent_task(task: dict, agent_bridge):
         try:
             # Don't clear history - scheduler tasks use isolated session_id so they won't pollute user conversations
             reply = agent_bridge.agent_reply(task_description, context=context, on_event=None, clear_history=False)
-            
-            if reply and reply.content:
-                # Send the reply via channel
-                from channel.channel_factory import create_channel
-                
-                try:
-                    channel = create_channel(channel_type)
-                    if channel:
-                        # For web channel, register request_id
-                        if channel_type == "web" and hasattr(channel, 'request_to_session'):
-                            request_id = context.get("request_id")
-                            if request_id:
-                                channel.request_to_session[request_id] = receiver
-                                logger.debug(f"[Scheduler] Registered request_id {request_id} -> session {receiver}")
-                        
-                        # Send the reply
-                        channel.send(reply, context)
-                        _remember_delivered_output(agent_bridge, task, channel_type, reply.content)
-                        logger.info(f"[Scheduler] Task {task['id']} executed successfully, result sent to {receiver}")
-                    else:
-                        logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
-                except Exception as e:
-                    logger.error(f"[Scheduler] Failed to send result: {e}")
-            else:
+
+            if not (reply and reply.content):
                 logger.error(f"[Scheduler] Task {task['id']}: No result from agent execution")
-                
+                return True  # agent ran but produced nothing; don't loop
+
+            from channel.channel_factory import create_channel
+            channel = create_channel(channel_type)
+            if not channel:
+                logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
+                return False
+
+            if channel_type == "web" and hasattr(channel, 'request_to_session'):
+                request_id = context.get("request_id")
+                if request_id:
+                    channel.request_to_session[request_id] = receiver
+
+            try:
+                channel.send(reply, context)
+            except Exception as e:
+                logger.error(f"[Scheduler] Failed to send result: {e}")
+                return False
+
+            _remember_delivered_output(agent_bridge, task, channel_type, reply.content)
+            logger.info(f"[Scheduler] Task {task['id']} executed successfully, result sent to {receiver}")
+            return True
+
         except Exception as e:
             logger.error(f"[Scheduler] Failed to execute task via Agent: {e}")
             import traceback
             logger.error(f"[Scheduler] Traceback: {traceback.format_exc()}")
-            
+            return False
+
     except Exception as e:
         logger.error(f"[Scheduler] Error in _execute_agent_task: {e}")
         import traceback
         logger.error(f"[Scheduler] Traceback: {traceback.format_exc()}")
+        return False
 
 
-def _execute_send_message(task: dict, agent_bridge):
-    """
-    Execute a send_message action
-    
-    Args:
-        task: Task dictionary
-        agent_bridge: AgentBridge instance
-    """
+def _execute_send_message(task: dict, agent_bridge) -> bool:
+    """Execute a send_message action. Returns True/False for delivery."""
     try:
         action = task.get("action", {})
         content = action.get("content", "")
@@ -263,7 +300,7 @@ def _execute_send_message(task: dict, agent_bridge):
         
         if not receiver:
             logger.error(f"[Scheduler] Task {task['id']}: No receiver specified")
-            return
+            return True
         
         # Create context for sending message
         context = Context(ContextType.TEXT, content)
@@ -308,169 +345,135 @@ def _execute_send_message(task: dict, agent_bridge):
         # Get channel and send
         from channel.channel_factory import create_channel
         
+        channel = create_channel(channel_type)
+        if not channel:
+            logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
+            return False
+
+        if channel_type == "web" and hasattr(channel, 'request_to_session'):
+            channel.request_to_session[request_id] = receiver
+
         try:
-            channel = create_channel(channel_type)
-            if channel:
-                # For web channel, register the request_id to session mapping
-                if channel_type == "web" and hasattr(channel, 'request_to_session'):
-                    channel.request_to_session[request_id] = receiver
-                    logger.debug(f"[Scheduler] Registered request_id {request_id} -> session {receiver}")
-                
-                channel.send(reply, context)
-                _remember_delivered_output(agent_bridge, task, channel_type, content)
-                logger.info(f"[Scheduler] Task {task['id']} executed: sent message to {receiver}")
-            else:
-                logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
+            channel.send(reply, context)
         except Exception as e:
             logger.error(f"[Scheduler] Failed to send message: {e}")
-            import traceback
-            logger.error(f"[Scheduler] Traceback: {traceback.format_exc()}")
-            
+            return False
+
+        _remember_delivered_output(agent_bridge, task, channel_type, content)
+        logger.info(f"[Scheduler] Task {task['id']} executed: sent message to {receiver}")
+        return True
+
     except Exception as e:
         logger.error(f"[Scheduler] Error in _execute_send_message: {e}")
         import traceback
         logger.error(f"[Scheduler] Traceback: {traceback.format_exc()}")
+        return False
 
 
-def _execute_tool_call(task: dict, agent_bridge):
-    """
-    Execute a tool_call action
-    
-    Args:
-        task: Task dictionary
-        agent_bridge: AgentBridge instance
-    """
+def _execute_tool_call(task: dict, agent_bridge) -> bool:
+    """Execute a tool_call action. Returns True/False for delivery."""
     try:
         action = task.get("action", {})
-        # Support both old and new field names
         tool_name = action.get("call_name") or action.get("tool_name")
         tool_params = action.get("call_params") or action.get("tool_params", {})
         result_prefix = action.get("result_prefix", "")
         receiver = action.get("receiver")
         is_group = action.get("is_group", False)
         channel_type = action.get("channel_type", "unknown")
-        
+
         if not tool_name:
             logger.error(f"[Scheduler] Task {task['id']}: No tool_name specified")
-            return
-        
+            return True
         if not receiver:
             logger.error(f"[Scheduler] Task {task['id']}: No receiver specified")
-            return
-        
-        # Get tool manager and create tool instance
+            return True
+
         from agent.tools.tool_manager import ToolManager
-        tool_manager = ToolManager()
-        tool = tool_manager.create_tool(tool_name)
-        
+        tool = ToolManager().create_tool(tool_name)
         if not tool:
             logger.error(f"[Scheduler] Task {task['id']}: Tool '{tool_name}' not found")
-            return
-        
-        # Execute tool
+            return True
+
         logger.info(f"[Scheduler] Task {task['id']}: Executing tool '{tool_name}' with params {tool_params}")
         result = tool.execute(tool_params)
-        
-        # Get result content
-        if hasattr(result, 'result'):
-            content = result.result
-        else:
-            content = str(result)
-        
-        # Add prefix if specified
+        content = result.result if hasattr(result, 'result') else str(result)
         if result_prefix:
             content = f"{result_prefix}\n\n{content}"
-        
-        # Send result as message
+
         context = Context(ContextType.TEXT, content)
         context["receiver"] = receiver
         context["isgroup"] = is_group
         context["session_id"] = receiver
-        
-        # Channel-specific context setup
+
+        request_id = None
         if channel_type == "web":
-            # Web channel needs request_id
             import uuid
             request_id = f"scheduler_{task['id']}_{uuid.uuid4().hex[:8]}"
             context["request_id"] = request_id
-            logger.debug(f"[Scheduler] Generated request_id for web channel: {request_id}")
         elif channel_type == "feishu":
             context["receive_id_type"] = "chat_id" if is_group else "open_id"
             context["msg"] = None
-            logger.debug(f"[Scheduler] Feishu: receive_id_type={context['receive_id_type']}, is_group={is_group}, receiver={receiver}")
         elif channel_type == "wecom_bot":
             context["msg"] = None
 
         reply = Reply(ReplyType.TEXT, content)
 
-        # Get channel and send
         from channel.channel_factory import create_channel
+        channel = create_channel(channel_type)
+        if not channel:
+            logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
+            return False
+
+        if channel_type == "web" and request_id and hasattr(channel, 'request_to_session'):
+            channel.request_to_session[request_id] = receiver
 
         try:
-            channel = create_channel(channel_type)
-            if channel:
-                if channel_type == "web" and hasattr(channel, 'request_to_session'):
-                    channel.request_to_session[request_id] = receiver
-                    logger.debug(f"[Scheduler] Registered request_id {request_id} -> session {receiver}")
-
-                channel.send(reply, context)
-                _remember_delivered_output(agent_bridge, task, channel_type, content)
-                logger.info(f"[Scheduler] Task {task['id']} executed: sent tool result to {receiver}")
-            else:
-                logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
+            channel.send(reply, context)
         except Exception as e:
             logger.error(f"[Scheduler] Failed to send tool result: {e}")
+            return False
+
+        _remember_delivered_output(agent_bridge, task, channel_type, content)
+        logger.info(f"[Scheduler] Task {task['id']} executed: sent tool result to {receiver}")
+        return True
 
     except Exception as e:
         logger.error(f"[Scheduler] Error in _execute_tool_call: {e}")
+        return False
 
 
-def _execute_skill_call(task: dict, agent_bridge):
-    """
-    Execute a skill_call action by asking Agent to run the skill
-    
-    Args:
-        task: Task dictionary
-        agent_bridge: AgentBridge instance
-    """
+def _execute_skill_call(task: dict, agent_bridge) -> bool:
+    """Execute a skill_call action by asking Agent to run the skill.
+    Returns True/False for delivery."""
     try:
         action = task.get("action", {})
-        # Support both old and new field names
         skill_name = action.get("call_name") or action.get("skill_name")
         skill_params = action.get("call_params") or action.get("skill_params", {})
         result_prefix = action.get("result_prefix", "")
         receiver = action.get("receiver")
         is_group = action.get("isgroup", False)
         channel_type = action.get("channel_type", "unknown")
-        
+
         if not skill_name:
             logger.error(f"[Scheduler] Task {task['id']}: No skill_name specified")
-            return
-        
+            return True
         if not receiver:
             logger.error(f"[Scheduler] Task {task['id']}: No receiver specified")
-            return
-        
+            return True
+
         logger.info(f"[Scheduler] Task {task['id']}: Executing skill '{skill_name}' with params {skill_params}")
-        
-        # Create a unique session_id for this scheduled task to avoid polluting user's conversation
-        # Format: scheduler_<receiver>_<task_id> to ensure isolation
+
         scheduler_session_id = f"scheduler_{receiver}_{task['id']}"
-        
-        # Build a natural language query for the Agent to execute the skill
-        # Format: "Use skill-name to do something with params"
         param_str = ", ".join([f"{k}={v}" for k, v in skill_params.items()])
         query = f"Use {skill_name} skill"
         if param_str:
             query += f" with {param_str}"
-        
-        # Create context for Agent
+
         context = Context(ContextType.TEXT, query)
         context["receiver"] = receiver
         context["isgroup"] = is_group
         context["session_id"] = scheduler_session_id
-        
-        # Channel-specific setup
+
         if channel_type == "web":
             import uuid
             request_id = f"scheduler_{task['id']}_{uuid.uuid4().hex[:8]}"
@@ -481,49 +484,48 @@ def _execute_skill_call(task: dict, agent_bridge):
         elif channel_type == "wecom_bot":
             context["msg"] = None
 
-        # Use Agent to execute the skill
         try:
-            # Don't clear history - scheduler tasks use isolated session_id so they won't pollute user conversations
             reply = agent_bridge.agent_reply(query, context=context, on_event=None, clear_history=False)
-            
-            if reply and reply.content:
-                content = reply.content
-                
-                # Add prefix if specified
-                if result_prefix:
-                    content = f"{result_prefix}\n\n{content}"
-                
-                # Send the result via channel
-                from channel.channel_factory import create_channel
-                
-                try:
-                    channel = create_channel(channel_type)
-                    if channel:
-                        # For web channel, register request_id
-                        if channel_type == "web" and hasattr(channel, 'request_to_session'):
-                            req_id = context.get("request_id")
-                            if req_id:
-                                channel.request_to_session[req_id] = receiver
-                                logger.debug(f"[Scheduler] Registered request_id {req_id} -> session {receiver}")
-                        
-                        channel.send(Reply(ReplyType.TEXT, content), context)
-                        _remember_delivered_output(agent_bridge, task, channel_type, content)
-                except Exception as e:
-                    logger.error(f"[Scheduler] Failed to send skill result: {e}")
-                
-                logger.info(f"[Scheduler] Task {task['id']} executed: skill result sent to {receiver}")
-            else:
-                logger.error(f"[Scheduler] Task {task['id']}: No result from skill execution")
-                
         except Exception as e:
             logger.error(f"[Scheduler] Failed to execute skill via Agent: {e}")
             import traceback
             logger.error(f"[Scheduler] Traceback: {traceback.format_exc()}")
-            
+            return False
+
+        if not (reply and reply.content):
+            logger.error(f"[Scheduler] Task {task['id']}: No result from skill execution")
+            return True
+
+        content = reply.content
+        if result_prefix:
+            content = f"{result_prefix}\n\n{content}"
+
+        from channel.channel_factory import create_channel
+        channel = create_channel(channel_type)
+        if not channel:
+            logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
+            return False
+
+        if channel_type == "web" and hasattr(channel, 'request_to_session'):
+            req_id = context.get("request_id")
+            if req_id:
+                channel.request_to_session[req_id] = receiver
+
+        try:
+            channel.send(Reply(ReplyType.TEXT, content), context)
+        except Exception as e:
+            logger.error(f"[Scheduler] Failed to send skill result: {e}")
+            return False
+
+        _remember_delivered_output(agent_bridge, task, channel_type, content)
+        logger.info(f"[Scheduler] Task {task['id']} executed: skill result sent to {receiver}")
+        return True
+
     except Exception as e:
         logger.error(f"[Scheduler] Error in _execute_skill_call: {e}")
         import traceback
         logger.error(f"[Scheduler] Traceback: {traceback.format_exc()}")
+        return False
 
 
 def attach_scheduler_to_tool(tool, context: Context = None):
diff --git a/agent/tools/scheduler/scheduler_service.py b/agent/tools/scheduler/scheduler_service.py
index dd5369cb..1f4bc6fb 100644
--- a/agent/tools/scheduler/scheduler_service.py
+++ b/agent/tools/scheduler/scheduler_service.py
@@ -52,7 +52,6 @@ class SchedulerService:
             self.running = True
             self.thread = threading.Thread(target=self._run_loop, daemon=True)
             self.thread.start()
-            logger.debug("[Scheduler] Service started")
     
     def stop(self):
         """Stop the scheduler service"""
@@ -67,7 +66,7 @@ class SchedulerService:
     
     def _run_loop(self):
         """Main scheduler loop"""
-        logger.debug("[Scheduler] Scheduler loop started")
+        logger.info("[Scheduler] Scheduler loop started")
         
         while self.running:
             try:
@@ -84,12 +83,18 @@ class SchedulerService:
         
         for task in tasks:
             try:
-                # Check if task is due
                 if self._is_task_due(task, now):
                     logger.info(f"[Scheduler] Executing task: {task['id']} - {task['name']}")
-                    self._execute_task(task)
-                    
-                    # Update next run time
+                    ok = self._execute_task(task)
+                    if not ok:
+                        # Leave next_run_at as-is so the next loop retries.
+                        # Cron tasks within the catch-up window will keep
+                        # firing; beyond it _is_task_due will reschedule.
+                        logger.warning(
+                            f"[Scheduler] Task {task['id']} delivery failed, will retry next tick"
+                        )
+                        continue
+
                     next_run = self._calculate_next_run(task, now)
                     if next_run:
                         self.task_store.update_task(task['id'], {
@@ -97,7 +102,6 @@ class SchedulerService:
                             "last_run_at": now.isoformat()
                         })
                     else:
-                        # One-time task completed, remove it
                         self.task_store.delete_task(task['id'])
                         logger.info(f"[Scheduler] One-time task completed and removed: {task['id']}")
             except Exception as e:
@@ -128,30 +132,35 @@ class SchedulerService:
         try:
             next_run = _parse_naive_local(next_run_str)
 
-            # Check if task is overdue (e.g., service restart)
             if next_run < now:
                 time_diff = (now - next_run).total_seconds()
-                
-                # If overdue by more than 5 minutes, skip this run and schedule next
-                if time_diff > 300:  # 5 minutes
-                    logger.warning(f"[Scheduler] Task {task['id']} is overdue by {int(time_diff)}s, skipping and scheduling next run")
-                    
-                    # For one-time tasks, remove them directly
-                    schedule = task.get("schedule", {})
-                    if schedule.get("type") == "once":
-                        self.task_store.delete_task(task['id'])
-                        logger.info(f"[Scheduler] One-time task {task['id']} expired, removed")
-                        return False
-                    
-                    # For recurring tasks, calculate next run from now
-                    next_next_run = self._calculate_next_run(task, now)
-                    if next_next_run:
-                        self.task_store.update_task(task['id'], {
-                            "next_run_at": next_next_run.isoformat()
-                        })
-                        logger.info(f"[Scheduler] Rescheduled task {task['id']} to {next_next_run}")
+                schedule = task.get("schedule", {})
+                schedule_type = schedule.get("type")
+
+                # Catch-up window: fire if we're within 10 minutes of the
+                # scheduled tick. Beyond that we'd rather skip than push a
+                # stale daily report to the user.
+                if time_diff <= 600:
+                    return True
+
+                logger.warning(
+                    f"[Scheduler] Task {task['id']} is overdue by {int(time_diff)}s, "
+                    f"skipping and scheduling next run"
+                )
+
+                if schedule_type == "once":
+                    self.task_store.delete_task(task['id'])
+                    logger.info(f"[Scheduler] One-time task {task['id']} expired, removed")
                     return False
-            
+
+                next_next_run = self._calculate_next_run(task, now)
+                if next_next_run:
+                    self.task_store.update_task(task['id'], {
+                        "next_run_at": next_next_run.isoformat()
+                    })
+                    logger.info(f"[Scheduler] Rescheduled task {task['id']} to {next_next_run}")
+                return False
+
             return now >= next_run
         except Exception as e:
             logger.error(
@@ -213,20 +222,22 @@ class SchedulerService:
         
         return None
     
-    def _execute_task(self, task: dict):
+    def _execute_task(self, task: dict) -> bool:
         """
-        Execute a task
-        
-        Args:
-            task: Task dictionary
+        Execute a task.
+
+        Returns True if delivery succeeded (caller should advance state),
+        False if it failed (caller should keep next_run_at so the next
+        loop iteration retries). Callback may return None for legacy
+        behaviour, treated as success.
         """
         try:
-            # Call the execute callback
-            self.execute_callback(task)
+            result = self.execute_callback(task)
+            return False if result is False else True
         except Exception as e:
             logger.error(f"[Scheduler] Error executing task {task['id']}: {e}")
-            # Update task with error
             self.task_store.update_task(task['id'], {
                 "last_error": str(e),
                 "last_error_at": datetime.now().isoformat()
             })
+            return False
diff --git a/agent/tools/vision/vision.py b/agent/tools/vision/vision.py
index a1c3265f..498f3cd8 100644
--- a/agent/tools/vision/vision.py
+++ b/agent/tools/vision/vision.py
@@ -3,7 +3,7 @@ Vision tool - Analyze images using Vision API.
 Supports local files (auto base64-encoded) and HTTP URLs.
 
 Provider resolution:
-  - tool.vision.model (if set) means "prefer this model first; fall back to
+  - tools.vision.model (if set) means "prefer this model first; fall back to
     other configured providers if it fails". The model name is mapped to its
     native provider (e.g. doubao-* → Doubao, kimi-* → Moonshot, gpt-* →
     OpenAI/LinkAI). That provider is tried first, then the standard auto
@@ -53,14 +53,15 @@ _DISCOVERABLE_MODELS = [
     ("ark_api_key", const.DOUBAO, const.DOUBAO_SEED_2_PRO, "Doubao"),
     ("dashscope_api_key", const.QWEN_DASHSCOPE, const.QWEN36_PLUS, "DashScope"),
     ("claude_api_key", const.CLAUDEAPI, const.CLAUDE_4_6_SONNET, "Claude"),
-    ("gemini_api_key", const.GEMINI, const.GEMINI_31_FLASH_LITE_PRE, "Gemini"),
+    ("gemini_api_key", const.GEMINI, const.GEMINI_35_FLASH, "Gemini"),
     ("qianfan_api_key", const.QIANFAN, const.ERNIE_45_TURBO_VL, "Qianfan"),
     ("zhipu_ai_api_key", const.ZHIPU_AI, const.GLM_4_7, "ZhipuAI"),
     ("minimax_api_key", const.MiniMax, const.MINIMAX_M2_7, "MiniMax"),
+    ("mimo_api_key", const.MIMO, const.MIMO_V2_5_PRO, "MiMo"),
 ]
 
 # Model name prefix → discoverable provider display_name.
-# Used to auto-route tool.vision.model to its native provider.
+# Used to auto-route tools.vision.model to its native provider.
 # Matched case-insensitively; longest prefix wins.
 _MODEL_PREFIX_TO_PROVIDER = [
     ("doubao-", "Doubao"),
@@ -73,11 +74,29 @@ _MODEL_PREFIX_TO_PROVIDER = [
     ("glm-", "ZhipuAI"),
     ("minimax-", "MiniMax"),
     ("abab", "MiniMax"),
+    ("mimo-", "MiMo"),
 ]
 
 # Model prefixes that natively belong to OpenAI / LinkAI (raw HTTP providers).
 _OPENAI_MODEL_PREFIXES = ("gpt-", "o1-", "o3-", "o4-", "chatgpt-")
 
+# Maps the UI provider id (persisted in tools.vision.provider) to the internal
+# display name used in VisionProvider.name. Keep in sync with _DISCOVERABLE_MODELS
+# and the openai/linkai branches in _route_by_model_name.
+_PROVIDER_ID_TO_DISPLAY = {
+    "openai": "OpenAI",
+    "linkai": "LinkAI",
+    "moonshot": "Moonshot",
+    "doubao": "Doubao",
+    "dashscope": "DashScope",
+    "claudeAPI": "Claude",
+    "gemini": "Gemini",
+    "qianfan": "Qianfan",
+    "zhipu": "ZhipuAI",
+    "minimax": "MiniMax",
+    "mimo": "MiMo",
+}
+
 
 @dataclass
 class VisionProvider:
@@ -154,7 +173,7 @@ class Vision(BaseTool):
 
         # Default model is only used as a last-resort placeholder for providers
         # whose VisionProvider.model_override is None (e.g. raw OpenAI provider
-        # when the user did not configure tool.vision.model).
+        # when the user did not configure tools.vision.model).
         return self._call_with_fallback(providers, DEFAULT_MODEL, question, image_content)
 
     def _call_with_fallback(self, providers: List[VisionProvider], model: str,
@@ -193,12 +212,12 @@ class Vision(BaseTool):
         """
         Build an ordered list of providers to try.
 
-        Semantics of `tool.vision.model`:
+        Semantics of `tools.vision.model`:
           "Prefer this model first; fall back to other configured providers
            if it fails."
 
         Order:
-          1. The provider that natively serves `tool.vision.model` (if any
+          1. The provider that natively serves `tools.vision.model` (if any
              and its API key is configured) — using the user-specified model
              name verbatim.
           2. Auto-discovery chain as fallback:
@@ -211,13 +230,19 @@ class Vision(BaseTool):
         are de-duplicated to avoid retrying the same endpoint twice.
         """
         user_model = self._resolve_user_vision_model()
+        user_provider = self._resolve_user_vision_provider()
         providers: List[VisionProvider] = []
 
-        # Step 1: preferred provider derived from tool.vision.model
-        if user_model:
+        # Step 1: preferred provider — explicit `tools.vision.provider`
+        # wins so custom model names can still be routed correctly. Falls
+        # through to model-name prefix inference when provider is unset.
+        preferred = None
+        if user_provider and user_model:
+            preferred = self._route_by_provider_id(user_provider, user_model)
+        if not preferred and user_model:
             preferred = self._route_by_model_name(user_model)
-            if preferred:
-                providers.extend(preferred)
+        if preferred:
+            providers.extend(preferred)
 
         # Step 2: auto-discovery chain as fallback
         existing = {p.name for p in providers}
@@ -251,11 +276,11 @@ class Vision(BaseTool):
 
     @staticmethod
     def _resolve_user_vision_model() -> Optional[str]:
-        """Read tool.vision.model from config; return None if unset/blank."""
-        tool_conf = conf().get("tool", {})
-        if not isinstance(tool_conf, dict):
+        """Read tools.vision.model (singular ``tool`` kept as runtime fallback)."""
+        tools_conf = conf().get("tools") or conf().get("tool") or {}
+        if not isinstance(tools_conf, dict):
             return None
-        vision_conf = tool_conf.get("vision", {})
+        vision_conf = tools_conf.get("vision", {})
         if not isinstance(vision_conf, dict):
             return None
         m = vision_conf.get("model")
@@ -263,6 +288,24 @@ class Vision(BaseTool):
             return m.strip()
         return None
 
+    @staticmethod
+    def _resolve_user_vision_provider() -> Optional[str]:
+        """Read tools.vision.provider — the UI-persisted vendor id.
+
+        Lets users pin a vendor for custom model names that prefix-inference
+        can't recognize. Returns None when unset/blank.
+        """
+        tools_conf = conf().get("tools") or conf().get("tool") or {}
+        if not isinstance(tools_conf, dict):
+            return None
+        vision_conf = tools_conf.get("vision", {})
+        if not isinstance(vision_conf, dict):
+            return None
+        p = vision_conf.get("provider")
+        if isinstance(p, str) and p.strip():
+            return p.strip()
+        return None
+
     @staticmethod
     def _infer_provider_from_model(model_name: str) -> Optional[str]:
         """
@@ -279,6 +322,54 @@ class Vision(BaseTool):
                 return display_name
         return None
 
+    def _route_by_provider_id(self, provider_id: str, user_model: str) -> Optional[List[VisionProvider]]:
+        """Route by the UI-persisted provider id.
+
+        Returns:
+          - [provider] : provider id is known and its key is configured.
+          - None       : unknown provider id, or the bot can't be created.
+                         Caller falls through to model-name-based routing.
+        """
+        display_name = _PROVIDER_ID_TO_DISPLAY.get(provider_id)
+        if not display_name:
+            return None
+
+        # OpenAI / LinkAI use raw HTTP providers, not the discoverable bot path.
+        if provider_id == "openai":
+            p = self._build_openai_provider(user_model)
+            return [p] if p else None
+        if provider_id == "linkai":
+            p = self._build_linkai_provider(user_model)
+            return [p] if p else None
+
+        # Discoverable bot-backed providers.
+        for config_key, bot_type, _default_model, name in _DISCOVERABLE_MODELS:
+            if name != display_name:
+                continue
+            api_key = conf().get(config_key, "")
+            if not api_key or not api_key.strip():
+                logger.warning(f"[Vision] tools.vision.provider='{provider_id}' "
+                               f"but '{config_key}' is not configured. Falling back.")
+                return None
+            try:
+                from models.bot_factory import create_bot
+                bot = create_bot(bot_type)
+                if not hasattr(bot, 'call_vision'):
+                    logger.warning(f"[Vision] '{display_name}' bot does not implement call_vision.")
+                    return None
+            except Exception as e:
+                logger.warning(f"[Vision] Failed to create '{display_name}' bot: {e}")
+                return None
+            return [VisionProvider(
+                name=display_name,
+                api_key="",
+                api_base="",
+                model_override=user_model,
+                use_bot=True,
+                fallback_bot=bot,
+            )]
+        return None
+
     def _route_by_model_name(self, user_model: str) -> Optional[List[VisionProvider]]:
         """
         Try to build a provider list using the user-specified model name.
@@ -303,7 +394,7 @@ class Vision(BaseTool):
                 self._append_provider(providers, lambda: self._build_linkai_provider(user_model))
             if providers:
                 return providers
-            logger.warning(f"[Vision] tool.vision.model='{user_model}' looks like an OpenAI "
+            logger.warning(f"[Vision] tools.vision.model='{user_model}' looks like an OpenAI "
                            f"model but neither OPENAI_API_KEY nor LINKAI_API_KEY is configured.")
             return None  # fall through to auto
 
@@ -317,7 +408,7 @@ class Vision(BaseTool):
                 continue
             api_key = conf().get(config_key, "")
             if not api_key or not api_key.strip():
-                logger.warning(f"[Vision] tool.vision.model='{user_model}' routes to "
+                logger.warning(f"[Vision] tools.vision.model='{user_model}' routes to "
                                f"'{display_name}' but '{config_key}' is not configured. "
                                f"Falling back to auto-discovery.")
                 return None  # fall through to auto
@@ -452,8 +543,8 @@ class Vision(BaseTool):
         if not self._main_bot_supports_vision(bot):
             return None
 
-        # Use the configured main model name; do NOT inject tool.vision.model
-        # here, because by the time we reach this branch the tool.vision.model
+        # Use the configured main model name; do NOT inject tools.vision.model
+        # here, because by the time we reach this branch the tools.vision.model
         # routing has already been attempted (and either matched the main bot
         # or failed to find a provider).
         main_model_name = conf().get("model") or None
diff --git a/agent/tools/web_search/web_search.py b/agent/tools/web_search/web_search.py
index 4c6d1e45..ca56567d 100644
--- a/agent/tools/web_search/web_search.py
+++ b/agent/tools/web_search/web_search.py
@@ -1,13 +1,27 @@
-"""
-Web Search tool - Search the web using Bocha or LinkAI search API.
-Supports two backends with unified response format:
-  1. Bocha Search (primary, requires BOCHA_API_KEY)
-  2. LinkAI Search (fallback, requires LINKAI_API_KEY)
+"""Web Search tool. Supports four backends with a unified response format:
+  - bocha   (https://open.bochaai.com)
+  - zhipu   (https://docs.bigmodel.cn/cn/guide/tools/web-search)
+  - qianfan (https://cloud.baidu.com/doc/qianfan/s/2mh4su4uy)
+  - linkai  (https://link-ai.tech, fallback)
+
+Provider selection
+  - strategy 'auto' (default): pick the first configured provider in the
+    canonical order [bocha, zhipu, qianfan, linkai]. When the caller passes
+    an explicit `provider` it overrides the pick; an invalid/unconfigured
+    one silently falls back to the auto order.
+  - strategy 'fixed': use the configured provider; if its credential is
+    missing at call time, silently fall back to auto order (no card hint).
+
+Credentials
+  - bocha   : tools.web_search.bocha_api_key  ->  env BOCHA_API_KEY
+  - zhipu   : conf.zhipu_ai_api_key            ->  env ZHIPUAI_API_KEY
+  - qianfan : conf.qianfan_api_key             ->  env QIANFAN_API_KEY
+  - linkai  : conf.linkai_api_key              ->  env LINKAI_API_KEY
 """
 
-import os
 import json
-from typing import Dict, Any, Optional
+import os
+from typing import Any, Dict, List, Optional
 
 import requests
 
@@ -16,12 +30,63 @@ from common.log import logger
 from config import conf
 
 
-# Default timeout for API requests (seconds)
 DEFAULT_TIMEOUT = 30
 
+# Canonical fallback order. Empirically ordered by Chinese real-time
+# quality + relevance: bocha (best overall), qianfan (best for hot news),
+# zhipu (strong on long-form articles), linkai (cloud aggregator, last
+# resort).
+PROVIDER_ORDER = ("bocha", "qianfan", "zhipu", "linkai")
+
+PROVIDER_LABELS = {
+    "bocha":   "Bocha",
+    "zhipu":   "Zhipu",
+    "qianfan": "Baidu Qianfan",
+    "linkai":  "LinkAI",
+}
+
+
+def _tools_web_search_conf() -> dict:
+    """Return the tools.web_search config block (dict-like)."""
+    tools_cfg = conf().get("tools") or {}
+    if not isinstance(tools_cfg, dict):
+        return {}
+    block = tools_cfg.get("web_search") or {}
+    return block if isinstance(block, dict) else {}
+
+
+def _get_api_key(provider: str) -> str:
+    """Resolve API key for a provider, with conf -> env fallback."""
+    if provider == "bocha":
+        key = (_tools_web_search_conf().get("bocha_api_key") or "").strip()
+        return key or os.environ.get("BOCHA_API_KEY", "").strip()
+    if provider == "zhipu":
+        key = (conf().get("zhipu_ai_api_key") or "").strip()
+        return key or os.environ.get("ZHIPUAI_API_KEY", "").strip()
+    if provider == "qianfan":
+        key = (conf().get("qianfan_api_key") or "").strip()
+        return key or os.environ.get("QIANFAN_API_KEY", "").strip()
+    if provider == "linkai":
+        key = (conf().get("linkai_api_key") or "").strip()
+        return key or os.environ.get("LINKAI_API_KEY", "").strip()
+    return ""
+
+
+def configured_providers() -> List[str]:
+    """Return configured providers in canonical order."""
+    return [p for p in PROVIDER_ORDER if _get_api_key(p)]
+
+
+def _configured_strategy() -> str:
+    return (_tools_web_search_conf().get("strategy") or "auto").strip().lower()
+
+
+def _configured_provider() -> str:
+    return (_tools_web_search_conf().get("provider") or "").strip().lower()
+
 
 class WebSearch(BaseTool):
-    """Tool for searching the web using Bocha or LinkAI search API"""
+    """Tool for searching the web across multiple providers."""
 
     name: str = "web_search"
     description: str = "Search the web for real-time information. Returns titles, URLs, and snippets."
@@ -55,264 +120,368 @@ class WebSearch(BaseTool):
 
     def __init__(self, config: dict = None):
         self.config = config or {}
-        self._backend = None  # Will be resolved on first execute
 
     @staticmethod
     def is_available() -> bool:
-        """Check if web search is available (at least one API key is configured)"""
-        return bool(os.environ.get("BOCHA_API_KEY") or os.environ.get("LINKAI_API_KEY"))
+        """Tool is offered to the agent when at least one provider has a key."""
+        return bool(configured_providers())
 
-    def _resolve_backend(self) -> Optional[str]:
-        """
-        Determine which search backend to use.
-        Priority: Bocha > LinkAI
+    @classmethod
+    def get_json_schema(cls) -> dict:
+        """Augment the static schema with a `provider` field — only when the
+        user has ≥2 providers configured AND strategy is 'auto'. Otherwise
+        the backend picks silently and exposing the field would only waste
+        the agent's tokens."""
+        schema = {
+            "name": cls.name,
+            "description": cls.description,
+            "parameters": json.loads(json.dumps(cls.params)),  # deep copy
+        }
+        if _configured_strategy() != "auto":
+            return schema
+        available = configured_providers()
+        if len(available) < 2:
+            return schema
 
-        :return: 'bocha', 'linkai', or None
+        schema["parameters"]["properties"]["provider"] = {
+            "type": "string",
+            "enum": available,
+            "description": "Optional. Specifies the search backend. You may switch between providers when the user wants results from a particular source or from multiple sources.",
+        }
+        return schema
+
+    # ------------------------------------------------------------------
+    # Provider resolution
+    # ------------------------------------------------------------------
+
+    def _resolve_provider(self, requested: Optional[str]) -> Optional[str]:
+        """Pick a provider for this call.
+
+        Priority: caller-supplied (if configured) > fixed strategy (if
+        configured) > first configured in PROVIDER_ORDER. Silent fallback
+        when the desired one has no key.
         """
-        if os.environ.get("BOCHA_API_KEY"):
-            return "bocha"
-        if os.environ.get("LINKAI_API_KEY"):
-            return "linkai"
-        return None
+        available = configured_providers()
+        if not available:
+            return None
+
+        if requested:
+            req = requested.strip().lower()
+            if req in available:
+                return req
+            logger.warning(f"[WebSearch] requested provider '{requested}' unavailable, falling back")
+
+        if _configured_strategy() == "fixed":
+            pinned = _configured_provider()
+            if pinned in available:
+                return pinned
+            if pinned:
+                logger.warning(f"[WebSearch] pinned provider '{pinned}' unavailable, falling back to auto")
+
+        return available[0]
+
+    @staticmethod
+    def _resolution_reason(requested: Optional[str], chosen: str) -> str:
+        """Human-readable explanation for why `chosen` won the resolver."""
+        if requested and requested.strip().lower() == chosen:
+            return "caller-requested"
+        strategy = _configured_strategy()
+        if strategy == "fixed" and _configured_provider() == chosen:
+            return "fixed-strategy"
+        return "auto-fallback"
+
+    # ------------------------------------------------------------------
+    # Entry point
+    # ------------------------------------------------------------------
 
     def execute(self, args: Dict[str, Any]) -> ToolResult:
-        """
-        Execute web search
-
-        :param args: Search parameters (query, count, freshness, summary)
-        :return: Search results
-        """
-        query = args.get("query", "").strip()
+        query = (args.get("query") or "").strip()
         if not query:
             return ToolResult.fail("Error: 'query' parameter is required")
 
         count = args.get("count", 10)
         freshness = args.get("freshness", "noLimit")
         summary = args.get("summary", False)
-
-        # Validate count
         if not isinstance(count, int) or count < 1 or count > 50:
             count = 10
 
-        # Resolve backend
-        backend = self._resolve_backend()
-        if not backend:
+        requested = args.get("provider")
+        provider = self._resolve_provider(requested)
+        if not provider:
             return ToolResult.fail(
-                "Error: No search API key configured. "
-                "Please set BOCHA_API_KEY or LINKAI_API_KEY using env_config tool.\n"
-                "  - Bocha Search: https://open.bocha.cn\n"
-                "  - LinkAI Search: https://link-ai.tech"
+                "Error: No search provider configured. "
+                "Configure one of BOCHA_API_KEY / zhipu_ai_api_key / qianfan_api_key / linkai_api_key."
             )
 
+        # Always log the routing decision so multi-provider deployments can
+        # tell at a glance which backend served any given query.
+        available = configured_providers()
+        reason = self._resolution_reason(requested, provider)
+        q_preview = query if len(query) <= 60 else (query[:57] + "...")
+        logger.info(
+            f"[WebSearch] provider={provider} reason={reason} "
+            f"available={list(available)} query={q_preview!r} count={count} freshness={freshness}"
+        )
+
         try:
-            if backend == "bocha":
+            if provider == "bocha":
                 return self._search_bocha(query, count, freshness, summary)
-            else:
+            if provider == "zhipu":
+                return self._search_zhipu(query, count, freshness)
+            if provider == "qianfan":
+                return self._search_qianfan(query, count, freshness)
+            if provider == "linkai":
                 return self._search_linkai(query, count, freshness)
+            return ToolResult.fail(f"Error: Unknown provider '{provider}'")
         except requests.Timeout:
             return ToolResult.fail(f"Error: Search request timed out after {DEFAULT_TIMEOUT}s")
         except requests.ConnectionError:
             return ToolResult.fail("Error: Failed to connect to search API")
         except Exception as e:
-            logger.error(f"[WebSearch] Unexpected error: {e}", exc_info=True)
+            logger.error(f"[WebSearch] Unexpected error ({provider}): {e}", exc_info=True)
             return ToolResult.fail(f"Error: Search failed - {str(e)}")
 
+    # ------------------------------------------------------------------
+    # Bocha
+    # ------------------------------------------------------------------
+
     def _search_bocha(self, query: str, count: int, freshness: str, summary: bool) -> ToolResult:
-        """
-        Search using Bocha API
-
-        :param query: Search query
-        :param count: Number of results
-        :param freshness: Time range filter
-        :param summary: Whether to include summary
-        :return: Formatted search results
-        """
-        api_key = os.environ.get("BOCHA_API_KEY", "")
-        url = "https://api.bocha.cn/v1/web-search"
-
+        api_key = _get_api_key("bocha")
+        url = "https://api.bochaai.com/v1/web-search"
         headers = {
             "Authorization": f"Bearer {api_key}",
             "Content-Type": "application/json",
-            "Accept": "application/json"
+            "Accept": "application/json",
         }
+        payload = {"query": query, "count": count, "freshness": freshness, "summary": summary}
 
-        payload = {
-            "query": query,
-            "count": count,
-            "freshness": freshness,
-            "summary": summary
-        }
+        logger.debug(f"[WebSearch] bocha: query='{query}', count={count}")
+        resp = requests.post(url, headers=headers, json=payload, timeout=DEFAULT_TIMEOUT)
 
-        logger.debug(f"[WebSearch] Bocha search: query='{query}', count={count}")
+        if resp.status_code == 401:
+            return ToolResult.fail("Error: Invalid bocha API key.")
+        if resp.status_code == 403:
+            return ToolResult.fail("Error: bocha API — insufficient balance. Top up at https://open.bochaai.com")
+        if resp.status_code == 429:
+            return ToolResult.fail("Error: bocha API rate limit reached.")
+        if resp.status_code != 200:
+            return ToolResult.fail(f"Error: bocha API returned HTTP {resp.status_code}")
 
-        response = requests.post(url, headers=headers, json=payload, timeout=DEFAULT_TIMEOUT)
-
-        if response.status_code == 401:
-            return ToolResult.fail("Error: Invalid BOCHA_API_KEY. Please check your API key.")
-        if response.status_code == 403:
-            return ToolResult.fail("Error: Bocha API - insufficient balance. Please top up at https://open.bocha.cn")
-        if response.status_code == 429:
-            return ToolResult.fail("Error: Bocha API rate limit reached. Please try again later.")
-        if response.status_code != 200:
-            return ToolResult.fail(f"Error: Bocha API returned HTTP {response.status_code}")
-
-        data = response.json()
-
-        # Check API-level error code
+        data = resp.json()
         api_code = data.get("code")
         if api_code is not None and api_code != 200:
             msg = data.get("msg") or "Unknown error"
-            return ToolResult.fail(f"Error: Bocha API error (code={api_code}): {msg}")
-
-        # Extract and format results
-        return self._format_bocha_results(data, query)
-
-    def _format_bocha_results(self, data: dict, query: str) -> ToolResult:
-        """
-        Format Bocha API response into unified result structure
-
-        :param data: Raw API response
-        :param query: Original query
-        :return: Formatted ToolResult
-        """
-        search_data = data.get("data", {})
-        web_pages = search_data.get("webPages", {})
-        pages = web_pages.get("value", [])
-
-        if not pages:
-            return ToolResult.success({
-                "query": query,
-                "backend": "bocha",
-                "total": 0,
-                "results": [],
-                "message": "No results found"
-            })
+            return ToolResult.fail(f"Error: bocha API error (code={api_code}): {msg}")
 
+        pages = (data.get("data") or {}).get("webPages", {}).get("value", []) or []
         results = []
-        for page in pages:
-            result = {
-                "title": page.get("name", ""),
-                "url": page.get("url", ""),
-                "snippet": page.get("snippet", ""),
-                "siteName": page.get("siteName", ""),
-                "datePublished": page.get("datePublished") or page.get("dateLastCrawled", ""),
+        for p in pages:
+            item = {
+                "title": p.get("name", ""),
+                "url": p.get("url", ""),
+                "snippet": p.get("snippet", ""),
+                "siteName": p.get("siteName", ""),
+                "datePublished": p.get("datePublished") or p.get("dateLastCrawled", ""),
             }
-            # Include summary only if present
-            if page.get("summary"):
-                result["summary"] = page["summary"]
-            results.append(result)
-
-        total = web_pages.get("totalEstimatedMatches", len(results))
-
+            if p.get("summary"):
+                item["summary"] = p["summary"]
+            results.append(item)
+        total = (data.get("data") or {}).get("webPages", {}).get("totalEstimatedMatches", len(results))
         return ToolResult.success({
-            "query": query,
-            "backend": "bocha",
-            "total": total,
-            "count": len(results),
-            "results": results
+            "query": query, "backend": "bocha",
+            "total": total, "count": len(results), "results": results,
         })
 
-    def _search_linkai(self, query: str, count: int, freshness: str) -> ToolResult:
-        """
-        Search using LinkAI plugin API
+    # ------------------------------------------------------------------
+    # Zhipu
+    # ------------------------------------------------------------------
 
-        :param query: Search query
-        :param count: Number of results
-        :param freshness: Time range filter
-        :return: Formatted search results
-        """
-        api_key = os.environ.get("LINKAI_API_KEY", "")
-        api_base = conf().get("linkai_api_base", "https://api.link-ai.tech")
-        url = f"{api_base.rstrip('/')}/v1/plugin/execute"
+    def _search_zhipu(self, query: str, count: int, freshness: str) -> ToolResult:
+        api_key = _get_api_key("zhipu")
+        api_base = (conf().get("zhipu_ai_api_base") or "https://open.bigmodel.cn/api/paas/v4").rstrip("/")
+        url = f"{api_base}/web_search"
+        headers = {
+            "Authorization": f"Bearer {api_key}",
+            "Content-Type": "application/json",
+        }
+
+        # Zhipu Web Search expects `search_query` <= 70 chars; truncate
+        # gracefully so a long agent-supplied query doesn't get rejected.
+        trimmed_query = (query or "")[:70]
+        engine = (_tools_web_search_conf().get("zhipu_search_engine") or "search_pro").strip().lower()
+        if engine not in ("search_std", "search_pro", "search_pro_sogou", "search_pro_quark"):
+            engine = "search_pro"
+
+        payload: Dict[str, Any] = {
+            "search_engine": engine,
+            "search_query": trimmed_query,
+            "search_intent": False,
+            "count": max(1, min(int(count or 10), 50)),
+            "search_recency_filter": freshness if freshness in (
+                "oneDay", "oneWeek", "oneMonth", "oneYear", "noLimit"
+            ) else "noLimit",
+        }
+        content_size = (_tools_web_search_conf().get("zhipu_content_size") or "").strip().lower()
+        if content_size in ("medium", "high"):
+            payload["content_size"] = content_size
+
+        logger.debug(f"[WebSearch] zhipu: query='{trimmed_query}', count={payload['count']}, engine={engine}")
+        resp = requests.post(url, headers=headers, json=payload, timeout=DEFAULT_TIMEOUT)
+
+        if resp.status_code == 401:
+            return ToolResult.fail("Error: Invalid Zhipu API key.")
+        if resp.status_code != 200:
+            return ToolResult.fail(f"Error: Zhipu API returned HTTP {resp.status_code}: {resp.text[:200]}")
+
+        data = resp.json()
+        # Business-level errors (1701/1702/1703 etc.) come back as
+        # {"error": {"code","message"}} even on HTTP 200.
+        if isinstance(data, dict) and data.get("error"):
+            err = data["error"] or {}
+            return ToolResult.fail(f"Error: Zhipu returned {err.get('code')}: {err.get('message','')}")
+
+        items = data.get("search_result") or (data.get("data") or {}).get("search_result") or []
+        results = []
+        for it in items:
+            results.append({
+                "title": it.get("title", ""),
+                "url": it.get("link") or it.get("url", ""),
+                "snippet": it.get("content") or it.get("snippet", ""),
+                "siteName": it.get("media") or it.get("siteName", ""),
+                "datePublished": it.get("publish_date") or it.get("datePublished", ""),
+            })
+        return ToolResult.success({
+            "query": query, "backend": "zhipu",
+            "total": len(results), "count": len(results), "results": results,
+        })
+
+    # ------------------------------------------------------------------
+    # Qianfan (Baidu)
+    # ------------------------------------------------------------------
+
+    def _search_qianfan(self, query: str, count: int, freshness: str) -> ToolResult:
+        api_key = _get_api_key("qianfan")
+        api_base = (conf().get("qianfan_api_base") or "https://qianfan.baidubce.com/v2").rstrip("/")
+        url = f"{api_base}/ai_search/web_search"
+        headers = {
+            "Authorization": f"Bearer {api_key}",
+            "Content-Type": "application/json",
+            "X-Appbuilder-From": "cow",
+        }
+
+        count = max(1, min(int(count or 10), 50))
+        payload: Dict[str, Any] = {
+            "messages": [{"role": "user", "content": query}],
+            "search_source": "baidu_search_v2",
+            "resource_type_filter": [{"type": "web", "top_k": count}],
+        }
+
+        # Baidu AI Search expects freshness as a date-range filter, not a
+        # named recency token. Translate our shared vocabulary into the
+        # underlying page_time range expected by the API.
+        search_filter = self._qianfan_build_freshness_filter(freshness)
+        if search_filter:
+            payload["search_filter"] = search_filter
+
+        logger.debug(f"[WebSearch] qianfan: query='{query}', count={count}, freshness={freshness!r}")
+        resp = requests.post(url, headers=headers, json=payload, timeout=DEFAULT_TIMEOUT)
+
+        if resp.status_code == 401:
+            return ToolResult.fail("Error: Invalid Qianfan API key.")
+        if resp.status_code != 200:
+            return ToolResult.fail(f"Error: Qianfan API returned HTTP {resp.status_code}: {resp.text[:200]}")
+
+        data = resp.json()
+        # Even on HTTP 200 Baidu surfaces business errors as {"code","message"}.
+        if isinstance(data, dict) and data.get("code"):
+            return ToolResult.fail(f"Error: Qianfan returned {data.get('code')}: {data.get('message','')}")
+
+        refs = data.get("references") or []
+        results = []
+        for d in refs:
+            results.append({
+                "title": d.get("title", ""),
+                "url": d.get("url", ""),
+                "snippet": (d.get("content") or "")[:200],
+                "siteName": d.get("web_anchor") or d.get("website") or "",
+                "datePublished": d.get("date", ""),
+            })
+        return ToolResult.success({
+            "query": query, "backend": "qianfan",
+            "total": len(results), "count": len(results), "results": results,
+        })
+
+    @staticmethod
+    def _qianfan_build_freshness_filter(freshness: str) -> Optional[Dict[str, Any]]:
+        if not freshness or freshness == "noLimit":
+            return None
+        delta_days = {"oneDay": 1, "oneWeek": 7, "oneMonth": 30, "oneYear": 365}.get(freshness)
+        if not delta_days:
+            return None
+        from datetime import datetime, timedelta
+        now = datetime.now()
+        end_date = (now + timedelta(days=1)).strftime("%Y-%m-%d")
+        start_date = (now - timedelta(days=delta_days)).strftime("%Y-%m-%d")
+        return {"range": {"page_time": {"gte": start_date, "lt": end_date}}}
+
+    # ------------------------------------------------------------------
+    # LinkAI (plugin)
+    # ------------------------------------------------------------------
+
+    def _search_linkai(self, query: str, count: int, freshness: str) -> ToolResult:
+        api_key = _get_api_key("linkai")
+        api_base = (conf().get("linkai_api_base") or "https://api.link-ai.tech").rstrip("/")
+        url = f"{api_base}/v1/plugin/execute"
 
         from common.utils import get_cloud_headers
         headers = get_cloud_headers(api_key)
 
-        payload = {
-            "code": "web-search",
-            "args": {
-                "query": query,
-                "count": count,
-                "freshness": freshness
-            }
-        }
+        payload = {"code": "web-search", "args": {"query": query, "count": count, "freshness": freshness}}
+        logger.debug(f"[WebSearch] linkai: query='{query}', count={count}")
+        resp = requests.post(url, headers=headers, json=payload, timeout=DEFAULT_TIMEOUT)
 
-        logger.debug(f"[WebSearch] LinkAI search: query='{query}', count={count}")
-
-        response = requests.post(url, headers=headers, json=payload, timeout=DEFAULT_TIMEOUT)
-
-        if response.status_code == 401:
-            return ToolResult.fail("Error: Invalid LINKAI_API_KEY. Please check your API key.")
-        if response.status_code != 200:
-            return ToolResult.fail(f"Error: LinkAI API returned HTTP {response.status_code}")
-
-        data = response.json()
+        if resp.status_code == 401:
+            return ToolResult.fail("Error: Invalid LinkAI API key.")
+        if resp.status_code != 200:
+            return ToolResult.fail(f"Error: LinkAI API returned HTTP {resp.status_code}")
 
+        data = resp.json()
         if not data.get("success"):
             msg = data.get("message") or "Unknown error"
             return ToolResult.fail(f"Error: LinkAI search failed: {msg}")
 
-        return self._format_linkai_results(data, query)
-
-    def _format_linkai_results(self, data: dict, query: str) -> ToolResult:
-        """
-        Format LinkAI API response into unified result structure.
-        LinkAI returns the search data in data.data field, which follows
-        the same Bing-compatible format as Bocha.
-
-        :param data: Raw API response
-        :param query: Original query
-        :return: Formatted ToolResult
-        """
-        raw_data = data.get("data", "")
-
-        # LinkAI may return data as a JSON string
-        if isinstance(raw_data, str):
+        raw = data.get("data", "")
+        if isinstance(raw, str):
             try:
-                raw_data = json.loads(raw_data)
+                raw = json.loads(raw)
             except (json.JSONDecodeError, TypeError):
-                # If data is plain text, return it as a single result
                 return ToolResult.success({
-                    "query": query,
-                    "backend": "linkai",
-                    "total": 1,
-                    "count": 1,
-                    "results": [{"content": raw_data}]
+                    "query": query, "backend": "linkai",
+                    "total": 1, "count": 1, "results": [{"content": raw}],
                 })
 
-        # If the response follows Bing-compatible structure
-        if isinstance(raw_data, dict):
-            web_pages = raw_data.get("webPages", {})
-            pages = web_pages.get("value", [])
-
+        if isinstance(raw, dict):
+            pages = (raw.get("webPages") or {}).get("value", []) or []
             if pages:
                 results = []
-                for page in pages:
-                    result = {
-                        "title": page.get("name", ""),
-                        "url": page.get("url", ""),
-                        "snippet": page.get("snippet", ""),
-                        "siteName": page.get("siteName", ""),
-                        "datePublished": page.get("datePublished") or page.get("dateLastCrawled", ""),
+                for p in pages:
+                    item = {
+                        "title": p.get("name", ""),
+                        "url": p.get("url", ""),
+                        "snippet": p.get("snippet", ""),
+                        "siteName": p.get("siteName", ""),
+                        "datePublished": p.get("datePublished") or p.get("dateLastCrawled", ""),
                     }
-                    if page.get("summary"):
-                        result["summary"] = page["summary"]
-                    results.append(result)
-
-                total = web_pages.get("totalEstimatedMatches", len(results))
+                    if p.get("summary"):
+                        item["summary"] = p["summary"]
+                    results.append(item)
+                total = (raw.get("webPages") or {}).get("totalEstimatedMatches", len(results))
                 return ToolResult.success({
-                    "query": query,
-                    "backend": "linkai",
-                    "total": total,
-                    "count": len(results),
-                    "results": results
+                    "query": query, "backend": "linkai",
+                    "total": total, "count": len(results), "results": results,
                 })
 
-        # Fallback: return raw data
         return ToolResult.success({
-            "query": query,
-            "backend": "linkai",
-            "total": 1,
-            "count": 1,
-            "results": [{"content": str(raw_data)}]
+            "query": query, "backend": "linkai",
+            "total": 1, "count": 1, "results": [{"content": str(raw)}],
         })
diff --git a/app.py b/app.py
index ba2ab265..dbb15209 100644
--- a/app.py
+++ b/app.py
@@ -289,6 +289,16 @@ def _warmup_mcp_tools():
         logger.warning(f"[App] MCP warmup failed (non-fatal): {e}")
 
 
+def _warmup_scheduler():
+    """Eager-init AgentBridge so the scheduler thread starts at process
+    boot rather than waiting for the first user message."""
+    try:
+        from bridge.bridge import Bridge
+        Bridge().get_agent_bridge()
+    except Exception as e:
+        logger.warning(f"[App] Scheduler warmup failed: {e}")
+
+
 def _sync_builtin_skills():
     """Sync builtin skills from project skills/ to workspace skills/ on startup."""
     import shutil
@@ -354,6 +364,8 @@ def run():
         # latency isn't dominated by npx package downloads.
         _warmup_mcp_tools()
 
+        _warmup_scheduler()
+
         logger.info(f"[App] Starting channels: {channel_names}")
 
         _channel_mgr = ChannelManager()
diff --git a/bridge/agent_bridge.py b/bridge/agent_bridge.py
index e60ffd9d..a924dab2 100644
--- a/bridge/agent_bridge.py
+++ b/bridge/agent_bridge.py
@@ -5,7 +5,7 @@ Agent Bridge - Integrates Agent system with existing COW bridge
 import os
 from typing import Optional, List
 
-from agent.protocol import Agent, LLMModel, LLMRequest
+from agent.protocol import Agent, LLMModel, LLMRequest, get_cancel_registry
 from bridge.agent_event_handler import AgentEventHandler
 from bridge.agent_initializer import AgentInitializer
 from bridge.bridge import Bridge
@@ -285,6 +285,15 @@ class AgentBridge:
         
         # Create helper instances
         self.initializer = AgentInitializer(bridge, self)
+
+        # Eager-start the scheduler so cron tasks fire without waiting
+        # for the first user message. init_scheduler is idempotent.
+        try:
+            from agent.tools.scheduler.integration import init_scheduler
+            if init_scheduler(self):
+                self.scheduler_initialized = True
+        except Exception as e:
+            logger.warning(f"[AgentBridge] Eager scheduler init failed: {e}")
     def create_agent(self, system_prompt: str, tools: List = None, **kwargs) -> Agent:
         """
         Create the super agent with COW integration
@@ -390,11 +399,22 @@ class AgentBridge:
         """
         session_id = None
         agent = None
+        request_id = None
+        cancel_event = None
         try:
             # Extract session_id from context for user isolation
             if context:
                 session_id = context.kwargs.get("session_id") or context.get("session_id")
-            
+                request_id = context.kwargs.get("request_id") or context.get("request_id")
+
+            # Register a cancel token. Prefer per-turn request_id (web),
+            # fall back to session_id (IM channels). The Event is polled by
+            # AgentStreamExecutor at safe checkpoints.
+            registry = get_cancel_registry()
+            token_key = request_id or session_id
+            if token_key:
+                cancel_event = registry.register(token_key, session_id=session_id)
+
             # Get agent for this session (will auto-initialize if needed)
             agent = self.get_agent(session_id=session_id)
             if not agent:
@@ -449,7 +469,8 @@ class AgentBridge:
                 response = agent.run_stream(
                     user_message=query,
                     on_event=event_handler.handle_event,
-                    clear_history=clear_history
+                    clear_history=clear_history,
+                    cancel_event=cancel_event,
                 )
             finally:
                 # Restore original tools
@@ -459,6 +480,13 @@ class AgentBridge:
                 # Log execution summary
                 event_handler.log_summary()
 
+                # Release cancel token; keep registry bounded.
+                if token_key:
+                    try:
+                        registry.unregister(token_key)
+                    except Exception:
+                        pass
+
             # Persist new messages generated during this run
             if session_id:
                 channel_type = (context.get("channel_type") or "") if context else ""
@@ -512,6 +540,12 @@ class AgentBridge:
                         logger.info(f"[AgentBridge] Cleared DB for session after error: {session_id}")
                 except Exception as db_err:
                     logger.warning(f"[AgentBridge] Failed to clear DB after error: {db_err}")
+            # Release cancel token on error path too (idempotent).
+            if cancel_event is not None and (request_id or session_id):
+                try:
+                    get_cancel_registry().unregister(request_id or session_id)
+                except Exception:
+                    pass
             return Reply(ReplyType.ERROR, f"Agent error: {str(e)}")
     
     def _schedule_mcp_hot_reload(self, agent):
diff --git a/bridge/agent_event_handler.py b/bridge/agent_event_handler.py
index 50826235..35173730 100644
--- a/bridge/agent_event_handler.py
+++ b/bridge/agent_event_handler.py
@@ -2,44 +2,40 @@
 Agent Event Handler - Handles agent events and thinking process output
 """
 
+from common import const
 from common.log import logger
 
+# Cap intermediate thinking messages on weixin to stay within send quota.
+WEIXIN_THINKING_INSTANT_MAX = 7
+
 
 class AgentEventHandler:
     """
     Handles agent events and optionally sends intermediate messages to channel
     """
-    
+
     def __init__(self, context=None, original_callback=None):
-        """
-        Initialize event handler
-        
-        Args:
-            context: COW context (for accessing channel)
-            original_callback: Original event callback to chain
-        """
         self.context = context
         self.original_callback = original_callback
-        
-        # Get channel for sending intermediate messages
+
         self.channel = None
         if context:
             self.channel = context.kwargs.get("channel") if hasattr(context, "kwargs") else None
-        
+
         self.current_content = ""
         self.turn_number = 0
-    
+
+        channel_type = ""
+        if context and hasattr(context, "kwargs"):
+            channel_type = context.kwargs.get("channel_type", "") or ""
+        self._is_weixin = channel_type == const.WEIXIN
+        self._thinking_sent_count = 0
+        self._merged_buf: list[str] = []
+
     def handle_event(self, event):
-        """
-        Main event handler
-        
-        Args:
-            event: Event dict with type and data
-        """
         event_type = event.get("type")
         data = event.get("data", {})
-        
-        # Dispatch to specific handlers
+
         if event_type == "turn_start":
             self._handle_turn_start(data)
         elif event_type == "message_update":
@@ -52,25 +48,23 @@ class AgentEventHandler:
             self._handle_tool_execution_start(data)
         elif event_type == "tool_execution_end":
             self._handle_tool_execution_end(data)
-        
-        # Call original callback if provided
+        elif event_type == "agent_end":
+            self._handle_agent_end(data)
+
         if self.original_callback:
             self.original_callback(event)
-    
+
     def _handle_turn_start(self, data):
-        """Handle turn start event"""
         self.turn_number = data.get("turn", 0)
         self.current_content = ""
-    
+
     def _handle_message_update(self, data):
-        """Handle message update event (streaming content text)"""
         delta = data.get("delta", "")
         self.current_content += delta
-    
+
     def _handle_message_end(self, data):
-        """Handle message end event"""
         tool_calls = data.get("tool_calls", [])
-        
+
         if tool_calls:
             if self.current_content.strip():
                 logger.info(f"💭 {self.current_content.strip()[:200]}{'...' if len(self.current_content) > 200 else ''}")
@@ -78,35 +72,54 @@ class AgentEventHandler:
         else:
             if self.current_content.strip():
                 logger.debug(f"💬 {self.current_content.strip()[:200]}{'...' if len(self.current_content) > 200 else ''}")
-        
+            # Drain weixin buffer before final reply leaves chat_channel
+            self._flush_merged_now()
+
         self.current_content = ""
-    
+
+    def _handle_agent_end(self, data):
+        self._flush_merged_now()
+
     def _handle_tool_execution_start(self, data):
-        """Handle tool execution start event - logged by agent_stream.py"""
         pass
-    
+
     def _handle_tool_execution_end(self, data):
-        """Handle tool execution end event - logged by agent_stream.py"""
         pass
-    
+
     def _send_to_channel(self, message):
-        """
-        Try to send intermediate message to channel.
-        Skipped in SSE mode because thinking text is already streamed via on_event.
-        """
         if self.context and self.context.get("on_event"):
             return
+        if not self.channel:
+            return
+
+        if not self._is_weixin:
+            self._do_send(message)
+            return
+
+        if self._thinking_sent_count < WEIXIN_THINKING_INSTANT_MAX:
+            self._do_send(message)
+            self._thinking_sent_count += 1
+            return
+
+        self._merged_buf.append(message)
+
+    def _flush_merged_now(self):
+        if not self._merged_buf:
+            return
+        merged = "\n\n".join(self._merged_buf)
+        count = len(self._merged_buf)
+        self._merged_buf = []
+        logger.debug(f"[AgentEventHandler] Flushing {count} merged thinking msgs, len={len(merged)}")
+        self._do_send(merged)
+        self._thinking_sent_count += 1
+
+    def _do_send(self, message):
+        try:
+            from bridge.reply import Reply, ReplyType
+            reply = Reply(ReplyType.TEXT, message)
+            self.channel._send(reply, self.context)
+        except Exception as e:
+            logger.debug(f"[AgentEventHandler] Failed to send to channel: {e}")
 
-        if self.channel:
-            try:
-                from bridge.reply import Reply, ReplyType
-                reply = Reply(ReplyType.TEXT, message)
-                self.channel._send(reply, self.context)
-            except Exception as e:
-                logger.debug(f"[AgentEventHandler] Failed to send to channel: {e}")
-    
     def log_summary(self):
-        """Log execution summary - simplified"""
-        # Summary removed as per user request
-        # Real-time logging during execution is sufficient
         pass
diff --git a/bridge/agent_initializer.py b/bridge/agent_initializer.py
index d17dcb0c..7d5afb4a 100644
--- a/bridge/agent_initializer.py
+++ b/bridge/agent_initializer.py
@@ -521,7 +521,7 @@ class AgentInitializer:
                 if tool_name == "web_search":
                     from agent.tools.web_search.web_search import WebSearch
                     if not WebSearch.is_available():
-                        logger.debug("[AgentInitializer] WebSearch skipped - no BOCHA_API_KEY or LINKAI_API_KEY")
+                        logger.debug("[AgentInitializer] WebSearch skipped - no search provider configured")
                         continue
 
                 # Special handling for EnvConfig tool
diff --git a/bridge/bridge.py b/bridge/bridge.py
index 753e394a..6eeb0887 100644
--- a/bridge/bridge.py
+++ b/bridge/bridge.py
@@ -14,7 +14,9 @@ class Bridge(object):
     def __init__(self):
         self.btype = {
             "chat": const.OPENAI,
-            "voice_to_text": conf().get("voice_to_text", "openai"),
+            # Empty `voice_to_text` (the default in new configs) triggers
+            # the auto-pick below — see _auto_pick_voice_to_text for order.
+            "voice_to_text": conf().get("voice_to_text") or self._auto_pick_voice_to_text(),
             "text_to_voice": conf().get("text_to_voice", "google"),
             "translate": conf().get("translate", "baidu"),
         }
@@ -61,6 +63,10 @@ class Bridge(object):
             if model_type and model_type.startswith("deepseek"):
                 self.btype["chat"] = const.DEEPSEEK
 
+            # 小米 MiMo 系列模型，全部以 mimo- 开头
+            if model_type and model_type.startswith("mimo-"):
+                self.btype["chat"] = const.MIMO
+
             if model_type and isinstance(model_type, str):
                 lowered_model_type = model_type.lower()
                 if lowered_model_type == const.QIANFAN or lowered_model_type.startswith("ernie"):
@@ -84,6 +90,46 @@ class Bridge(object):
         self.chat_bots = {}
         self._agent_bridge = None
 
+    def refresh_voice(self):
+        """Re-read voice_to_text / text_to_voice from config and drop the
+        cached voice bots so the next call picks up the new provider.
+        Used by the web console after the user edits voice settings.
+        Does NOT touch the agent_bridge / agent state.
+        """
+        new_v2t = conf().get("voice_to_text") or self._auto_pick_voice_to_text()
+        new_t2v = conf().get("text_to_voice", "google")
+        if conf().get("use_linkai") and conf().get("linkai_api_key"):
+            if not conf().get("voice_to_text") or conf().get("voice_to_text") in ["openai"]:
+                new_v2t = const.LINKAI
+            if not conf().get("text_to_voice") or conf().get("text_to_voice") in ["openai", const.TTS_1, const.TTS_1_HD]:
+                new_t2v = const.LINKAI
+        self.btype["voice_to_text"] = new_v2t
+        self.btype["text_to_voice"] = new_t2v
+        self.bots.pop("voice_to_text", None)
+        self.bots.pop("text_to_voice", None)
+        logger.info(f"[Bridge] voice refreshed: voice_to_text={new_v2t}, text_to_voice={new_t2v}")
+
+    @staticmethod
+    def _auto_pick_voice_to_text() -> str:
+        """Pick an ASR provider by configured api keys when voice_to_text is
+        unset. Order matches the web console: openai → dashscope → zhipu →
+        linkai. Falls back to 'openai' when nothing is configured so the
+        original "missing key" error is preserved.
+        """
+        def has(k: str) -> bool:
+            v = (conf().get(k) or "").strip()
+            return v != "" and v not in ("YOUR API KEY", "YOUR_API_KEY")
+
+        for key, provider in (
+            ("open_ai_api_key", "openai"),
+            ("dashscope_api_key", "dashscope"),
+            ("zhipu_ai_api_key", "zhipu"),
+            ("linkai_api_key", "linkai"),
+        ):
+            if has(key):
+                return provider
+        return "openai"
+
     # 模型对应的接口
     def get_bot(self, typename):
         if self.bots.get(typename) is None:
diff --git a/channel/channel_factory.py b/channel/channel_factory.py
index 10000226..2645945e 100644
--- a/channel/channel_factory.py
+++ b/channel/channel_factory.py
@@ -42,6 +42,12 @@ def create_channel(channel_type) -> Channel:
     elif channel_type == const.QQ:
         from channel.qq.qq_channel import QQChannel
         ch = QQChannel()
+    elif channel_type == const.TELEGRAM:
+        from channel.telegram.telegram_channel import TelegramChannel
+        ch = TelegramChannel()
+    elif channel_type == const.SLACK:
+        from channel.slack.slack_channel import SlackChannel
+        ch = SlackChannel()
     elif channel_type in (const.WEIXIN, "wx"):
         from channel.weixin.weixin_channel import WeixinChannel
         ch = WeixinChannel()
diff --git a/channel/chat_channel.py b/channel/chat_channel.py
index 3251c286..6a9a1952 100644
--- a/channel/chat_channel.py
+++ b/channel/chat_channel.py
@@ -171,7 +171,13 @@ class ChatChannel(Channel):
             if "desire_rtype" not in context and conf().get("always_reply_voice") and ReplyType.VOICE not in self.NOT_SUPPORT_REPLYTYPE:
                 context["desire_rtype"] = ReplyType.VOICE
         elif context.type == ContextType.VOICE:
-            if "desire_rtype" not in context and conf().get("voice_reply_voice") and ReplyType.VOICE not in self.NOT_SUPPORT_REPLYTYPE:
+            # Voice input replies with voice when either voice_reply_voice
+            # (mirror voice) or the global always_reply_voice toggle is on.
+            if (
+                "desire_rtype" not in context
+                and (conf().get("voice_reply_voice") or conf().get("always_reply_voice"))
+                and ReplyType.VOICE not in self.NOT_SUPPORT_REPLYTYPE
+            ):
                 context["desire_rtype"] = ReplyType.VOICE
         return context
 
@@ -264,6 +270,8 @@ class ChatChannel(Channel):
                 if reply.type == ReplyType.TEXT:
                     reply_text = reply.content
                     if desire_rtype == ReplyType.VOICE and ReplyType.VOICE not in self.NOT_SUPPORT_REPLYTYPE:
+                        # Preserve original text for the "text-then-voice" pattern in _send_reply.
+                        context["voice_reply_text"] = reply.content
                         reply = super().build_text_to_voice(reply.content)
                         return self._decorate_reply(context, reply)
                     if context.get("isgroup", False):
@@ -311,6 +319,15 @@ class ChatChannel(Channel):
                     # 短暂延迟后发送图片
                     time.sleep(0.3)
                     self._send(reply, context)
+                # Send text bubble before voice, unless channel already streamed
+                # the text (feishu) or natively renders STT under the voice (wechatcom).
+                elif reply.type == ReplyType.VOICE and context.get("voice_reply_text") \
+                        and not context.get("feishu_streamed") \
+                        and context.get("channel_type") not in ("wechatcom_app",):
+                    text_reply = Reply(ReplyType.TEXT, context.get("voice_reply_text"))
+                    self._send(text_reply, context)
+                    time.sleep(0.3)
+                    self._send(reply, context)
                 else:
                     self._send(reply, context)
     
@@ -421,8 +438,21 @@ class ChatChannel(Channel):
 
         return func
 
+    # Chat commands that must bypass the per-session serial queue,
+    # otherwise /cancel would queue behind the task it tries to cancel.
+    # Use /cancel (not /stop) to avoid colliding with `cow stop` CLI.
+    _BYPASS_QUEUE_COMMANDS = ("/cancel",)
+
     def produce(self, context: Context):
         session_id = context["session_id"]
+
+        # Fast path: /cancel must not enter the queue.
+        if context.type == ContextType.TEXT and context.content:
+            stripped = context.content.strip().lower()
+            if stripped in self._BYPASS_QUEUE_COMMANDS:
+                self._handle_cancel_command(context, session_id)
+                return
+
         with self.lock:
             if session_id not in self.sessions:
                 self.sessions[session_id] = [
@@ -434,6 +464,29 @@ class ChatChannel(Channel):
             else:
                 self.sessions[session_id][0].put(context)
 
+    def _handle_cancel_command(self, context: Context, session_id: str) -> None:
+        """Cancel any in-flight agent run for *session_id* and reply inline.
+
+        Runs synchronously on the caller's thread. Reply is sent through
+        _send_reply so plugins (e.g. logging) still observe it.
+        """
+        try:
+            from agent.protocol import get_cancel_registry
+            from bridge.reply import Reply, ReplyType
+
+            cancelled = get_cancel_registry().cancel_session(session_id)
+            text = (
+                "🛑 已中止"
+                if cancelled > 0
+                else "当前没有可中止的任务。"
+            )
+            logger.info(
+                f"[chat_channel] /cancel fast-path: session={session_id}, cancelled={cancelled}"
+            )
+            self._send_reply(context, Reply(ReplyType.TEXT, text))
+        except Exception as e:
+            logger.warning(f"[chat_channel] /cancel fast-path failed: {e}")
+
     # 消费者函数，单独线程，用于从消息队列中取出消息并处理
     def consume(self):
         while True:
diff --git a/channel/dingtalk/dingtalk_channel.py b/channel/dingtalk/dingtalk_channel.py
index d572e35d..b1ae86c2 100644
--- a/channel/dingtalk/dingtalk_channel.py
+++ b/channel/dingtalk/dingtalk_channel.py
@@ -86,6 +86,8 @@ def _check(func):
 
 @singleton
 class DingTalkChanel(ChatChannel, dingtalk_stream.ChatbotHandler):
+    NOT_SUPPORT_REPLYTYPE = []
+
     dingtalk_client_id = conf().get('dingtalk_client_id')
     dingtalk_client_secret = conf().get('dingtalk_client_secret')
 
@@ -870,6 +872,48 @@ class DingTalkChanel(ChatChannel, dingtalk_stream.ChatbotHandler):
                     self.reply_text("抱歉，文件上传失败", incoming_message)
             return
         
+        # Native sampleAudio. Upload only accepts ogg/amr, so convert TTS mp3/wav to amr.
+        elif reply.type == ReplyType.VOICE:
+            logger.info(f"[DingTalk] Sending voice: {reply.content}")
+            access_token = self.get_access_token()
+            if not access_token:
+                logger.error("[DingTalk] Cannot get access token for voice")
+                self.reply_text("抱歉，语音发送失败（无法获取token）", incoming_message)
+                return
+
+            voice_path = reply.content
+            if voice_path.startswith("file://"):
+                voice_path = voice_path[7:]
+
+            amr_path = voice_path
+            duration_ms = 0
+            if not voice_path.lower().endswith((".amr", ".ogg")):
+                try:
+                    from voice.audio_convert import any_to_amr
+                    amr_path = os.path.splitext(voice_path)[0] + ".amr"
+                    duration_ms = int(any_to_amr(voice_path, amr_path) or 0)
+                except Exception as e:
+                    logger.error(f"[DingTalk] Failed to convert voice to amr: {e}")
+                    self.reply_text("抱歉，语音转码失败", incoming_message)
+                    return
+
+            media_id = self.upload_media(amr_path, media_type="voice")
+            if not media_id:
+                logger.error("[DingTalk] Failed to upload voice media")
+                self.reply_text("抱歉，语音上传失败", incoming_message)
+                return
+
+            msg_param = {
+                "mediaId": media_id,
+                "duration": str(duration_ms or 1000),
+            }
+            success = self._send_file_message(
+                access_token, incoming_message, "sampleAudio", msg_param, isgroup
+            )
+            if not success:
+                self.reply_text("抱歉，语音发送失败", incoming_message)
+            return
+
         # 处理文本消息
         elif reply.type == ReplyType.TEXT:
             logger.info(f"[DingTalk] Sending text message, length={len(reply.content)}")
diff --git a/channel/feishu/feishu_channel.py b/channel/feishu/feishu_channel.py
index f479394a..9a9f3307 100644
--- a/channel/feishu/feishu_channel.py
+++ b/channel/feishu/feishu_channel.py
@@ -752,6 +752,9 @@ class FeiShuChanel(ChatChannel):
         init_in_flight = [False]
         # 一旦初始化失败就长期标记为 disabled，本次回复不再尝试任何流式调用
         disabled = [False]
+        # True after agent_cancelled: agent_end stops rewriting the card
+        # with stale final_response and just finalizes current content.
+        cancelled = [False]
         lock = threading.Lock()
 
         # ---- 异步推送队列 ----------------------------------------------------
@@ -1076,18 +1079,42 @@ class FeiShuChanel(ChatChannel):
                     message_id[0] = None
                     sequence[0] = 0
 
+            elif event_type == "agent_cancelled":
+                # Lock channel into "no-rewrite" mode: the subsequent
+                # agent_end's final_response is from the last *completed*
+                # turn (the user already saw it), so rewriting the card
+                # would duplicate it visually.
+                with lock:
+                    cancelled[0] = True
+
             elif event_type == "agent_end":
                 # 最终回复：用 final_response 覆盖当前流式卡片，然后关闭流式模式。
                 final_response = data.get("final_response", "")
-                if not final_response:
-                    return
-                final_text = str(final_response)
                 # 标记 streamed 让 chat_channel 跳过 send()
                 context["feishu_streamed"] = True
 
                 with lock:
+                    was_cancelled = cancelled[0]
                     has_card = card_id[0] is not None
                     init_busy = init_in_flight[0]
+                    pending_text = current_text[0]
+
+                if was_cancelled:
+                    # Cancelled path: finalize the in-flight card with
+                    # partial output (or a short marker if empty); drop
+                    # stale final_response to avoid duplicating last turn.
+                    if has_card:
+                        _drain_push_queue()
+                        partial = (pending_text or "").rstrip()
+                        final_text = partial or "_(已中止)_"
+                        _stream_update_text(final_text)
+                        _close_streaming_mode(final_text)
+                    push_queue.put(None)
+                    return
+
+                if not final_response:
+                    return
+                final_text = str(final_response)
 
                 # 罕见情况：agent_end 触发时还没创建过卡片（极快返回 / 没有
                 # message_update），主动创建一张承载 final_text。
@@ -1515,10 +1542,16 @@ class FeiShuChanel(ChatChannel):
             else:
                 context.type = ContextType.TEXT
             context.content = content.strip()
+            # Text input opts into voice replies only when the always-on toggle is set.
+            if "desire_rtype" not in context and conf().get("always_reply_voice"):
+                context["desire_rtype"] = ReplyType.VOICE
 
         elif context.type == ContextType.VOICE:
-            # 2.语音请求
-            if "desire_rtype" not in context and conf().get("voice_reply_voice"):
+            # 2.语音请求: voice input replies with voice if either
+            # voice_reply_voice (mirror reply) or always_reply_voice is on.
+            if "desire_rtype" not in context and (
+                conf().get("voice_reply_voice") or conf().get("always_reply_voice")
+            ):
                 context["desire_rtype"] = ReplyType.VOICE
 
         return context
diff --git a/channel/slack/__init__.py b/channel/slack/__init__.py
new file mode 100644
index 00000000..8b137891
--- /dev/null
+++ b/channel/slack/__init__.py
@@ -0,0 +1 @@
+
diff --git a/channel/slack/slack_channel.py b/channel/slack/slack_channel.py
new file mode 100644
index 00000000..8e82fcc5
--- /dev/null
+++ b/channel/slack/slack_channel.py
@@ -0,0 +1,506 @@
+"""
+Slack channel via Bolt for Python (Socket Mode).
+
+Features:
+- Direct message & channel chat (text / image / file)
+- Channel trigger: @mention or reply in a thread the bot is in (configurable)
+- /cancel fast-path matches Web channel behaviour
+- Socket Mode: no public IP / callback URL required, works behind NAT
+
+Implementation note:
+    slack_bolt's SocketModeHandler is blocking and runs its own background
+    threads. We start it in a dedicated thread so the rest of cow (sync) stays
+    untouched. Inbound events are dispatched onto cow's existing sync
+    ChatChannel.produce() pipeline; outbound send() calls the Slack Web API
+    client directly (it is sync-safe).
+"""
+
+import os
+import re
+import threading
+
+import requests
+
+from bridge.context import Context, ContextType
+from bridge.reply import Reply, ReplyType
+from channel.chat_channel import ChatChannel, check_prefix
+from channel.slack.slack_message import SlackMessage
+from common.expired_dict import ExpiredDict
+from common.log import logger
+from common.singleton import singleton
+from config import conf
+
+
+@singleton
+class SlackChannel(ChatChannel):
+    NOT_SUPPORT_REPLYTYPE = []
+
+    def __init__(self):
+        super().__init__()
+        self.bot_token = ""
+        self.app_token = ""
+        self.bot_user_id = ""  # used to strip @mention and ignore self messages
+        self._app = None
+        self._handler = None
+        self._client = None
+        self._loop_thread = None
+        # Idempotent dedup; Slack retries event delivery on slow ack
+        self._received_msgs = ExpiredDict(60 * 60 * 1)
+
+        # Disable group whitelist / prefix checks (we handle triggering ourselves
+        # in _should_reply_in_channel), aligned with telegram / feishu channels.
+        conf()["group_name_white_list"] = ["ALL_GROUP"]
+        conf()["single_chat_prefix"] = [""]
+
+    # ------------------------------------------------------------------
+    # Lifecycle
+    # ------------------------------------------------------------------
+
+    def startup(self):
+        self.bot_token = conf().get("slack_bot_token", "")
+        self.app_token = conf().get("slack_app_token", "")
+        if not self.bot_token or not self.app_token:
+            err = "[Slack] slack_bot_token and slack_app_token are both required"
+            logger.error(err)
+            self.report_startup_error(err)
+            return
+
+        # Guard against the common mistake of swapping the two tokens:
+        # bot token must start with xoxb-, app-level token with xapp-.
+        if not self.bot_token.startswith("xoxb-") or not self.app_token.startswith("xapp-"):
+            err = (
+                "[Slack] token type mismatch: slack_bot_token must start with 'xoxb-' "
+                "and slack_app_token must start with 'xapp-' (they look swapped)"
+            )
+            logger.error(err)
+            self.report_startup_error(err)
+            return
+
+        try:
+            from slack_bolt import App
+            from slack_bolt.adapter.socket_mode import SocketModeHandler
+        except ImportError:
+            err = (
+                "[Slack] slack_bolt is not installed. "
+                "Run: pip install slack_bolt"
+            )
+            logger.error(err)
+            self.report_startup_error(err)
+            return
+
+        try:
+            self._app = App(token=self.bot_token)
+            self._client = self._app.client
+
+            # Resolve our own bot user id (needed for @mention strip / self-ignore)
+            auth = self._client.auth_test()
+            self.bot_user_id = auth.get("user_id", "")
+            self.name = self.bot_user_id  # ChatChannel uses self.name to strip @-mention
+            logger.info(f"[Slack] Bot logged in as user_id={self.bot_user_id}, team={auth.get('team')}")
+        except Exception as e:
+            err = f"[Slack] auth_test failed: {e}"
+            logger.error(err)
+            self.report_startup_error(err)
+            return
+
+        self._register_handlers()
+
+        self._handler = SocketModeHandler(self._app, self.app_token)
+
+        def _run():
+            try:
+                logger.info("[Slack] Starting Socket Mode connection...")
+                self.report_startup_success()
+                logger.info("[Slack] ✅ Slack bot ready, listening for events")
+                self._handler.start()
+            except Exception as e:
+                logger.error(f"[Slack] socket mode crashed: {e}", exc_info=True)
+                self.report_startup_error(str(e))
+            finally:
+                logger.info("[Slack] socket mode exited")
+
+        self._loop_thread = threading.Thread(target=_run, daemon=True, name="slack-socket")
+        self._loop_thread.start()
+        # Block startup() until the handler thread exits, matching other channels'
+        # behaviour (startup is a blocking call).
+        self._loop_thread.join()
+
+    def _register_handlers(self):
+        app = self._app
+
+        # app_mention: bot is @-mentioned in a channel
+        @app.event("app_mention")
+        def _on_app_mention(event, ack):
+            ack()
+            self._handle_event(event, is_group=True)
+
+        # message: DMs and channel messages (including thread replies)
+        @app.event("message")
+        def _on_message(event, ack):
+            ack()
+            self._handle_message_event(event)
+
+    def stop(self):
+        logger.info("[Slack] stop() called")
+        try:
+            if self._handler is not None:
+                self._handler.close()
+        except Exception as e:
+            logger.warning(f"[Slack] handler close error: {e}")
+        if self._loop_thread and self._loop_thread.is_alive():
+            try:
+                self._loop_thread.join(timeout=10)
+            except Exception:
+                pass
+        logger.info("[Slack] stop() completed")
+
+    # ------------------------------------------------------------------
+    # Inbound: slack event -> ChatMessage -> ChatChannel.produce
+    # ------------------------------------------------------------------
+
+    def _handle_message_event(self, event: dict):
+        """Route a raw `message` event: skip bot/system noise, decide grouping."""
+        try:
+            logger.debug(
+                f"[Slack] message event: channel_type={event.get('channel_type')}, "
+                f"subtype={event.get('subtype')}, user={event.get('user')}, "
+                f"ts={event.get('ts')}, thread_ts={event.get('thread_ts')}"
+            )
+            # Ignore bot messages (including our own) and message edits/deletes
+            if event.get("bot_id") or event.get("subtype") in ("bot_message", "message_changed", "message_deleted"):
+                return
+            if event.get("user") == self.bot_user_id:
+                return
+
+            channel_type = event.get("channel_type", "")
+            # DM (im) is single chat; channel/group is group chat. app_mention
+            # already covers channel @-mentions, so for plain channel messages we
+            # only react when configured / thread-following.
+            is_group = channel_type in ("channel", "group", "mpim")
+            if is_group:
+                # app_mention handler covers explicit @bot; here we only handle
+                # follow-up replies in threads the bot participates in.
+                if not self._should_reply_in_channel(event):
+                    return
+            self._handle_event(event, is_group=is_group)
+        except Exception as e:
+            logger.error(f"[Slack] _handle_message_event error: {e}", exc_info=True)
+
+    def _handle_event(self, event: dict, is_group: bool):
+        """Parse event -> build SlackMessage -> produce()."""
+        try:
+            channel_id = event.get("channel", "")
+            ts = event.get("ts", "")
+            if not channel_id:
+                return
+
+            # Idempotent dedup
+            msg_uid = f"{channel_id}:{ts}"
+            if self._received_msgs.get(msg_uid):
+                return
+            self._received_msgs[msg_uid] = True
+
+            # Parse type + download media if needed.
+            ctype, content, caption = self._parse_event(event)
+            if ctype is None:
+                logger.debug(f"[Slack] unsupported message type, skip. event={event}")
+                return
+
+            # Strip <@bot_user_id> mention from channel text
+            if is_group and self.bot_user_id:
+                if ctype == ContextType.TEXT and content:
+                    content = self._strip_at_mention(content)
+                if caption:
+                    caption = self._strip_at_mention(caption)
+
+            slack_msg = SlackMessage(
+                event,
+                is_group=is_group,
+                bot_user_id=self.bot_user_id,
+                ctype=ctype,
+                content=content,
+            )
+            slack_msg.is_at = is_group  # if we reached here in a channel, bot is mentioned/threaded
+
+            from channel.file_cache import get_file_cache
+            file_cache = get_file_cache()
+            session_id = self._compute_session_id(event, is_group)
+
+            # Media + caption together: treat as a complete query and bypass the cache
+            if ctype in (ContextType.IMAGE, ContextType.FILE) and caption:
+                tag = "image" if ctype == ContextType.IMAGE else "file"
+                merged_text = f"{caption}\n[{tag}: {content}]"
+                slack_msg.ctype = ContextType.TEXT
+                slack_msg.content = merged_text
+                ctype = ContextType.TEXT
+                logger.info(f"[Slack] Media+caption merged for session {session_id}")
+                # fallthrough to the TEXT branch below
+
+            elif ctype == ContextType.IMAGE:
+                file_cache.add(session_id, content, file_type="image")
+                logger.info(f"[Slack] Image cached for session {session_id}, waiting for query...")
+                return
+            elif ctype == ContextType.FILE:
+                file_cache.add(session_id, content, file_type="file")
+                logger.info(f"[Slack] File cached for session {session_id}: {content}")
+                return
+
+            if ctype == ContextType.TEXT:
+                # Fast-path: /cancel mirrors Web channel behaviour
+                if (content or "").strip().lower() in ("/cancel", "cancel"):
+                    self._do_cancel(session_id, channel_id, event)
+                    return
+
+                cached_files = file_cache.get(session_id)
+                if cached_files:
+                    refs = []
+                    for fi in cached_files:
+                        ftype = fi["type"]
+                        tag = ftype if ftype in ("image", "video") else "file"
+                        refs.append(f"[{tag}: {fi['path']}]")
+                    slack_msg.content = (slack_msg.content or "") + "\n" + "\n".join(refs)
+                    file_cache.clear(session_id)
+                    logger.info(f"[Slack] Attached {len(cached_files)} cached file(s) to query")
+
+            # Reply in the originating thread when present, else start one on this msg
+            thread_ts = event.get("thread_ts") or ts
+
+            context = self._compose_context(
+                slack_msg.ctype,
+                slack_msg.content,
+                isgroup=is_group,
+                msg=slack_msg,
+                # Replies go back into the thread, no manual @mention needed
+                no_need_at=True,
+            )
+            if context:
+                context["session_id"] = session_id
+                context["receiver"] = channel_id
+                context["slack_channel"] = channel_id
+                context["slack_thread_ts"] = thread_ts if is_group else None
+                self.produce(context)
+            logger.debug(f"[Slack] received: type={ctype}, content={str(slack_msg.content)[:80]}")
+        except Exception as e:
+            logger.error(f"[Slack] _handle_event error: {e}", exc_info=True)
+
+    def _do_cancel(self, session_id: str, channel_id: str, event: dict):
+        """Fast-path: /cancel calls cancel_session directly without going through agent."""
+        try:
+            from agent.protocol import get_cancel_registry
+            cancelled = get_cancel_registry().cancel_session(session_id)
+            text = "Current task cancelled." if cancelled else "No running task to cancel."
+            thread_ts = event.get("thread_ts") or event.get("ts")
+            self._client.chat_postMessage(channel=channel_id, text=text, thread_ts=thread_ts)
+            logger.info(f"[Slack] /cancel session={session_id}, cancelled={cancelled}")
+        except Exception as e:
+            logger.error(f"[Slack] /cancel error: {e}", exc_info=True)
+
+    def _parse_event(self, event: dict):
+        """Parse a slack event and return (ctype, content, caption).
+
+        - content is text for ContextType.TEXT, otherwise the local file path
+        - caption is the optional text accompanying a file; empty for plain text
+        """
+        text = (event.get("text") or "").strip()
+        files = event.get("files") or []
+
+        if files:
+            # Handle the first attachment; caption is the accompanying message text
+            f = files[0]
+            mimetype = (f.get("mimetype") or "").lower()
+            url = f.get("url_private_download") or f.get("url_private")
+            name = f.get("name") or f.get("id") or "file"
+            if not url:
+                return (None, None, "")
+            path = self._download_file(url, name)
+            if not path:
+                return (None, None, "")
+            if mimetype.startswith("image/"):
+                return (ContextType.IMAGE, path, text)
+            return (ContextType.FILE, path, text)
+
+        if text:
+            return (ContextType.TEXT, text, "")
+
+        return (None, None, "")
+
+    def _download_file(self, url: str, name: str):
+        """Download a Slack private file (requires bot token auth) to local tmp dir."""
+        try:
+            headers = {"Authorization": f"Bearer {self.bot_token}"}
+            resp = requests.get(url, headers=headers, timeout=60, stream=True)
+            resp.raise_for_status()
+            tmp_dir = SlackMessage.get_tmp_dir()
+            # Sanitize the name and keep it unique-ish via the url tail
+            safe_name = re.sub(r"[^\w.\-]", "_", name)
+            local_path = os.path.join(tmp_dir, safe_name)
+            with open(local_path, "wb") as fp:
+                for chunk in resp.iter_content(chunk_size=8192):
+                    if chunk:
+                        fp.write(chunk)
+            logger.debug(f"[Slack] downloaded {name} -> {local_path}")
+            return local_path
+        except Exception as e:
+            logger.error(f"[Slack] download_file failed ({name}): {e}")
+            return None
+
+    # ------------------------------------------------------------------
+    # Channel trigger logic
+    # ------------------------------------------------------------------
+
+    def _should_reply_in_channel(self, event: dict) -> bool:
+        """Decide whether to reply to a plain channel message (no @mention).
+
+        app_mention already handles explicit @bot, so here we only deal with
+        follow-up messages. `all` replies to every message; `mention_or_reply`
+        replies inside threads the bot already participates in.
+        """
+        mode = conf().get("slack_group_trigger", "mention_or_reply")
+        if mode == "all":
+            return True
+        if mode == "mention_only":
+            return False
+        # mention_or_reply: follow up only within an existing thread
+        return bool(event.get("thread_ts"))
+
+    def _strip_at_mention(self, content: str) -> str:
+        """Strip <@BOT_USER_ID> from channel text."""
+        if not content or not self.bot_user_id:
+            return content
+        pattern = re.compile(r"<@" + re.escape(self.bot_user_id) + r">", re.IGNORECASE)
+        return pattern.sub("", content).strip()
+
+    @staticmethod
+    def _compute_session_id(event: dict, is_group: bool) -> str:
+        channel_id = event.get("channel", "")
+        user_id = event.get("user", "")
+        if is_group:
+            if conf().get("group_shared_session", True):
+                return f"slack_channel_{channel_id}"
+            return f"slack_channel_{channel_id}_{user_id}"
+        return f"slack_user_{user_id}"
+
+    # ------------------------------------------------------------------
+    # Override _compose_context: skip the parent's group whitelist/at checks
+    # (already handled via _should_reply_in_channel). Same idea as telegram.
+    # ------------------------------------------------------------------
+
+    def _compose_context(self, ctype: ContextType, content, **kwargs):
+        context = Context(ctype, content)
+        context.kwargs = kwargs
+        if "channel_type" not in context:
+            context["channel_type"] = self.channel_type
+        if "origin_ctype" not in context:
+            context["origin_ctype"] = ctype
+
+        cmsg = context["msg"]
+        if cmsg.is_group:
+            if conf().get("group_shared_session", True):
+                context["session_id"] = cmsg.other_user_id
+            else:
+                context["session_id"] = f"{cmsg.from_user_id}:{cmsg.other_user_id}"
+        else:
+            context["session_id"] = cmsg.from_user_id
+        context["receiver"] = cmsg.other_user_id
+
+        if ctype == ContextType.TEXT:
+            img_match_prefix = check_prefix(content, conf().get("image_create_prefix"))
+            if img_match_prefix:
+                content = content.replace(img_match_prefix, "", 1)
+                context.type = ContextType.IMAGE_CREATE
+            else:
+                context.type = ContextType.TEXT
+            context.content = (content or "").strip()
+            if "desire_rtype" not in context and conf().get("always_reply_voice"):
+                context["desire_rtype"] = ReplyType.VOICE
+        elif ctype == ContextType.VOICE:
+            if "desire_rtype" not in context and (
+                conf().get("voice_reply_voice") or conf().get("always_reply_voice")
+            ):
+                context["desire_rtype"] = ReplyType.VOICE
+
+        return context
+
+    # ------------------------------------------------------------------
+    # Outbound: ChatChannel.send -> Slack Web API
+    # ------------------------------------------------------------------
+
+    def send(self, reply: Reply, context: Context):
+        """Called from cow's sync main thread; Slack Web client is sync-safe."""
+        if self._client is None:
+            logger.warning("[Slack] client not ready, drop reply")
+            return
+
+        channel_id = context.get("slack_channel")
+        thread_ts = context.get("slack_thread_ts")
+        if not channel_id:
+            logger.warning("[Slack] no slack_channel in context, drop reply")
+            return
+
+        try:
+            self._do_send(reply, channel_id, thread_ts)
+            logger.info(f"[Slack] sent reply (type={reply.type}, channel={channel_id})")
+        except Exception as e:
+            logger.error(f"[Slack] send failed: {e}", exc_info=True)
+
+    def _do_send(self, reply: Reply, channel_id: str, thread_ts):
+        rtype = reply.type
+        content = reply.content
+
+        if rtype in (ReplyType.TEXT, ReplyType.INFO, ReplyType.ERROR):
+            text = str(content) if content is not None else ""
+            if not text:
+                return
+            # Slack caps a message around 40k chars; split conservatively
+            for chunk in _split_text(text, 3500):
+                self._client.chat_postMessage(channel=channel_id, text=chunk, thread_ts=thread_ts)
+
+        elif rtype == ReplyType.IMAGE:
+            # Already a local BytesIO; upload it directly
+            content.seek(0)
+            self._client.files_upload_v2(
+                channel=channel_id, file=content, filename="image.png", thread_ts=thread_ts,
+            )
+
+        elif rtype == ReplyType.IMAGE_URL:
+            url = str(content)
+            if url.startswith("file://"):
+                local = url[7:]
+                self._client.files_upload_v2(
+                    channel=channel_id, file=local, thread_ts=thread_ts,
+                )
+            else:
+                # Post the URL as text; Slack will unfurl it as an image preview
+                self._client.chat_postMessage(channel=channel_id, text=url, thread_ts=thread_ts)
+
+        elif rtype in (ReplyType.VOICE, ReplyType.FILE):
+            local = content[7:] if isinstance(content, str) and content.startswith("file://") else content
+            caption = getattr(reply, "text_content", None) or None
+            self._client.files_upload_v2(
+                channel=channel_id, file=local, initial_comment=caption, thread_ts=thread_ts,
+            )
+
+        else:
+            # Fallback: send as plain text
+            self._client.chat_postMessage(channel=channel_id, text=str(content), thread_ts=thread_ts)
+
+
+def _split_text(text: str, limit: int):
+    """Split long text preferring line breaks to keep markdown structure intact."""
+    if len(text) <= limit:
+        yield text
+        return
+    buf = []
+    size = 0
+    for line in text.splitlines(keepends=True):
+        if size + len(line) > limit and buf:
+            yield "".join(buf)
+            buf, size = [], 0
+        # Hard-split single lines that exceed the limit
+        while len(line) > limit:
+            yield line[:limit]
+            line = line[limit:]
+        buf.append(line)
+        size += len(line)
+    if buf:
+        yield "".join(buf)
diff --git a/channel/slack/slack_message.py b/channel/slack/slack_message.py
new file mode 100644
index 00000000..39f215bd
--- /dev/null
+++ b/channel/slack/slack_message.py
@@ -0,0 +1,60 @@
+"""
+Slack message adapter.
+
+Convert a Slack event payload into cow's unified ChatMessage.
+File downloads are NOT performed here; the channel layer downloads files
+on demand because it needs the bot token for authenticated download URLs.
+"""
+import os
+
+from bridge.context import ContextType
+from channel.chat_message import ChatMessage
+from common.utils import expand_path
+from config import conf
+
+
+class SlackMessage(ChatMessage):
+    """Wrap a Slack event into the unified ChatMessage."""
+
+    def __init__(self, event: dict, is_group: bool = False, bot_user_id: str = "",
+                 ctype: ContextType = ContextType.TEXT, content: str = ""):
+        super().__init__(event)
+        # Basic fields
+        self.msg_id = event.get("client_msg_id") or event.get("ts") or ""
+        try:
+            self.create_time = int(float(event.get("ts", 0)))
+        except (TypeError, ValueError):
+            self.create_time = 0
+        self.ctype = ctype
+        self.content = content
+
+        # Sender / chat info
+        from_user_id = event.get("user", "unknown")
+        channel_id = event.get("channel", "")
+        self.from_user_id = from_user_id
+        self.from_user_nickname = from_user_id
+        self.to_user_id = bot_user_id or "slack_bot"
+        self.to_user_nickname = bot_user_id or "slack_bot"
+
+        self.is_group = is_group
+        if is_group:
+            # Channel chat: other_user_id = channel_id, actual_user_id = sender id
+            self.other_user_id = channel_id
+            self.other_user_nickname = channel_id
+            self.actual_user_id = from_user_id
+            self.actual_user_nickname = from_user_id
+        else:
+            # DM: use channel_id so replies go back to the same DM channel
+            self.other_user_id = channel_id or from_user_id
+            self.other_user_nickname = from_user_id
+
+        # Whether the bot was triggered by @-mention (set by channel layer)
+        self.is_at = False
+
+    @staticmethod
+    def get_tmp_dir() -> str:
+        """Local download directory, aligned with other channels (agent_workspace/tmp)."""
+        workspace_root = expand_path(conf().get("agent_workspace", "~/cow"))
+        tmp_dir = os.path.join(workspace_root, "tmp")
+        os.makedirs(tmp_dir, exist_ok=True)
+        return tmp_dir
diff --git a/channel/telegram/__init__.py b/channel/telegram/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/channel/telegram/telegram_channel.py b/channel/telegram/telegram_channel.py
new file mode 100644
index 00000000..9e40c59f
--- /dev/null
+++ b/channel/telegram/telegram_channel.py
@@ -0,0 +1,719 @@
+"""
+Telegram channel via Bot API (long polling mode).
+
+Features:
+- Single chat & group chat (text / photo / voice / video / document)
+- Group trigger: @mention or reply-to-bot (configurable)
+- /cancel fast-path matches Web channel behaviour
+- Auto-register bot commands menu on startup (mirrors Web slash menu)
+- Optional HTTP/SOCKS5 proxy support for restricted networks
+
+Implementation note:
+    python-telegram-bot is async-first. We run the bot inside a dedicated
+    thread with its own asyncio loop so the rest of cow (which is sync)
+    stays untouched. Inbound updates are dispatched onto cow's existing
+    sync ChatChannel.produce() pipeline; outbound send() schedules
+    coroutines back onto that loop via asyncio.run_coroutine_threadsafe.
+"""
+
+import asyncio
+import os
+import re
+import threading
+
+from bridge.context import Context, ContextType
+from bridge.reply import Reply, ReplyType
+from channel.chat_channel import ChatChannel, check_prefix
+from channel.telegram.telegram_message import TelegramMessage
+from common.expired_dict import ExpiredDict
+from common.log import logger
+from common.singleton import singleton
+from config import conf
+
+# Bot command menu, aligned with Web slash commands.
+# Top-level commands only; sub-commands are entered with a space (e.g. "/skill list").
+TELEGRAM_BOT_COMMANDS = [
+    ("help", "Show command help"),
+    ("status", "Show running status"),
+    ("context", "View/clear conversation context (sub: clear)"),
+    ("skill", "Manage skills (list/search/install/...)"),
+    ("memory", "Manage memory (sub: dream)"),
+    ("knowledge", "Manage knowledge base (list/on/off)"),
+    ("config", "Show current config"),
+    ("cancel", "Cancel running agent task"),
+    ("logs", "Show recent logs"),
+    ("version", "Show version"),
+]
+
+
+@singleton
+class TelegramChannel(ChatChannel):
+    NOT_SUPPORT_REPLYTYPE = []
+
+    def __init__(self):
+        super().__init__()
+        self.bot_token = ""
+        self.bot_username = ""  # used for @-mention matching
+        self._bot = None
+        self._application = None
+        self._loop = None
+        self._loop_thread = None
+        self._stop_event = threading.Event()
+        # Idempotent dedup; TG occasionally redelivers the same update on flaky networks
+        self._received_msgs = ExpiredDict(60 * 60 * 1)
+
+        # Disable group whitelist / prefix checks (we handle triggering ourselves
+        # in _should_reply_in_group), aligned with feishu / wecom_bot channels.
+        conf()["group_name_white_list"] = ["ALL_GROUP"]
+        conf()["single_chat_prefix"] = [""]
+
+    # ------------------------------------------------------------------
+    # Lifecycle
+    # ------------------------------------------------------------------
+
+    def startup(self):
+        self.bot_token = conf().get("telegram_token", "")
+        if not self.bot_token:
+            err = "[Telegram] telegram_token is required"
+            logger.error(err)
+            self.report_startup_error(err)
+            return
+
+        try:
+            from telegram.ext import (
+                Application,
+                MessageHandler,
+                CommandHandler,
+                filters,
+            )
+        except ImportError:
+            err = (
+                "[Telegram] python-telegram-bot is not installed. "
+                "Run: pip install python-telegram-bot"
+            )
+            logger.error(err)
+            self.report_startup_error(err)
+            return
+
+        # Run the asyncio event loop in a dedicated thread so the sync cow body
+        # is untouched.
+        self._loop = asyncio.new_event_loop()
+
+        def _run_loop():
+            asyncio.set_event_loop(self._loop)
+            try:
+                self._loop.run_until_complete(self._async_main(Application, MessageHandler, CommandHandler, filters))
+            except Exception as e:
+                logger.error(f"[Telegram] event loop crashed: {e}", exc_info=True)
+                self.report_startup_error(str(e))
+            finally:
+                try:
+                    self._loop.close()
+                except Exception:
+                    pass
+                logger.info("[Telegram] event loop exited")
+
+        self._loop_thread = threading.Thread(target=_run_loop, daemon=True, name="telegram-loop")
+        self._loop_thread.start()
+        # Block startup() until the loop thread exits, matching other channels'
+        # behaviour (startup is a blocking call).
+        self._loop_thread.join()
+
+    async def _async_main(self, Application, MessageHandler, CommandHandler, filters):
+        """Build Application, register handlers, and run polling."""
+        builder = Application.builder().token(self.bot_token)
+
+        # Proxy: prefer telegram_proxy config, fall back to HTTPS_PROXY env var
+        proxy_url = conf().get("telegram_proxy", "") or os.environ.get("HTTPS_PROXY", "")
+        if proxy_url:
+            try:
+                builder = builder.proxy(proxy_url).get_updates_proxy(proxy_url)
+                logger.info(f"[Telegram] using proxy: {proxy_url}")
+            except Exception as e:
+                logger.warning(f"[Telegram] proxy config failed, fallback to direct: {e}")
+
+        # Media uploads (photo/voice/video/document) over a proxy can be slow,
+        # bump read/write/connect/pool timeouts.
+        builder = (
+            builder
+            .read_timeout(60)
+            .write_timeout(120)
+            .connect_timeout(30)
+            .pool_timeout(30)
+        )
+
+        application = builder.build()
+        self._application = application
+        self._bot = application.bot
+
+        # Fetch our own username (needed for @-mention matching in groups)
+        try:
+            me = await self._bot.get_me()
+            self.bot_username = me.username or ""
+            self.name = self.bot_username  # ChatChannel uses self.name to strip @-mention
+            logger.info(f"[Telegram] Bot logged in as @{self.bot_username} (id={me.id})")
+        except Exception as e:
+            err = f"[Telegram] get_me failed: {e}"
+            logger.error(err)
+            self.report_startup_error(err)
+            return
+
+        # Register the command menu (failure is non-fatal)
+        if conf().get("telegram_register_commands", True):
+            try:
+                from telegram import BotCommand
+                cmds = [BotCommand(name, desc) for name, desc in TELEGRAM_BOT_COMMANDS]
+                await self._bot.set_my_commands(cmds)
+                logger.info(f"[Telegram] Registered {len(cmds)} bot commands")
+            except Exception as e:
+                logger.warning(f"[Telegram] set_my_commands failed: {e}")
+
+        # Handlers:
+        # 1) /cancel uses the fast-path
+        application.add_handler(CommandHandler("cancel", self._on_cancel))
+        # 2) Normal messages (text + media)
+        application.add_handler(MessageHandler(filters.ALL & ~filters.COMMAND, self._on_message))
+        # 3) Other slash commands are forwarded as plain text for the agent to handle
+        application.add_handler(MessageHandler(filters.COMMAND, self._on_command_passthrough))
+
+        # Start polling. drop_pending_updates avoids replaying backlog after restart.
+        # Transient "Server disconnected" / RemoteProtocolError during get_updates
+        # are common over proxies/flaky networks; PTB's network loop auto-retries,
+        # so we only need to keep the noise down (see _quiet_polling_network_errors).
+        self._quiet_polling_network_errors()
+        logger.info("[Telegram] Starting long polling...")
+        await application.initialize()
+        await application.start()
+        await application.updater.start_polling(
+            drop_pending_updates=True,
+            # Long-poll hold time on the server side; smaller value = reconnect more
+            # often but each hung connection fails faster.
+            timeout=30,
+            # Retry forever on transient get_updates network errors instead of giving up.
+            bootstrap_retries=-1,
+        )
+        self.report_startup_success()
+        logger.info("[Telegram] ✅ Telegram bot ready, polling for updates")
+
+        # Block until stop()
+        try:
+            while not self._stop_event.is_set():
+                await asyncio.sleep(0.5)
+        finally:
+            try:
+                await application.updater.stop()
+                await application.stop()
+                await application.shutdown()
+            except Exception as e:
+                logger.warning(f"[Telegram] shutdown error: {e}")
+
+    @staticmethod
+    def _quiet_polling_network_errors():
+        """Downgrade PTB's noisy 'Exception happened while polling for updates' logs.
+
+        These transient get_updates errors (RemoteProtocolError / NetworkError /
+        TimedOut, typically over a proxy) are auto-retried by PTB's network loop,
+        so logging the full traceback at ERROR is just noise. We attach a filter
+        that drops these specific records while leaving real errors untouched.
+        """
+        import logging
+
+        class _PollingNoiseFilter(logging.Filter):
+            _NEEDLES = (
+                "Exception happened while polling for updates",
+                "Server disconnected without sending a response",
+            )
+
+            def filter(self, record: logging.LogRecord) -> bool:
+                try:
+                    msg = record.getMessage()
+                except Exception:
+                    return True
+                if any(n in msg for n in self._NEEDLES):
+                    # Keep a single-line breadcrumb at DEBUG, drop the traceback.
+                    logger.debug(f"[Telegram] transient polling network error (auto-retrying): {msg.splitlines()[0]}")
+                    return False
+                return True
+
+        noise_filter = _PollingNoiseFilter()
+        for name in ("telegram.ext.Updater", "telegram.ext._updater", "telegram.ext"):
+            logging.getLogger(name).addFilter(noise_filter)
+
+    def stop(self):
+        logger.info("[Telegram] stop() called")
+        self._stop_event.set()
+        if self._loop_thread and self._loop_thread.is_alive():
+            try:
+                self._loop_thread.join(timeout=10)
+            except Exception:
+                pass
+        logger.info("[Telegram] stop() completed")
+
+    # ------------------------------------------------------------------
+    # Inbound: telegram update -> ChatMessage -> ChatChannel.produce
+    # ------------------------------------------------------------------
+
+    async def _on_cancel(self, update, _context):
+        """Fast-path: /cancel calls cancel_session directly without going through agent."""
+        try:
+            from agent.protocol import get_cancel_registry
+            session_id = self._compute_session_id(update)
+            cancelled = get_cancel_registry().cancel_session(session_id)
+            text = "Current task cancelled." if cancelled else "No running task to cancel."
+            await update.effective_message.reply_text(text)
+            logger.info(f"[Telegram] /cancel session={session_id}, cancelled={cancelled}")
+        except Exception as e:
+            logger.error(f"[Telegram] /cancel error: {e}", exc_info=True)
+            try:
+                await update.effective_message.reply_text(f"⚠️ /cancel failed: {e}")
+            except Exception:
+                pass
+
+    async def _on_command_passthrough(self, update, _context):
+        """All non-/cancel commands fall through to plain message handling."""
+        await self._on_message(update, _context)
+
+    async def _on_message(self, update, _context):
+        """Telegram update entry: parse message -> build ChatMessage -> produce()."""
+        try:
+            message = update.effective_message
+            chat = update.effective_chat
+            if not message or not chat:
+                return
+
+            # Idempotent dedup
+            msg_uid = f"{chat.id}:{message.message_id}"
+            if self._received_msgs.get(msg_uid):
+                return
+            self._received_msgs[msg_uid] = True
+
+            is_group = chat.type in ("group", "supergroup")
+
+            # Debug log: helpful when group messages are silently dropped
+            if is_group:
+                logger.debug(
+                    f"[Telegram] group update received: chat_id={chat.id}, "
+                    f"text={(message.text or message.caption or '')[:40]!r}, "
+                    f"reply_to_bot={bool(message.reply_to_message and message.reply_to_message.from_user and message.reply_to_message.from_user.username == self.bot_username)}"
+                )
+
+            # Group trigger gate (silently drop if not triggered)
+            if is_group and not self._should_reply_in_group(update):
+                logger.debug(f"[Telegram] group message not triggered (need @{self.bot_username} or reply), skip")
+                return
+
+            # Parse message type + download media if needed.
+            # Media messages with caption return both the local path and the caption text.
+            ctype, content, caption = await self._parse_message(message)
+            if ctype is None:
+                logger.debug(f"[Telegram] unsupported message type, skip. msg={message}")
+                return
+
+            # Strip @bot mention for group text/caption
+            if is_group and self.bot_username:
+                if ctype == ContextType.TEXT and content:
+                    content = self._strip_at_mention(content)
+                if caption:
+                    caption = self._strip_at_mention(caption)
+
+            tg_msg = TelegramMessage(
+                update,
+                is_group=is_group,
+                bot_username=self.bot_username,
+                ctype=ctype,
+                content=content,
+            )
+            tg_msg.is_at = is_group  # If we got here in a group, the bot is mentioned/replied
+
+            # File cache: standalone media goes into cache, the next text query attaches them
+            from channel.file_cache import get_file_cache
+            file_cache = get_file_cache()
+            session_id = self._compute_session_id(update)
+
+            # Media + caption together: treat as a complete query and bypass the cache
+            if ctype in (ContextType.IMAGE, ContextType.FILE) and caption:
+                tag = "image" if ctype == ContextType.IMAGE else "file"
+                merged_text = f"{caption}\n[{tag}: {content}]"
+                tg_msg.ctype = ContextType.TEXT
+                tg_msg.content = merged_text
+                ctype = ContextType.TEXT
+                logger.info(f"[Telegram] Media+caption merged for session {session_id}")
+                # fallthrough to the TEXT branch below
+
+            elif ctype == ContextType.IMAGE:
+                file_cache.add(session_id, content, file_type="image")
+                logger.info(f"[Telegram] Image cached for session {session_id}, waiting for query...")
+                return
+            elif ctype == ContextType.FILE:
+                file_cache.add(session_id, content, file_type="file")
+                logger.info(f"[Telegram] File cached for session {session_id}: {content}")
+                return
+
+            if ctype == ContextType.TEXT:
+                cached_files = file_cache.get(session_id)
+                if cached_files:
+                    refs = []
+                    for fi in cached_files:
+                        ftype = fi["type"]
+                        tag = ftype if ftype in ("image", "video") else "file"
+                        refs.append(f"[{tag}: {fi['path']}]")
+                    tg_msg.content = (tg_msg.content or "") + "\n" + "\n".join(refs)
+                    file_cache.clear(session_id)
+                    logger.info(f"[Telegram] Attached {len(cached_files)} cached file(s) to query")
+
+            # Dispatch to cow main pipeline (reuses ChatChannel._compose_context routing)
+            context = self._compose_context(
+                tg_msg.ctype,
+                tg_msg.content,
+                isgroup=is_group,
+                msg=tg_msg,
+            )
+            if context:
+                context["session_id"] = session_id
+                context["receiver"] = str(chat.id)
+                context["telegram_chat_id"] = chat.id
+                context["telegram_reply_to_msg_id"] = message.message_id if is_group else None
+                self.produce(context)
+            logger.debug(f"[Telegram] received: type={ctype}, content={str(tg_msg.content)[:80]}")
+
+        except Exception as e:
+            logger.error(f"[Telegram] _on_message error: {e}", exc_info=True)
+
+    async def _parse_message(self, message):
+        """Parse a telegram message and return (ctype, content, caption).
+
+        - content is text for ContextType.TEXT, otherwise the local file path
+        - caption is the optional text accompanying a media message; empty for plain text
+        """
+        caption = (message.caption or "").strip()
+
+        if message.photo:
+            largest = message.photo[-1]
+            path = await self._download_file(largest.file_id, suffix=".jpg")
+            return (ContextType.IMAGE, path, caption) if path else (None, None, "")
+
+        if message.voice or message.audio:
+            audio_obj = message.voice or message.audio
+            suffix = ".ogg" if message.voice else (
+                "." + (audio_obj.mime_type.split("/")[-1] if getattr(audio_obj, "mime_type", "") else "mp3")
+            )
+            path = await self._download_file(audio_obj.file_id, suffix=suffix)
+            return (ContextType.VOICE, path, caption) if path else (None, None, "")
+
+        if message.video or message.video_note:
+            video_obj = message.video or message.video_note
+            path = await self._download_file(video_obj.file_id, suffix=".mp4")
+            return (ContextType.FILE, path, caption) if path else (None, None, "")
+
+        if message.document:
+            doc = message.document
+            ext = ""
+            if doc.file_name and "." in doc.file_name:
+                ext = "." + doc.file_name.rsplit(".", 1)[-1]
+            path = await self._download_file(doc.file_id, suffix=ext, original_name=doc.file_name)
+            if not path:
+                return (None, None, "")
+            # Image-typed documents (user picked "send as file") are treated as images
+            mime = (doc.mime_type or "").lower()
+            if mime.startswith("image/"):
+                return (ContextType.IMAGE, path, caption)
+            return (ContextType.FILE, path, caption)
+
+        if message.text:
+            return (ContextType.TEXT, message.text.strip(), "")
+
+        return (None, None, "")
+
+    async def _download_file(self, file_id: str, suffix: str = "", original_name: str = ""):
+        """Download via bot.get_file into the local tmp dir; return path or None on failure."""
+        try:
+            f = await self._bot.get_file(file_id)
+            tmp_dir = TelegramMessage.get_tmp_dir()
+            base = original_name or f"{file_id}{suffix or ''}"
+            # Prefix with file_id to avoid name collisions / weird chars
+            safe_name = f"{file_id}_{base}" if original_name else base
+            local_path = os.path.join(tmp_dir, safe_name)
+            await f.download_to_drive(custom_path=local_path)
+            logger.debug(f"[Telegram] downloaded file_id={file_id} -> {local_path}")
+            return local_path
+        except Exception as e:
+            logger.error(f"[Telegram] download_file failed (file_id={file_id}): {e}")
+            return None
+
+    # ------------------------------------------------------------------
+    # Group trigger logic
+    # ------------------------------------------------------------------
+
+    def _should_reply_in_group(self, update) -> bool:
+        """Decide whether to reply to a group message based on configuration."""
+        mode = conf().get("telegram_group_trigger", "mention_or_reply")
+        if mode == "all":
+            return True
+
+        message = update.effective_message
+        if not message:
+            return False
+
+        # 1) Mentioned
+        if self.bot_username and self._is_mentioned(message, self.bot_username):
+            return True
+
+        # 2) Reply to a bot message
+        if mode == "mention_or_reply":
+            reply = message.reply_to_message
+            if reply and reply.from_user and reply.from_user.username == self.bot_username:
+                return True
+
+        return False
+
+    @staticmethod
+    def _is_mentioned(message, bot_username: str) -> bool:
+        """Check whether entities/caption_entities contain a @mention of the bot."""
+        bot_at = "@" + bot_username.lower()
+        text = (message.text or message.caption or "").lower()
+        if bot_at in text:
+            return True
+        # Also check entities strictly to support text_mention (no-username @)
+        for ent in (message.entities or []) + (message.caption_entities or []):
+            if ent.type == "mention":
+                src = message.text or message.caption or ""
+                if src[ent.offset: ent.offset + ent.length].lower() == bot_at:
+                    return True
+        return False
+
+    def _strip_at_mention(self, content: str) -> str:
+        """Strip @bot_username from group text (case-insensitive)."""
+        if not content or not self.bot_username:
+            return content
+        pattern = re.compile(r"@" + re.escape(self.bot_username), re.IGNORECASE)
+        return pattern.sub("", content).strip()
+
+    @staticmethod
+    def _compute_session_id(update) -> str:
+        chat = update.effective_chat
+        user = update.effective_user
+        is_group = chat.type in ("group", "supergroup")
+        if is_group:
+            if conf().get("group_shared_session", True):
+                return f"tg_group_{chat.id}"
+            return f"tg_group_{chat.id}_{user.id}"
+        return f"tg_user_{user.id}"
+
+    # ------------------------------------------------------------------
+    # Override _compose_context: skip the parent's group whitelist/at checks
+    # (already handled in _on_message via _should_reply_in_group). Same idea
+    # as the feishu channel.
+    # ------------------------------------------------------------------
+
+    def _compose_context(self, ctype: ContextType, content, **kwargs):
+        context = Context(ctype, content)
+        context.kwargs = kwargs
+        if "channel_type" not in context:
+            context["channel_type"] = self.channel_type
+        if "origin_ctype" not in context:
+            context["origin_ctype"] = ctype
+
+        cmsg = context["msg"]
+        if cmsg.is_group:
+            if conf().get("group_shared_session", True):
+                context["session_id"] = cmsg.other_user_id
+            else:
+                context["session_id"] = f"{cmsg.from_user_id}:{cmsg.other_user_id}"
+        else:
+            context["session_id"] = cmsg.from_user_id
+        context["receiver"] = cmsg.other_user_id
+
+        if ctype == ContextType.TEXT:
+            img_match_prefix = check_prefix(content, conf().get("image_create_prefix"))
+            if img_match_prefix:
+                content = content.replace(img_match_prefix, "", 1)
+                context.type = ContextType.IMAGE_CREATE
+            else:
+                context.type = ContextType.TEXT
+            context.content = (content or "").strip()
+            if "desire_rtype" not in context and conf().get("always_reply_voice"):
+                context["desire_rtype"] = ReplyType.VOICE
+        elif ctype == ContextType.VOICE:
+            if "desire_rtype" not in context and (
+                conf().get("voice_reply_voice") or conf().get("always_reply_voice")
+            ):
+                context["desire_rtype"] = ReplyType.VOICE
+
+        return context
+
+    # ------------------------------------------------------------------
+    # Outbound: ChatChannel.send -> Telegram API
+    # ------------------------------------------------------------------
+
+    def send(self, reply: Reply, context: Context):
+        """Called from cow's sync main thread; we marshal the coroutine onto the loop thread."""
+        if self._loop is None or self._bot is None:
+            logger.warning("[Telegram] bot not ready, drop reply")
+            return
+
+        chat_id = context.get("telegram_chat_id")
+        reply_to = context.get("telegram_reply_to_msg_id")
+        if chat_id is None:
+            logger.warning("[Telegram] no telegram_chat_id in context, drop reply")
+            return
+
+        coro = self._async_send(reply, chat_id, reply_to)
+        try:
+            future = asyncio.run_coroutine_threadsafe(coro, self._loop)
+            # Media uploads through a proxy can be slow; let PTB's own timeouts win
+            future.result(timeout=180)
+        except Exception as e:
+            logger.error(f"[Telegram] send failed: {e}")
+
+    # Number of retries for transient network errors (proxy hiccups etc.)
+    _SEND_RETRIES = 2
+    _SEND_RETRY_BACKOFF = 2.0  # seconds
+
+    async def _send_with_retry(self, send_fn, *, label: str):
+        """Run a single Telegram API call with retries for transient network errors."""
+        from telegram.error import NetworkError, TimedOut
+        last_err = None
+        for attempt in range(self._SEND_RETRIES + 1):
+            try:
+                return await send_fn()
+            except (NetworkError, TimedOut) as e:
+                last_err = e
+                if attempt >= self._SEND_RETRIES:
+                    break
+                wait = self._SEND_RETRY_BACKOFF * (attempt + 1)
+                logger.warning(
+                    f"[Telegram] {label} transient error (attempt {attempt + 1}/"
+                    f"{self._SEND_RETRIES + 1}): {e}; retry in {wait}s"
+                )
+                await asyncio.sleep(wait)
+        raise last_err
+
+    async def _async_send(self, reply: Reply, chat_id, reply_to_msg_id):
+        try:
+            rtype = reply.type
+            content = reply.content
+
+            if rtype == ReplyType.TEXT or rtype == ReplyType.INFO or rtype == ReplyType.ERROR:
+                # Telegram caps a single text message at 4096 chars; auto-split
+                text = str(content) if content is not None else ""
+                if not text:
+                    return
+                for chunk in _split_text(text, 4000):
+                    await self._send_with_retry(
+                        lambda c=chunk: self._bot.send_message(
+                            chat_id=chat_id,
+                            text=c,
+                            reply_to_message_id=reply_to_msg_id,
+                            # Avoid failing the whole send if reply_to was deleted
+                            allow_sending_without_reply=True,
+                        ),
+                        label="send_message",
+                    )
+
+            elif rtype == ReplyType.IMAGE:
+                # Already a local BytesIO; send it directly
+                content.seek(0)
+                await self._send_with_retry(
+                    lambda: self._bot.send_photo(
+                        chat_id=chat_id,
+                        photo=content,
+                        reply_to_message_id=reply_to_msg_id,
+                        allow_sending_without_reply=True,
+                    ),
+                    label="send_photo",
+                )
+
+            elif rtype == ReplyType.IMAGE_URL:
+                url = str(content)
+                if url.startswith("file://"):
+                    local = url[7:]
+                    # Open inside the lambda so each retry gets a fresh stream
+                    async def _send_local_photo():
+                        with open(local, "rb") as f:
+                            return await self._bot.send_photo(
+                                chat_id=chat_id, photo=f,
+                                reply_to_message_id=reply_to_msg_id,
+                                allow_sending_without_reply=True,
+                            )
+                    await self._send_with_retry(_send_local_photo, label="send_photo(file)")
+                else:
+                    await self._send_with_retry(
+                        lambda: self._bot.send_photo(
+                            chat_id=chat_id, photo=url,
+                            reply_to_message_id=reply_to_msg_id,
+                            allow_sending_without_reply=True,
+                        ),
+                        label="send_photo(url)",
+                    )
+
+            elif rtype == ReplyType.VOICE:
+                local = content[7:] if isinstance(content, str) and content.startswith("file://") else content
+                async def _send_voice():
+                    with open(local, "rb") as f:
+                        return await self._bot.send_voice(
+                            chat_id=chat_id, voice=f,
+                            reply_to_message_id=reply_to_msg_id,
+                            allow_sending_without_reply=True,
+                        )
+                await self._send_with_retry(_send_voice, label="send_voice")
+
+            elif rtype == ReplyType.FILE:
+                # Videos go through send_video, everything else through send_document
+                local = content[7:] if isinstance(content, str) and content.startswith("file://") else content
+                # File replies may carry an accompanying text caption
+                caption = getattr(reply, "text_content", None) or None
+                is_video = isinstance(local, str) and local.lower().endswith(
+                    (".mp4", ".mov", ".avi", ".mkv", ".webm")
+                )
+
+                async def _send_file():
+                    with open(local, "rb") as f:
+                        if is_video:
+                            return await self._bot.send_video(
+                                chat_id=chat_id, video=f, caption=caption,
+                                reply_to_message_id=reply_to_msg_id,
+                                allow_sending_without_reply=True,
+                            )
+                        return await self._bot.send_document(
+                            chat_id=chat_id, document=f, caption=caption,
+                            reply_to_message_id=reply_to_msg_id,
+                            allow_sending_without_reply=True,
+                        )
+                await self._send_with_retry(_send_file, label="send_video" if is_video else "send_document")
+
+            else:
+                # Fallback: send as plain text
+                await self._send_with_retry(
+                    lambda: self._bot.send_message(
+                        chat_id=chat_id, text=str(content),
+                        reply_to_message_id=reply_to_msg_id,
+                        allow_sending_without_reply=True,
+                    ),
+                    label="send_message(fallback)",
+                )
+
+            logger.info(f"[Telegram] sent reply (type={rtype}, chat_id={chat_id})")
+
+        except Exception as e:
+            logger.error(f"[Telegram] _async_send error: {e}", exc_info=True)
+
+
+def _split_text(text: str, limit: int):
+    """Split long text preferring line breaks to keep markdown structure intact."""
+    if len(text) <= limit:
+        yield text
+        return
+    buf = []
+    size = 0
+    for line in text.splitlines(keepends=True):
+        if size + len(line) > limit and buf:
+            yield "".join(buf)
+            buf, size = [], 0
+        # Hard-split single lines that exceed the limit
+        while len(line) > limit:
+            yield line[:limit]
+            line = line[limit:]
+        buf.append(line)
+        size += len(line)
+    if buf:
+        yield "".join(buf)
diff --git a/channel/telegram/telegram_message.py b/channel/telegram/telegram_message.py
new file mode 100644
index 00000000..c97c6059
--- /dev/null
+++ b/channel/telegram/telegram_message.py
@@ -0,0 +1,62 @@
+"""
+Telegram message adapter.
+
+Convert a python-telegram-bot Update into cow's unified ChatMessage.
+File downloads are NOT performed here; the channel layer triggers
+bot.get_file() on demand because it requires the async event loop.
+"""
+import os
+
+from bridge.context import ContextType
+from channel.chat_message import ChatMessage
+from common.utils import expand_path
+from config import conf
+
+
+class TelegramMessage(ChatMessage):
+    """Wrap a Telegram Update into the unified ChatMessage."""
+
+    def __init__(self, update, is_group: bool = False, bot_username: str = "",
+                 ctype: ContextType = ContextType.TEXT, content: str = ""):
+        super().__init__(update)
+        message = update.effective_message
+        chat = update.effective_chat
+        user = update.effective_user
+
+        # Basic fields
+        self.msg_id = str(message.message_id) if message else ""
+        self.create_time = int(message.date.timestamp()) if message and message.date else 0
+        self.ctype = ctype
+        self.content = content
+
+        # Sender / chat info
+        from_user_id = str(user.id) if user else "unknown"
+        from_user_nick = (
+            user.full_name if user and user.full_name else (user.username if user else "unknown")
+        )
+        self.from_user_id = from_user_id
+        self.from_user_nickname = from_user_nick or from_user_id
+        self.to_user_id = bot_username or "telegram_bot"
+        self.to_user_nickname = bot_username or "telegram_bot"
+
+        self.is_group = is_group
+        if is_group:
+            # Group: other_user_id = group_id, actual_user_id = sender id
+            self.other_user_id = str(chat.id)
+            self.other_user_nickname = chat.title or str(chat.id)
+            self.actual_user_id = from_user_id
+            self.actual_user_nickname = self.from_user_nickname
+        else:
+            self.other_user_id = from_user_id
+            self.other_user_nickname = self.from_user_nickname
+
+        # Whether the bot was triggered by @-mention or reply (set by channel layer)
+        self.is_at = False
+
+    @staticmethod
+    def get_tmp_dir() -> str:
+        """Local download directory, aligned with other channels (agent_workspace/tmp)."""
+        workspace_root = expand_path(conf().get("agent_workspace", "~/cow"))
+        tmp_dir = os.path.join(workspace_root, "tmp")
+        os.makedirs(tmp_dir, exist_ok=True)
+        return tmp_dir
diff --git a/channel/web/chat.html b/channel/web/chat.html
index 56ce808f..d90adb15 100644
--- a/channel/web/chat.html
+++ b/channel/web/chat.html
@@ -137,6 +137,11 @@
                             <i class="fas fa-sliders item-icon text-xs w-5 text-center"></i>
                             <span data-i18n="menu_config">配置</span>
                         </a>
+                        <a class="sidebar-item flex items-center gap-3 px-3 py-2 rounded-lg cursor-pointer transition-all duration-150 hover:bg-white/5 hover:text-neutral-200 text-[14px]"
+                           data-view="models">
+                            <i class="fas fa-microchip item-icon text-xs w-5 text-center"></i>
+                            <span data-i18n="menu_models">模型</span>
+                        </a>
                         <a class="sidebar-item flex items-center gap-3 px-3 py-2 rounded-lg cursor-pointer transition-all duration-150 hover:bg-white/5 hover:text-neutral-200 text-[14px]"
                            data-view="skills">
                             <i class="fas fa-bolt item-icon text-xs w-5 text-center"></i>
@@ -417,21 +422,30 @@
                                     </button>
                                 </div>
                                 <div id="slash-menu" class="slash-menu hidden"></div>
-                                <textarea id="chat-input"
-                                          class="flex-1 min-w-0 px-4 py-[10px] rounded-xl border border-slate-200 dark:border-slate-600
-                                                 bg-slate-50 dark:bg-white/5 text-slate-800 dark:text-slate-100
-                                                 placeholder:text-slate-400 dark:placeholder:text-slate-500
-                                                 focus:outline-none focus:ring-0 focus:border-primary-600
-                                                 text-sm leading-relaxed"
-                                          rows="1"
-                                          data-i18n-placeholder="input_placeholder"
-                                          placeholder="输入消息，或输入 / 使用指令"></textarea>
+                                <div class="flex-1 min-w-0 relative flex items-center">
+                                    <textarea id="chat-input"
+                                              class="w-full pl-4 pr-11 py-[10px] rounded-xl border border-slate-200 dark:border-slate-600
+                                                     bg-slate-50 dark:bg-white/5 text-slate-800 dark:text-slate-100
+                                                     placeholder:text-slate-400 dark:placeholder:text-slate-500
+                                                     focus:outline-none focus:ring-0 focus:border-primary-600
+                                                     text-sm leading-relaxed"
+                                              rows="1"
+                                              data-i18n-placeholder="input_placeholder"
+                                              placeholder="输入消息，或输入 / 使用指令"></textarea>
+                                    <button id="mic-btn" type="button"
+                                            class="absolute right-2 top-1/2 -translate-y-1/2 w-8 h-8 flex items-center justify-center rounded-lg
+                                                   text-slate-400 hover:text-primary-500 hover:bg-primary-50 dark:hover:bg-primary-900/20
+                                                   cursor-pointer transition-colors duration-150"
+                                            data-i18n-title="mic_idle_title" title="点击录音 / 再按一次结束">
+                                        <i class="fas fa-microphone text-sm"></i>
+                                    </button>
+                                </div>
                                 <button id="send-btn"
                                         class="flex-shrink-0 w-10 h-10 flex items-center justify-center rounded-lg
                                                bg-primary-400 text-white hover:bg-primary-500
                                                disabled:bg-slate-300 dark:disabled:bg-slate-600
                                                disabled:cursor-not-allowed cursor-pointer transition-colors duration-150"
-                                        disabled onclick="sendMessage()">
+                                        disabled>
                                     <i class="fas fa-paper-plane text-sm"></i>
                                 </button>
                             </div>
@@ -460,6 +474,11 @@
                                             <i class="fas fa-microchip text-primary-500 text-sm"></i>
                                         </div>
                                         <h3 class="font-semibold text-slate-800 dark:text-slate-100" data-i18n="config_model">模型配置</h3>
+                                        <a class="ml-auto text-xs text-slate-500 dark:text-slate-400 hover:text-primary-500 dark:hover:text-primary-400 cursor-pointer transition-colors flex items-center gap-1"
+                                           onclick="navigateTo('models')">
+                                            <span data-i18n="config_model_advanced">高级配置</span>
+                                            <i class="fas fa-arrow-right text-[10px]"></i>
+                                        </a>
                                     </div>
                                     <div class="space-y-5">
                                         <!-- Provider -->
@@ -850,6 +869,41 @@
                     </div>
                 </div>
 
+                <!-- ====================================================== -->
+                <!-- VIEW: Models                                            -->
+                <!-- ====================================================== -->
+                <div id="view-models" class="view">
+                    <!-- Tailwind JIT safelist: capability-card icon colors are
+                         emitted from JS template strings. Listing them here
+                         (display:none) guarantees the CDN-side compiler picks
+                         them up regardless of render timing. -->
+                    <div class="hidden bg-blue-50 dark:bg-blue-900/30 text-blue-500
+                                       bg-orange-50 dark:bg-orange-900/30 text-orange-500
+                                       bg-purple-50 dark:bg-purple-900/30 text-purple-500
+                                       bg-amber-50 dark:bg-amber-900/30 text-amber-500
+                                       bg-primary-50 dark:bg-primary-900/30 text-primary-500"></div>
+                    <div class="flex-1 overflow-y-auto p-6">
+                        <div class="max-w-4xl mx-auto">
+                            <div class="flex items-center justify-between mb-6">
+                                <div>
+                                    <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="models_title">模型管理</h2>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="models_desc">统一管理对话、视觉、语音、向量、图像、搜索能力</p>
+                                </div>
+                                <button id="models-add-vendor-btn" onclick="openVendorModal('')"
+                                        class="flex items-center gap-2 px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600
+                                               text-white text-sm font-medium cursor-pointer transition-colors duration-150">
+                                    <i class="fas fa-plus text-xs"></i>
+                                    <span data-i18n="models_add_vendor">添加厂商</span>
+                                </button>
+                            </div>
+                            <div id="models-loading" class="flex items-center gap-2 py-12 justify-center text-slate-400 dark:text-slate-500 text-sm">
+                                <i class="fas fa-spinner fa-spin text-xs"></i><span>Loading...</span>
+                            </div>
+                            <div id="models-content" class="grid gap-6 hidden"></div>
+                        </div>
+                    </div>
+                </div>
+
                 <!-- ====================================================== -->
                 <!-- VIEW: Channels                                          -->
                 <!-- ====================================================== -->
@@ -959,7 +1013,7 @@
     </div><!-- /app -->
 
     <!-- Confirm Dialog -->
-    <div id="confirm-dialog-overlay" class="fixed inset-0 bg-black/50 z-[100] hidden flex items-center justify-center">
+    <div id="confirm-dialog-overlay" class="fixed inset-0 bg-black/50 z-[200] hidden flex items-center justify-center">
         <div class="bg-white dark:bg-[#1A1A1A] rounded-2xl border border-slate-200 dark:border-white/10 shadow-xl
                     w-full max-w-sm mx-4 overflow-hidden">
             <div class="p-6">
@@ -984,6 +1038,77 @@
         </div>
     </div>
 
+    <!-- Vendor Credentials Modal -->
+    <div id="vendor-modal-overlay" class="fixed inset-0 bg-black/50 z-[100] hidden flex items-center justify-center">
+        <div class="bg-white dark:bg-[#1A1A1A] rounded-2xl border border-slate-200 dark:border-white/10 shadow-xl
+                    w-full max-w-md mx-4">
+            <div class="p-6">
+                <div class="flex items-center gap-3 mb-5">
+                    <div class="w-10 h-10 rounded-xl bg-primary-50 dark:bg-primary-900/20 flex items-center justify-center flex-shrink-0">
+                        <i class="fas fa-key text-primary-500"></i>
+                    </div>
+                    <div class="min-w-0 flex-1">
+                        <h3 id="vendor-modal-title" class="font-semibold text-slate-800 dark:text-slate-100 text-base"></h3>
+                        <p id="vendor-modal-subtitle" class="text-xs text-slate-500 dark:text-slate-400 mt-0.5 font-mono"></p>
+                    </div>
+                </div>
+
+                <!-- Provider selector (only visible when adding via top button) -->
+                <div id="vendor-modal-picker-wrap" class="mb-4 hidden">
+                    <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="models_provider">厂商</label>
+                    <div id="vendor-modal-picker" class="cfg-dropdown" tabindex="0">
+                        <div class="cfg-dropdown-selected">
+                            <span class="cfg-dropdown-text">--</span>
+                            <i class="fas fa-chevron-down cfg-dropdown-arrow"></i>
+                        </div>
+                        <div class="cfg-dropdown-menu"></div>
+                    </div>
+                </div>
+
+                <div class="space-y-4">
+                    <div>
+                        <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">API Key</label>
+                        <input id="vendor-modal-key" type="text" autocomplete="off" data-1p-ignore data-lpignore="true"
+                               class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
+                                      bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
+                                      focus:outline-none focus:border-primary-500 font-mono transition-colors"
+                               placeholder="sk-...">
+                    </div>
+                    <div id="vendor-modal-base-wrap">
+                        <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">API Base</label>
+                        <input id="vendor-modal-base" type="text"
+                               class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
+                                      bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
+                                      focus:outline-none focus:border-primary-500 font-mono transition-colors"
+                               placeholder="https://...../v1">
+                        <p id="vendor-modal-base-hint" class="mt-1.5 text-xs text-slate-400 dark:text-slate-500 hidden">
+                            <i class="fas fa-info-circle mr-1"></i><span data-i18n="models_base_default_hint">留空将使用官方默认地址</span>
+                        </p>
+                    </div>
+                </div>
+            </div>
+            <div class="flex items-center justify-between gap-3 px-6 py-4 border-t border-slate-100 dark:border-white/5 rounded-b-2xl">
+                <button id="vendor-modal-clear"
+                        class="px-3 py-2 rounded-lg text-xs
+                               text-red-500 dark:text-red-400 hover:bg-red-50 dark:hover:bg-red-900/20
+                               cursor-pointer transition-colors duration-150 hidden"
+                        data-i18n="models_clear_credential">清除凭据</button>
+                <span id="vendor-modal-status"
+                      class="flex-1 text-xs text-primary-500 opacity-0 transition-opacity duration-300 text-center"></span>
+                <button id="vendor-modal-cancel"
+                        class="px-4 py-2 rounded-lg border border-slate-200 dark:border-white/10
+                               text-slate-600 dark:text-slate-300 text-sm font-medium
+                               hover:bg-slate-50 dark:hover:bg-white/5
+                               cursor-pointer transition-colors duration-150"
+                        data-i18n="cancel">取消</button>
+                <button id="vendor-modal-save"
+                        class="px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600 text-white text-sm font-medium
+                               cursor-pointer transition-colors duration-150 disabled:opacity-50 disabled:cursor-not-allowed"
+                        data-i18n="save">保存</button>
+            </div>
+        </div>
+    </div>
+
     <script defer src="assets/js/console.js"></script>
 </body>
 </html>
diff --git a/channel/web/static/css/console.css b/channel/web/static/css/console.css
index d5caf5b1..957db3e0 100644
--- a/channel/web/static/css/console.css
+++ b/channel/web/static/css/console.css
@@ -725,6 +725,58 @@
     background: rgba(74, 190, 110, 0.15);
     color: #74E9A4;
 }
+/* When an item carries a hint (e.g. brand alias next to a technical model
+   id), label/hint are split into two spans so the hint sits on the right in
+   a dim, smaller weight. Without a hint the row stays a plain text node and
+   uses the default ellipsis behaviour, so no layout regressions for old call
+   sites. */
+.cfg-dropdown-label {
+    flex: 1 1 auto;
+    min-width: 0;
+    overflow: hidden;
+    text-overflow: ellipsis;
+}
+.cfg-dropdown-hint {
+    flex-shrink: 0;
+    margin-left: auto;
+    padding-left: 12px;
+    color: #94a3b8;
+    font-size: 12px;
+    font-weight: 400;
+}
+.dark .cfg-dropdown-hint {
+    color: #64748b;
+}
+.cfg-dropdown-item.active .cfg-dropdown-hint {
+    /* Tint the hint toward the brand colour on the active row so it doesn't
+       fight with the highlighted label tone. */
+    color: rgba(34, 133, 71, 0.65);
+}
+.dark .cfg-dropdown-item.active .cfg-dropdown-hint {
+    color: rgba(116, 233, 164, 0.6);
+}
+/* The active row gets a trailing brand-green checkmark via a Font Awesome
+   pseudo-element so every dropdown (chat / vision / image / asr / tts / etc.)
+   surfaces "this is what's currently selected" without per-call JS plumbing.
+   When a hint is present, the ✓ sits to its right with a small gap; without
+   a hint, margin-left:auto pushes the ✓ flush against the right edge. */
+.cfg-dropdown-item.active::after {
+    content: '\f00c';                  /* FontAwesome check glyph */
+    font-family: 'Font Awesome 6 Free', 'Font Awesome 5 Free', 'FontAwesome';
+    font-weight: 900;
+    margin-left: auto;
+    padding-left: 12px;
+    color: #4abe6e;
+    font-size: 11px;
+    flex-shrink: 0;
+}
+.cfg-dropdown-item.active:has(.cfg-dropdown-hint)::after {
+    /* When hint occupies the auto-margin slot, the ✓ no longer benefits
+       from `margin-left: auto`; replace it with a small fixed gap so the
+       ✓ trails the hint cleanly. */
+    margin-left: 0;
+    padding-left: 10px;
+}
 
 /* API Key masking via CSS (avoids browser password prompts) */
 .cfg-key-masked {
@@ -732,6 +784,77 @@
     text-security: disc;
 }
 
+/* Provider logo image — vendors flagged as `provider-logo-invert-dark`
+   ship a black wordmark that disappears on the dark canvas; we invert their
+   luminance only in dark mode so the brand stays recognizable without
+   touching multi-color marks like Google/MiniMax. */
+.provider-logo-img {
+    object-fit: contain;
+    object-position: center;
+}
+.dark .provider-logo-invert-dark {
+    filter: invert(1) brightness(1.15);
+}
+
+/* Models page — provider dropdown rows.
+   Configured rows look like ordinary picker entries; the .active row's
+   trailing brand-green ✓ already announces "this is what's selected"
+   (handled globally by .cfg-dropdown-item.active::after above).
+   Unconfigured rows are visually subdued and carry a trailing gear icon
+   as a "click to set up" affordance. */
+.cap-provider-label {
+    flex: 1 1 auto;
+    overflow: hidden;
+    text-overflow: ellipsis;
+}
+.cap-provider-gear {
+    margin-left: auto;
+    padding-left: 12px;
+    color: #94a3b8;
+    font-size: 11px;
+    flex-shrink: 0;
+}
+.cap-provider-item.cap-provider-unconfigured {
+    color: #94a3b8;
+}
+.dark .cap-provider-item.cap-provider-unconfigured {
+    color: #64748b;
+}
+.cap-provider-item.cap-provider-unconfigured:hover {
+    color: #475569;
+}
+.dark .cap-provider-item.cap-provider-unconfigured:hover {
+    color: #cbd5e1;
+}
+.cap-provider-item.cap-provider-unconfigured:hover .cap-provider-gear {
+    color: #475569;
+}
+.dark .cap-provider-item.cap-provider-unconfigured:hover .cap-provider-gear {
+    color: #cbd5e1;
+}
+/* If the active row ever lands on an unconfigured vendor (defensive — the
+   click handler normally diverts to the modal), suppress the global ✓ so
+   the gear remains the sole trailing icon and the row keeps reading as
+   "needs setup" rather than "already selected". */
+.cap-provider-item.cap-provider-unconfigured.active::after {
+    content: none;
+}
+
+/* "Add vendor" modal picker — each configured row carries a static
+   brand-green ✓ via decorateVendorModalPicker so users can see what's set
+   up at a glance. The active row's global ✓ is suppressed here to avoid
+   showing two checks side by side on configured + selected rows. */
+.vendor-picker-item.active::after {
+    content: none;
+}
+.vendor-picker-configured-mark {
+    margin-left: auto;
+    padding-left: 12px;
+    color: #4abe6e;
+    font-size: 11px;
+    flex-shrink: 0;
+}
+
 /* Chat Input */
 #chat-input {
     resize: none; height: 42px; max-height: 180px;
@@ -1171,3 +1294,108 @@
     overflow: hidden;
     min-height: 2.5em;  /* ~2 lines at text-sm leading-relaxed */
 }
+
+/* --------------------------------------------------------------------
+ * Voice pill — compact custom audio player used by mic uploads and TTS
+ * replies. Replaces the bulky native <audio controls> with a play/pause
+ * icon + thin progress bar + duration counter so it blends into chat
+ * bubbles without the chrome-grey browser default look.
+ * ------------------------------------------------------------------ */
+.voice-pill {
+    display: inline-flex;
+    align-items: center;
+    gap: 8px;
+    padding: 6px 10px;
+    border-radius: 999px;
+    background: rgba(15, 23, 42, 0.05);
+    color: rgb(71, 85, 105);
+    font-size: 12px;
+    line-height: 1;
+    max-width: 240px;
+    user-select: none;
+    cursor: default;
+}
+.dark .voice-pill {
+    background: rgba(255, 255, 255, 0.08);
+    color: rgb(203, 213, 225);
+}
+.voice-pill[data-loading="1"] {
+    opacity: 0.65;
+}
+.voice-pill-btn {
+    width: 22px;
+    height: 22px;
+    border-radius: 999px;
+    display: inline-flex;
+    align-items: center;
+    justify-content: center;
+    background: var(--color-primary-500, #2563eb);
+    color: #fff;
+    flex-shrink: 0;
+    cursor: pointer;
+    transition: transform 0.1s ease;
+}
+.voice-pill-btn:hover { transform: scale(1.05); }
+.voice-pill-btn i { font-size: 9px; margin-left: 1px; }
+.voice-pill-btn[data-state="play"] i { margin-left: 2px; }
+.voice-pill-btn[data-state="pause"] i { margin-left: 0; }
+.voice-pill-track {
+    flex: 1;
+    height: 3px;
+    border-radius: 999px;
+    background: rgba(100, 116, 139, 0.25);
+    overflow: hidden;
+    min-width: 70px;
+}
+.dark .voice-pill-track {
+    background: rgba(148, 163, 184, 0.25);
+}
+.voice-pill-fill {
+    height: 100%;
+    width: 0%;
+    background: var(--color-primary-500, #2563eb);
+    border-radius: inherit;
+    transition: width 0.1s linear;
+}
+.voice-pill-time {
+    font-variant-numeric: tabular-nums;
+    font-size: 11px;
+    color: inherit;
+    opacity: 0.75;
+    flex-shrink: 0;
+    min-width: 28px;
+    text-align: right;
+}
+.voice-pill audio { display: none; }
+
+/* Send button toggles into a Stop button while an SSE stream is in flight.
+   Match the look of the disabled send button (light grey block + white
+   glyph) so it reads as the same visual element, just paused/idle from
+   sending perspective and clickable to stop. */
+#send-btn.send-btn-cancel {
+    background-color: rgb(203 213 225) !important; /* slate-300, == disabled send-btn */
+    color: white !important;
+}
+#send-btn.send-btn-cancel:hover {
+    background-color: rgb(148 163 184) !important; /* slate-400 */
+}
+#send-btn.send-btn-cancel:disabled {
+    background-color: rgb(226 232 240) !important; /* slate-200, while stop is in flight */
+    color: white !important;
+    cursor: progress;
+}
+.dark #send-btn.send-btn-cancel {
+    background-color: rgb(71 85 105) !important; /* slate-600, == dark disabled send-btn */
+    color: white !important;
+}
+.dark #send-btn.send-btn-cancel:hover {
+    background-color: rgb(100 116 139) !important; /* slate-500 */
+}
+.dark #send-btn.send-btn-cancel:disabled {
+    background-color: rgb(51 65 85) !important; /* slate-700 */
+    color: rgb(203 213 225) !important;
+}
+
+.agent-cancelled-tag {
+    font-style: italic;
+}
diff --git a/channel/web/static/js/console.js b/channel/web/static/js/console.js
index fa094664..6d0a66fc 100644
--- a/channel/web/static/js/console.js
+++ b/channel/web/static/js/console.js
@@ -14,9 +14,70 @@ const I18N = {
     zh: {
         console: '控制台',
         nav_chat: '对话', nav_manage: '管理', nav_monitor: '监控',
-        menu_chat: '对话', menu_config: '配置', menu_skills: '技能',
+        menu_chat: '对话', menu_config: '配置', menu_models: '模型', menu_skills: '技能',
         menu_memory: '记忆', menu_knowledge: '知识', menu_channels: '通道', menu_tasks: '定时',
         menu_logs: '日志',
+        models_title: '模型管理',
+        models_desc: '统一管理对话、图像、语音、向量、搜索能力',
+        models_section_vendors: '厂商凭据',
+        models_section_vendors_desc: '一处配置，多个模型能力共享',
+        models_section_capabilities: '模型能力',
+        models_add_vendor: '添加厂商',
+        models_provider: '厂商',
+        models_model: '模型',
+        models_voice: '音色',
+        models_configured: '已配置',
+        models_not_configured: '未配置',
+        models_pick_to_configure: '选择以配置',
+        models_clear_credential: '清除凭据',
+        models_base_default_hint: '留空将使用官方默认地址',
+        models_base_default: '默认',
+        models_capability_chat: '主模型',
+        models_capability_chat_desc: '用于基础对话和 Agent 推理',
+        models_capability_vision: '图像理解',
+        models_capability_vision_desc: '识别图片内容，用于图像识别工具',
+        models_capability_image: '图像生成',
+        models_capability_image_desc: '生成图片，用于图像生成技能',
+        models_auto_using: '当前优先使用',
+        models_capability_asr: '语音识别',
+        models_capability_asr_desc: '语音转文字',
+        models_capability_tts: '语音合成',
+        models_capability_tts_desc: '文字转语音',
+        models_capability_embedding: '向量',
+        models_capability_embedding_desc: '用于记忆与知识的向量化检索',
+        models_capability_search: '联网搜索',
+        models_capability_search_desc: '实时网页检索能力，用于搜索工具',
+        models_strategy_auto: '自动',
+        models_search_strategy_label: '策略',
+        models_search_strategy_fixed: '指定',
+        models_search_strategy_auto_hint: '从已配置厂商中自动选择',
+        models_search_strategy_fixed_hint: '指定使用搜索厂商',
+        models_pending_config: '待配置',
+        models_search_available_label: '可用搜索厂商：',
+        models_search_none_configured: '暂未启用任何搜索厂商，点击添加',
+        models_search_add_provider: '添加厂商',
+        models_search_add_desc: '选择一个搜索厂商进行配置',
+        models_search_bocha_title: '配置博查 API Key',
+        models_search_bocha_desc: '前往博查开放平台创建 API Key',
+        models_search_edit_hint: '点击修改配置',
+        models_unavailable: '不可用',
+        models_set_via_env: '通过环境变量启用',
+        models_dim_label: '维度',
+        models_save_success: '已保存',
+        models_save_failed: '保存失败',
+        models_cleared: '已清除',
+        models_clear_failed: '清除失败',
+        models_embedding_change_title: '更改向量模型',
+        models_embedding_change_msg: '切换向量模型后，已有索引将失效，需要重建。是否继续？',
+        models_embedding_saved_title: '向量模型已更新',
+        models_embedding_saved_msg: '请在聊天框输入 /memory rebuild-index 重建索引。',
+        models_embedding_saved_ok: '去执行',
+        models_pick_provider: '待选择',
+        models_clear_confirm_title: '清除厂商凭据',
+        models_clear_confirm_msg: '确认清除该厂商的 API Key 与 Base URL 吗？相关能力将不再可用。',
+        cancel: '取消',
+        save: '保存',
+        ok: '确定',
         knowledge_title: '知识库', knowledge_desc: '浏览和探索你的知识库',
         knowledge_tab_docs: '文档', knowledge_tab_graph: '图谱',
         knowledge_loading: '加载知识库中...', knowledge_loading_desc: '知识页面将显示在这里',
@@ -33,6 +94,7 @@ const I18N = {
         input_placeholder: '输入消息，或输入 / 使用指令',
         config_title: '配置管理', config_desc: '管理模型和 Agent 配置',
         config_model: '模型配置', config_agent: 'Agent 配置',
+        config_model_advanced: '高级配置',
         config_channel: '通道配置',
         config_agent_enabled: 'Agent 模式',
         config_max_tokens: '最大上下文 Token', config_max_tokens_hint: '对话中 Agent 能输入的最大 Token 长度，超过后会智能压缩处理',
@@ -44,7 +106,7 @@ const I18N = {
         config_custom_model_hint: '输入自定义模型名称',
         config_save: '保存', config_saved: '已保存',
         config_save_error: '保存失败',
-        config_custom_option: '自定义...',
+        config_custom_option: '自定义',
         config_custom_tip: '接口需遵循 OpenAI API 协议',
         config_security: '安全设置', config_password: '访问密码',
         config_password_hint: '留空则不启用密码保护',
@@ -106,6 +168,17 @@ const I18N = {
         tip_clear_context: '清除上下文',
         tip_attach: '添加附件',
         attach_menu_file: '上传文件',
+        mic_idle_title: '点击录音 / 再按一次结束',
+        mic_recording_title: '录音中，再次点击结束',
+        mic_busy_title: '识别中…',
+        mic_permission_denied: '无法访问麦克风，请检查浏览器权限',
+        mic_too_short: '录音太短，请重试',
+        mic_error: '语音识别失败',
+        speak_msg: '朗读这段回复',
+        voice_reply_mode_label: '语音回复策略',
+        voice_reply_off: '关闭',
+        voice_reply_if_voice: '仅语音问/语音答',
+        voice_reply_always: '总是语音回复',
         attach_menu_folder: '上传文件夹',
         confirm_yes: '确认',
         confirm_cancel: '取消',
@@ -115,9 +188,70 @@ const I18N = {
     en: {
         console: 'Console',
         nav_chat: 'Chat', nav_manage: 'Management', nav_monitor: 'Monitor',
-        menu_chat: 'Chat', menu_config: 'Config', menu_skills: 'Skills',
+        menu_chat: 'Chat', menu_config: 'Config', menu_models: 'Models', menu_skills: 'Skills',
         menu_memory: 'Memory', menu_knowledge: 'Knowledge', menu_channels: 'Channels', menu_tasks: 'Tasks',
         menu_logs: 'Logs',
+        models_title: 'Models',
+        models_desc: 'Manage chat, image, voice, embedding and search capabilities in one place',
+        models_section_vendors: 'Vendor Credentials',
+        models_section_vendors_desc: 'Configured once, shared by multiple model capabilities',
+        models_section_capabilities: 'Capabilities',
+        models_add_vendor: 'Add Vendor',
+        models_provider: 'Provider',
+        models_model: 'Model',
+        models_voice: 'Voice',
+        models_configured: 'configured',
+        models_not_configured: 'not configured',
+        models_pick_to_configure: 'pick to configure',
+        models_clear_credential: 'Clear credentials',
+        models_base_default_hint: 'Leave blank to use the official default base URL',
+        models_base_default: 'Default',
+        models_capability_chat: 'Main Model',
+        models_capability_chat_desc: 'Used for basic chat and agent reasoning',
+        models_capability_vision: 'Image Understanding',
+        models_capability_vision_desc: 'Recognizes image content, used by image recognition tools',
+        models_capability_image: 'Image Generation',
+        models_capability_image_desc: 'Generates images, used by image generation skills',
+        models_auto_using: 'Preferred',
+        models_capability_asr: 'Speech Recognition',
+        models_capability_asr_desc: 'Voice to text',
+        models_capability_tts: 'Speech Synthesis',
+        models_capability_tts_desc: 'Text to voice',
+        models_capability_embedding: 'Embedding',
+        models_capability_embedding_desc: 'Used for vectorized retrieval of memory and knowledge',
+        models_capability_search: 'Web Search',
+        models_capability_search_desc: 'Real-time web retrieval, used by search tools',
+        models_strategy_auto: 'auto',
+        models_search_strategy_label: 'Strategy',
+        models_search_strategy_fixed: 'Pinned',
+        models_search_strategy_auto_hint: 'Auto-pick from configured providers',
+        models_search_strategy_fixed_hint: 'Always use a specific provider',
+        models_pending_config: 'Pending setup',
+        models_search_available_label: 'Available:',
+        models_search_none_configured: 'No search provider enabled yet — click add.',
+        models_search_add_provider: 'Add provider',
+        models_search_add_desc: 'Pick a search provider to configure',
+        models_search_bocha_title: 'Configure Bocha API Key',
+        models_search_bocha_desc: 'Create a key at the Bocha open platform.',
+        models_search_edit_hint: 'Click to edit',
+        models_unavailable: 'unavailable',
+        models_set_via_env: 'enable via environment variable',
+        models_dim_label: 'dim',
+        models_save_success: 'Saved',
+        models_save_failed: 'Save failed',
+        models_cleared: 'Cleared',
+        models_clear_failed: 'Clear failed',
+        models_embedding_change_title: 'Change embedding model',
+        models_embedding_change_msg: 'Switching the embedding model invalidates the existing index — a rebuild will be needed. Continue?',
+        models_embedding_saved_title: 'Embedding model updated',
+        models_embedding_saved_msg: 'Send /memory rebuild-index in the chat to rebuild the index.',
+        models_embedding_saved_ok: 'Go',
+        models_pick_provider: 'Pick a provider',
+        models_clear_confirm_title: 'Clear vendor credentials',
+        models_clear_confirm_msg: 'Remove this vendor\'s API Key and Base URL? Capabilities relying on it will stop working.',
+        cancel: 'Cancel',
+        save: 'Save',
+        ok: 'OK',
         knowledge_title: 'Knowledge', knowledge_desc: 'Browse and explore your knowledge base',
         knowledge_tab_docs: 'Documents', knowledge_tab_graph: 'Graph',
         knowledge_loading: 'Loading knowledge base...', knowledge_loading_desc: 'Knowledge pages will be displayed here',
@@ -134,6 +268,7 @@ const I18N = {
         input_placeholder: 'Type a message, or press / for commands',
         config_title: 'Configuration', config_desc: 'Manage model and agent settings',
         config_model: 'Model Configuration', config_agent: 'Agent Configuration',
+        config_model_advanced: 'Advanced',
         config_channel: 'Channel Configuration',
         config_agent_enabled: 'Agent Mode',
         config_max_tokens: 'Max Context Tokens', config_max_tokens_hint: 'Max tokens the Agent can input per conversation, auto-compressed when exceeded',
@@ -145,7 +280,7 @@ const I18N = {
         config_custom_model_hint: 'Enter custom model name',
         config_save: 'Save', config_saved: 'Saved',
         config_save_error: 'Save failed',
-        config_custom_option: 'Custom...',
+        config_custom_option: 'Custom',
         config_custom_tip: 'API must follow OpenAI protocol.',
         config_security: 'Security', config_password: 'Password',
         config_password_hint: 'Leave empty to disable password protection',
@@ -207,6 +342,17 @@ const I18N = {
         tip_clear_context: 'Clear Context',
         tip_attach: 'Add Attachment',
         attach_menu_file: 'Upload File',
+        mic_idle_title: 'Click to record, click again to stop',
+        mic_recording_title: 'Recording, click to stop',
+        mic_busy_title: 'Transcribing…',
+        mic_permission_denied: 'Cannot access microphone — check browser permissions',
+        mic_too_short: 'Recording too short, please retry',
+        mic_error: 'Speech recognition failed',
+        speak_msg: 'Read this reply aloud',
+        voice_reply_mode_label: 'Voice reply policy',
+        voice_reply_off: 'Off',
+        voice_reply_if_voice: 'Voice only if voice input',
+        voice_reply_always: 'Always reply with voice',
         attach_menu_folder: 'Upload Folder',
         confirm_yes: 'Confirm',
         confirm_cancel: 'Cancel',
@@ -221,6 +367,15 @@ function t(key) {
     return (I18N[currentLang] && I18N[currentLang][key]) || (I18N.en[key]) || key;
 }
 
+// Resolve a localized label that may be either a plain string or
+// a {zh, en} object returned by the backend.
+function localizedLabel(label) {
+    if (label && typeof label === 'object') {
+        return label[currentLang] || label.en || label.zh || '';
+    }
+    return label || '';
+}
+
 function applyI18n() {
     document.querySelectorAll('[data-i18n]').forEach(el => {
         el.textContent = t(el.dataset.i18n);
@@ -244,6 +399,18 @@ function toggleLanguage() {
     localStorage.setItem('cow_lang', currentLang);
     applyI18n();
     _applyInputTooltips();
+    // Re-render views whose DOM is built in JS (data-i18n alone does not
+    // cover strings interpolated via t() into innerHTML).
+    try { rerenderDynamicViews(); } catch (e) {}
+}
+
+// Refresh JS-rendered views after a language switch. Each branch uses the
+// lightweight in-memory re-render path (no extra network round-trips).
+function rerenderDynamicViews() {
+    if (currentView === 'models' && typeof renderModelsView === 'function'
+            && modelsState && (modelsState.providers || modelsState.capabilities)) {
+        renderModelsView();
+    }
 }
 
 // Floating tooltip portal for [data-tip-key] elements. Tooltip nodes are
@@ -326,6 +493,7 @@ function toggleTheme() {
 const VIEW_META = {
     chat:     { group: 'nav_chat',    page: 'menu_chat' },
     config:   { group: 'nav_manage',  page: 'menu_config' },
+    models:   { group: 'nav_manage',  page: 'menu_models' },
     skills:   { group: 'nav_manage',  page: 'menu_skills' },
     memory:   { group: 'nav_manage',  page: 'menu_memory' },
     knowledge:{ group: 'nav_manage',  page: 'menu_knowledge' },
@@ -612,6 +780,191 @@ if (!supportsDirectoryUpload && attachFolderOption) {
     attachFolderOption.classList.add('hidden');
 }
 
+// ---------------- Mic button: in-page voice input via the configured ASR provider ----------------
+(function setupMicButton() {
+    const micBtn = document.getElementById('mic-btn');
+    if (!micBtn) return;
+    if (!navigator.mediaDevices || !navigator.mediaDevices.getUserMedia ||
+        typeof window.MediaRecorder === 'undefined') {
+        micBtn.style.display = 'none';
+        return;
+    }
+
+    let mediaRecorder = null;
+    let stream = null;
+    let chunks = [];
+    let recording = false;
+
+    const setIdle = () => {
+        recording = false;
+        micBtn.classList.remove('text-red-500', 'animate-pulse');
+        micBtn.classList.add('text-slate-400');
+        micBtn.querySelector('i').className = 'fas fa-microphone text-sm';
+        micBtn.title = t('mic_idle_title');
+    };
+    const setRecording = () => {
+        recording = true;
+        micBtn.classList.remove('text-slate-400');
+        micBtn.classList.add('text-red-500', 'animate-pulse');
+        micBtn.querySelector('i').className = 'fas fa-stop text-sm';
+        micBtn.title = t('mic_recording_title');
+    };
+    const setBusy = () => {
+        micBtn.classList.remove('text-red-500', 'animate-pulse', 'text-slate-400');
+        micBtn.classList.add('text-primary-500');
+        micBtn.querySelector('i').className = 'fas fa-spinner fa-spin text-sm';
+        micBtn.title = t('mic_busy_title');
+    };
+
+    const pickMimeType = () => {
+        const candidates = [
+            'audio/webm;codecs=opus',
+            'audio/webm',
+            'audio/ogg;codecs=opus',
+            'audio/mp4',
+        ];
+        for (const m of candidates) {
+            if (window.MediaRecorder.isTypeSupported && MediaRecorder.isTypeSupported(m)) {
+                return m;
+            }
+        }
+        return '';
+    };
+
+    const stopStream = () => {
+        if (stream) {
+            stream.getTracks().forEach(t => t.stop());
+            stream = null;
+        }
+    };
+
+    let _micTipTimer = null;
+    const flashError = (msg) => {
+        console.warn('[mic]', msg);
+        // Pop a small bubble above the mic so the user actually notices it.
+        // The mic lives inside a relatively-positioned wrapper around the
+        // textarea (see chat.html), so we hang the tip off that wrapper.
+        const wrapper = micBtn.parentElement;
+        if (!wrapper) return;
+        let tip = wrapper.querySelector('.mic-tip');
+        if (!tip) {
+            tip = document.createElement('div');
+            tip.className = 'mic-tip absolute right-1 bottom-full mb-2 px-2 py-1 rounded-md '
+                + 'text-xs text-white bg-slate-800/90 dark:bg-slate-700/90 shadow-md '
+                + 'pointer-events-none whitespace-nowrap z-10';
+            wrapper.appendChild(tip);
+        }
+        tip.textContent = msg;
+        tip.style.opacity = '1';
+        if (_micTipTimer) clearTimeout(_micTipTimer);
+        _micTipTimer = setTimeout(() => {
+            tip.style.opacity = '0';
+            tip.style.transition = 'opacity 200ms';
+            setTimeout(() => tip.remove(), 250);
+        }, 2000);
+    };
+
+    const upload = async (blob, ext) => {
+        setBusy();
+        const fd = new FormData();
+        fd.append('file', blob, `recording.${ext}`);
+        try {
+            const resp = await fetch('/api/voice/asr', { method: 'POST', body: fd });
+            const data = await resp.json();
+            if (data.status === 'success' && data.text) {
+                // Voice-message UX: drop the recording into the conversation
+                // as a playable bubble with the caption underneath, then
+                // dispatch the recognised text through the regular send path.
+                sendVoiceMessage(data.text, data.audio_url);
+            } else {
+                flashError(data.message || t('mic_error'));
+            }
+        } catch (e) {
+            flashError(t('mic_error') + ': ' + e.message);
+        } finally {
+            setIdle();
+        }
+    };
+
+    const start = async () => {
+        try {
+            stream = await navigator.mediaDevices.getUserMedia({ audio: true });
+        } catch (e) {
+            flashError(t('mic_permission_denied'));
+            return;
+        }
+        chunks = [];
+        const mimeType = pickMimeType();
+        try {
+            mediaRecorder = mimeType
+                ? new MediaRecorder(stream, { mimeType })
+                : new MediaRecorder(stream);
+        } catch (e) {
+            stopStream();
+            flashError(t('mic_error') + ': ' + e.message);
+            return;
+        }
+        mediaRecorder.ondataavailable = (ev) => {
+            if (ev.data && ev.data.size > 0) chunks.push(ev.data);
+        };
+        mediaRecorder.onstop = () => {
+            stopStream();
+            const blob = new Blob(chunks, { type: mediaRecorder.mimeType || 'audio/webm' });
+            // Map mime -> extension so the server picks the right file suffix.
+            const mt = (mediaRecorder.mimeType || 'audio/webm').split(';')[0];
+            const extMap = {
+                'audio/webm': 'webm', 'audio/ogg': 'ogg',
+                'audio/mp4': 'm4a',   'audio/mpeg': 'mp3',
+            };
+            const ext = extMap[mt] || 'webm';
+            // 256 bytes ~ container header only, no actual audio. Anything
+            // below that we treat as "tapped by mistake".
+            if (blob.size < 256) {
+                setIdle();
+                flashError(t('mic_too_short'));
+                return;
+            }
+            upload(blob, ext);
+        };
+        // timeslice=250ms: force the recorder to flush a chunk every 250ms.
+        // Without it some browsers wait for stop() before producing any data,
+        // which loses the audio on very short taps.
+        mediaRecorder.start(250);
+        recordStartedAt = Date.now();
+        setRecording();
+    };
+
+    let recordStartedAt = 0;
+
+    const stopWithMinDuration = () => {
+        const elapsed = Date.now() - recordStartedAt;
+        const minMs = 350;
+        if (elapsed < minMs) {
+            // Give the recorder a moment to capture at least one chunk
+            // before we tell it to stop.
+            setTimeout(() => stop(), minMs - elapsed);
+        } else {
+            stop();
+        }
+    };
+
+    const stop = () => {
+        if (mediaRecorder && mediaRecorder.state !== 'inactive') {
+            mediaRecorder.stop();
+        }
+    };
+
+    micBtn.addEventListener('click', () => {
+        if (recording) {
+            stopWithMinDuration();
+        } else {
+            start();
+        }
+    });
+
+    setIdle();
+})();
+
 // Smart auto-scroll: pause when user scrolls up, resume when near bottom
 let _autoScrollEnabled = true;
 const _SCROLL_THRESHOLD = 80; // px from bottom to re-enable auto-scroll
@@ -663,7 +1016,60 @@ const inputHistory = [];
 let historyIdx = -1;
 let historySavedDraft = '';
 
+// While an SSE stream is in flight, the send button morphs into a cancel
+// button. Only one in-flight request is supported at a time.
+let activeRequestId = null;
+let sendBtnMode = 'send'; // 'send' | 'cancel'
+
+function setSendBtnCancelMode(requestId) {
+    activeRequestId = requestId;
+    sendBtnMode = 'cancel';
+    sendBtn.disabled = false;
+    sendBtn.classList.add('send-btn-cancel');
+    sendBtn.title = (currentLang === 'zh' ? '中止' : 'Cancel');
+    sendBtn.innerHTML = '<i class="fas fa-stop text-sm"></i>';
+}
+
+function resetSendBtnSendMode() {
+    activeRequestId = null;
+    sendBtnMode = 'send';
+    sendBtn.classList.remove('send-btn-cancel');
+    sendBtn.title = '';
+    sendBtn.innerHTML = '<i class="fas fa-paper-plane text-sm"></i>';
+    updateSendBtnState();
+}
+
+function requestCancel() {
+    const reqId = activeRequestId;
+    if (!reqId) return;
+    fetch('/cancel', {
+        method: 'POST',
+        headers: { 'Content-Type': 'application/json' },
+        body: JSON.stringify({ request_id: reqId, session_id: sessionId, lang: currentLang }),
+    }).catch(err => {
+        console.warn('[cancel] request failed', err);
+    });
+    // Optimistic UI lock so the click visibly registers before the SSE
+    // "cancelled" event arrives.
+    sendBtn.disabled = true;
+    sendBtn.title = (currentLang === 'zh' ? '已中止' : 'Cancelled');
+}
+
+// Button click is the only path to Cancel. Pressing Enter still calls
+// sendMessage() so users can submit "/cancel" as a regular slash command.
+sendBtn.addEventListener('click', () => {
+    if (sendBtnMode === 'cancel') {
+        requestCancel();
+    } else {
+        sendMessage();
+    }
+});
+
 function updateSendBtnState() {
+    if (sendBtnMode === 'cancel') {
+        // Don't downgrade a Cancel button on input edits.
+        return;
+    }
     sendBtn.disabled = uploadingCount > 0 || (!chatInput.value.trim() && pendingAttachments.length === 0);
 }
 
@@ -911,6 +1317,7 @@ const SLASH_COMMANDS = [
     { cmd: '/knowledge on',        desc: '开启知识库' },
     { cmd: '/knowledge off',       desc: '关闭知识库' },
     { cmd: '/config',              desc: '查看当前配置' },
+    { cmd: '/cancel',              desc: '中止当前正在运行的 Agent 任务' },
     { cmd: '/logs',                desc: '查看最近日志' },
     { cmd: '/version',             desc: '查看版本' },
 ];
@@ -1155,7 +1562,100 @@ document.querySelectorAll('.example-card').forEach(card => {
     });
 });
 
+// Voice-message variant of sendMessage(): renders a playable audio bubble
+// with the ASR caption, then dispatches the recognised text to /message
+// through the same SSE/loading flow as a typed message.
+function sendVoiceMessage(text, audioUrl) {
+    text = (text || '').trim();
+    if (!text) return;
+
+    inputHistory.push(text);
+    historyIdx = -1;
+    historySavedDraft = '';
+
+    const ws = document.getElementById('welcome-screen');
+    const isFirstMessage = !!ws;
+    if (ws) ws.remove();
+
+    const titleInfo = isFirstMessage ? { sid: sessionId, userMsg: text } : null;
+    const timestamp = new Date();
+    addUserVoiceMessage(audioUrl, text, timestamp);
+    const loadingEl = addLoadingIndicator();
+
+    const body = {
+        session_id: sessionId,
+        message: text,
+        stream: true,
+        timestamp: timestamp.toISOString(),
+        is_voice: true,
+        lang: currentLang,
+    };
+
+    const MAX_RETRIES = 2;
+    const RETRY_DELAY_MS = 1000;
+    function postWithRetry(attempt) {
+        fetch('/message', {
+            method: 'POST',
+            headers: { 'Content-Type': 'application/json' },
+            body: JSON.stringify(body)
+        })
+        .then(r => r.json())
+        .then(data => {
+            if (data.status === 'success') {
+                if (data.inline_reply) {
+                    // Synchronous fast-path reply (e.g. /cancel); skip SSE.
+                    loadingEl.remove();
+                    addBotMessage(data.inline_reply, new Date());
+                } else if (data.stream) {
+                    setSendBtnCancelMode(data.request_id);
+                    startSSE(data.request_id, loadingEl, timestamp, titleInfo);
+                } else {
+                    loadingContainers[data.request_id] = loadingEl;
+                }
+            } else {
+                loadingEl.remove();
+                addBotMessage(t('error_send'), new Date());
+                resetSendBtnSendMode();
+            }
+        })
+        .catch(err => {
+            if (attempt < MAX_RETRIES) {
+                setTimeout(() => postWithRetry(attempt + 1), RETRY_DELAY_MS * (attempt + 1));
+                return;
+            }
+            loadingEl.remove();
+            addBotMessage(t('error_send'), new Date());
+        });
+    }
+    postWithRetry(0);
+}
+
+function addUserVoiceMessage(audioUrl, caption, timestamp) {
+    const el = document.createElement('div');
+    el.className = 'flex justify-end px-4 sm:px-6 py-3';
+    // Voice-message bubble: compact voice pill on top, ASR caption beneath.
+    // The bubble keeps the same primary tint as a normal user message so
+    // it visually slots into the conversation flow.
+    el.innerHTML = `
+        <div class="max-w-[75%] sm:max-w-[60%]">
+            <div class="bg-slate-100 dark:bg-white/10 text-slate-700 dark:text-slate-200 rounded-2xl px-3 py-2 msg-content user-bubble">
+                <div class="user-voice-slot"></div>
+                ${caption ? `<div class="text-xs mt-1.5 leading-snug text-slate-500 dark:text-slate-400 whitespace-pre-wrap break-words">${escapeHtml(caption)}</div>` : ''}
+            </div>
+            <div class="text-xs text-slate-400 dark:text-slate-500 mt-1.5 text-right">${formatTime(timestamp)}</div>
+        </div>
+    `;
+    el.querySelector('.user-voice-slot').appendChild(renderVoicePill(audioUrl));
+    messagesDiv.appendChild(el);
+    _autoScrollEnabled = true;
+    scrollChatToBottom(true);
+}
+
 function sendMessage() {
+    // Do NOT branch on sendBtnMode here: Enter should always send (so
+    // typing "/cancel" submits normally). Cancel is wired only to the
+    // send button's pointer click — see send-btn listener above.
+
     const text = chatInput.value.trim();
     if (!text && pendingAttachments.length === 0) return;
 
@@ -1184,7 +1684,7 @@ function sendMessage() {
     renderAttachmentPreview();
     sendBtn.disabled = true;
 
-    const body = { session_id: sessionId, message: text, stream: true, timestamp: timestamp.toISOString() };
+    const body = { session_id: sessionId, message: text, stream: true, timestamp: timestamp.toISOString(), lang: currentLang };
     if (attachments.length > 0) {
         body.attachments = attachments.map(a => ({
             file_path: a.file_path,
@@ -1206,7 +1706,13 @@ function sendMessage() {
         .then(r => r.json())
         .then(data => {
             if (data.status === 'success') {
-                if (data.stream) {
+                if (data.inline_reply) {
+                    // Channel handled synchronously (e.g. /cancel fast-path);
+                    // render as a bot bubble and skip SSE entirely.
+                    loadingEl.remove();
+                    addBotMessage(data.inline_reply, new Date());
+                } else if (data.stream) {
+                    setSendBtnCancelMode(data.request_id);
                     startSSE(data.request_id, loadingEl, timestamp, titleInfo);
                 } else {
                     loadingContainers[data.request_id] = loadingEl;
@@ -1214,12 +1720,14 @@ function sendMessage() {
             } else {
                 loadingEl.remove();
                 addBotMessage(t('error_send'), new Date());
+                resetSendBtnSendMode();
             }
         })
         .catch(err => {
             if (err.name === 'AbortError') {
                 loadingEl.remove();
                 addBotMessage(t('error_timeout'), new Date());
+                resetSendBtnSendMode();
                 return;
             }
             if (attempt < MAX_RETRIES) {
@@ -1229,6 +1737,7 @@ function sendMessage() {
             }
             loadingEl.remove();
             addBotMessage(t('error_send'), new Date());
+            resetSendBtnSendMode();
         });
     }
 
@@ -1264,12 +1773,16 @@ function startSSE(requestId, loadingEl, timestamp, titleInfo) {
                     <div class="agent-steps"></div>
                     <div class="answer-content sse-streaming"></div>
                     <div class="media-content"></div>
+                    <div class="bot-audio-slot"></div>
                 </div>
                 <div class="flex items-center gap-2 mt-1.5">
                     <span class="text-xs text-slate-400 dark:text-slate-500">${formatTime(timestamp)}</span>
                     <button class="copy-msg-btn text-xs text-slate-300 dark:text-slate-600 hover:text-slate-500 dark:hover:text-slate-400 transition-colors cursor-pointer" title="${currentLang === 'zh' ? '复制' : 'Copy'}" style="display:none">
                         <i class="fas fa-copy"></i>
                     </button>
+                    <button class="speak-msg-btn text-xs text-slate-300 dark:text-slate-600 hover:text-slate-500 dark:hover:text-slate-400 transition-colors cursor-pointer" title="${t('speak_msg')}" style="display:none;">
+                        <i class="fas fa-volume-up"></i>
+                    </button>
                 </div>
             </div>
         `;
@@ -1480,13 +1993,33 @@ function startSSE(requestId, loadingEl, timestamp, titleInfo) {
                 stepsEl.appendChild(wrap);
                 scrollChatToBottom();
 
-            } else if (item.type === 'done') {
-                done = true;
-                es.close();
-                delete activeStreams[requestId];
+            } else if (item.type === 'cancelled') {
+                // Agent acknowledged the stop; mark the bubble. A trailing
+                // "done" still arrives with the partial answer.
+                ensureBotEl();
+                if (currentReasoningEl) {
+                    finalizeThinking(currentReasoningEl, reasoningStartTime, reasoningText);
+                    currentReasoningEl = null;
+                    reasoningText = '';
+                }
+                if (!botEl.querySelector('.agent-cancelled-tag')) {
+                    const tag = document.createElement('div');
+                    tag.className = 'agent-cancelled-tag text-xs text-amber-600 dark:text-amber-400 mt-1';
+                    tag.textContent = (currentLang === 'zh') ? '已中止' : 'Cancelled';
+                    stepsEl.appendChild(tag);
+                }
+                resetSendBtnSendMode();
 
-                // item.content may be empty when "done" is only a stream-close signal after media.
-                const finalText = item.content || accumulatedText;
+            } else if (item.type === 'done') {
+                // Don't close the stream yet: the backend keeps it open
+                // for a short tail to deliver async attachments such as
+                // TTS audio (`voice_attach`). It will close the stream on
+                // its own via onerror once the tail expires.
+                done = true;
+                resetSendBtnSendMode();
+
+                const finalTextRaw = item.content || accumulatedText;
+                const finalText = localizeCancelMarker(finalTextRaw);
 
                 if (!botEl && finalText) {
                     if (loadingEl) { loadingEl.remove(); loadingEl = null; }
@@ -1494,11 +2027,12 @@ function startSSE(requestId, loadingEl, timestamp, titleInfo) {
                 } else if (botEl) {
                     contentEl.classList.remove('sse-streaming');
                     if (finalText) contentEl.innerHTML = renderMarkdown(finalText);
-                    contentEl.dataset.rawMd = finalText || '';
+                    contentEl.dataset.rawMd = finalTextRaw || '';
                     const copyBtn = botEl.querySelector('.copy-msg-btn');
                     if (copyBtn && finalText) copyBtn.style.display = '';
                     applyHighlighting(botEl);
                 }
+                renderBotSpeakerButton(botEl, finalText);
                 scrollChatToBottom();
 
                 if (titleInfo) {
@@ -1508,12 +2042,22 @@ function startSSE(requestId, loadingEl, timestamp, titleInfo) {
                     loadSessionList();
                 }
 
+            } else if (item.type === 'voice_attach') {
+                // TTS finished — attach a playable audio element to the
+                // current bot bubble. The stream closes right after.
+                if (botEl && item.url) {
+                    attachAudioToBotBubble(botEl, item.url, { autoplay: true });
+                }
+                es.close();
+                delete activeStreams[requestId];
+
             } else if (item.type === 'error') {
                 done = true;
                 es.close();
                 delete activeStreams[requestId];
                 if (loadingEl) { loadingEl.remove(); loadingEl = null; }
                 addBotMessage(t('error_send'), new Date());
+                resetSendBtnSendMode();
             }
         };
 
@@ -1521,7 +2065,10 @@ function startSSE(requestId, loadingEl, timestamp, titleInfo) {
             es.close();
             delete activeStreams[requestId];
 
-            if (done) return;
+            if (done) {
+                // Normal close after the post-done tail expired; nothing to do.
+                return;
+            }
 
             if (currentReasoningEl) {
                 finalizeThinking(currentReasoningEl, reasoningStartTime, reasoningText);
@@ -1547,6 +2094,7 @@ function startSSE(requestId, loadingEl, timestamp, titleInfo) {
                 applyHighlighting(botEl);
                 bindChatKnowledgeLinks(botEl);
             }
+            resetSendBtnSendMode();
         };
     }
 
@@ -1785,13 +2333,23 @@ function _renderSentFileFromToolResult(step) {
         `<i class="fas fa-file-download" style="color:#6b7280;"></i> ${escapeHtml(fileName)}</a></div>`;
 }
 
+// Cosmetic translator for cancel markers persisted in history.
+// History keeps the English canonical form for the LLM; only display is localized.
+function localizeCancelMarker(text) {
+    if (!text) return text;
+    if (currentLang !== 'zh') return text;
+    return text
+        .replace(/_\(Cancelled by user\)_/g, '_(用户已中止)_')
+        .replace(/_\(Cancelled\)_/g, '_(已中止)_');
+}
+
 function createBotMessageEl(content, timestamp, requestId, msg) {
     const el = document.createElement('div');
     el.className = 'flex gap-3 px-4 sm:px-6 py-3';
     if (requestId) el.dataset.requestId = requestId;
 
     let stepsHtml = '';
-    let displayContent = content;
+    let displayContent = localizeCancelMarker(content);
 
     if (msg && msg.steps && msg.steps.length > 0) {
         // New format: ordered steps with interleaved content
@@ -1812,21 +2370,174 @@ function createBotMessageEl(content, timestamp, requestId, msg) {
             <div class="bg-white dark:bg-[#1A1A1A] border border-slate-200 dark:border-white/10 rounded-2xl px-4 py-3 text-sm leading-relaxed msg-content text-slate-700 dark:text-slate-200">
                 ${stepsHtml ? `<div class="agent-steps">${stepsHtml}</div>` : ''}
                 <div class="answer-content">${renderMarkdown(displayContent)}</div>
+                <div class="bot-audio-slot"></div>
             </div>
             <div class="flex items-center gap-2 mt-1.5">
                 <span class="text-xs text-slate-400 dark:text-slate-500">${formatTime(timestamp)}</span>
                 <button class="copy-msg-btn text-xs text-slate-300 dark:text-slate-600 hover:text-slate-500 dark:hover:text-slate-400 transition-colors cursor-pointer" title="${currentLang === 'zh' ? '复制' : 'Copy'}">
                     <i class="fas fa-copy"></i>
                 </button>
+                <button class="speak-msg-btn text-xs text-slate-300 dark:text-slate-600 hover:text-slate-500 dark:hover:text-slate-400 transition-colors cursor-pointer" title="${t('speak_msg')}" style="display:none;">
+                    <i class="fas fa-volume-up"></i>
+                </button>
             </div>
         </div>
     `;
     el.querySelector('.answer-content').dataset.rawMd = displayContent;
+    // Existing TTS attachment (history replay): mount the player up-front.
+    const existingAudio = msg && msg.extras && msg.extras.audio && msg.extras.audio.url;
+    if (existingAudio) {
+        attachAudioToBotBubble(el, existingAudio, { autoplay: false });
+    }
+    renderBotSpeakerButton(el, displayContent);
     applyHighlighting(el);
     bindChatKnowledgeLinks(el);
     return el;
 }
 
+// Append (or replace) a small audio player inside a bot bubble's
+// dedicated `.bot-audio-slot`. Used by both live TTS pushes and history
+// replay. Silent failures: never throws.
+function attachAudioToBotBubble(botEl, audioUrl, opts) {
+    try {
+        if (!botEl || !audioUrl) return;
+        const slot = botEl.querySelector('.bot-audio-slot');
+        if (!slot) return;
+        slot.innerHTML = '';
+        slot.style.marginTop = '6px';
+        const pill = renderVoicePill(audioUrl, { autoplay: !!(opts && opts.autoplay) });
+        slot.appendChild(pill);
+        const speakBtn = botEl.querySelector('.speak-msg-btn');
+        if (speakBtn) speakBtn.style.display = 'none';
+    } catch (_) { /* silent */ }
+}
+
+// Build a compact play/pause + progress + duration pill that wraps a
+// hidden <audio>. Returns the root element; safe to embed anywhere.
+function renderVoicePill(audioUrl, opts) {
+    opts = opts || {};
+    const wrap = document.createElement('div');
+    wrap.className = 'voice-pill';
+    wrap.innerHTML = `
+        <button type="button" class="voice-pill-btn" data-state="play" aria-label="play">
+            <i class="fas fa-play"></i>
+        </button>
+        <div class="voice-pill-track"><div class="voice-pill-fill"></div></div>
+        <span class="voice-pill-time">0:00</span>
+        <audio preload="metadata" src="${audioUrl}"></audio>
+    `;
+    const btn = wrap.querySelector('.voice-pill-btn');
+    const fill = wrap.querySelector('.voice-pill-fill');
+    const timeEl = wrap.querySelector('.voice-pill-time');
+    const audio = wrap.querySelector('audio');
+
+    const fmt = (s) => {
+        if (!isFinite(s) || s < 0) s = 0;
+        const m = Math.floor(s / 60);
+        const r = Math.floor(s % 60);
+        return `${m}:${r < 10 ? '0' : ''}${r}`;
+    };
+    const setIcon = (state) => {
+        btn.dataset.state = state;
+        btn.querySelector('i').className = state === 'pause' ? 'fas fa-pause' : 'fas fa-play';
+        btn.setAttribute('aria-label', state === 'pause' ? 'pause' : 'play');
+    };
+
+    audio.addEventListener('loadedmetadata', () => {
+        if (audio.duration && isFinite(audio.duration)) timeEl.textContent = fmt(audio.duration);
+    });
+    audio.addEventListener('timeupdate', () => {
+        const dur = audio.duration || 0;
+        if (dur > 0) {
+            fill.style.width = `${Math.min(100, (audio.currentTime / dur) * 100)}%`;
+            timeEl.textContent = fmt(dur - audio.currentTime);
+        }
+    });
+    audio.addEventListener('ended', () => {
+        setIcon('play');
+        fill.style.width = '0%';
+        timeEl.textContent = fmt(audio.duration || 0);
+    });
+    audio.addEventListener('play',  () => setIcon('pause'));
+    audio.addEventListener('pause', () => setIcon('play'));
+
+    btn.addEventListener('click', (e) => {
+        e.stopPropagation();
+        if (audio.paused) {
+            audio.play().catch(() => {});
+        } else {
+            audio.pause();
+        }
+    });
+
+    if (opts.autoplay) {
+        // Autoplay may be blocked by the browser; fall back silently and
+        // let the user tap the play button.
+        const tryPlay = () => audio.play().catch(() => {});
+        if (audio.readyState >= 2) tryPlay();
+        else audio.addEventListener('canplay', tryPlay, { once: true });
+    }
+    return wrap;
+}
+
+// Show the manual "read aloud" button when TTS is configured but the
+// bubble has no audio yet. Lazily probes capability via /api/models so
+// we don't expose the button when nothing can synthesize speech.
+function renderBotSpeakerButton(botEl, text) {
+    if (!botEl || !text || !text.trim()) return;
+    const btn = botEl.querySelector('.speak-msg-btn');
+    if (!btn) return;
+    if (botEl.querySelector('.bot-audio-slot audio')) return;
+    _isTtsReady().then(ready => {
+        if (!ready) return;
+        btn.style.display = '';
+        btn.onclick = () => _triggerManualTts(btn, botEl, text);
+    });
+}
+
+let _ttsReadyPromise = null;
+let _ttsReadyTs = 0;
+function _isTtsReady() {
+    // Cache for 30s to avoid hammering /api/models on every bubble.
+    if (_ttsReadyPromise && Date.now() - _ttsReadyTs < 30000) {
+        return _ttsReadyPromise;
+    }
+    _ttsReadyTs = Date.now();
+    _ttsReadyPromise = fetch('/api/models')
+        .then(r => r.json())
+        .then(data => {
+            const tts = data && data.capabilities && data.capabilities.tts;
+            if (!tts) return false;
+            return Boolean(tts.current_provider || tts.suggested_provider);
+        })
+        .catch(() => false);
+    return _ttsReadyPromise;
+}
+
+function _triggerManualTts(btn, botEl, text) {
+    if (btn.dataset.busy === '1') return;
+    btn.dataset.busy = '1';
+    const icon = btn.querySelector('i');
+    const prev = icon ? icon.className : '';
+    if (icon) icon.className = 'fas fa-spinner fa-spin';
+    fetch('/api/voice/tts', {
+        method: 'POST',
+        headers: { 'Content-Type': 'application/json' },
+        body: JSON.stringify({ text, session_id: sessionId }),
+    })
+        .then(r => r.json())
+        .then(data => {
+            if (data && data.status === 'success' && data.audio_url) {
+                attachAudioToBotBubble(botEl, data.audio_url, { autoplay: true });
+            }
+        })
+        .catch(() => {})
+        .finally(() => {
+            btn.dataset.busy = '0';
+            if (icon) icon.className = prev || 'fas fa-volume-up';
+        });
+}
+
 function addUserMessage(content, timestamp, attachments) {
     const el = createUserMessageEl(content, timestamp, attachments);
     messagesDiv.appendChild(el);
@@ -2478,7 +3189,12 @@ let cfgProviderValue = '';
 let cfgModelValue = '';
 
 // --- Custom dropdown helper ---
-function initDropdown(el, options, selectedValue, onChange) {
+function initDropdown(el, options, selectedValue, onChange, opts) {
+    // opts.placeholder: when set AND selectedValue is empty, render that text
+    // in a dim style instead of auto-selecting options[0]. Useful for
+    // "pick or empty" capabilities (asr / embedding) where we want the
+    // user to make an explicit choice.
+    opts = opts || {};
     const textEl = el.querySelector('.cfg-dropdown-text');
     const menuEl = el.querySelector('.cfg-dropdown-menu');
     const selEl = el.querySelector('.cfg-dropdown-selected');
@@ -2491,8 +3207,23 @@ function initDropdown(el, options, selectedValue, onChange) {
         options.forEach(opt => {
             const item = document.createElement('div');
             item.className = 'cfg-dropdown-item' + (opt.value === el._ddValue ? ' active' : '');
-            item.textContent = opt.label;
             item.dataset.value = opt.value;
+            // Hint is an optional dim secondary label rendered on the right
+            // side of the row (e.g. friendly brand name next to a technical
+            // model id). When absent the row degrades to the original
+            // single-string layout.
+            if (opt.hint) {
+                const labelEl = document.createElement('span');
+                labelEl.className = 'cfg-dropdown-label';
+                labelEl.textContent = opt.label;
+                const hintEl = document.createElement('span');
+                hintEl.className = 'cfg-dropdown-hint';
+                hintEl.textContent = opt.hint;
+                item.appendChild(labelEl);
+                item.appendChild(hintEl);
+            } else {
+                item.textContent = opt.label;
+            }
             item.addEventListener('click', (e) => {
                 e.stopPropagation();
                 el._ddValue = opt.value;
@@ -2505,8 +3236,20 @@ function initDropdown(el, options, selectedValue, onChange) {
             menuEl.appendChild(item);
         });
         const sel = options.find(o => o.value === el._ddValue);
-        textEl.textContent = sel ? sel.label : (options[0] ? options[0].label : '--');
-        if (!sel && options[0]) el._ddValue = options[0].value;
+        if (sel) {
+            textEl.textContent = sel.label;
+            textEl.classList.remove('text-slate-400', 'dark:text-slate-500');
+        } else if (opts.placeholder && !el._ddValue) {
+            // No selection yet — show the placeholder in muted style.
+            // Do NOT write a fallback value, so the dropdown stays
+            // "unsaved" until the user explicitly picks.
+            textEl.textContent = opts.placeholder;
+            textEl.classList.add('text-slate-400', 'dark:text-slate-500');
+        } else {
+            textEl.textContent = options[0] ? options[0].label : '--';
+            textEl.classList.remove('text-slate-400', 'dark:text-slate-500');
+            if (options[0]) el._ddValue = options[0].value;
+        }
     }
 
     render();
@@ -2535,7 +3278,7 @@ function initConfigView(data) {
     configCurrentModel = data.model || '';
 
     const providerEl = document.getElementById('cfg-provider');
-    const providerOpts = Object.entries(configProviders).map(([pid, p]) => ({ value: pid, label: p.label }));
+    const providerOpts = Object.entries(configProviders).map(([pid, p]) => ({ value: pid, label: localizedLabel(p.label) }));
 
     // if use_linkai is enabled, always select linkai as the provider
     // Otherwise prefer bot_type from config, fall back to model-based detection
@@ -3113,12 +3856,14 @@ function closeMemoryViewer() {
 // =====================================================================
 // Custom Confirm Dialog
 // =====================================================================
-function showConfirmDialog({ title, message, okText, cancelText, onConfirm }) {
+function showConfirmDialog({ title, message, okText, cancelText, onConfirm, hideCancel }) {
     const overlay = document.getElementById('confirm-dialog-overlay');
     document.getElementById('confirm-dialog-title').textContent = title || '';
     document.getElementById('confirm-dialog-message').textContent = message || '';
     document.getElementById('confirm-dialog-ok').textContent = okText || 'OK';
-    document.getElementById('confirm-dialog-cancel').textContent = cancelText || t('channels_cancel');
+    const cancelBtn = document.getElementById('confirm-dialog-cancel');
+    cancelBtn.textContent = cancelText || t('channels_cancel');
+    cancelBtn.classList.toggle('hidden', !!hideCancel);
 
     function cleanup() {
         overlay.classList.add('hidden');
@@ -3131,13 +3876,1474 @@ function showConfirmDialog({ title, message, okText, cancelText, onConfirm }) {
     function onOverlayClick(e) { if (e.target === overlay) cleanup(); }
 
     const okBtn = document.getElementById('confirm-dialog-ok');
-    const cancelBtn = document.getElementById('confirm-dialog-cancel');
     okBtn.addEventListener('click', onOk);
     cancelBtn.addEventListener('click', onCancel);
     overlay.addEventListener('click', onOverlayClick);
     overlay.classList.remove('hidden');
 }
 
+// =====================================================================
+// Models View
+// =====================================================================
+// Capability cards rendered on the Models page. Order matters — main model
+// comes first because it transitively decides defaults for vision and image.
+// Icon palette is grouped by capability family:
+//   - chat                       → primary (brand green; the "main" capability)
+//   - vision + image             → blue    (everything visual)
+//   - asr + tts                  → amber   (everything audio)
+//   - embedding                  → purple  (vectors)
+//   - search                     → orange  (retrieval)
+// Each card uses an explicit `iconClass` string so Tailwind's CDN JIT can
+// see the literal class names — dynamic `bg-${color}-50` strings would not
+// be picked up reliably.
+const MODELS_CAPABILITY_DEFS = [
+    { id: 'chat',      icon: 'fa-microchip',        editable: true,  needsModel: true,  titleKey: 'models_capability_chat',      descKey: 'models_capability_chat_desc',
+      iconChip: 'bg-primary-50 dark:bg-primary-900/30',  iconGlyph: 'text-primary-500' },
+    { id: 'vision',    icon: 'fa-eye',              editable: true,  needsModel: true,  titleKey: 'models_capability_vision',    descKey: 'models_capability_vision_desc',
+      iconChip: 'bg-blue-50 dark:bg-blue-900/30',        iconGlyph: 'text-blue-500' },
+    { id: 'image',     icon: 'fa-image',            editable: true,  needsModel: true,  titleKey: 'models_capability_image',     descKey: 'models_capability_image_desc',
+      iconChip: 'bg-blue-50 dark:bg-blue-900/30',        iconGlyph: 'text-blue-500' },
+    { id: 'asr',       icon: 'fa-microphone',       editable: true,  needsModel: false, titleKey: 'models_capability_asr',       descKey: 'models_capability_asr_desc',
+      iconChip: 'bg-amber-50 dark:bg-amber-900/30',      iconGlyph: 'text-amber-500' },
+    { id: 'tts',       icon: 'fa-volume-high',      editable: true,  needsModel: true,  titleKey: 'models_capability_tts',       descKey: 'models_capability_tts_desc',
+      iconChip: 'bg-amber-50 dark:bg-amber-900/30',      iconGlyph: 'text-amber-500' },
+    { id: 'embedding', icon: 'fa-vector-square',    editable: true,  needsModel: false, titleKey: 'models_capability_embedding', descKey: 'models_capability_embedding_desc',
+      iconChip: 'bg-purple-50 dark:bg-purple-900/30',    iconGlyph: 'text-purple-500' },
+    { id: 'search',    icon: 'fa-magnifying-glass', editable: true,  needsModel: false, titleKey: 'models_capability_search',    descKey: 'models_capability_search_desc',
+      iconChip: 'bg-orange-50 dark:bg-orange-900/30',    iconGlyph: 'text-orange-500' },
+];
+
+// Provider logos: when a real SVG exists under static/logos/<id>.svg we use
+// it; otherwise we fall back to a neutral monogram chip. SVGs are fetched
+// via <img> with a hidden onerror so layout stays stable when files are
+// absent. Vendors whose mark is rendered in pure (or near-pure) black are
+// listed in MODELS_PROVIDER_LOGO_DARK_INVERT — for those, we apply a CSS
+// invert filter in dark mode so the glyph stays visible against #1A1A1A.
+const MODELS_PROVIDER_LOGO_PATH = 'assets/logos';
+const MODELS_PROVIDER_LOGO_DARK_INVERT = new Set([
+    'openai',     // black wordmark
+    'moonshot',   // dark monogram
+    'zhipu',      // dark monogram
+    'custom',     // single-color slider glyph
+]);
+
+let modelsState = { providers: [], capabilities: {} };
+
+// One-shot: { capabilityId, providerId } stashed before a Models reload,
+// consumed by renderCapabilityBody to preselect a just-configured vendor.
+let pendingCapabilitySelection = null;
+
+// `opts.preserveScroll` keeps the page's vertical scroll position across the
+// refresh. We capture it before unhiding the loading skeleton (which collapses
+// content height to zero) and restore it after the new content is mounted.
+// This matters when the user configures a vendor from inside a capability
+// card's dropdown — without preservation, the post-save reload bounces them
+// back to the top of the page, away from the card they were configuring.
+function loadModelsView(opts) {
+    const loading = document.getElementById('models-loading');
+    const content = document.getElementById('models-content');
+    if (!loading || !content) return;
+    const preserveScroll = !!(opts && opts.preserveScroll);
+    // The Models pane has its own scrollable container; capture its position
+    // (not window.scrollY) so we can put the user back exactly where they were.
+    const scroller = document.querySelector('#view-models .overflow-y-auto');
+    const savedTop = preserveScroll && scroller ? scroller.scrollTop : null;
+
+    loading.classList.remove('hidden');
+    content.classList.add('hidden');
+
+    fetch('/api/models').then(r => r.json()).then(data => {
+        if (data.status !== 'success') {
+            loading.innerHTML = `<span class="text-sm text-red-400">${escapeHtml(data.message || 'Failed to load')}</span>`;
+            return;
+        }
+        modelsState.providers = data.providers || [];
+        modelsState.capabilities = data.capabilities || {};
+        renderModelsView();
+        loading.classList.add('hidden');
+        content.classList.remove('hidden');
+        if (savedTop !== null && scroller) {
+            // Wait one frame for the new layout to settle, otherwise the
+            // restored scrollTop snaps to the previous (smaller) max.
+            requestAnimationFrame(() => { scroller.scrollTop = savedTop; });
+        }
+    }).catch(err => {
+        loading.innerHTML = `<span class="text-sm text-red-400">${escapeHtml(String(err))}</span>`;
+    });
+}
+
+function renderModelsView() {
+    const container = document.getElementById('models-content');
+    container.innerHTML = '';
+    container.appendChild(renderVendorsSection());
+    MODELS_CAPABILITY_DEFS.forEach(def => container.appendChild(renderCapabilityCard(def)));
+}
+
+// ---------- Vendor section (Layer 1) -----------------------------------
+
+function renderVendorsSection() {
+    const wrap = document.createElement('div');
+    wrap.className = 'bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10 p-6';
+
+    const configured = modelsState.providers.filter(p => p.configured);
+
+    const header = `
+        <div class="flex items-start gap-3 mb-5">
+            <div class="w-9 h-9 rounded-lg bg-primary-50 dark:bg-primary-900/30 flex items-center justify-center flex-shrink-0">
+                <i class="fas fa-key text-primary-500 text-sm"></i>
+            </div>
+            <div class="flex-1 min-w-0">
+                <h3 class="font-semibold text-slate-800 dark:text-slate-100">${t('models_section_vendors')}</h3>
+                <p class="text-xs text-slate-500 dark:text-slate-400 mt-0.5">${t('models_section_vendors_desc')}</p>
+            </div>
+            <span class="text-xs text-slate-400 dark:text-slate-500 mt-2 flex-shrink-0">${configured.length}/${modelsState.providers.length}</span>
+        </div>`;
+
+    let body;
+    if (configured.length === 0) {
+        body = `
+            <div class="flex flex-col items-center justify-center py-8 px-4 rounded-lg border border-dashed border-slate-200 dark:border-white/10">
+                <p class="text-sm text-slate-500 dark:text-slate-400 text-center">${t('models_not_configured')}</p>
+                <button onclick="openVendorModal('')"
+                        class="mt-3 px-3 py-1.5 rounded-lg text-xs font-medium bg-primary-50 dark:bg-primary-900/30 text-primary-600 dark:text-primary-400 hover:bg-primary-100 dark:hover:bg-primary-900/50 cursor-pointer transition-colors">
+                    <i class="fas fa-plus text-[10px] mr-1"></i>${t('models_add_vendor')}
+                </button>
+            </div>`;
+    } else {
+        body = `<div class="grid grid-cols-1 sm:grid-cols-2 gap-3">
+            ${configured.map(renderVendorChip).join('')}
+        </div>`;
+    }
+
+    wrap.innerHTML = header + body;
+    return wrap;
+}
+
+function renderVendorChip(p) {
+    // The masked API key is intentionally not surfaced here; it is shown
+    // inside the edit modal so the chip stays uncluttered and scannable.
+    return `
+        <button onclick="openVendorModal('${escapeHtml(p.id)}')"
+                class="group flex items-center gap-3 px-3 py-2.5 rounded-lg border border-slate-200 dark:border-white/10
+                       bg-slate-50 dark:bg-white/5 hover:border-primary-300 dark:hover:border-primary-500/50
+                       cursor-pointer transition-colors duration-150 text-left">
+            ${renderProviderLogo(p, 28)}
+            <span class="flex-1 min-w-0 text-sm font-medium text-slate-800 dark:text-slate-100 truncate">${escapeHtml(localizedLabel(p.label))}</span>
+            <i class="fas fa-pen-to-square text-[11px] text-slate-400 dark:text-slate-500 group-hover:text-primary-500 transition-colors"></i>
+        </button>`;
+}
+
+// Render a uniformly-styled logo for a provider. Tries an SVG asset first; if
+// it 404s the <img> swaps itself for a monogram fallback via onerror.
+function renderProviderLogo(p, sizePx) {
+    const initial = (localizedLabel(p.label) || p.id || '?').slice(0, 1).toUpperCase();
+    const sz = sizePx || 32;
+    const url = `${MODELS_PROVIDER_LOGO_PATH}/${encodeURIComponent(p.id)}.svg`;
+    const fallbackId = `pl-${p.id}-${Math.random().toString(36).slice(2, 8)}`;
+    const imgClass = MODELS_PROVIDER_LOGO_DARK_INVERT.has(p.id)
+        ? 'absolute inset-0 m-auto provider-logo-img provider-logo-invert-dark'
+        : 'absolute inset-0 m-auto provider-logo-img';
+    return `
+        <span class="relative flex items-center justify-center rounded-lg bg-slate-100 dark:bg-white/10
+                     text-slate-600 dark:text-slate-300 flex-shrink-0 overflow-hidden"
+              style="width:${sz}px;height:${sz}px;">
+            <span id="${fallbackId}" class="text-xs font-bold">${escapeHtml(initial)}</span>
+            <img src="${url}" alt="" aria-hidden="true"
+                 class="${imgClass}"
+                 style="width:${Math.round(sz * 0.65)}px;height:${Math.round(sz * 0.65)}px;"
+                 onload="(function(el){var f=document.getElementById('${fallbackId}');if(f)f.style.display='none';})(this)"
+                 onerror="this.remove();">
+        </span>`;
+}
+
+// ---------- Capability cards (Layer 2) ---------------------------------
+
+function renderCapabilityCard(def) {
+    const cap = modelsState.capabilities[def.id] || {};
+    const wrap = document.createElement('div');
+    wrap.className = 'bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10 p-6';
+    wrap.id = `models-card-${def.id}`;
+
+    const headerRight = renderCapabilityHeaderTag(def, cap);
+
+    wrap.innerHTML = `
+        <div class="flex items-start gap-3 mb-5">
+            <div class="w-9 h-9 rounded-lg ${def.iconChip} flex items-center justify-center flex-shrink-0">
+                <i class="fas ${def.icon} ${def.iconGlyph} text-sm"></i>
+            </div>
+            <div class="flex-1 min-w-0">
+                <h3 class="font-semibold text-slate-800 dark:text-slate-100">${t(def.titleKey)}</h3>
+                <p class="text-xs text-slate-500 dark:text-slate-400 mt-0.5">${t(def.descKey)}</p>
+            </div>
+            ${headerRight}
+        </div>
+        <div class="space-y-4" data-cap-body="${def.id}"></div>`;
+
+    const body = wrap.querySelector(`[data-cap-body="${def.id}"]`);
+    renderCapabilityBody(def, cap, body);
+    return wrap;
+}
+
+function renderCapabilityHeaderTag(def, cap) {
+    return '';
+}
+
+function _searchProviderLabel(cap, providerId) {
+    const list = (cap && cap.providers) || [];
+    const hit = list.find(p => p.id === providerId);
+    return hit ? localizedLabel(hit.label) : providerId;
+}
+
+// Search card body: strategy picker + (when fixed) provider picker + a
+// status row that surfaces which providers are ready and how to add the
+// missing ones. Three of the four backends piggy-back on model-vendor
+// credentials (zhipu / qianfan / linkai); bocha owns its own key under
+// tools.web_search and gets its own minimal credential modal.
+function renderSearchCapability(def, cap, body) {
+    const providers = cap.providers || [];
+    const configuredIds = cap.configured_providers || [];
+    const hasAny = configuredIds.length > 0;
+    const strategy = cap.strategy || 'auto';
+
+    body.innerHTML = `
+        <div>
+            <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">${t('models_search_strategy_label')}</label>
+            <div id="cap-search-strategy" class="cfg-dropdown" tabindex="0">
+                <div class="cfg-dropdown-selected">
+                    <span class="cfg-dropdown-text">--</span>
+                    <i class="fas fa-chevron-down cfg-dropdown-arrow"></i>
+                </div>
+                <div class="cfg-dropdown-menu"></div>
+            </div>
+        </div>
+        <div id="cap-search-provider-wrap" class="hidden">
+            <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">${t('models_provider')}</label>
+            <div id="cap-search-provider" class="cfg-dropdown" tabindex="0">
+                <div class="cfg-dropdown-selected">
+                    <span class="cfg-dropdown-text">--</span>
+                    <i class="fas fa-chevron-down cfg-dropdown-arrow"></i>
+                </div>
+                <div class="cfg-dropdown-menu"></div>
+            </div>
+        </div>
+        <div id="cap-search-summary"></div>
+        <div class="flex items-center justify-end gap-3 pt-1">
+            <span id="cap-search-status" class="text-xs text-primary-500 opacity-0 transition-opacity duration-300"></span>
+            <button onclick="saveSearchCapability()"
+                    class="px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600 text-white text-sm font-medium
+                           cursor-pointer transition-colors duration-150 disabled:opacity-50 disabled:cursor-not-allowed">
+                ${t('save')}
+            </button>
+        </div>
+    `;
+
+    // Strategy dropdown — when no provider is configured the strategy
+    // value is meaningless, so we show a "待配置" placeholder instead of
+    // a default selection. Once any provider gets configured the saved
+    // strategy (or "auto") becomes the active value.
+    initDropdown(
+        body.querySelector('#cap-search-strategy'),
+        [
+            { value: 'auto',  label: t('models_strategy_auto'),         hint: t('models_search_strategy_auto_hint') },
+            { value: 'fixed', label: t('models_search_strategy_fixed'), hint: t('models_search_strategy_fixed_hint') },
+        ],
+        hasAny ? strategy : '',
+        (value) => _onSearchStrategyChange(cap, value, body),
+        hasAny ? null : { placeholder: t('models_pending_config') },
+    );
+
+    // Provider dropdown — populated with configured providers only;
+    // unconfigured ones cannot be pinned (they'd silently fall back).
+    const provOpts = configuredIds.map(id => ({
+        value: id,
+        label: _searchProviderLabel(cap, id),
+    }));
+    if (provOpts.length === 0) provOpts.push({ value: '', label: '--' });
+    initDropdown(
+        body.querySelector('#cap-search-provider'),
+        provOpts,
+        cap.fixed_provider || configuredIds[0] || '',
+        () => {},
+    );
+
+    _renderSearchSummary(body, cap);
+    _setSearchProviderPickerVisible(body, strategy === 'fixed' && hasAny);
+}
+
+function _onSearchStrategyChange(cap, value, body) {
+    const configuredIds = cap.configured_providers || [];
+    _setSearchProviderPickerVisible(body, value === 'fixed' && configuredIds.length > 0);
+}
+
+function _setSearchProviderPickerVisible(body, visible) {
+    const wrap = body.querySelector('#cap-search-provider-wrap');
+    if (!wrap) return;
+    if (visible) wrap.classList.remove('hidden');
+    else wrap.classList.add('hidden');
+}
+
+// Search summary line: just lists configured providers + a trailing "+
+// add" button. Unconfigured backends are hidden — the user picks one from
+// a small chooser when they click add. Empty state surfaces the same add
+// button as a primary CTA.
+function _renderSearchSummary(body, cap) {
+    const host = body.querySelector('#cap-search-summary');
+    if (!host) return;
+    const providers = cap.providers || [];
+    const configured = providers.filter(p => p.configured);
+    const missing = providers.filter(p => !p.configured);
+
+    const addBtn = missing.length
+        ? `<button type="button" id="cap-search-add-btn"
+                  class="inline-flex items-center gap-1 px-2 py-0.5 text-[11px] rounded-md cursor-pointer
+                         bg-slate-100 dark:bg-white/5 text-slate-500 dark:text-slate-400
+                         hover:bg-slate-200 dark:hover:bg-white/10 transition-colors">
+              <i class="fas fa-plus text-[10px]"></i>${t('models_search_add_provider')}
+           </button>`
+        : '';
+
+    if (configured.length === 0) {
+        host.innerHTML = `
+            <div class="flex items-center gap-2 text-xs text-slate-500 dark:text-slate-400">
+                <i class="fas fa-circle-info text-[10px] text-amber-500"></i>
+                <span>${t('models_search_none_configured')}</span>
+                ${addBtn}
+            </div>
+        `;
+    } else {
+        const chips = configured.map(p => `
+            <button type="button" data-search-edit-provider="${p.id}"
+                    title="${t('models_search_edit_hint')}"
+                    class="inline-flex items-center gap-1 px-2 py-0.5 text-[11px] rounded-md cursor-pointer
+                           bg-emerald-50 dark:bg-emerald-900/30 text-emerald-600 dark:text-emerald-400
+                           hover:bg-emerald-100 dark:hover:bg-emerald-900/50 transition-colors">
+                <i class="fas fa-check text-[10px]"></i>${escapeHtml(localizedLabel(p.label))}
+            </button>
+        `).join('');
+        host.innerHTML = `
+            <div class="flex items-center flex-wrap gap-2 text-xs text-slate-500 dark:text-slate-400">
+                <span>${t('models_search_available_label')}</span>
+                ${chips}
+                ${addBtn}
+            </div>
+        `;
+    }
+
+    const addBtnEl = host.querySelector('#cap-search-add-btn');
+    if (addBtnEl) {
+        addBtnEl.addEventListener('click', (ev) => {
+            ev.preventDefault();
+            openSearchAddProviderPicker(missing);
+        });
+    }
+    host.querySelectorAll('[data-search-edit-provider]').forEach(el => {
+        el.addEventListener('click', (ev) => {
+            ev.preventDefault();
+            const pid = el.getAttribute('data-search-edit-provider');
+            const meta = (cap.providers || []).find(p => p.id === pid);
+            _launchSearchProviderConfig(pid, meta);
+        });
+    });
+}
+
+// Two-step add flow: click "+ 添加厂商" -> chooser dialog -> per-provider
+// credential editor. Bocha lands on the dedicated key modal; the others
+// piggy-back on the existing vendor credential modal.
+function openSearchAddProviderPicker(missingProviders) {
+    if (!missingProviders || missingProviders.length === 0) return;
+    if (missingProviders.length === 1) {
+        _launchSearchProviderConfig(missingProviders[0].id);
+        return;
+    }
+
+    const existing = document.getElementById('search-add-modal');
+    if (existing) existing.remove();
+
+    const rows = missingProviders.map(p => `
+        <button type="button" data-pid="${p.id}"
+                class="w-full flex items-center justify-between px-3 py-2.5 rounded-lg cursor-pointer
+                       bg-slate-50 dark:bg-white/5 hover:bg-slate-100 dark:hover:bg-white/10
+                       text-sm text-slate-700 dark:text-slate-200 transition-colors">
+            <span>${escapeHtml(localizedLabel(p.label))}</span>
+            <i class="fas fa-chevron-right text-[10px] text-slate-400"></i>
+        </button>
+    `).join('');
+
+    const modal = document.createElement('div');
+    modal.id = 'search-add-modal';
+    modal.className = 'fixed inset-0 z-50 flex items-center justify-center bg-black/40 backdrop-blur-sm';
+    modal.innerHTML = `
+        <div class="bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10
+                    w-full max-w-md mx-4 p-6 shadow-xl">
+            <h3 class="text-lg font-semibold text-slate-800 dark:text-slate-100 mb-1">${t('models_search_add_provider')}</h3>
+            <p class="text-xs text-slate-500 dark:text-slate-400 mb-4">${t('models_search_add_desc')}</p>
+            <div class="space-y-2">${rows}</div>
+            <div class="flex items-center justify-end mt-5">
+                <button type="button" onclick="document.getElementById('search-add-modal').remove()"
+                        class="px-3 py-1.5 rounded-md text-sm text-slate-600 dark:text-slate-300
+                               hover:bg-slate-100 dark:hover:bg-white/5 transition-colors">
+                    ${t('cancel')}
+                </button>
+            </div>
+        </div>
+    `;
+    document.body.appendChild(modal);
+    modal.querySelectorAll('[data-pid]').forEach(el => {
+        el.addEventListener('click', () => {
+            const pid = el.getAttribute('data-pid');
+            modal.remove();
+            _launchSearchProviderConfig(pid);
+        });
+    });
+}
+
+function _launchSearchProviderConfig(providerId, providerMeta) {
+    if (providerId === 'bocha') {
+        openSearchBochaModal(providerMeta);
+    } else {
+        openVendorModal(providerId, () => loadModelsView({ preserveScroll: true }));
+    }
+}
+
+function saveSearchCapability() {
+    const strategyDd = document.getElementById('cap-search-strategy');
+    const providerDd = document.getElementById('cap-search-provider');
+    const strategy = strategyDd ? getDropdownValue(strategyDd) : 'auto';
+    const provider = (strategy === 'fixed' && providerDd) ? getDropdownValue(providerDd) : '';
+
+    fetch('/api/models', {
+        method: 'POST',
+        headers: { 'Content-Type': 'application/json' },
+        body: JSON.stringify({
+            action: 'set_capability',
+            capability: 'search',
+            strategy,
+            provider,
+        }),
+    }).then(r => r.json()).then(data => {
+        if (data.status === 'success') {
+            showStatus('cap-search-status', 'models_save_success', false);
+            setTimeout(() => loadModelsView({ preserveScroll: true }), 400);
+        } else {
+            showStatus('cap-search-status', 'models_save_failed', true);
+        }
+    }).catch(() => showStatus('cap-search-status', 'models_save_failed', true));
+}
+
+// Minimal bocha API-key modal. Reuses the existing vendor-modal markup
+// helpers would be nice, but bocha isn't in PROVIDER_MODELS (it's not a
+// model vendor), so we render a tiny dedicated dialog.
+function openSearchBochaModal(providerMeta) {
+    const existing = document.getElementById('search-bocha-modal');
+    if (existing) existing.remove();
+
+    let masked = (providerMeta && providerMeta.api_key_masked) || '';
+    if (!masked) {
+        const searchCap = (modelsState && modelsState.capabilities && modelsState.capabilities.search) || {};
+        const bocha = (searchCap.providers || []).find(p => p.id === 'bocha');
+        if (bocha && bocha.api_key_masked) masked = bocha.api_key_masked;
+    }
+    const hasKey = !!masked;
+    const clearBtnHtml = hasKey
+        ? `<button type="button" id="search-bocha-clear"
+                  class="px-3 py-1.5 rounded-md text-xs text-red-500 dark:text-red-400
+                         hover:bg-red-50 dark:hover:bg-red-900/20 cursor-pointer transition-colors">
+              ${t('models_clear_credential')}
+           </button>`
+        : '';
+
+    const modal = document.createElement('div');
+    modal.id = 'search-bocha-modal';
+    modal.className = 'fixed inset-0 z-50 flex items-center justify-center bg-black/40 backdrop-blur-sm';
+    modal.innerHTML = `
+        <div id="search-bocha-modal-card"
+             class="bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10
+                    w-full max-w-md mx-4 p-6 shadow-xl">
+            <h3 class="text-lg font-semibold text-slate-800 dark:text-slate-100 mb-1">${t('models_search_bocha_title')}</h3>
+            <p class="text-xs text-slate-500 dark:text-slate-400 mb-4">${t('models_search_bocha_desc')}</p>
+            <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">API Key</label>
+            <input id="search-bocha-key" type="text" autocomplete="off" data-1p-ignore data-lpignore="true"
+                   class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
+                          bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
+                          focus:outline-none focus:border-primary-500 font-mono ${hasKey ? 'cfg-key-masked' : ''}"
+                   value="${escapeHtml(masked)}"
+                   data-masked="${hasKey ? '1' : ''}"
+                   placeholder="sk-..." />
+            <div class="flex items-center justify-between gap-3 mt-5">
+                <div>${clearBtnHtml}</div>
+                <div class="flex items-center gap-3">
+                    <button type="button" onclick="document.getElementById('search-bocha-modal').remove()"
+                            class="px-3 py-1.5 rounded-md text-sm text-slate-600 dark:text-slate-300
+                                   hover:bg-slate-100 dark:hover:bg-white/5 transition-colors">
+                        ${t('cancel')}
+                    </button>
+                    <button type="button" onclick="_saveBochaKey()"
+                            class="px-4 py-1.5 rounded-md bg-primary-500 hover:bg-primary-600 text-white text-sm font-medium
+                                   cursor-pointer transition-colors">
+                        ${t('save')}
+                    </button>
+                </div>
+            </div>
+        </div>
+    `;
+    document.body.appendChild(modal);
+
+    // Reset masked sentinel as soon as the user starts editing so the save
+    // handler can tell apart "kept the existing key" vs "typed a new one".
+    const input = document.getElementById('search-bocha-key');
+    if (input) {
+        const unmask = () => {
+            if (input.dataset.masked === '1') {
+                input.value = '';
+                input.dataset.masked = '';
+                input.classList.remove('cfg-key-masked');
+            }
+        };
+        input.addEventListener('keydown', (e) => {
+            if (e.key === 'Tab' || e.key === 'Escape') return;
+            unmask();
+        });
+        input.addEventListener('paste', unmask);
+        if (!hasKey) setTimeout(() => input.focus(), 50);
+    }
+    const clearBtn = document.getElementById('search-bocha-clear');
+    if (clearBtn) clearBtn.addEventListener('click', _clearBochaKey);
+
+    modal.addEventListener('mousedown', (e) => {
+        if (e.target === modal) modal.remove();
+    });
+    const onKey = (e) => {
+        if (e.key === 'Escape') {
+            modal.remove();
+            document.removeEventListener('keydown', onKey);
+        }
+    };
+    document.addEventListener('keydown', onKey);
+}
+
+function _saveBochaKey() {
+    const input = document.getElementById('search-bocha-key');
+    if (!input) return;
+    // Untouched masked value => no change requested; close silently.
+    if (input.dataset.masked === '1') {
+        const modal = document.getElementById('search-bocha-modal');
+        if (modal) modal.remove();
+        return;
+    }
+    const apiKey = input.value.trim();
+    if (!apiKey) {
+        input.focus();
+        return;
+    }
+    fetch('/api/models', {
+        method: 'POST',
+        headers: { 'Content-Type': 'application/json' },
+        body: JSON.stringify({ action: 'set_search_credential', api_key: apiKey }),
+    }).then(r => r.json()).then(data => {
+        if (data.status === 'success') {
+            const modal = document.getElementById('search-bocha-modal');
+            if (modal) modal.remove();
+            loadModelsView({ preserveScroll: true });
+        }
+    });
+}
+
+function _clearBochaKey() {
+    fetch('/api/models', {
+        method: 'POST',
+        headers: { 'Content-Type': 'application/json' },
+        body: JSON.stringify({ action: 'set_search_credential', api_key: '' }),
+    }).then(r => r.json()).then(data => {
+        if (data.status === 'success') {
+            const modal = document.getElementById('search-bocha-modal');
+            if (modal) modal.remove();
+            loadModelsView({ preserveScroll: true });
+        }
+    });
+}
+
+function renderCapabilityBody(def, cap, body) {
+    if (def.id === 'search') {
+        renderSearchCapability(def, cap, body);
+        return;
+    }
+
+    // Editable cards: provider dropdown + (optional) model dropdown + save row
+    const providerOpts = buildCapabilityProviderOptions(def, cap);
+    const providerHtml = `
+        <div>
+            <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">${t('models_provider')}</label>
+            <div id="cap-${def.id}-provider" class="cfg-dropdown" tabindex="0">
+                <div class="cfg-dropdown-selected">
+                    <span class="cfg-dropdown-text">--</span>
+                    <i class="fas fa-chevron-down cfg-dropdown-arrow"></i>
+                </div>
+                <div class="cfg-dropdown-menu"></div>
+            </div>
+        </div>`;
+
+    // The model-picker container is always emitted so the provider-change
+    // handler can show/hide it; for `auto` capabilities it starts hidden and
+    // gets toggled by setCapabilityModelPickerVisible.
+    const modelHtml = def.needsModel ? `
+        <div id="cap-${def.id}-model-wrap">
+            <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">${t('models_model')}</label>
+            <div id="cap-${def.id}-model" class="cfg-dropdown" tabindex="0">
+                <div class="cfg-dropdown-selected">
+                    <span class="cfg-dropdown-text">--</span>
+                    <i class="fas fa-chevron-down cfg-dropdown-arrow"></i>
+                </div>
+                <div class="cfg-dropdown-menu"></div>
+            </div>
+            <div id="cap-${def.id}-model-custom-wrap" class="mt-2 hidden">
+                <input id="cap-${def.id}-model-custom" type="text"
+                       class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
+                              bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
+                              focus:outline-none focus:border-primary-500 font-mono transition-colors"
+                       placeholder="custom model name">
+            </div>
+        </div>` : '';
+
+    const dimHtml = (def.id === 'embedding' && cap.current_dim) ? `
+        <p class="text-xs text-slate-400 dark:text-slate-500">
+            <i class="fas fa-cube text-[10px] mr-1"></i>${t('models_dim_label')}: <span class="font-mono">${cap.current_dim}</span>
+        </p>` : '';
+
+    // Footer layout: a "hint slot" (filled later by renderCapabilityHints for
+    // auto-mode cards) sits on the left while status + save stay anchored on
+    // the right. Keeping them on the same row means the save button hugs the
+    // inputs above instead of being pushed down by a separate hint line.
+    const footer = `
+        <div class="flex items-center justify-between gap-3 pt-1">
+            <div data-cap-hint="${def.id}" class="flex-1 min-w-0"></div>
+            <div class="flex items-center gap-3 flex-shrink-0">
+                <span id="cap-${def.id}-status" class="text-xs text-primary-500 opacity-0 transition-opacity duration-300"></span>
+                <button onclick="saveCapability('${def.id}')"
+                        class="px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600 text-white text-sm font-medium
+                               cursor-pointer transition-colors duration-150 disabled:opacity-50 disabled:cursor-not-allowed">
+                    ${t('save')}
+                </button>
+            </div>
+        </div>`;
+
+    body.innerHTML = providerHtml + modelHtml + dimHtml + footer;
+
+    // TTS: mount reply-mode above provider; defer off-mode toggle to the end.
+    if (def.id === 'tts') {
+        renderVoiceReplyMode(body, cap.reply_mode || 'off', { skipVisibilityToggle: true });
+        // Voice-timbre picker depends on provider+model; rebuilt by callbacks.
+        const modelWrap = body.querySelector(`#cap-${def.id}-model-wrap`);
+        if (modelWrap) {
+            const voiceWrap = document.createElement('div');
+            voiceWrap.id = `cap-${def.id}-voice-wrap`;
+            voiceWrap.innerHTML = `
+                <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">${t('models_voice')}</label>
+                <div id="cap-${def.id}-voice" class="cfg-dropdown" tabindex="0">
+                    <div class="cfg-dropdown-selected">
+                        <span class="cfg-dropdown-text">--</span>
+                        <i class="fas fa-chevron-down cfg-dropdown-arrow"></i>
+                    </div>
+                    <div class="cfg-dropdown-menu"></div>
+                </div>
+                <div id="cap-${def.id}-voice-custom-wrap" class="hidden mt-2">
+                    <input id="cap-${def.id}-voice-custom" type="text"
+                           class="w-full px-3 py-2 text-sm rounded-md border border-slate-200 dark:border-slate-700
+                                  bg-white dark:bg-slate-800 text-slate-700 dark:text-slate-200
+                                  placeholder:text-slate-400 dark:placeholder:text-slate-500
+                                  focus:outline-none focus:ring-2 focus:ring-primary-500"
+                           placeholder="voice id" />
+                </div>
+            `;
+            modelWrap.parentNode.insertBefore(voiceWrap, modelWrap.nextSibling);
+        }
+    }
+
+    // `body` is still detached from `document`; scope lookups locally.
+    const provDd = body.querySelector(`#cap-${def.id}-provider`);
+    // Strip private fields before handing to the generic initDropdown helper.
+    const ddOpts = providerOpts.map(o => ({ value: o.value, label: o.label }));
+
+    let pendingProvider = null;
+    if (pendingCapabilitySelection
+            && pendingCapabilitySelection.capabilityId === def.id
+            && providerOpts.some(o => o.value === pendingCapabilitySelection.providerId)) {
+        pendingProvider = pendingCapabilitySelection.providerId;
+        pendingCapabilitySelection = null;
+    }
+
+    // Auto strategy => leave empty sentinel selected. `suggested_provider`
+    // is a UI-only preselect (not persisted until the user clicks Save).
+    // No current + no suggestion => leave unselected with a placeholder.
+    //
+    // Pending-config takes priority over both "auto" and "pick provider":
+    // when no real (non-sentinel) configured option exists, surfacing
+    // "auto" or "pick" misleads the user — there's nothing to auto-route
+    // to or pick from. Force a "待配置" placeholder instead so all
+    // capabilities behave consistently on a fresh environment.
+    const hasConfiguredOpt = providerOpts.some(o => !o._isAuto && o._configured);
+    const noSelectionAndNoHint = !cap.current_provider && !cap.suggested_provider;
+    let initialProviderValue;
+    let dropdownPlaceholder = null;
+    if (!hasConfiguredOpt) {
+        initialProviderValue = '';
+        dropdownPlaceholder = { placeholder: t('models_pending_config') };
+    } else {
+        initialProviderValue = pendingProvider
+            ? pendingProvider
+            : ((cap.strategy === 'auto' && capabilitySupportsAuto(def.id))
+                ? ''
+                : (cap.current_provider
+                    || cap.suggested_provider
+                    || (noSelectionAndNoHint ? '' : (ddOpts[0] && ddOpts[0].value))
+                    || ''));
+        if (noSelectionAndNoHint) {
+            dropdownPlaceholder = { placeholder: t('models_pick_provider') };
+        }
+    }
+    initDropdown(
+        provDd,
+        ddOpts,
+        initialProviderValue,
+        (value) => onCapabilityProviderChange(def, value, body),
+        dropdownPlaceholder,
+    );
+    decorateCapabilityProviderDropdown(def, provDd, providerOpts);
+
+    if (def.needsModel) {
+        rebuildCapabilityModelDropdown(def, initialProviderValue, cap.current_model || '', body);
+        // Hide model picker in auto mode — fallback hint below covers it.
+        setCapabilityModelPickerVisible(def, initialProviderValue !== '' || !capabilitySupportsAuto(def.id), body);
+    }
+
+    if (def.id === 'tts') {
+        rebuildCapabilityVoiceDropdown(
+            initialProviderValue,
+            cap.current_voice || '',
+            body,
+            cap.current_model || ''
+        );
+    }
+
+    // Inject auto/router-pending hint banners before the action footer.
+    renderCapabilityHints(def, cap, body, initialProviderValue);
+
+    if (def.id === 'tts') {
+        _setTtsConfigVisible(body, (cap.reply_mode || 'off') !== 'off');
+    }
+}
+
+// TTS reply-policy dropdown (off / voice_if_voice / always). Persists on
+// change. When off, hides the rest of the TTS card.
+function renderVoiceReplyMode(host, currentMode, options) {
+    options = options || {};
+    const opts = [
+        { value: 'off',            label: t('voice_reply_off') },
+        { value: 'voice_if_voice', label: t('voice_reply_if_voice') },
+        { value: 'always',         label: t('voice_reply_always') },
+    ];
+    const wrap = document.createElement('div');
+    wrap.id = 'voice-reply-mode-wrap';
+    wrap.innerHTML = `
+        <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">${t('voice_reply_mode_label')}</label>
+        <div id="voice-reply-mode-dd" class="cfg-dropdown" tabindex="0">
+            <div class="cfg-dropdown-selected">
+                <span class="cfg-dropdown-text">--</span>
+                <i class="fas fa-chevron-down cfg-dropdown-arrow"></i>
+            </div>
+            <div class="cfg-dropdown-menu"></div>
+        </div>
+    `;
+    host.prepend(wrap);
+
+    const dd = wrap.querySelector('#voice-reply-mode-dd');
+    const valid = ['off', 'voice_if_voice', 'always'];
+    const initial = valid.includes(currentMode) ? currentMode : 'off';
+    if (!options.skipVisibilityToggle) _setTtsConfigVisible(host, initial !== 'off');
+    initDropdown(dd, opts, initial, (mode) => {
+        if (!valid.includes(mode)) return;
+        _setTtsConfigVisible(host, mode !== 'off');
+        fetch('/api/models', {
+            method: 'POST',
+            headers: { 'Content-Type': 'application/json' },
+            body: JSON.stringify({ action: 'set_voice_reply_mode', mode }),
+        })
+            .then(r => r.json())
+            .then(data => {
+                if (data && data.status === 'success') {
+                    _ttsReadyPromise = null;  // force re-probe on next bubble
+                }
+            })
+            .catch(() => {});
+    });
+}
+
+// Show/hide everything in the TTS card below the reply-mode dropdown.
+function _setTtsConfigVisible(host, visible) {
+    if (!host) return;
+    Array.from(host.children).forEach((child) => {
+        if (child.id === 'voice-reply-mode-wrap') return;
+        child.classList.toggle('hidden', !visible);
+    });
+}
+
+// Toggle wrapper visibility instead of re-rendering so dropdown state survives.
+function setCapabilityModelPickerVisible(def, visible, scope) {
+    const root = scope || document;
+    const wrap = root.querySelector(`#cap-${def.id}-model-wrap`);
+    if (!wrap) return;
+    wrap.classList.toggle('hidden', !visible);
+}
+
+function renderCapabilityHints(def, cap, body, currentProvider) {
+    // Capabilities that can be in "auto" mode show a fallback hint right
+    // under the inputs so users always know what'd actually be hit. The
+    // image card additionally surfaces a "router pending" warning until the
+    // standalone dispatcher lands.
+    // The hint slot is co-located with the save button in the footer row
+    // (see renderCapabilityBody) so the save button stays close to the
+    // inputs above. We just rewrite the slot's innerHTML — emptying it
+    // when the card leaves auto mode, or rendering a one-line hint when
+    // it's in auto mode.
+    const slot = body.querySelector(`[data-cap-hint="${def.id}"]`);
+    if (!slot) return;
+    slot.innerHTML = '';
+
+    if (currentProvider !== '' || !capabilitySupportsAuto(def.id)) return;
+
+    // The hint mirrors what the runtime would actually pick when in auto
+    // mode. fallback_provider/model are pre-computed on the backend (see
+    // _predict_vision_auto, _predict_image_auto) so we can trust them
+    // here without re-implementing the provider chain.
+    const fbProv = cap.fallback_provider || '';
+    const fbModel = cap.fallback_model || '';
+    if (!fbProv && !fbModel) return;
+    // Show the vendor's display label (e.g. "LinkAI") instead of the raw
+    // id ("linkai") when we know it. Falls back to the id when the
+    // provider isn't in our vendor table (rare).
+    const provMeta = modelsState.providers.find(p => p.id === fbProv);
+    const fbProvLabel = (provMeta && localizedLabel(provMeta.label)) || fbProv;
+    const fbText = fbModel ? `${fbProvLabel} / ${fbModel}` : fbProvLabel;
+    slot.innerHTML = `
+        <p class="flex items-center gap-1.5 text-xs text-slate-400 dark:text-slate-500 min-w-0">
+            <i class="fas fa-circle-info text-[10px] flex-shrink-0"></i>
+            <span class="flex-shrink-0">${t('models_auto_using')}</span>
+            <span class="font-mono text-slate-500 dark:text-slate-400 truncate">${escapeHtml(fbText)}</span>
+        </p>`;
+}
+
+function buildCapabilityProviderOptions(def, cap) {
+    // Show ALL vendors in capability dropdowns so users can see at a glance
+    // who's configured (green check) and who isn't (gray dot, click to set
+    // up). The list order puts configured vendors first; clicking an
+    // unconfigured row opens the vendor modal in-place. ASR/TTS engines that
+    // aren't tracked by PROVIDER_MODELS (azure/baidu/google etc.) are treated
+    // as "always available" — no credential gate.
+    const knownProviderMap = {};
+    modelsState.providers.forEach(p => { knownProviderMap[p.id] = p; });
+
+    const explicitList = cap.providers && cap.providers.length ? cap.providers : null;
+    let providerIds = explicitList ? explicitList.slice() : modelsState.providers.map(p => p.id);
+    if (cap.current_provider && !providerIds.includes(cap.current_provider)) {
+        providerIds = [cap.current_provider, ...providerIds];
+    }
+
+    const opts = providerIds.map(pid => {
+        const meta = knownProviderMap[pid];
+        const tracked = !!meta;
+        const configured = !tracked || !!meta.configured;
+        return {
+            value: pid,
+            label: (meta && localizedLabel(meta.label)) || pid,
+            _tracked: tracked,
+            _configured: configured,
+        };
+    });
+
+    opts.sort((a, b) => {
+        if (a._configured === b._configured) return 0;
+        return a._configured ? -1 : 1;
+    });
+
+    // Capabilities with a fallback ("auto") strategy expose it as a sentinel
+    // option pinned to the top of the list. We use empty-string as the auto
+    // value so the existing save handler propagates it untouched to the
+    // backend, which interprets "" as "fall back to the main model".
+    // Skip the sentinel when no real vendor is configured — "auto" would
+    // route to nothing useful and the renderer will show "待配置" instead.
+    const hasAnyConfigured = opts.some(o => o._configured);
+    if ((cap.strategy === 'auto' || cap.strategy === 'specified') && hasAnyConfigured) {
+        if (capabilitySupportsAuto(def.id)) {
+            opts.unshift({
+                value: '',
+                label: t('models_strategy_auto'),
+                _tracked: false,
+                _configured: true,
+                _isAuto: true,
+            });
+        }
+    }
+    return opts;
+}
+
+function capabilitySupportsAuto(capId) {
+    // Embedding is intentionally NOT here: runtime only auto-falls back to
+    // OpenAI/LinkAI, so dressing it up as "auto" hides reality from users.
+    return capId === 'image' || capId === 'vision';
+}
+
+// After initDropdown renders the capability provider menu, decorate each
+// row with the right-aligned configuration cue:
+//   - configured rows: nothing extra — the .active marker (a brand-green ✓)
+//     already comes from initDropdown's selected-state CSS for the row the
+//     user currently picked. Other configured rows show no chrome, mirroring
+//     a plain "switch to this" selector.
+//   - unconfigured rows: a subdued gear icon hints at "click to configure".
+//     The row's whole click handler is swapped to launch the vendor modal
+//     in place rather than selecting an unusable value.
+function decorateCapabilityProviderDropdown(def, ddEl, opts) {
+    if (!ddEl) return;
+    const menu = ddEl.querySelector('.cfg-dropdown-menu');
+    if (!menu) return;
+
+    const optByValue = {};
+    opts.forEach(o => { optByValue[o.value] = o; });
+
+    menu.querySelectorAll('.cfg-dropdown-item').forEach(item => {
+        const value = item.dataset.value;
+        const opt = optByValue[value];
+        if (!opt) return;
+        item.classList.add('cap-provider-item');
+        if (!opt._configured) item.classList.add('cap-provider-unconfigured');
+
+        // Wrap the label so the trailing affordance lines up via flex:auto.
+        const labelText = item.textContent;
+        item.textContent = '';
+        const labelEl = document.createElement('span');
+        labelEl.className = 'cap-provider-label';
+        labelEl.textContent = labelText;
+        item.appendChild(labelEl);
+
+        if (!opt._configured) {
+            // Trailing gear icon as the "configure this vendor" affordance.
+            const gear = document.createElement('i');
+            gear.className = 'fas fa-gear cap-provider-gear';
+            item.appendChild(gear);
+        }
+
+        if (!opt._configured && opt._tracked) {
+            // Hijack the click: open the vendor modal instead of selecting
+            // an unusable value, and remember which capability the user was
+            // configuring so the post-save reload can preselect the vendor.
+            const newItem = item.cloneNode(true);
+            item.replaceWith(newItem);
+            newItem.addEventListener('click', (e) => {
+                e.stopPropagation();
+                ddEl.classList.remove('open');
+                openVendorModal(value, (savedProviderId) => {
+                    pendingCapabilitySelection = {
+                        capabilityId: def.id,
+                        providerId: savedProviderId || value,
+                    };
+                    loadModelsView({ preserveScroll: true });
+                });
+            });
+        }
+    });
+}
+
+// Lightweight decorator for the "add vendor" modal's provider picker:
+// every configured vendor row gets a trailing brand-green ✓ so the user can
+// see at a glance who's already set up, without having to read each row.
+// Unlike decorateCapabilityProviderDropdown we don't hijack clicks here —
+// picking an unconfigured vendor in this modal *is* the intended action.
+function decorateVendorModalPicker(ddEl, opts) {
+    if (!ddEl) return;
+    const menu = ddEl.querySelector('.cfg-dropdown-menu');
+    if (!menu) return;
+
+    const optByValue = {};
+    opts.forEach(o => { optByValue[o.value] = o; });
+
+    menu.querySelectorAll('.cfg-dropdown-item').forEach(item => {
+        const opt = optByValue[item.dataset.value];
+        if (!opt) return;
+        // Tag the row so the global active-row ✓ rule is suppressed in CSS
+        // (otherwise configured AND selected rows would render two checks).
+        item.classList.add('vendor-picker-item');
+        if (!opt._configured) return;
+        const check = document.createElement('i');
+        check.className = 'fas fa-check vendor-picker-configured-mark';
+        item.appendChild(check);
+    });
+}
+
+function rebuildCapabilityModelDropdown(def, providerId, selectedModel, scope) {
+    // `scope` lets the caller (renderCapabilityBody) target a still-detached
+    // subtree. After the card is mounted, callers may pass `document` instead.
+    const root = scope || document;
+    const el = root.querySelector(`#cap-${def.id}-model`);
+    if (!el) return;
+
+    // Prefer the capability-scoped model list when the backend provides one
+    // (vision / image). It reflects the models the runtime can actually
+    // dispatch to for this capability, instead of the vendor's full chat-
+    // model catalog. Fall back to the generic provider.models for chat /
+    // embedding / tts where any vendor model is fair game.
+    //
+    // Entries may be plain strings or {value, hint} objects (image catalog
+    // uses the latter to surface brand aliases like "Nano Banana 2" next to
+    // the technical Gemini model id). We normalize to {value, label, hint}
+    // before handing off to initDropdown.
+    const cap = modelsState.capabilities[def.id] || {};
+    const capModelMap = cap.provider_models || {};
+    let rawList;
+    if (capModelMap[providerId]) {
+        rawList = capModelMap[providerId].slice();
+    } else {
+        const provider = modelsState.providers.find(p => p.id === providerId);
+        rawList = (provider && provider.models) ? provider.models.slice() : [];
+    }
+    const modelValues = [];
+    const opts = rawList.map(entry => {
+        if (typeof entry === 'string') {
+            modelValues.push(entry);
+            return { value: entry, label: entry };
+        }
+        modelValues.push(entry.value);
+        return { value: entry.value, label: entry.label || entry.value, hint: entry.hint || '' };
+    });
+    opts.push({ value: '__custom__', label: currentLang === 'zh' ? '自定义' : 'Custom' });
+
+    let initialValue = selectedModel || '';
+    if (initialValue && !modelValues.includes(initialValue)) {
+        initialValue = '__custom__';
+    }
+    if (!initialValue && opts.length) initialValue = opts[0].value;
+
+    initDropdown(el, opts, initialValue, (value) => {
+        const customWrap = document.getElementById(`cap-${def.id}-model-custom-wrap`);
+        if (customWrap) {
+            if (value === '__custom__') {
+                customWrap.classList.remove('hidden');
+                const input = document.getElementById(`cap-${def.id}-model-custom`);
+                if (input && !input.value) input.value = selectedModel || '';
+            } else {
+                customWrap.classList.add('hidden');
+            }
+        }
+        // TTS voice catalog may be scoped per engine model (aggregating
+        // gateways). Rebuild the voice picker whenever the model changes.
+        if (def.id === 'tts') {
+            const provDd = document.getElementById('cap-tts-provider');
+            const provId = provDd ? getDropdownValue(provDd) : '';
+            rebuildCapabilityVoiceDropdown(provId, '', null, value);
+        }
+    });
+
+    const customWrap = root.querySelector(`#cap-${def.id}-model-custom-wrap`);
+    if (customWrap) {
+        if (initialValue === '__custom__') {
+            customWrap.classList.remove('hidden');
+            const input = root.querySelector(`#cap-${def.id}-model-custom`);
+            if (input) input.value = selectedModel || '';
+        } else {
+            customWrap.classList.add('hidden');
+        }
+    }
+}
+
+// TTS-only: rebuild the voice timbre picker against the provider's
+// curated voice list. Hidden when no provider is picked.
+//
+// Each voice entry may be:
+//   - a bare string  (code = label)
+//   - {value, label, hint?}   so we can show a friendly Chinese name
+//     while persisting the raw API code that the runtime sends.
+function rebuildCapabilityVoiceDropdown(providerId, selectedVoice, scope, modelId) {
+    const root = scope || document;
+    const wrap = root.querySelector(`#cap-tts-voice-wrap`);
+    const el = root.querySelector(`#cap-tts-voice`);
+    if (!wrap || !el) return;
+    const cap = modelsState.capabilities.tts || {};
+    const voicesByProvider = cap.provider_voices || {};
+    let raw = (providerId && voicesByProvider[providerId]) || [];
+    // Some providers (gateways) scope voices by engine model id.
+    if (raw && !Array.isArray(raw) && typeof raw === 'object') {
+        const activeModel = modelId
+            || (root.querySelector(`#cap-tts-model`) ? getDropdownValue(root.querySelector(`#cap-tts-model`)) : '');
+        raw = (activeModel && raw[activeModel]) || [];
+    }
+    if (!raw || raw.length === 0) {
+        wrap.classList.add('hidden');
+        return;
+    }
+    wrap.classList.remove('hidden');
+    // Voice picker: friendly name on the left, raw API code as right-hand
+    // hint. Persisted/sent value is always the raw code.
+    const codes = [];
+    const opts = raw.map(entry => {
+        if (typeof entry === 'string') {
+            codes.push(entry);
+            return { value: entry, label: entry };
+        }
+        codes.push(entry.value);
+        const code = entry.value;
+        const desc = entry.hint || entry.label || code;
+        return {
+            value: code,
+            label: desc,
+            hint: desc === code ? '' : code,
+        };
+    });
+    opts.push({ value: '__custom__', label: currentLang === 'zh' ? '自定义' : 'Custom' });
+
+    // Off-catalog values route through the custom branch.
+    let initial = selectedVoice || '';
+    const isCustom = initial && !codes.includes(initial);
+    if (isCustom) initial = '__custom__';
+    if (!initial) initial = codes[0];
+
+    initDropdown(el, opts, initial, (value) => {
+        const customWrap = root.querySelector(`#cap-tts-voice-custom-wrap`);
+        if (!customWrap) return;
+        if (value === '__custom__') {
+            customWrap.classList.remove('hidden');
+            const input = root.querySelector(`#cap-tts-voice-custom`);
+            if (input && !input.value) input.value = isCustom ? selectedVoice : '';
+        } else {
+            customWrap.classList.add('hidden');
+        }
+    });
+
+    const customWrap = root.querySelector(`#cap-tts-voice-custom-wrap`);
+    if (customWrap) {
+        if (initial === '__custom__') {
+            customWrap.classList.remove('hidden');
+            const input = root.querySelector(`#cap-tts-voice-custom`);
+            if (input) input.value = isCustom ? selectedVoice : '';
+        } else {
+            customWrap.classList.add('hidden');
+        }
+    }
+}
+
+function onCapabilityProviderChange(def, providerId, scope) {
+    if (def.needsModel) {
+        // Empty sentinel hides the model picker (capability is in auto mode).
+        const isAuto = providerId === '' && capabilitySupportsAuto(def.id);
+        if (!isAuto) {
+            rebuildCapabilityModelDropdown(def, providerId, '', scope);
+        }
+        setCapabilityModelPickerVisible(def, !isAuto, scope);
+    }
+    if (def.id === 'tts') {
+        rebuildCapabilityVoiceDropdown(providerId, '', scope);
+    }
+    const body = scope || document.querySelector(`[data-cap-body="${def.id}"]`);
+    if (body) {
+        const cap = modelsState.capabilities[def.id] || {};
+        renderCapabilityHints(def, cap, body, providerId);
+    }
+}
+
+function getCapabilityModelValue(def) {
+    if (!def.needsModel) return '';
+    const dd = document.getElementById(`cap-${def.id}-model`);
+    if (!dd) return '';
+    const v = getDropdownValue(dd);
+    if (v === '__custom__') {
+        const input = document.getElementById(`cap-${def.id}-model-custom`);
+        return input ? input.value.trim() : '';
+    }
+    return v || '';
+}
+
+function saveCapability(capId) {
+    const def = MODELS_CAPABILITY_DEFS.find(d => d.id === capId);
+    if (!def || !def.editable) return;
+    // Search has its own form (strategy + provider, no model picker).
+    if (capId === 'search') { saveSearchCapability(); return; }
+    const provDd = document.getElementById(`cap-${capId}-provider`);
+    const provider = provDd ? getDropdownValue(provDd) : '';
+    // When the user is in auto mode (provider == ""), the model picker is
+    // hidden and any value left in it is stale; persist an empty model so
+    // the backend treats this as "fall back to the runtime chain".
+    const isAuto = provider === '' && capabilitySupportsAuto(capId);
+    const model = isAuto ? '' : getCapabilityModelValue(def);
+    // TTS carries an extra voice timbre (supports free-text custom ids).
+    let voice = '';
+    if (capId === 'tts' && !isAuto) {
+        const voiceDd = document.getElementById(`cap-${capId}-voice`);
+        voice = voiceDd ? getDropdownValue(voiceDd) : '';
+        if (voice === '__custom__') {
+            const input = document.getElementById(`cap-${capId}-voice-custom`);
+            voice = input ? input.value.trim() : '';
+        }
+    }
+
+    // Embedding changes invalidate any pre-existing vector index because
+    // dimensions / vendor differ. Gate the save behind a confirm, and on
+    // success surface a dedicated info dialog telling the user how to
+    // rebuild — both via the in-app custom dialog, not the native alert.
+    if (capId === 'embedding') {
+        const cap = modelsState.capabilities[capId] || {};
+        const before = (cap.current_provider || '').trim();
+        const after = (provider || '').trim();
+        if (before !== after) {
+            showConfirmDialog({
+                title: t('models_embedding_change_title'),
+                message: t('models_embedding_change_msg'),
+                okText: t('save'),
+                cancelText: t('cancel'),
+                onConfirm: () => _persistCapability(capId, provider, model, () => {
+                    showConfirmDialog({
+                        title: t('models_embedding_saved_title'),
+                        message: t('models_embedding_saved_msg'),
+                        okText: t('models_embedding_saved_ok'),
+                        hideCancel: true,
+                        onConfirm: () => {
+                            navigateTo('chat');
+                            // Defer focus + value set: navigateTo may
+                            // re-render the chat panel; setting value before
+                            // the input is mounted would be lost.
+                            setTimeout(() => {
+                                const input = document.getElementById('chat-input');
+                                if (!input) return;
+                                input.value = '/memory rebuild-index';
+                                input.focus();
+                                // Trigger any input listeners (autosize, send-button enable, etc.)
+                                input.dispatchEvent(new Event('input', { bubbles: true }));
+                            }, 60);
+                        },
+                    });
+                }),
+            });
+            return;
+        }
+    }
+    _persistCapability(capId, provider, model, undefined, { voice });
+}
+
+function _persistCapability(capId, provider, model, onAfterSuccess, extras) {
+    const payload = { action: 'set_capability', capability: capId, provider_id: provider, model: model };
+    if (extras && extras.voice !== undefined) payload.voice = extras.voice;
+    fetch('/api/models', {
+        method: 'POST',
+        headers: { 'Content-Type': 'application/json' },
+        body: JSON.stringify(payload),
+    }).then(r => r.json()).then(data => {
+        if (data.status === 'success') {
+            // Flash "Saved" before reload so the status survives the rebuild.
+            showStatus(`cap-${capId}-status`, 'models_save_success', false);
+            setTimeout(() => {
+                loadModelsView({ preserveScroll: true });
+                if (onAfterSuccess) onAfterSuccess();
+            }, 400);
+        } else {
+            showStatus(`cap-${capId}-status`, 'models_save_failed', true);
+        }
+    }).catch(() => showStatus(`cap-${capId}-status`, 'models_save_failed', true));
+}
+
+// ---------- Vendor credential modal ------------------------------------
+
+let vendorModalState = { providerId: '', onSaved: null };
+
+function openVendorModal(providerId, onSaved) {
+    vendorModalState = { providerId: providerId || '', onSaved: onSaved || null };
+
+    const overlay = document.getElementById('vendor-modal-overlay');
+    const titleEl = document.getElementById('vendor-modal-title');
+    const subEl = document.getElementById('vendor-modal-subtitle');
+    const pickerWrap = document.getElementById('vendor-modal-picker-wrap');
+    const baseWrap = document.getElementById('vendor-modal-base-wrap');
+    const baseInput = document.getElementById('vendor-modal-base');
+    const baseHint = document.getElementById('vendor-modal-base-hint');
+    const keyInput = document.getElementById('vendor-modal-key');
+    const clearBtn = document.getElementById('vendor-modal-clear');
+
+    // Reset any leftover status (e.g. previous "Saved" message)
+    const statusEl = document.getElementById('vendor-modal-status');
+    if (statusEl) {
+        statusEl.textContent = '';
+        statusEl.classList.add('opacity-0');
+    }
+
+    if (!providerId) {
+        // Add flow — show provider picker, default to the first unconfigured one.
+        // We render every configured vendor with a trailing green ✓ via the
+        // dropdown decorator, mirroring the visual language used by the
+        // capability provider dropdowns. The .active row already shows the
+        // currently selected vendor via its own background highlight, so we
+        // intentionally suppress the global active-row ✓ for this picker
+        // (see CSS) — otherwise configured + selected rows would show two.
+        const unconfigured = modelsState.providers.filter(p => !p.configured);
+        const defaultId = (unconfigured[0] && unconfigured[0].id) || (modelsState.providers[0] && modelsState.providers[0].id) || '';
+        pickerWrap.classList.remove('hidden');
+        const pickerEl = document.getElementById('vendor-modal-picker');
+        const pickerOpts = modelsState.providers.map(p => ({
+            value: p.id,
+            label: localizedLabel(p.label),
+            _configured: !!p.configured,
+        }));
+        initDropdown(pickerEl, pickerOpts, defaultId, (val) => fillVendorModalForProvider(val));
+        decorateVendorModalPicker(pickerEl, pickerOpts);
+        fillVendorModalForProvider(defaultId);
+    } else {
+        pickerWrap.classList.add('hidden');
+        fillVendorModalForProvider(providerId);
+    }
+
+    overlay.classList.remove('hidden');
+
+    document.getElementById('vendor-modal-cancel').onclick = closeVendorModal;
+    document.getElementById('vendor-modal-save').onclick = saveVendorModal;
+    clearBtn.onclick = clearVendorModal;
+
+    // Once the user edits the masked value, drop the "masked sentinel" dataset
+    // so the save handler treats their input as a real new key. We compare on
+    // the next tick because keydown fires before the new char lands in .value.
+    keyInput.oninput = function () {
+        if (keyInput.dataset.masked === '1' && keyInput.value !== keyInput.dataset.maskedVal) {
+            keyInput.dataset.masked = '';
+        }
+    };
+
+    function onOverlayClick(e) {
+        if (e.target === overlay) {
+            closeVendorModal();
+            overlay.removeEventListener('click', onOverlayClick);
+        }
+    }
+    overlay.addEventListener('click', onOverlayClick);
+    keyInput.focus();
+}
+
+function fillVendorModalForProvider(providerId) {
+    const meta = modelsState.providers.find(p => p.id === providerId);
+    if (!meta) return;
+    document.getElementById('vendor-modal-title').textContent = localizedLabel(meta.label);
+    document.getElementById('vendor-modal-subtitle').textContent = meta.id;
+
+    // ----- API Base -----
+    // Always reflect the *current effective* base as the input value so the
+    // user can see (and edit) what's in use today. Placeholder is reserved
+    // strictly for the "not yet typed anything" state and shows the official
+    // default — never mixed with the actual value.
+    const baseWrap = document.getElementById('vendor-modal-base-wrap');
+    const baseInput = document.getElementById('vendor-modal-base');
+    const baseHint = document.getElementById('vendor-modal-base-hint');
+    if (meta.api_base_field) {
+        baseWrap.classList.remove('hidden');
+        baseInput.placeholder = meta.api_base_default || meta.api_base_placeholder || '';
+        baseInput.value = meta.api_base || '';
+        baseHint.classList.add('hidden');
+    } else {
+        baseWrap.classList.add('hidden');
+        baseInput.value = '';
+    }
+
+    // ----- API Key -----
+    // For configured vendors, surface the masked key as the input *value* so
+    // it shows up in the same dark text as a real entry — making "configured"
+    // visually unambiguous. The masked form (e.g. "sk-r***zRU") is also a
+    // sentinel: the save handler treats untouched masked input as "no change".
+    const keyInput = document.getElementById('vendor-modal-key');
+    if (meta.configured && meta.api_key_masked) {
+        keyInput.value = meta.api_key_masked;
+        keyInput.dataset.masked = '1';
+        keyInput.dataset.maskedVal = meta.api_key_masked;
+        keyInput.placeholder = '';
+    } else {
+        keyInput.value = '';
+        keyInput.dataset.masked = '';
+        keyInput.dataset.maskedVal = '';
+        keyInput.placeholder = 'sk-...';
+    }
+
+    const clearBtn = document.getElementById('vendor-modal-clear');
+    clearBtn.classList.toggle('hidden', !meta.configured);
+
+    vendorModalState.providerId = providerId;
+}
+
+function closeVendorModal() {
+    document.getElementById('vendor-modal-overlay').classList.add('hidden');
+}
+
+function saveVendorModal() {
+    const providerId = vendorModalState.providerId;
+    if (!providerId) return;
+    const keyInput = document.getElementById('vendor-modal-key');
+    const apiBase = document.getElementById('vendor-modal-base').value.trim();
+
+    // Treat "input still equals the masked value we surfaced on open" as "no
+    // change" — the backend uses missing/empty api_key to skip the field.
+    let apiKey = keyInput.value.trim();
+    const masked = keyInput.dataset.masked === '1';
+    const maskedVal = keyInput.dataset.maskedVal || '';
+    if (masked && apiKey === maskedVal) {
+        apiKey = '';
+    }
+
+    if (!apiKey && !masked) {
+        // First-time setup with no key entered → nudge the user.
+        keyInput.focus();
+        return;
+    }
+
+    const btn = document.getElementById('vendor-modal-save');
+    btn.disabled = true;
+    const payload = { action: 'set_provider', provider_id: providerId, api_base: apiBase };
+    if (apiKey) payload.api_key = apiKey;
+    fetch('/api/models', {
+        method: 'POST',
+        headers: { 'Content-Type': 'application/json' },
+        body: JSON.stringify(payload),
+    }).then(r => r.json()).then(data => {
+        btn.disabled = false;
+        if (data.status === 'success') {
+            closeVendorModal();
+            const onSaved = vendorModalState.onSaved;
+            if (onSaved) {
+                try { onSaved(providerId); } catch (e) { /* noop */ }
+            } else {
+                loadModelsView();
+            }
+        } else {
+            showStatus('vendor-modal-status', 'models_save_failed', true);
+        }
+    }).catch(() => {
+        btn.disabled = false;
+        showStatus('vendor-modal-status', 'models_save_failed', true);
+    });
+}
+
+function clearVendorModal() {
+    const providerId = vendorModalState.providerId;
+    if (!providerId) return;
+    showConfirmDialog({
+        title: t('models_clear_confirm_title'),
+        message: t('models_clear_confirm_msg'),
+        okText: t('models_clear_credential'),
+        cancelText: t('cancel'),
+        onConfirm: () => {
+            fetch('/api/models', {
+                method: 'POST',
+                headers: { 'Content-Type': 'application/json' },
+                body: JSON.stringify({ action: 'delete_provider', provider_id: providerId }),
+            }).then(r => r.json()).then(data => {
+                if (data.status === 'success') {
+                    closeVendorModal();
+                    loadModelsView();
+                } else {
+                    showStatus('vendor-modal-status', 'models_clear_failed', true);
+                }
+            }).catch(() => showStatus('vendor-modal-status', 'models_clear_failed', true));
+        }
+    });
+}
+
 // =====================================================================
 // Channels View
 // =====================================================================
@@ -4283,6 +6489,7 @@ navigateTo = function(viewId) {
 
     // Lazy-load view data
     if (viewId === 'config') loadConfigView();
+    else if (viewId === 'models') loadModelsView();
     else if (viewId === 'skills') loadSkillsView();
     else if (viewId === 'memory') {
         document.getElementById('memory-panel-viewer').classList.add('hidden');
diff --git a/channel/web/static/logos/claudeAPI.svg b/channel/web/static/logos/claudeAPI.svg
new file mode 100644
index 00000000..e9a401b7
--- /dev/null
+++ b/channel/web/static/logos/claudeAPI.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251656961" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="18432" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M252.8 652.8l167.893333-94.293333 2.773334-8.106667-2.773334-4.48h-8.106666l-28.16-1.706667-96-2.56-83.2-3.413333-80.64-4.266667-20.266667-4.266666L85.333333 504.746667l1.92-12.586667 17.066667-11.52 24.32 2.133333 53.973333 3.626667 81.066667 5.546667 58.666667 3.413333 87.04 9.173333h13.866666l1.92-5.546666-4.693333-3.413334-3.626667-3.413333-83.84-56.746667-90.666666-60.16-47.573334-34.56-25.813333-17.493333-13.013333-16.426667-5.546667-35.84 23.253333-25.813333 31.36 2.133333 7.893334 2.133334 31.786666 24.32 67.84 52.48L401.066667 391.466667l13.013333 10.88 5.12-3.626667 0.64-2.56-5.76-9.813333-48.213333-87.04L314.453333 210.773333l-22.826666-36.693333-5.973334-21.973333a107.861333 107.861333 0 0 1-3.626666-26.026667l26.666666-36.053333L323.413333 85.333333l35.413334 4.693334 14.933333 13.013333 21.973333 50.346667 35.626667 79.36 55.253333 107.733333 16.213334 32 8.746666 29.653333 3.2 9.173334h5.546667v-5.12l4.48-60.8 8.32-74.453334 8.106667-96 2.773333-27.093333 13.44-32.426667 26.666667-17.493333 20.693333 10.026667 17.066667 24.32-2.346667 15.786666-10.24 65.92-19.84 103.253334-13.013333 69.12h7.466666l8.746667-8.746667 34.986667-46.506667 58.666666-73.386666 26.026667-29.226667 30.293333-32.213333 19.413334-15.36h36.693333l27.093333 40.106666-12.16 41.386667-37.76 48-31.36 40.533333-45.013333 60.586667-28.16 48.426667 2.56 3.84 6.613333-0.64 101.546667-21.546667 54.826667-10.026667 65.493333-11.306666 29.653333 13.866666 3.2 14.08-11.733333 28.8-69.973333 17.28-82.133334 16.426667-122.24 29.013333-1.493333 1.066667 1.706667 2.133333 55.04 5.12 23.466666 1.28h57.6l107.306667 7.893334 28.16 18.56 16.853333 22.613333-2.773333 17.28-43.306667 21.973333-58.24-13.866666-136.106666-32.426667-46.72-11.733333h-6.4v3.84l38.826666 37.973333 71.253334 64.426667 89.173333 82.986666 4.48 20.48-11.52 16.213334-12.16-1.706667-78.506667-58.88-30.293333-26.666667-68.48-57.6h-4.48v5.973334l15.786667 23.04 83.413333 125.226666 4.266667 38.4-5.973334 12.586667-21.546666 7.466667-23.68-4.266667-48.853334-68.48-50.346666-77.226667-40.533334-69.12-4.906666 2.773334-23.893334 258.133333-11.306666 13.226667-26.026667 10.026666-21.546667-16.426666-11.52-26.666667 11.52-52.48 13.866667-68.48 11.306667-54.4 10.24-67.626667 5.973333-22.4-0.426667-1.493333-4.906666 0.64-50.986667 69.973333-77.653333 104.746667-61.44 65.706667-14.72 5.76-25.386667-13.226667 2.346667-23.466667 14.293333-20.906666 84.906667-107.946667 51.2-66.986667 33.066666-38.613333v-5.546667h-2.133333l-225.493333 146.56-40.106667 5.12-17.28-16.213333 2.133333-26.666667 8.106667-8.746666 67.84-46.72h-0.213333l0.853333 0.853333z" fill="#D97757" p-id="18433"></path></svg>
\ No newline at end of file
diff --git a/channel/web/static/logos/custom.svg b/channel/web/static/logos/custom.svg
new file mode 100644
index 00000000..63857648
--- /dev/null
+++ b/channel/web/static/logos/custom.svg
@@ -0,0 +1,10 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" width="200" height="200" fill="none" stroke="#475569" stroke-width="1.8" stroke-linecap="round" stroke-linejoin="round">
+  <!-- Horizontal slider tracks -->
+  <line x1="4" y1="7" x2="20" y2="7"/>
+  <line x1="4" y1="12" x2="20" y2="12"/>
+  <line x1="4" y1="17" x2="20" y2="17"/>
+  <!-- Knobs (filled circles) -->
+  <circle cx="9" cy="7"  r="2.2" fill="#475569" stroke="none"/>
+  <circle cx="15" cy="12" r="2.2" fill="#475569" stroke="none"/>
+  <circle cx="7" cy="17"  r="2.2" fill="#475569" stroke="none"/>
+</svg>
diff --git a/channel/web/static/logos/dashscope.svg b/channel/web/static/logos/dashscope.svg
new file mode 100644
index 00000000..a5801c86
--- /dev/null
+++ b/channel/web/static/logos/dashscope.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251621200" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="17444" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M1019.364785 620.816931L891.797142 397.807295 946.450846 293.15069a29.097778 29.097778 0 0 0 6.399732-36.393472l-70.184053-126.586684a30.078737 30.078737 0 0 0-24.574968-13.652427H597.4945L539.171949 14.549389a27.348852 27.348852 0 0 0-20.906122-14.549389H380.628607a29.139776 29.139776 0 0 0-24.616967 14.549389v5.545767L225.797108 243.062793H100.919352a29.182775 29.182775 0 0 0-25.513928 13.653427L3.428446 384.11187a32.766624 32.766624 0 0 0 0 29.182775L132.831012 638.096205 74.508461 740.064923a32.766624 32.766624 0 0 0 0 29.05478l66.514207 116.561105a29.905744 29.905744 0 0 0 25.513929 14.505391H427.132654l62.845361 109.222414A30.078737 30.078737 0 0 0 512.762058 1024H660.382859a29.139776 29.139776 0 0 0 24.574968-14.549389l128.463606-224.843558h114.76818a31.91366 31.91366 0 0 0 24.660965-15.444352l66.471208-117.414069a28.158818 28.158818 0 0 0 0-30.9747l0.042999 0.042999z m-161.273228 14.591387L791.57735 512.490479 518.265827 993.964261l-74.748861-122.87484h-273.268525l65.618244-119.205994h139.386147L101.856313 272.244568h143.055993L380.671605 30.121735l68.34913 119.247993-70.184053 122.87484H925.501726l-69.202094 121.936879 137.594222 241.183873H858.134555z" fill="#605BEC" p-id="17445"></path><path d="M499.962596 699.320634l174.371677-274.719464H324.694955z" fill="#605BEC" p-id="17446"></path></svg>
\ No newline at end of file
diff --git a/channel/web/static/logos/deepseek.svg b/channel/web/static/logos/deepseek.svg
new file mode 100644
index 00000000..ae90d3db
--- /dev/null
+++ b/channel/web/static/logos/deepseek.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251541870" class="icon" viewBox="0 0 1391 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="12864" xmlns:xlink="http://www.w3.org/1999/xlink" width="271.6796875" height="200"><path d="M1333.74443323 82.22042509c-13.80988113-6.90651166-19.77216769 6.25765149-27.83971486 12.94735271-2.7494075 2.15867766-5.09661597 4.96464441-7.44382443 7.55380074-20.17908001 22.01097094-43.75485659 36.47128333-74.589069 34.74465541-45.04943475-2.58915632-83.51757347 11.86958497-117.50810569 47.04629015-7.2285851-43.37779501-31.23798253-69.2740715-67.78939144-85.89149046-19.15315822-8.63156848-38.46813872-17.26470805-51.87582082-36.04080467-9.33227462-13.37940245-11.86958497-28.2701935-16.56243082-42.94417414-2.96778901-8.8483789-5.93557805-17.91199713-15.91514173-19.42338573-10.87194282-1.72662791-15.10760146 7.55380073-19.36996865 15.3228408-16.99448057 31.72344934-23.6040562 66.68491519-22.93005859 102.07685969 1.45797153 79.63383898 34.42258196 143.08073766 99.86633603 188.18516058 7.44539552 5.17831264 9.36055423 10.35819639 7.01334578 17.91042602-4.45089798 15.53808012-9.79260399 30.6456816-14.45874129 46.18376174-2.9693601 9.92771773-7.418687 12.0848243-17.85858007 7.76904006-35.90569092-15.3228408-66.92843413-37.9826719-94.36280776-65.38876592-46.53254371-45.9685224-88.61576053-96.6833077-141.08388229-136.39103645a620.44857966 620.44857966 0 0 0-37.41550843-26.11308694c-53.54746058-53.0887023 7.01334578-96.68173661 21.0416084-101.86162035 14.6472721-5.39512307 5.09818706-23.95440928-42.29845612-23.73916995s-90.74772965 16.40217963-145.99510964 37.98267191c-8.09111351 3.2364454-16.59071043 5.6103624-25.27569597 7.55380074-50.17590143-9.71247839-102.23868196-11.86958497-156.65024201-5.61193348-102.42721275 11.65434565-184.24643792 61.07455278-244.40190308 145.45465466-72.24186053 101.4295706-89.26462071 216.6721645-68.4115431 336.87626062 21.85071977 126.68012914 85.21592177 231.56295556 182.54651857 313.56914048 100.94410379 85.02739095 217.18433986 126.68012914 349.79847589 118.69584973 80.54978445-4.74940507 170.2181753-15.75489055 271.37751842-103.15776961 25.51921492 12.94892381 52.30629946 18.12880755 96.71001624 22.01254203 34.23248007 3.2364454 67.17038198-1.72662791 92.66288839-7.12175096 39.95124769-8.63156848 37.17198947-46.39900106 22.7399567-53.30394163-117.10276448-55.67942971-91.39501876-33.01959858-114.755556-51.36207439 59.50817604-71.86479892 149.17656689-146.53556459 184.24643795-388.45514546 2.77768711-19.20657529 0.43047867-31.29139958 0-46.82947971-0.21681042-9.49566798 1.88687908-13.16573423 12.54358259-14.24350198 29.32282382-3.45325582 57.80982774-11.65434565 83.9496232-26.32832626 75.85536753-42.29845616 106.47276951-111.78933809 113.70292571-195.09167222 1.07933883-12.73211339-0.21523932-25.89627652-13.40768208-32.58597776M672.59048267 831.93671913c-113.46097785-91.07137422-168.51982701-121.06819563-191.25978372-119.77361748-21.25684774 1.29457817-17.42653031 26.11308695-12.76039301 42.29845614 4.88294773 15.97012989 11.27571295 26.97561536 20.20421747 41.00387801 6.15238845 9.28042865 10.41475564 23.09188086-6.17595481 33.45007725-36.55298001 23.09188086-100.08157538-7.76904006-103.04779332-9.27885757-73.96848843-44.45713381-135.82544403-103.1577696-179.39176984-183.43732658-42.08164574-77.25992199-66.4948133-160.1302064-70.54194114-248.61085317-1.07933883-21.36525295 5.09818706-28.91905367 25.89784762-32.80435928a250.87636497 250.87636497 0 0 1 83.11223228-2.15710656c115.83646593 17.26627914 214.46006978 70.138171 297.11354374 153.8725549 47.18140388 47.69200813 82.87028441 104.66601601 119.66521225 160.34544572 39.08871926 59.12954337 81.17193611 115.45626214 134.71939669 161.63845278 18.90963927 16.18536923 33.98896113 28.48700395 48.44770242 37.55062216-43.56632578 4.96464441-116.26537349 6.04241215-165.98251663-34.09736632m54.40998899-357.16217477c0-9.49566798 7.44696661-17.04946873 16.80594974-17.04946872q3.18302835 0.05498814 5.71876762 1.07933883a16.91435498 16.91435498 0 0 1 10.84523431 15.97012989 16.83265829 16.83265829 0 0 1-16.77924123 17.04946872 16.6441275 16.6441275 0 0 1-16.59071044-17.04946872m168.95187674 88.48064679c-10.81852576 4.53259466-21.66218896 8.41790022-32.10208201 8.8483789-16.13195215 0.8640995-33.7737218-5.82560173-43.32280686-14.02669155-14.89079105-12.73368447-25.52078602-19.85543545-29.97168398-42.08321683-1.91515871-9.49566798-0.8640995-24.16964861 0.83739098-32.58597774 3.83031742-18.12880755-0.43204976-29.78158209-12.94892381-40.35658891-10.19637413-8.63313957-23.17357754-11.00705657-37.41550843-11.00705657-5.31499747 0-10.19637413-2.37234591-13.81145222-4.31578423a14.16180529 14.16180529 0 0 1-6.15081735-19.85386437c1.48310897-3.02120608 8.71326515-10.35976749 10.41318453-11.65434564 19.34011795-11.2222959 41.64959598-7.55222964 62.25915463 0.8640995 19.1264497 7.9842794 33.55848246 22.65983111 54.4115601 43.37779497 21.25684774 25.03374811 25.08716515 31.94025977 37.17198946 50.71478532 9.57736465 14.67398064 18.29062981 29.78158209 24.22620784 47.04471904 3.64021553 10.79181723-1.0526303 19.63862503-13.59621288 25.03374811" fill="#4D6BFE" p-id="12865"></path></svg>
\ No newline at end of file
diff --git a/channel/web/static/logos/doubao.svg b/channel/web/static/logos/doubao.svg
new file mode 100644
index 00000000..d67b4933
--- /dev/null
+++ b/channel/web/static/logos/doubao.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779261485522" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="5381" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M958.976 439.808C804.864 336.896 642.56 321.536 642.56 321.536s8.192 235.008-10.752 306.176c-0.512 9.728-11.776 75.264-43.008 157.696-10.752 28.16-24.064 55.296-39.424 81.408-40.96 74.24-89.6 127.488-89.6 127.488 119.808-48.64 205.312-92.672 309.76-175.616 122.88-96.768 229.376-254.464 189.44-378.88z" fill="#37E1BE" p-id="5382"></path><path d="M329.728 395.776c158.208-100.864 308.736-78.848 312.32-74.752 0.512 0.512 1.024 0.512 1.024 0.512 0-14.336-6.656-60.928-13.312-106.496-11.776-60.928-22.528-124.928-23.04-133.632-170.496-139.264-356.864-78.336-448 25.6-61.44 70.144-103.424 169.984-102.4 224.256V762.88c0.512-12.8 1.536-20.48 2.048-20.48 17.92-197.12 271.36-346.624 271.36-346.624z" fill="#A569FF" p-id="5383"></path><path d="M792.064 272.384c-41.984-43.52-87.552-88.576-122.368-125.44-33.28-34.816-59.392-60.928-62.976-65.536 0.512 8.704 11.264 72.704 23.04 133.632 6.656 45.568 12.8 92.672 13.312 106.496 0 0 162.304 15.36 316.416 118.272-0.512 0-83.456-80.384-167.424-167.424zM549.888 866.816c-2.56 1.024-198.656 107.008-292.352-30.72-20.992-30.72-31.744-68.096-33.28-106.496-3.072-74.752 5.12-227.84 105.472-333.824 0 0-253.44 149.504-270.848 346.624-0.512 0.512-2.048 8.192-2.048 20.48-1.024 32.768 4.608 98.304 43.008 155.136 52.224 78.336 193.024 138.752 328.192 85.504l33.28-9.728c-1.024 0.512 47.616-52.224 88.576-126.976z" fill="#1E37FC" p-id="5384"></path></svg>
\ No newline at end of file
diff --git a/channel/web/static/logos/gemini.svg b/channel/web/static/logos/gemini.svg
new file mode 100644
index 00000000..8b63e171
--- /dev/null
+++ b/channel/web/static/logos/gemini.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251750646" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="29551" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M214.101333 512c0-32.512 5.546667-63.701333 15.36-92.928L57.173333 290.218667A491.861333 491.861333 0 0 0 4.693333 512c0 79.701333 18.858667 154.88 52.394667 221.610667l172.202667-129.066667A290.56 290.56 0 0 1 214.101333 512" fill="#FBBC05" p-id="29552"></path><path d="M516.693333 216.192c72.106667 0 137.258667 25.002667 188.458667 65.962667L854.101333 136.533333C763.349333 59.178667 646.997333 11.392 516.693333 11.392c-202.325333 0-376.234667 113.28-459.52 278.826667l172.373334 128.853333c39.68-118.016 152.832-202.88 287.146666-202.88" fill="#EA4335" p-id="29553"></path><path d="M516.693333 807.808c-134.357333 0-247.509333-84.864-287.232-202.88l-172.288 128.853333c83.242667 165.546667 257.152 278.826667 459.52 278.826667 124.842667 0 244.053333-43.392 333.568-124.757333l-163.584-123.818667c-46.122667 28.458667-104.234667 43.776-170.026666 43.776" fill="#34A853" p-id="29554"></path><path d="M1005.397333 512c0-29.568-4.693333-61.44-11.648-91.008H516.650667V614.4h274.602666c-13.696 65.962667-51.072 116.650667-104.533333 149.632l163.541333 123.818667c93.994667-85.418667 155.136-212.650667 155.136-375.850667" fill="#4285F4" p-id="29555"></path></svg>
\ No newline at end of file
diff --git a/channel/web/static/logos/linkai.svg b/channel/web/static/logos/linkai.svg
new file mode 100644
index 00000000..44628cc3
--- /dev/null
+++ b/channel/web/static/logos/linkai.svg
@@ -0,0 +1 @@
+<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="168" height="168" viewBox="0 0 168 168"><image width="168" height="168" xlink:href="data:image/webp;base64,UklGRpIfAABXRUJQVlA4TIYfAAAvp8ApAAFIbhtJkhAOjR3V3f//cGZttzlG9H8C6n+3+zXUO6SXUHUv6DW1tVHT81JMdxWKp1/5/dkZVUnY0gt8ExupkBL6jY5tTy820mOSsBNLINt+w7GdqM/6IR1xQhKPQE6QHlJs/2YjZuynaNvukhNLINt0PyFlA4ktAQmjB9TLr4uBRcgG6QElOboLFqQCx/StX9burtp4VIwTNHe0HJtCti0xM/H0jaMTp0+Y2NMgJ0iXdCRJnxST2JKQk7khuAbEtFASS1dYvr8tVCGwpQJi+krpA67tBtn2BtBcQUASKEMVAqzeIV2ZyQo2UDCQkh64y1pV9W2oqu9Xeur68m89gOO4jSCpIAiN/f9LL7zcPb092YYG2UZqRnAI7w/5kSRJkiJJepplZraFqsruqRbJmWX+/y/2GXvs/wRY+MnSA71HsT3x4S2KjduMDUlgvKRjQBhE3AXLOxQAMACwgLsOC73DVQC8J5AWAIjlDWaEqQMotgFev1CnsOErPcSga8yCfQBFI2i5RfUJTwxQAHeBWOlWBWeHWIkC5q0KMy4HT9hAXAD4KBa6Uj/KqL1mAfBPDOJubJwBoCtU3zZ8ogFigLgLgP/8owu1YkZEFTsABCbFBgCcxfIdc/k06m59A6j2AAAMMMMfALHSV0uYhQBQnY84EmJgavmGb8u3Ueo5RhQAUB1ATO4CYGH5gu+Y9aMMUAGA+MpTaAAAN2OjI2YZCyb2ceIIAHEOio2HAHjPh6Ply2YJj4zaCUDgALYAgGKlk+VvKcCChgcAsQcAiAMxYwMA5hHzAwBwl7sAeDpI++CCQAM60AGWBIeLBcCpAEDsRHECdwFoO6Hl8wnqNUB1GqiIEQAg0AGWyw9mZwAUYPlUH0YAxRhgPblI45Be+eGqgW3bNoKi/de+2NFXX0aEArdtlB3k+EavmL9b2B/gG3gJehEU8GuwFdCcISjg77BPYLnxXPngi4BzeXQvGMxah7zzwXZySdsWTf8Q3y4KXEE8I0FUx6HpXa8YBs12PdKD6Im2fapJEfdI3fMoPiD+cMfhI4N7Lv37Mniy/38tSc6pqu5ZJwICLMGwsULvvTLvvffea1zVDe65ZW7dO/LSR14cFoGoKP+FJz4A5H64StFEG03Y6S8uCheBDBdFnbYQeEWHxzzDQAgEptLC8GWhVIA2/acXxx5chWHTDX+pDACBEJhFMLEoHCRiIATDROaflih0rki8FsPBALe2bdXKPvc+d41waIsGKcctdKLn/t69R2IbSY4kMfbOf1/1/Y/KmAC71bYtk+T8VFVd1V3NAz0rZmZmZmZmZmaWx8zMyzQ8OwzNOMxMzV1djdVV/y/qmYUe/zF3jjJQAMx5jPubJZbKFasEQawlzGLWGmuVwLqfWFMpTApCU75SWLDE0lhiayxBBDxHnqyNYD2xXPk6/zmy+hxZ6/7CAJTFWPI3hR1L55TOWmIpBv1xbAzrCmOQO+ZviaUAFIJMBaAMWK6OIlACnY4ogmVLaKGnz+1NYf21BJZy0FEizDyW2KKH///ZSPr8krRNUrez173ZDnaf4dk2/rRt27Zt2+baNm9U99rBdYy0UZPm9/MtSZIlSZJtEYl6XKq6+///r/8g7+kRoUIxAQvg5/z/U+t1ABenlwgvkUVHRmcACV6Qioza2rCQCCM8Fmzm8p55+rooLKFhFwNy6PFfTdjHiO8E+OzuTByryzCpAOVCxqF99YvFt2mVzxwsBzzbzXjh8IGvP0FlLGozlSuF2EVAohxy6PeBtv5uGd/6eGIEL3IiAdtCZe2NipT+Qxz2iXeG4JkLdkjczAw/n6C23WTzPe5tr/3zUG0zKQWo7vRGDWRspLpsCmvT2BE8c5ooVjn9q9TBSoYCbx62wW+gbrOCtnTbawU24ZhS4jCE6TNEPfwlImFO20W7lU6/+NToWGcFxyTrbdljPGKfNDzdXiKLMvNa10BxalarDI8ksvu3v2fhQj0n8SN6Z317Noic5WE5WV7GwYhm/DZuthew9dfJLR6DX1LUoKOYDTX/LTAc59Zcf7ZtzOSHYbEM7NDyFagj+YJIGFFppB8wI8mBEtyzVmsz3mO/ZOT+XFQl3b4n3vzh0OI5FAahI983ve/QqV8EI2qKFs7kZcYQUGGJjJ0rEmr0PTLn3ET9HFyPKQjhIIuAI0/ozPe25BPqdtLQBTTPwHIRDimAi5UJ76zui7dHT/zDOYgMtO4p+j7WweAYd335y0FHjkOu+b3/rt8lANuZ5UCzH6/6lsxiy7Oh5I5V8pNs4Zrjhzx8/4aGv24x4kjHG/xK6Q1n9TnL0VjcGAgnsI12SF13271D+P692AhgdLmrOZk8mIxPfbXhEj1692SP/F1LG05FJqZMc7lWd8zQrQnR9lGUqDR4syAkqjZahuQLHm78Nr+nwy/44WDBcxZips/+BGIZC4YG4FiKToFB6x6X2V3sts/uEJLXbur8rAe6mBuAC/Dzfrb/1/dWeJLVzvCUz8U57bufUe24NPQ6wQEqwHCK1DUXTd/Kd6sHbRuVBzbrxvz2aQSSDfBEbuNMniwC1bJLhOAQnY4DDuZMqpFQNJWJCUfz5PZM5nM/14API+D4ie2y6VpiMDYkMDQwkYMZfQxAxkhmXk4tEhFDq4tTY19caiAkd8B4p1zIjjflB4hE3C6qkm1s8uJPUxipBiHIp8QBBtHGAvr2fIVyhk4hpYgyu+YaC4kVJtOpAmN0mHiYKJWYGQdkuwAtxv5uoPd88Ac6SSgSwnCvIJnlyTAnTrtDO0ZTztDZuXQgu0WBiQWmwqO1O3siRg+I+23Sk6FGbf7Z8DDvNchKAu4OYEiqgBlIIFo5ChtjX0u+nTSz6xmvRo0YKmkDPs0s0DvL3h6Yde47bNfGxVsiwvO+FyoGiIF9Aeoq2cnqI9ByLO91UocB5U9TFJa7IIBhb+icQB7Y7vmv7LaH6FneuLWExnFVTEWQiRfoxVzl/azEU7/06EKA1sYOfcg2mSkiQtQgKjc+5QTVwcsNzrTbSx7HjC1amhWXZcvU0EmgQTZjpBwklh+8hFmrnqUT0hUoaAHd77CIIACGDq+6dr1E+BnDwW5bjEQa04nDe2ApCiRg9GQ6wqBYvPQIZgLKbsXI18LHfKQupBMiAUsXvBy3yk1wFlgAFN928MFJOUH5pbiLMYQQJUApJLHsTGBHt0jeCPOgNpJPx9Mvg6mOhRARgXuC3O+VaIYgJzv60dYtRhAaiUUHvPf/gheWpSiQCaIEWDp2Pe1FqtlPt3oMPLFbcR3fRdbMUzoJK0A7IqepkyFePgGc6NYWfJ22ah5J0TYoKGYeveQsieHd0rsSIOiBFJamEbSkxAwYJYvdHzqG2oBdffY+cBxjCIZSYTAhelDll1cH1dHg0JrJwbYBiFtIxg6N9b6LuUJYsmxkABUYH2a+/kvs8gu5HIXdpnGWqJGcpgRPEDjWizwQsS4cOlG6B3t/qK/i6yNvtKZXGNySqBBPsIYHShOSAZkVcpo4YWe9hPSBJk9pQ0jQSyIENnIbtYppduHJMEoMY5hug4/9vgZGgEnr7yyCvCgYouYooA8gcQbN4A3rHiTloWa/Cn2VUlKrybB1SryMmYDie+GMRBd5OWX6A6JVdP3jb8Yd98BYO5NU5aALu0skECsNvJPXbKfeA9svFblvQ0OaQTOVByGrCIoQLvBqhrDdC2dmAFrREUS7pgTLEUZPhckTjwQDZAmBgoDFS1GDry9OefOi8snVkdWnwY9MoGTOvWOkd43uKNgpnEdnAJG42TT0dZeYJnhfJIMJd2VydfOUAutBSFAzIFNUJ7Kjlq9M153Sr9jxFceZj3kpx6Fnkpr2IWVKPEwSanc2F7Ky2eiUFhgGyzpWuiFjZABDtE8JJKFhNGvD4tb95POW1cHuH3NwYiYSCm7p7NzkTlk7OkqLKfxit8l26CrWqTDWOYgGOiUwubOUiBR4BwQAEXXD96vl53wODp6uUGz/imPBPPDtRopEiWfKec4m198zM/+CNpeqhLjMZXAlFrrFSXyB4QEQIAMIlgdMh44f+lAIwiAph0TL+V/5iBeMIQIZCYSDVwjQDu7wytdNJsMomzWpcbZMTQYyi5snkZmhK5KAjBaR9iCtJzeCR4HTt0IkfZCGD05UY7WzCiQTg+bOwbnKT7SXY84Pq+dGm0qldnOz8ufDjBNHARwYK2T3qFojvdOBoBCqUwdlbqxMWlkzhgOiNoDopaSDh6moFAEBMHdYUAjSHXzK9LDBU/7/sE8iyLAOC80NBcjBCSZ742TZvyIqkUwQAJOZZAeRuAAmTlZOobljzHLT/v2BFAAKg8CpcnOeCgf5kNf/36sW2kwayTeM/Ye3XVoMEgiB+YS5W2U4RY3A+HEoW2Z6q1oDMoAEgehgBoM3FS6fWJA2xBftZ/eDCpBlKCAiUILwKlV7YnQbTbhQYccPkm/wz/dRKzC4IzHpmhA0M9PuyJvBv6dXbqQrMwiWAxwqL5oDxHis1s2PqBVpmPP4adQADMABfdKGsp2+mwxAtIkKydCI+xZrUyediBgzQJK7m5fhmtfEJhxrtF/SSb0ZjZIEQQIVJF2yMMv26fn6JtyhNkRNMl2aCgjxBqAKcskDdYrZR9cDnRlhQ0dnKnOtUisQsKBobSgvHykr5W6++RfYRnymL/KO2zqzHCABkXmPBQYfOf+jrrCbWua5KzsGrdQsyE0Cso84Km5Q2ZOXH70YcQKbZ2QjZznhAuRvjAQS5PjYS/qEqNtmzCKmBcDBOCE7Jb1wLoi7CJJISxweHQaa5e3CFiKxjR26ozaThu61EUACwqUJbWI/iz3GD0GFaxzDOq+zdDf/ZEmoF3gBHJmZG+g7bA+EkQvRzi6KcK5HsGwJSYD7UUCwa2glMhVrQ9Ea2wi1XrMIUy7Vfhhxv2253DCdzC+Mq2jG/DJ3TgwvRUiy7F0i79Chl6WycRWXUGQd8ibv3PwuORliSQYJDAKVsNm1LpGMhVb40IfQsj/9ekAiYdMjyVXvRYcoHX38jx/yZjGehw+ipywLLzCKIATgJkQ5inXrPn0gfBQa3eYNYBIB0RqZs0DUJ2Rq6yqWUKraALQkX9/25TW7UYFBgBjcuvoagXxLNj/1cZvFhx5uBxd95qZ3ESSY0PUeqshvi3J+Fp2l0hrzDyJcCJJYqgoKLl8vk7mHA94gl6P4juNu8MWtFcOpoosqgAogXq4BzNaN6JzJtklE/9K2lMVHMxkCCEgAk1rFSpYk/Vbjpl8SMgdMfvzF2tpxTwgiE8AcDyNw3hr+sWOp51wBbbo3S15BwI9QQDqB74PAFmK+zBzzof1GQYq8ORNNXBwRkUkAKcDLeynMsNHbXwladcT9NY/8l4KogAUo8eAlGSZt8IHyFTnlW+/rP3W8nhdBQniCrky3tVrathLZOaHNocgjJCwlMBJjUAMIUgKQGF5GrCxxDlr4H2qbQmyjRD/d204cjyAhyCrRMDwsg7oIM215ZtElnjZAYub2RMYgvwbYAQEI3Hu93tJG/PxM9bFRN/DmW2nWd5DIWqFiWGLccn5ys273TJGcMnUaGxB8VKLVEKEUdhSxwzl+uhhGYmc0miRqZD+EcxaAAJvKeSgcYovNZjOIr6LQyBAtI1yKugkswCRylO7Khpkm/libJbRKI8WWNjNep5IgM+gLCEEjC5+7TR5wQBrat4UPaWSjZX6yECEAQsAeXWJNbA8aA1QYtDK1WGk3T1RIPIkgaC8cJfIm3O7JB4/llBrCIqQ7d+mxQILIBBqGCS6LQYMvkqMNWflpM2uzuptIpwIicWPE42XdONROm5shSPFtBg0DynSbPyzfQnel6VfdDJmA+L0XZ2trJFQZkNPISg12SiSYEUgjCCsCQylx7VKJ6NwWr9X931Ab3oOuqAjzDsZhtECBj10CyNsMAfq1VI8sKkg6JVUfBnd6d+VxDg1lBGQVYZlfycmfLwuHcWqhF7F7+mFpQjQEtExUcpYMRvM+85Jt9vYZR8sBmv2aASvZjgcwifbeXibU3ErtFbtNIHyv207+vxh7wt96J38yVZbJP677jBegngFaGW2c77I1VeBxKQQCctYlCitiB+3xAZFQjmqPrhq9Qc12vp5QEogWYZAZbLJn8roOPNiAlg7H2IGxuJF0DsQYcnaYZRBQaYyTW9Aag2vO6EN/0HG3Qe24DLygT1Amian4bVF3M00/S1iKpw3Qej7epQQkElChxzNfXpQdkgcncPlUezhmJjTElOh4xNCB0w0sU2nEUpnwsNXOAiCW+heOV6JCtGrsC1Wjq6+XVr5J+EB5wsBc9Z4sct7+XyIhKokmBEEio8LO6DnW0GXTOEJxlWY3/ZJcVMpA+DTBAEHnloPr+YlDckCcWfEd4524foLqclli2UAGUHgjQGXoqKGORG3oiCPZol10rIxlY4A5g2KV7yUT4JunAR2y9vrw09SCZAIL8CCrMpg58Ako8UcrJ3OuSbAgjMCjAUUgUiBk8JFBxhQ6JiIrfSQHzbazQFHGYTKBQAHpDKr8juwEZ7FbVpmHaYpNmSGLDtd4merIaKgpcDo3yGrZGOv2vOl58J/zUBSA6IRWH86qaE8CmBBMTJUAMCC54SDUBh8SuXqXe9pYCESAX8HLvaohyLSzTkssRl4uoGOWtRQzNnBbEEgCv2LunO4o7IcfpP4TxuNchc6YdDrnvoVBhkCAyoIGj9kyTNOy2qfYQamSeQzJhiYiEw4gQfW6Obcb3d60kw8bVblUPckEnZdjbt7FmmVDOV7xJ1O6wxEZx6XzV/g9nRjPSSmxaNOVtQLrzk0gBFIkEI1oIXfcNhAS37lChHN0iRMQUW1YDSLmYMgEd3PY2NaMXgOITy8TcgcdUYD+EQz7GHzqCA10LE3wWjQh1xbGOcvoWrnd8/fOBEQlU1j2ZQK9/F2Vxo4o325qVh+LGQZhHCoEEAKckOgUMU52snBAf3lUB4Geq4+5XUYHLKvwK8CjayiW+c0ilRrzfOd4bnSZmDyF/8JJUIjEoAfk8KeNii/dABnZxlMBRMJevgJCNYeoEdyQBCbkNYl0aVxD1LhEnPA6TA1vx1TTMxJo0JNMGJjsjfY+DTGMksa5a+il5MK5H0OholUY0OuXzdptaDycTGWFkZpVXeIVp3glWU4DmGCCixFh2GOAb8WNLovKcvJuRDTfQuQxBEwIB+zAYCF1GrcU0AugWtyf3JWxZUgEQSSYEE5AwJ7eecGoE2jjmq+TPrl+e1ElRQLUSgD9/mHY11Zm8QQ6D1mIBJFTLODKw3sgo1JzAgcIYGI6DEsv1sslJE4usDAoR+rrE9SRS3WAoD1BsA78/xSKhJlibVQ+gctI9DRCESSCmESgU6vTkwTtcnh4IuujqBgrxjFAyDD6qIKS/N3z92tv3SsR+SJIJCzkJhegBtRcEqlBcgwI2TuiWSh3K0a+Be40L0WrJe39axaVaeES9FIUYiXHRj45xLkZRQgBhOhxL/Ebz/4a4AE+XKhoTonYe8faTKNkBK0qKdArjeMMJfzINtRGssdP3gflW4BQEjISqAnvifI2rWCaEE7gMiCISl6LuxXC/nFSIcBzdoCylgbQ/hdLtYsBSMHm1yGIVOrnIonaRwp7kPj2b21dF3I5kEmZfYtRDKdh3NDuGJk52anEYmtibmz9+jBHzN+cEIGxCdIiAWK+I2CMzIfS8w2HpqHngn3EzmrX4yX8RAaqJYIiaKQXe/USm8NDfW0hxKEBLRnd3BJ1KSuZxiDVCJGNDmYR5ERYP3DFchTgvdJIMi0SIBOW5ECnxYNkDTuxcqwMklXuHCAhEILy1+l+mmAxjMROmWOymqU9csRVkKkAqJnEbx6gxOA/b/p1U3zXcBUs0bC3NyZ0HiWAACJny/71eeAt+7+hujiJSP5SKC47gjzhUTcAlWWB1ZaZo8e+iNy3hQ/JZDsyH/o0c4ULAhgcY4MiTH/0/UwcrB2Vu4/qRItT9IASck4APKtY+uqcBzpr2NOL0GNIUztSzQUtvRem2iY+SHZitvgUa9lXVsqKAzUuf9h6+TSIewWSAJTvfm52hjFyAQd866Uh3aT/4kxHwJi6KSYHUwAgrODqvVZr1yTaOnCzRDvNAK2PyAXQUAaBoQk/x2bPGNQGDzrg0wbGsCRkNSoGBNA/ExQXm6ZEzM3pI7oMuvh5ZEJ4PdlDQqQOzmtJH90GgYy1FM7fKTGBDZxsS5ZFEEUJOLm8PjZ/1+GU9N1bvXZyk36LEABZac1D0XrG90y9pgOidarGULOM5Ws4UMaSErd4j+Ey4LE+yTmFtXNKTKg294AivsaqtyOkgATBBLAydvj5dPB0hYqd3nJceDCeKCQBUBWAmQT6tW594WKtjRYzkgpxTbB8bhBHFaCKlDnJ0gG3vCB2R8zOx87kw0rJm8avkNEFRBowTS9XJxWnmb28z1DkNpBb2jG4rG0RdIKEipd+/jJG5IDbHnlHa4QPQ7D2JJ9ZCN4RHhXE8iPLUtS5zPWeLjecc+osIEgwKUjOzzB1wkCFZRngA3YYJR5h1RVCX2yj4kZEeKHTAQ6yakLIgtuBfnYoPqJfJ5ku0zd/XDnm+ybFyyhEMM2MG1fF9V4yDOezu1jS2coL6zizDzm0IPAJSjBWSn7oy4VXmz3X44na0BHNKvYS1ZESMhJkOiQhlvmVy8VujeDDUIZrXFZ0KcLIdFXpPC0ZTVNHiU/BGOf3+15SFLErSLK+SN6Yjv992O3Hs/2DnQHA1r89+fm56mjIKi2BrFdVB0ToT2uNi9UaGYoabuf9L3uxRgdmgAHoHoPoYMIH/P8U1vs+6q7KtsKJogtmpiKHEH90vD2NsBO9ctyP6O66qOx979JBqHDqafIbIA/yDIoCRkx2gqQVdeiuTCwLJWU5p01odXbL/gm1wYMk9WMA7o+r4YAAJKBRDpyd0feSIoCR66S6rivjcO4CZo+9LAF6sjFa0NBTf0W9S3dy3amhipT9jvPIYiKaXwySnHUB6IodYLFoow1zpkBFCIo8JhNsO/rGalWx2dvo9a9pZMsUxksyQIXWB1FrzO5MwillKOsEtQ+hzqPXsc37WY3BPWRIMoT6vkf9yL5cl73rbqQ/oXEt+13YiH1I1TMoZZQYBes0R76IMCEMwISfq8LYfMt6h9qgInnd7dmNZS5iFYNOICowch8R3TM2lSQGxRr1WM81Cej7FtXpEBKG+N+9Uj07pJJ/4VJB/X4Sxvud4P3KF7enY0H0Aj0zL79cCNThSpZ1jMmy+ti/f8Qqpahc44/+huA6NV/fiwHJkSZ1OPXZ0GbOCAtBsN69+3AK8pIEFEboapBZjTRQJJESdHR7l5S8rzEI0BIEw1vLZ81HQbcXvatbMDqQUCZuDwNNi3bnFiPbMHSSNg0G1XvzKi8QsKew4A3439cF7PiWV67XSSMFMwz5UdF0PAGpTngRyweUQWQJ0OzswSVk4o3VA6E16PCIb8DFQ4Y0lYAAQqDEDn0mowY0F1SsDeoCjbiCJ3m2DtAnIghAHV8ywaZ88yc312sEYEBmnN1yMoRFYLgVEUsQYEkAXMF4pZ1DYoyAOD6BJ+5bh4PKxh+jMHl+65VoLRiYSvAc4kboKCzKipHoJta2hnUfXCa6H2KfgYrg5fICfcf6KBNrrZFitcBDkFSnAhBdVlCByITKBOA2ORSxLyNIzOXKygQfUEAA6MPUs+WrMcDq+fcMGIgkRaZkGDRTPSMHE7VBkVtJ97K3leXuFTSZaF/OofLhD7/fiuJ6jTAM9Rb3y/rFEF3ng4DoKxBJKHJKcaUFDznJ6IC/7z+CZXkekNMAUkGJL55og6OkQ8UhUPmQmb3sb/6/rBZfh6vhCxPRgUtXT2/wUpoQIJXX+qKGvy7m1latgkUuRjGOYAMQqhAKIJggrgzofG+AK1H/ek4IkwHyNCJ2EL1Wa6j/99pxhUZIIuRfzlEi5KB5VomxjUrrvyCVj5DRcZcCARwY1tqw0z5Iv2xUa4R4wYA6CvMKfg+0mJOhwwmWg6qYoEygyTvmyV0UYf0UhATE/WmQ0ZFNzqz5d+vc5XcgA1DVhwzp5VMrrO4ZWqTJY0S07BMSAW4BJjDBw+dn/GwR62TkxWU+pZI+u8WE18yqrCRkWuAsq8PF3oGA+UXLuwZ7Qafx0c/FjM+DfgeqITqMCSwAP37uAh9x9/5dixjbALHLv/7Auy1mSNALMnHnjGvu4Cyvnc574gTWCI0OVYx+J/7IK6nzWE3FxY4iJBWhIvm+II0qI0Vylb2rNI4Xqx3+vyz+ARkGyOgm27uIqWlEKnaInUr8habuL1R+Mxu2/G13cc8DK0VsoHHGyGr/rFb9kkySCKiSqoSi6T98OlmKPDcCGId2PQx9M8PxfMI3FkYhFaEiIY1XO8QOFZdUUmXvUil/+uyXkNMNDAj9NHTZ2MIJsqK8t7F7tSujhCRUsvahNb+XI2RPnZZz3waQgB3gZU1H6KjkVfcjRRISRX/3NEdQ9/PPDviCyDF38pEcneYdR8M4uVa7tNe4d51gvNqNamUn1Oq7QyyvNLxvOFc4rQ+9dFMKj1JizuBhJ7FT6ygGHq4061vu5+1hpY88s97IJu0uBHxZ3Y0DGt++Y//q26EqGfCEpWtHimeNAGR26TRkOTRAgBCtBRAgWgUCEMDut81e17WLD/iV1VWgLoKR4xCn3/2eSydAgKtzEwBDx2G5CoCNr7DCRwHR3lBXiKDKfZjGB8Xc25KXyuUS1LPonOoaxLr9nP9/hkY="/></svg>
\ No newline at end of file
diff --git a/channel/web/static/logos/minimax.svg b/channel/web/static/logos/minimax.svg
new file mode 100644
index 00000000..79557d77
--- /dev/null
+++ b/channel/web/static/logos/minimax.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251514432" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="11888" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M415.392 475.808v329.984c-22.304 111.744-170.56 82.944-171.2 1.92-0.672-101.824 0-202.976 0-304.064v-117.184c0-14.656-3.2-26.24-16-35.392-24.96-18.72-54.944 3.264-55.584 30.208-1.408 36.16-0.704 71.616-1.408 107.264 0 28.16 0 55.52 0.64 83.648-18.368 123.776-168.32 103.232-171.808 0.704V487.04c0-28.032 54.944-34.624 52.256 7.36-1.792 20.8-0.64 42.272-1.344 62.912-0.64 36.8 55.648 61.6 68.896 1.408 0.64-49.632 0.64-99.264 0.64-149.344 0-62.752 17.824-113.856 84.352-118.624 28.8-2.56 47.968 9.504 66.336 30.304 7.04 7.36 23.68 30.72 24.32 56.16 0 23.456 0.64 46.752 0.64 70.464 0 46.72-0.64 93.76-0.64 140.48 0 30.304 0.64 60.256 0.64 89.856 0 37.536 0 75.552-0.64 113.152-0.64 48.864 58.816 48.16 68.352-0.768 0-57.632 0.64-114.56 0.64-172.192 0-141.984-0.64-283.968-0.64-425.856 0-14.72-2.048-55.584 5.76-70.464 41.504-101.12 167.392-56.96 168.544 26.72 2.432 171.52 0 344.896 0.64 516.8 0 59.616-48.416 46.816-51.104 23.488 0-178.88 0-358.4 0.64-537.024-2.368-44.832-68.832-38.72-72.672-6.592-1.28 36.864-0.64 74.4-1.28 111.232v219.008h0.64l0.448 0.256h-0.064z" fill="#D4367A" p-id="11889"></path><path d="M610.016 473.184v242.336V143.648c21.632-112.512 169.824-83.264 170.464-2.176 0.704 101.12 0 202.912 0.704 304 0 38.784 0 77.728-0.64 116.544 0 15.36 3.776 26.176 16.64 36.032 24.32 18.24 54.24-3.2 55.584-30.592 1.344-35.488 0.64-70.976 0.64-107.328V376.96c18.56-123.776 168.128-103.232 171.264-0.704v310.592c0 28.16-54.304 34.848-51.872-7.296 1.472-21.44 0-267.104 0.768-288.64 1.28-36.16-55.712-61.664-68.928-0.768v148.576c0 63.68-17.856 113.92-84.96 119.36-63.264 1.504-88.704-42.24-90.752-86.432V271.328c0-38.24 0-75.552 0.64-113.088 0.64-48.864-58.784-48.864-68.896 0.704V831.36c0 14.592 2.048 55.52-5.184 70.432-41.44 101.056-168 56.864-169.152-26.752v-79.616c3.136-53.6 48.416-40.864 50.464-18.176v94.464c2.432 44.928 68.928 39.488 72.064 6.656 1.344-36.896 1.344-73.728 1.344-111.296v-293.824h-0.192v-0.064z" fill="#ED6D48" p-id="11890"></path></svg>
\ No newline at end of file
diff --git a/channel/web/static/logos/moonshot.svg b/channel/web/static/logos/moonshot.svg
new file mode 100644
index 00000000..20d60b5c
--- /dev/null
+++ b/channel/web/static/logos/moonshot.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251592968" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="16416" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M117.9648 684.6464l342.30272 93.57312v75.34592l209.7152 58.5728A428.99456 428.99456 0 0 1 512 942.08c-176.128 0-327.53664-105.8816-394.0352-257.4336zM83.29216 477.42976l407.30624 112.64-9.6256 37.00736-6.0416 35.0208 383.3856 104.96a432.5376 432.5376 0 0 1-65.10592 70.32832l-688.18944-185.9584A429.4656 429.4656 0 0 1 81.92 512c0-11.63264 0.47104-23.1424 1.37216-34.54976z m57.344-182.4768l429.07648 114.21696a279.94112 279.94112 0 0 0-23.06048 35.55328 201.17504 201.17504 0 0 0-14.70464 34.93888l403.08736 110.26432a426.8032 426.8032 0 0 1-23.552 81.7152L86.54848 448.7168a427.25376 427.25376 0 0 1 54.0672-153.76384z m158.47424-156.75392l404.23424 108.31872a190.2592 190.2592 0 0 0-32.80896 24.90368c-9.13408 8.8064-19.8656 21.4016-32.1536 37.74464l285.24544 77.78304c9.216 30.45376 15.03232 61.8496 17.32608 93.5936L156.61056 269.68064a432.27136 432.27136 0 0 1 142.49984-131.4816zM512 81.92c142.90944 0 269.55776 69.71392 347.7504 176.98816L337.26464 118.90688A428.50304 428.50304 0 0 1 512 81.92z" fill="#000000" p-id="16417"></path></svg>
\ No newline at end of file
diff --git a/channel/web/static/logos/openai.svg b/channel/web/static/logos/openai.svg
new file mode 100644
index 00000000..b7b1fc50
--- /dev/null
+++ b/channel/web/static/logos/openai.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251225589" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="9015" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M881.664 431.488a218.88 218.88 0 0 0-18.176-177.088A218.624 218.624 0 0 0 628.992 149.76c-40.576-45.824-100.288-71.424-162.176-71.424a219.136 219.136 0 0 0-208 150.4 215.68 215.68 0 0 0-144 104.512 218.944 218.944 0 0 0 26.688 254.912 218.752 218.752 0 0 0 19.2 177.152 217.088 217.088 0 0 0 234.624 104.512 219.136 219.136 0 0 0 162.112 72.512 219.136 219.136 0 0 0 208-150.4 215.68 215.68 0 0 0 144-104.512 219.008 219.008 0 0 0-27.712-256z m-324.288 454.4a158.08 158.08 0 0 1-103.424-37.376c1.088-1.088 4.288-2.176 5.376-3.2l171.712-99.2a28.16 28.16 0 0 0 13.824-24.512V479.488l72.576 41.6c1.024 0 1.024 1.024 1.024 2.112v200.512a160.512 160.512 0 0 1-161.088 162.112z m-347.712-148.288c-19.2-33.088-25.6-71.488-19.2-108.8 1.088 1.024 3.2 2.176 5.376 3.2l171.712 99.2a25.984 25.984 0 0 0 27.712 0l210.112-121.6v84.224c0 1.152 0 2.176-1.024 2.176L430.464 796.16c-76.8 44.8-176 18.176-220.8-58.624z m-44.736-375.424c19.2-32.64 48.896-57.856 84.224-71.488v204.8c0 9.6 5.376 19.2 13.888 24.512l210.176 121.6-72.576 41.6c-1.024 0-2.112 1.088-2.112 0L224.64 582.912a160.448 160.448 0 0 1-59.776-220.8h0.064z m597.312 138.688l-210.112-121.6 72.512-41.6c1.088 0 2.176-1.088 2.176 0l173.824 100.224a161.088 161.088 0 0 1-25.6 291.2V525.44a26.304 26.304 0 0 0-12.8-24.512z m71.488-108.8a23.232 23.232 0 0 0-5.312-3.2L656.64 289.536a26.048 26.048 0 0 0-27.712 0l-210.176 121.6V326.912c0-1.088 0-2.176 1.088-2.176l173.824-100.224a161.152 161.152 0 0 1 220.8 59.712c19.2 32 25.6 70.4 19.2 107.776z m-454.4 149.248l-72.64-41.6c-1.024 0-1.024-1.088-1.024-2.176V297.088A162.048 162.048 0 0 1 467.84 135.04a158.08 158.08 0 0 1 103.424 37.312 22.848 22.848 0 0 1-5.312 3.2L394.24 274.688a28.16 28.16 0 0 0-13.888 24.512v242.112h-1.088z m39.424-85.312l93.824-54.4 93.888 54.4v107.712l-93.888 54.4-93.824-54.4V456z" fill="#000000" p-id="9016"></path></svg>
\ No newline at end of file
diff --git a/channel/web/static/logos/qianfan.svg b/channel/web/static/logos/qianfan.svg
new file mode 100644
index 00000000..a9356678
--- /dev/null
+++ b/channel/web/static/logos/qianfan.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251568791" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="14450" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M96.20121136 636.3124965c-0.1472897-113.41305959-0.29457937-226.8261192-0.29457937-340.23917879 0-14.87625845 7.65906378-26.51214381 20.4732666-34.02391789 45.51251353-26.65943349 91.02502705-53.31886698 136.83211997-79.53643141 71.1409192-40.94653321 142.42912809-81.59848704 213.71733698-122.39773055 7.36448439-4.12411126 14.58167909-8.3955122 21.50429441-13.2560719 19.44223878-13.40336159 39.03176725-16.05457598 60.09419263-3.53495252 27.39588193 16.34915535 54.93905355 32.25644163 82.48222516 48.16372793 88.0792333 50.96223197 176.30575629 101.77717426 264.38498958 152.59211653 9.86840908 5.74429781 19.88410785 11.19401627 29.60522725 17.0856038 14.13981003 8.54280189 21.50429441 21.06242535 21.50429443 37.70616007 0 147.73155685 0.29457937 295.46311371-0.1472897 443.19467057 0 15.46541722-7.2171947 28.57419943-21.7988738 36.96971163-34.7603663 20.17868721-70.55176044 38.88447758-104.57567833 59.94690293-48.90017634 30.19438599-100.00969801 56.11737105-148.76258466 86.60633642-29.01606849 18.11663161-59.50503387 34.02391789-89.11026112 50.96223197-13.10878221 7.51177407-26.07027474 15.17083783-39.03176726 22.9771913-13.84523065 8.3955122-27.83775099 8.83738127-41.97756102 0.73644843-56.41195043-32.55102101-112.82390085-65.10204201-169.38314098-97.653063-61.86166887-35.64410444-123.72333775-71.1409192-185.4377169-106.78502365-11.19401627-6.48074626-22.24074286-12.81420285-32.99289009-19.88410785-11.48859565-7.65906378-17.08560379-19.14765941-17.08560378-32.69831069-0.1472897-34.7603663 0.1472897-69.52073264 0.29457938-104.28109895 1.62018657-0.58915875 1.62018657-1.62018657-0.29457938-2.65121438z m356.58833414-225.500512c2.20934532-1.76747625 4.41869063-3.68224221 6.77532565-5.15513907 68.93157389-39.62092601 137.86314777-79.24185204 206.94201135-118.86277807 2.79850407-1.62018657 6.48074626-1.62018657 6.62803594-6.18616688 0.1472897-4.8605597-4.12411126-4.71327001-6.77532564-6.18616688-40.65195383-23.56635005-81.59848704-46.83812071-122.10315117-70.84633984-16.79102442-10.01569877-32.84560039-8.54280189-48.45830728 0.58915876-45.9543826 26.51214381-91.46689612 53.61344636-137.27398903 80.42016953-31.96186226 18.70579035-64.21830387 37.11700133-96.32745581 55.67550198-18.41121097 10.60485751-27.54317163 25.33382629-27.24859225 47.72185885 0.88373813 89.55213018 0.58915875 179.10426036 0.14728969 268.65639053-0.1472897 20.17868721 9.27925033 33.58204881 25.33382629 43.15587853 31.3727035 18.70579035 63.18727606 37.11700133 95.14913832 54.93905355 10.89943689 6.03887719 21.06242535 13.99252034 35.79139414 18.41121096V505.51925374c6.48074626 19.58952848 18.55850066 34.02391789 36.67513226 44.6287754 27.83775099 16.20186565 63.18727606 12.51962347 86.31175705-10.45756784 26.95401286-26.65943349 28.72148912-62.89269668 12.81420282-90.14128893-16.34915535-28.42690974-43.59774757-37.55887038-74.38129233-38.73718787z m82.48222517 429.64401928c14.28709972-3.82953187 25.92298506-13.99252034 38.88447758-21.35700473 40.94653321-23.27177067 81.30390766-47.72185885 122.54502023-70.55176046 26.95401286-15.02354815 52.87699792-31.66728287 80.71474891-45.21793415 16.79102442-8.10093283 29.60522723-22.53532223 29.60522726-43.4504579 0.1472897-92.939793 0.29457937-185.73229631 0.14728969-278.6720893 0-11.19401627-5.15513907-13.99252034-13.84523067-7.06990501-26.51214381 20.76784598-57.29568854 34.46578693-86.16446735 51.25681135-54.49718448 31.81457257-109.14165865 63.33456576-163.78613282 95.00184862-8.54280189 4.8605597-11.78317502 10.45756784-11.63588535 20.47326662 0.29457937 96.18016613 0.1472897 192.50762194 0.1472897 288.68778806-0.29457937 3.5349525-1.47289687 7.65906378 3.38766282 10.8994369z" fill="#066AF3" p-id="14451"></path><path d="M96.20121136 636.3124965c1.91476594 1.03102783 1.91476594 2.06205563 0 3.09308345v-3.09308345z" fill="#4372E0" p-id="14452"></path><path d="M391.3697457 505.37196405c-5.44971845-44.33419602 13.84523065-74.08671296 61.4197998-94.55997955 30.93083443 1.17831749 58.03213699 10.31027814 74.38129233 38.5898982 15.75999659 27.39588193 14.13981003 63.48185543-12.81420282 90.14128893-23.27177067 22.97719129-58.47400606 26.65943349-86.31175705 10.45756783-18.11663161-10.60485751-30.34167568-25.03924691-36.67513226-44.62877541z" fill="#002A9A" p-id="14453"></path></svg>
\ No newline at end of file
diff --git a/channel/web/static/logos/zhipu.svg b/channel/web/static/logos/zhipu.svg
new file mode 100644
index 00000000..e4b55463
--- /dev/null
+++ b/channel/web/static/logos/zhipu.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251419020" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="10062" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M520.063496 0v77.563152c0 269.231173-144.758953 414.054122-434.212862 434.340854L86.106618 511.968002H76.827198V255.984001l443.236298-255.984001z" fill="#5B55F6" p-id="10063"></path><path d="M520.063496 1023.936004v-77.563152c0-269.231173-144.758953-414.054122-434.212862-434.340854L86.042622 511.968002H76.827198v255.984001l443.236298 255.984001z" fill="#376AF3" p-id="10064"></path><path d="M520.063496 0v77.563152c0 269.231173 144.758953 414.054122 434.276858 434.340854L954.08437 511.968002h9.215424V255.984001L520.063496 0z" fill="#5B55F6" p-id="10065"></path><path d="M520.063496 1023.936004v-77.563152c0-269.231173 144.758953-414.054122 434.276858-434.340854L954.08437 511.968002h9.27942v255.984001l-443.236298 255.984001z" fill="#376AF3" p-id="10066"></path></svg>
\ No newline at end of file
diff --git a/channel/web/web_channel.py b/channel/web/web_channel.py
index 53f26c46..c0faef4d 100644
--- a/channel/web/web_channel.py
+++ b/channel/web/web_channel.py
@@ -1,15 +1,17 @@
+import datetime
 import hashlib
 import hmac
-import time
 import json
 import logging
 import mimetypes
 import os
+import random
+import shutil
 import threading
 import time
 import uuid
 from queue import Queue, Empty
-from typing import Tuple
+from typing import List, Tuple
 
 import web
 
@@ -26,8 +28,16 @@ from config import conf
 IMAGE_EXTENSIONS = {".jpg", ".jpeg", ".png", ".gif", ".webp", ".bmp", ".svg"}
 VIDEO_EXTENSIONS = {".mp4", ".webm", ".avi", ".mov", ".mkv"}
 
+def _get_web_password() -> str:
+    # Coerce to str so non-string values in config.json (e.g. numeric password) won't break comparisons
+    pwd = conf().get("web_password", "")
+    if pwd is None:
+        return ""
+    return str(pwd)
+
+
 def _is_password_enabled():
-    return bool(conf().get("web_password", ""))
+    return bool(_get_web_password())
 
 
 def _session_expire_seconds():
@@ -38,7 +48,7 @@ def _create_auth_token():
     """Create a stateless signed token: ``<timestamp_hex>.<hmac_hex>``."""
     ts = format(int(time.time()), "x")
     sig = hmac.new(
-        conf().get("web_password", "").encode(),
+        _get_web_password().encode(),
         ts.encode(),
         hashlib.sha256,
     ).hexdigest()
@@ -61,7 +71,7 @@ def _verify_auth_token(token):
     if time.time() - ts > _session_expire_seconds():
         return False
     expected = hmac.new(
-        conf().get("web_password", "").encode(),
+        _get_web_password().encode(),
         ts_hex.encode(),
         hashlib.sha256,
     ).hexdigest()
@@ -83,6 +93,15 @@ def _require_auth():
                             json.dumps({"status": "error", "message": "Unauthorized"}))
 
 
+# Localized text for /cancel system replies. Web is the only channel that
+# honors a per-request `lang`; other channels reply in Chinese by default.
+def _cancel_reply_text(cancelled: int, lang: str) -> str:
+    en = lang.startswith("en")
+    if cancelled > 0:
+        return "🛑 Cancelled." if en else "🛑 已中止"
+    return "Nothing to cancel." if en else "当前没有可中止的任务。"
+
+
 def _get_upload_dir() -> str:
     from common.utils import expand_path
     ws_root = expand_path(conf().get("agent_workspace", "~/cow"))
@@ -294,6 +313,12 @@ class WebChannel(ChatChannel):
                     "timestamp": time.time()
                 })
                 logger.debug(f"SSE done sent for request {request_id}")
+                # Auto-trigger TTS once the bot finishes its text reply. The
+                # synthesis runs in the background so the chat stream is never
+                # blocked; the resulting audio URL is pushed via a follow-up
+                # `voice_attach` SSE event and persisted to messages.extras.
+                if reply.type == ReplyType.TEXT and content.strip():
+                    self._maybe_dispatch_auto_tts(request_id, session_id, content, context)
                 return
 
             # Fallback: polling mode
@@ -340,6 +365,10 @@ class WebChannel(ChatChannel):
         # Use a single-element list as a mutable counter accessible from closure.
         reasoning_chars_sent = [0]
         reasoning_capped_notified = [False]
+        # Captures the first error message emitted by agent_stream so the
+        # subsequent agent_end handler can skip its "empty final_response"
+        # fallback (which would otherwise overwrite the real error).
+        streamed_error: List[str] = []
 
         def on_event(event: dict):
             if request_id not in self.sse_queues:
@@ -398,6 +427,37 @@ class WebChannel(ChatChannel):
                 if tool_calls:
                     q.put({"type": "message_end", "has_tool_calls": True})
 
+            elif event_type == "error":
+                # Agent raised an exception (LLM 401/timeout/etc). Surface the
+                # real message instead of letting the empty-response fallback
+                # below hide it as "(模型未返回任何内容)".
+                err_msg = data.get("error") or "unknown error"
+                logger.warning(
+                    f"[WebChannel] agent_stream emitted error for "
+                    f"request {request_id}: {err_msg}"
+                )
+                # Remember it so the agent_end handler below knows not to
+                # rewrite the message into a generic empty-response notice.
+                streamed_error.append(err_msg)
+                q.put({
+                    "type": "done",
+                    "content": f"❌ {err_msg}",
+                    "request_id": request_id,
+                    "timestamp": time.time(),
+                })
+
+            elif event_type == "agent_cancelled":
+                # Push an explicit cancelled SSE event so the frontend
+                # marks the bubble as stopped. A trailing "done" still
+                # arrives with the partial answer.
+                final_response = data.get("final_response", "")
+                q.put({
+                    "type": "cancelled",
+                    "content": final_response,
+                    "request_id": request_id,
+                    "timestamp": time.time(),
+                })
+
             elif event_type == "agent_end":
                 # Safety net: if the agent finishes with an empty final_response,
                 # chat_channel skips _send_reply (because reply.content is empty),
@@ -406,16 +466,21 @@ class WebChannel(ChatChannel):
                 # here so the frontend always gets closure.
                 final_response = data.get("final_response", "")
                 if not final_response or not str(final_response).strip():
-                    logger.warning(
-                        f"[WebChannel] agent_end with empty final_response for "
-                        f"request {request_id}, sending fallback done"
-                    )
-                    q.put({
-                        "type": "done",
-                        "content": "(模型未返回任何内容，请重试或换一种方式描述你的需求)",
-                        "request_id": request_id,
-                        "timestamp": time.time(),
-                    })
+                    if streamed_error:
+                        # Error was already surfaced via the `error` event
+                        # handler above; nothing more to do here.
+                        pass
+                    else:
+                        logger.warning(
+                            f"[WebChannel] agent_end with empty final_response for "
+                            f"request {request_id}, sending fallback done"
+                        )
+                        q.put({
+                            "type": "done",
+                            "content": "(模型未返回任何内容，请重试或换一种方式描述你的需求)",
+                            "request_id": request_id,
+                            "timestamp": time.time(),
+                        })
 
             elif event_type == "file_to_send":
                 file_path = data.get("path", "")
@@ -432,6 +497,156 @@ class WebChannel(ChatChannel):
 
         return on_event
 
+    # ------------------------------------------------------------------
+    # TTS auto-dispatch
+    # ------------------------------------------------------------------
+    @staticmethod
+    def _resolve_voice_reply_mode() -> str:
+        """
+        Decide the TTS auto-reply policy.
+
+        Source of truth is the cross-channel pair
+        (`always_reply_voice`, `voice_reply_voice`) which chat_channel
+        also consults. The web UI presents these as a single three-state
+        picker (off / voice_if_voice / always) via a lossless mapping.
+        """
+        if conf().get("always_reply_voice", False):
+            return "always"
+        if conf().get("voice_reply_voice", False):
+            return "voice_if_voice"
+        return "off"
+
+    # Mirror of ModelsHandler._TTS_PROVIDERS. zhipu is intentionally omitted
+    # from the UI (GLM-TTS prelude beep); pinning it in config.json still works.
+    _TTS_PROVIDERS_SUGGEST_ORDER = ["openai", "minimax", "dashscope", "linkai"]
+
+    @classmethod
+    def _tts_provider_ready(cls) -> bool:
+        """True if user picked a provider OR any suggested vendor has an API key."""
+        if (conf().get("text_to_voice") or "").strip():
+            return True
+        for pid in cls._TTS_PROVIDERS_SUGGEST_ORDER:
+            meta = ConfigHandler.PROVIDER_MODELS.get(pid) or {}
+            key_field = meta.get("api_key_field")
+            if not key_field:
+                continue
+            val = (conf().get(key_field) or "").strip()
+            if val and val not in ("YOUR API KEY", "YOUR_API_KEY"):
+                return True
+        return False
+
+    def _maybe_dispatch_auto_tts(
+        self,
+        request_id: str,
+        session_id: str,
+        text: str,
+        context: dict,
+    ) -> None:
+        try:
+            mode = self._resolve_voice_reply_mode()
+            if mode == "off":
+                return
+            if mode == "voice_if_voice" and not context.get("is_voice_input"):
+                return
+            if not self._tts_provider_ready():
+                return
+            threading.Thread(
+                target=self._synthesize_tts_async,
+                args=(request_id, session_id, text),
+                daemon=True,
+            ).start()
+        except Exception as e:
+            logger.debug(f"[WebChannel] auto-tts dispatch skipped: {e}")
+
+    def _synthesize_tts_async(
+        self,
+        request_id: str,
+        session_id: str,
+        text: str,
+    ) -> None:
+        try:
+            from bridge.bridge import Bridge
+            reply = Bridge().fetch_text_to_voice(text)
+            if reply is None or reply.type != ReplyType.VOICE or not reply.content:
+                logger.warning(
+                    f"[WebChannel] TTS produced no audio for request {request_id}: "
+                    f"reply={reply}"
+                )
+                return
+            url = self._publish_tts_audio(reply.content)
+            if not url:
+                logger.warning(f"[WebChannel] TTS publish failed for request {request_id}")
+                return
+            payload = {"audio": {"url": url, "kind": "tts"}}
+            try:
+                from agent.memory import get_conversation_store
+                get_conversation_store().attach_extras_to_last_assistant(session_id, payload)
+            except Exception as e:
+                logger.debug(f"[WebChannel] tts persist skipped: {e}")
+            q = self.sse_queues.get(request_id)
+            if q is None:
+                logger.warning(
+                    f"[WebChannel] TTS ready but SSE queue already closed "
+                    f"for request {request_id} (url={url})"
+                )
+                return
+            q.put({
+                "type": "voice_attach",
+                "url": url,
+                "request_id": request_id,
+                "timestamp": time.time(),
+            })
+            logger.info(f"[WebChannel] TTS voice_attach pushed for request {request_id}: {url}")
+        except Exception as e:
+            # TTS failures are intentionally silent (no user-facing error).
+            logger.warning(f"[WebChannel] TTS synthesis failed: {e}")
+
+    @staticmethod
+    def _publish_tts_audio(src_path: str) -> str:
+        """Move a TTS file into uploads/ and return its public URL."""
+        try:
+            if not src_path or not os.path.isfile(src_path):
+                logger.warning(f"[WebChannel] publish_tts_audio missing source: {src_path!r}")
+                return ""
+            ext = os.path.splitext(src_path)[1].lower() or ".mp3"
+            upload_dir = _get_upload_dir()
+            os.makedirs(upload_dir, exist_ok=True)
+            ts = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
+            dst_name = f"voice_reply_{ts}_{random.randint(0, 9999)}{ext}"
+            dst_path = os.path.join(upload_dir, dst_name)
+            shutil.move(src_path, dst_path)
+            logger.debug(f"[WebChannel] publish_tts_audio moved {src_path} -> {dst_path}")
+            return f"/uploads/{dst_name}"
+        except Exception as e:
+            logger.warning(f"[WebChannel] publish_tts_audio failed: {e}")
+            return ""
+
+    @staticmethod
+    def _cleanup_stale_voice_recordings(max_age_seconds: int = 3600) -> None:
+        """Drop voice_input_* uploads older than max_age_seconds (run at startup)."""
+        try:
+            upload_dir = _get_upload_dir()
+            if not os.path.isdir(upload_dir):
+                return
+            now = time.time()
+            removed = 0
+            for name in os.listdir(upload_dir):
+                if not name.startswith("voice_input_"):
+                    continue
+                full = os.path.join(upload_dir, name)
+                try:
+                    if not os.path.isfile(full):
+                        continue
+                    if now - os.path.getmtime(full) > max_age_seconds:
+                        os.remove(full)
+                        removed += 1
+                except OSError:
+                    continue
+            if removed:
+                logger.info(f"[WebChannel] cleaned up {removed} stale voice recording(s) from {upload_dir}")
+        except Exception as e:
+            logger.warning(f"[WebChannel] voice cleanup failed: {e}")
+
     def upload_file(self):
         """Handle file or directory upload via multipart/form-data."""
         try:
@@ -557,6 +772,29 @@ class WebChannel(ChatChannel):
             prompt = json_data.get('message', '')
             use_sse = json_data.get('stream', True)
             attachments = json_data.get('attachments', [])
+            # Tag the message as originating from voice input so the post-reply
+            # TTS hook can honour the `voice_if_voice` policy (mirrors the
+            # desire_rtype concept used by other channels).
+            is_voice_input = bool(json_data.get('is_voice', False))
+
+            # Fast path for /cancel: bypass the session queue and SSE setup.
+            # Web frontend (stream=true) only listens to SSE, so we return an
+            # inline_reply payload to be rendered synchronously.
+            stripped_prompt = (prompt or "").strip().lower()
+            if stripped_prompt == "/cancel":
+                from agent.protocol import get_cancel_registry
+                cancelled = get_cancel_registry().cancel_session(session_id)
+                lang = (json_data.get('lang') or 'zh').lower()
+                msg_text = _cancel_reply_text(cancelled, lang)
+                logger.info(
+                    f"[WebChannel] /cancel fast-path: session={session_id}, cancelled={cancelled}, lang={lang}"
+                )
+                return json.dumps({
+                    "status": "success",
+                    "request_id": "",
+                    "stream": False,
+                    "inline_reply": msg_text,
+                })
 
             # Append file references to the prompt (same format as QQ channel)
             if attachments:
@@ -607,6 +845,11 @@ class WebChannel(ChatChannel):
             context["session_id"] = session_id
             context["receiver"] = session_id
             context["request_id"] = request_id
+            if is_voice_input:
+                # Web channel runs its own TTS post-pipeline via
+                # _maybe_dispatch_auto_tts; don't set desire_rtype here or
+                # chat_channel would synthesize a duplicate VOICE reply.
+                context["is_voice_input"] = True
 
             if use_sse:
                 context["on_event"] = self._make_sse_callback(request_id)
@@ -634,28 +877,98 @@ class WebChannel(ChatChannel):
         q = self.sse_queues[request_id]
         idle_timeout = 600  # 10 minutes without any real event
         deadline = time.time() + idle_timeout
-        done = False
+        # After the main reply is done we keep the stream open for a short
+        # tail so async post-processing (TTS auto-synthesis) can deliver a
+        # `voice_attach` event before the client disconnects.
+        POST_DONE_TAIL_SECONDS = 60
+        post_done = False
+        post_deadline = 0.0
 
         try:
             while time.time() < deadline:
                 try:
                     item = q.get(timeout=1)
                 except Empty:
+                    if post_done and time.time() >= post_deadline:
+                        break
                     yield b": keepalive\n\n"
                     continue
 
-                # Real event received, reset idle deadline
                 deadline = time.time() + idle_timeout
-
                 payload = json.dumps(item, ensure_ascii=False)
                 yield f"data: {payload}\n\n".encode("utf-8")
 
-                if item.get("type") == "done":
-                    done = True
-                    break
+                itype = item.get("type")
+                if itype == "done":
+                    post_done = True
+                    post_deadline = time.time() + POST_DONE_TAIL_SECONDS
+                elif itype == "cancelled":
+                    # Close SSE tail quickly after cancel; don't wait for the
+                    # full TTS tail since the user already pressed Stop.
+                    post_done = True
+                    post_deadline = time.time() + 3
+                elif itype == "voice_attach":
+                    # WSGI buffers the previous chunk until the next yield;
+                    # shrink the tail so the generator wakes up quickly to
+                    # emit a couple of keepalive comments that push the
+                    # voice_attach payload through to the browser.
+                    post_done = True
+                    post_deadline = time.time() + 2  # 2s post-attach tail
         finally:
-            if done:
-                self.sse_queues.pop(request_id, None)
+            self.sse_queues.pop(request_id, None)
+
+    def cancel_request(self):
+        """
+        Cancel an in-flight agent run.
+
+        Body: {"request_id": "...", "session_id": "..."}
+        Either field is sufficient; request_id is preferred when known.
+        Always returns success even when nothing was running, so the
+        client's UX is idempotent.
+        """
+        try:
+            from agent.protocol import get_cancel_registry
+
+            data = web.data()
+            try:
+                json_data = json.loads(data) if data else {}
+            except Exception:
+                json_data = {}
+
+            request_id = (json_data.get("request_id") or "").strip()
+            session_id = (json_data.get("session_id") or "").strip()
+            lang = (json_data.get("lang") or "zh").lower()
+
+            registry = get_cancel_registry()
+            cancelled = 0
+
+            if request_id:
+                if registry.cancel_request(request_id):
+                    cancelled = 1
+
+            if cancelled == 0 and session_id:
+                cancelled = registry.cancel_session(session_id)
+
+            if request_id and request_id in self.sse_queues:
+                self.sse_queues[request_id].put({
+                    "type": "cancelled",
+                    "content": "Cancelled" if lang.startswith("en") else "已中止",
+                    "request_id": request_id,
+                    "timestamp": time.time(),
+                })
+
+            logger.info(
+                f"[WebChannel] cancel request: request_id={request_id!r}, "
+                f"session_id={session_id!r}, cancelled={cancelled}"
+            )
+            return json.dumps({
+                "status": "success",
+                "cancelled": cancelled,
+            })
+
+        except Exception as e:
+            logger.error(f"[WebChannel] cancel_request error: {e}")
+            return json.dumps({"status": "error", "message": str(e)})
 
     def poll_response(self):
         """
@@ -703,6 +1016,8 @@ class WebChannel(ChatChannel):
         port = conf().get("web_port", 9899)
         is_public_bind = host in ("0.0.0.0", "::")
 
+        self._cleanup_stale_voice_recordings()
+
         # 打印可用渠道类型提示
         logger.info(
             "[WebChannel] 全部可用通道如下，可修改 config.json 配置文件中的 channel_type 字段进行切换，多个通道用逗号分隔：")
@@ -747,10 +1062,14 @@ class WebChannel(ChatChannel):
             '/upload', 'UploadHandler',
             '/uploads/(.*)', 'UploadsHandler',
             '/api/file', 'FileServeHandler',
+            '/api/voice/asr', 'VoiceAsrHandler',
+            '/api/voice/tts', 'VoiceTtsHandler',
             '/poll', 'PollHandler',
             '/stream', 'StreamHandler',
+            '/cancel', 'CancelHandler',
             '/chat', 'ChatHandler',
             '/config', 'ConfigHandler',
+            '/api/models', 'ModelsHandler',
             '/api/channels', 'ChannelsHandler',
             '/api/weixin/qrlogin', 'WeixinQrHandler',
             '/api/feishu/register', 'FeishuRegisterHandler',
@@ -839,8 +1158,8 @@ class AuthLoginHandler:
             data = json.loads(web.data())
         except Exception:
             return json.dumps({"status": "error", "message": "Invalid request"})
-        password = data.get("password", "")
-        expected = conf().get("web_password", "")
+        password = str(data.get("password", "") or "")
+        expected = _get_web_password()
         if not hmac.compare_digest(password, expected):
             logger.warning("[WebChannel] Invalid login attempt")
             return json.dumps({"status": "error", "message": "Wrong password"})
@@ -870,6 +1189,103 @@ class UploadHandler:
         return WebChannel().upload_file()
 
 
+class VoiceAsrHandler:
+    """Receive a mic recording, persist it under uploads/ and run ASR.
+    Returns {status, text, audio_url} so the UI can render a playback bubble."""
+    def POST(self):
+        _require_auth()
+        web.header('Content-Type', 'application/json; charset=utf-8')
+
+        saved_path = None
+        try:
+            params = _raw_web_input()
+            file_obj = params.get("file")
+            if file_obj is None:
+                return json.dumps({"status": "error", "message": "no audio file"})
+
+            filename = getattr(file_obj, "filename", "") or "recording.webm"
+            ext = os.path.splitext(filename)[1].lower() or ".webm"
+            if ext not in (".webm", ".ogg", ".opus", ".mp4", ".m4a", ".mp3", ".wav"):
+                ext = ".webm"
+
+            upload_dir = _get_upload_dir()
+            os.makedirs(upload_dir, exist_ok=True)
+            ts = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
+            saved_name = f"voice_input_{ts}_{random.randint(0, 9999)}{ext}"
+            saved_path = os.path.join(upload_dir, saved_name)
+            with open(saved_path, "wb") as f:
+                f.write(file_obj.file.read() if hasattr(file_obj, "file") else file_obj.value)
+
+            audio_url = f"/uploads/{saved_name}"
+
+            from bridge.bridge import Bridge
+            reply = Bridge().fetch_voice_to_text(saved_path)
+            if reply is None:
+                return json.dumps({
+                    "status": "error",
+                    "message": "ASR returned no reply",
+                    "audio_url": audio_url,
+                })
+
+            from bridge.reply import ReplyType
+            if reply.type == ReplyType.TEXT:
+                return json.dumps({
+                    "status": "success",
+                    "text": reply.content or "",
+                    "audio_url": audio_url,
+                })
+            return json.dumps({
+                "status": "error",
+                "message": reply.content or "ASR failed",
+                "audio_url": audio_url,
+            })
+        except Exception as e:
+            logger.exception(f"[VoiceAsrHandler] failed: {e}")
+            return json.dumps({"status": "error", "message": str(e)})
+
+
+class VoiceTtsHandler:
+    """On-demand TTS for the in-chat "read aloud" button. Returns the
+    audio URL and (when session_id is given) persists it onto the message."""
+    def POST(self):
+        _require_auth()
+        web.header('Content-Type', 'application/json; charset=utf-8')
+        try:
+            data = json.loads(web.data() or b"{}")
+            text = (data.get("text") or "").strip()
+            session_id = (data.get("session_id") or "").strip()
+            if not text:
+                return json.dumps({"status": "error", "message": "empty text"})
+            # `@singleton` makes WebChannel a factory function — go via instance.
+            channel = WebChannel()
+            if not channel._tts_provider_ready():
+                return json.dumps({"status": "error", "message": "tts not configured"})
+
+            from bridge.bridge import Bridge
+            reply = Bridge().fetch_text_to_voice(text)
+            if reply is None or reply.type != ReplyType.VOICE or not reply.content:
+                msg = getattr(reply, "content", "") or "tts failed"
+                return json.dumps({"status": "error", "message": str(msg)})
+
+            url = channel._publish_tts_audio(reply.content)
+            if not url:
+                return json.dumps({"status": "error", "message": "publish failed"})
+
+            if session_id:
+                try:
+                    from agent.memory import get_conversation_store
+                    get_conversation_store().attach_extras_to_last_assistant(
+                        session_id, {"audio": {"url": url, "kind": "tts"}},
+                    )
+                except Exception as e:
+                    logger.debug(f"[VoiceTtsHandler] persist skipped: {e}")
+
+            return json.dumps({"status": "success", "audio_url": url})
+        except Exception as e:
+            logger.exception(f"[VoiceTtsHandler] failed: {e}")
+            return json.dumps({"status": "error", "message": str(e)})
+
+
 class UploadsHandler:
     def GET(self, file_name):
         _require_auth()
@@ -900,7 +1316,20 @@ class FileServeHandler:
             file_path = params.path
             if not file_path or not os.path.isabs(file_path):
                 raise web.notfound()
-            file_path = os.path.normpath(file_path)
+            # Resolve symlinks and confine access to the allowed root dirs,
+            # so this endpoint can't be abused to read arbitrary files (e.g. /etc/passwd, ~/.ssh).
+            # Defaults to the user home dir plus the agent workspace; set web_file_serve_root="/"
+            # to allow the whole filesystem.
+            file_path = os.path.realpath(file_path)
+            serve_root = conf().get("web_file_serve_root", "~") or "~"
+            allowed_roots = [
+                os.path.realpath(os.path.expanduser(serve_root)),
+                os.path.realpath(os.path.expanduser(conf().get("agent_workspace", "~/cow"))),
+            ]
+            if os.sep not in allowed_roots and not any(
+                os.path.commonpath([file_path, root]) == root for root in allowed_roots
+            ):
+                raise web.notfound()
             if not os.path.isfile(file_path):
                 raise web.notfound()
             content_type = mimetypes.guess_type(file_path)[0] or "application/octet-stream"
@@ -924,6 +1353,12 @@ class PollHandler:
         return WebChannel().poll_response()
 
 
+class CancelHandler:
+    def POST(self):
+        _require_auth()
+        return WebChannel().cancel_request()
+
+
 class StreamHandler:
     def GET(self):
         _require_auth()
@@ -958,14 +1393,15 @@ class ConfigHandler:
     _RECOMMENDED_MODELS = [
         const.DEEPSEEK_V4_FLASH, const.DEEPSEEK_V4_PRO, const.DEEPSEEK_CHAT, const.DEEPSEEK_REASONER,
         const.MINIMAX_M2_7_HIGHSPEED, const.MINIMAX_M2_7, const.MINIMAX_M2_5, const.MINIMAX_M2_1, const.MINIMAX_M2_1_LIGHTNING,
-        const.CLAUDE_4_6_SONNET, const.CLAUDE_4_7_OPUS, const.CLAUDE_4_6_OPUS, const.CLAUDE_4_5_SONNET,
-        const.GEMINI_31_FLASH_LITE_PRE, const.GEMINI_31_PRO_PRE, const.GEMINI_3_FLASH_PRE,
-        const.GPT_54, const.GPT_54_MINI, const.GPT_54_NANO, const.GPT_5, const.GPT_41, const.GPT_4o,
+        const.CLAUDE_4_8_OPUS, const.CLAUDE_4_7_OPUS, const.CLAUDE_4_6_SONNET, const.CLAUDE_4_6_OPUS, const.CLAUDE_4_5_SONNET,
+        const.GEMINI_35_FLASH, const.GEMINI_31_FLASH_LITE_PRE, const.GEMINI_31_PRO_PRE, const.GEMINI_3_FLASH_PRE,
+        const.GPT_55, const.GPT_54, const.GPT_54_MINI, const.GPT_54_NANO, const.GPT_5, const.GPT_41, const.GPT_4o,
         const.GLM_5_1, const.GLM_5_TURBO, const.GLM_5, const.GLM_4_7,
-        const.QWEN36_PLUS, const.QWEN35_PLUS, const.QWEN3_MAX,
+        const.QWEN36_PLUS, const.QWEN37_MAX, const.QWEN35_PLUS, const.QWEN3_MAX,
         const.DOUBAO_SEED_2_PRO, const.DOUBAO_SEED_2_CODE,
         const.KIMI_K2_6, const.KIMI_K2_5, const.KIMI_K2,
         const.ERNIE_5_1, const.ERNIE_5, const.ERNIE_X1_1, const.ERNIE_45_TURBO_128K, const.ERNIE_45_TURBO_32K,
+        const.MIMO_V2_5_PRO, const.MIMO_V2_5,
     ]
 
     # Generic placeholder hints surfaced in the web console. We deliberately
@@ -1002,7 +1438,7 @@ class ConfigHandler:
             "api_base_key": "claude_api_base",
             "api_base_default": "https://api.anthropic.com/v1",
             "api_base_placeholder": _PLACEHOLDER_V1,
-            "models": [const.CLAUDE_4_6_SONNET, const.CLAUDE_4_7_OPUS, const.CLAUDE_4_6_OPUS, const.CLAUDE_4_5_SONNET],
+            "models": [const.CLAUDE_4_8_OPUS, const.CLAUDE_4_7_OPUS, const.CLAUDE_4_6_SONNET, const.CLAUDE_4_6_OPUS, const.CLAUDE_4_5_SONNET],
         }),
         ("gemini", {
             "label": "Gemini",
@@ -1010,7 +1446,7 @@ class ConfigHandler:
             "api_base_key": "gemini_api_base",
             "api_base_default": "https://generativelanguage.googleapis.com",
             "api_base_placeholder": _PLACEHOLDER_GEMINI,
-            "models": [const.GEMINI_31_FLASH_LITE_PRE, const.GEMINI_31_PRO_PRE, const.GEMINI_3_FLASH_PRE],
+            "models": [const.GEMINI_35_FLASH, const.GEMINI_31_FLASH_LITE_PRE, const.GEMINI_31_PRO_PRE, const.GEMINI_3_FLASH_PRE],
         }),
         ("openai", {
             "label": "OpenAI",
@@ -1018,10 +1454,10 @@ class ConfigHandler:
             "api_base_key": "open_ai_api_base",
             "api_base_default": "https://api.openai.com/v1",
             "api_base_placeholder": _PLACEHOLDER_V1,
-            "models": [const.GPT_54, const.GPT_54_MINI, const.GPT_54_NANO, const.GPT_5, const.GPT_41, const.GPT_4o],
+            "models": [const.GPT_55, const.GPT_54, const.GPT_54_MINI, const.GPT_54_NANO, const.GPT_5, const.GPT_41, const.GPT_4o],
         }),
         ("zhipu", {
-            "label": "智谱AI",
+            "label": {"zh": "智谱AI", "en": "GLM"},
             "api_key_field": "zhipu_ai_api_key",
             "api_base_key": "zhipu_ai_api_base",
             "api_base_default": "https://open.bigmodel.cn/api/paas/v4",
@@ -1029,15 +1465,15 @@ class ConfigHandler:
             "models": [const.GLM_5_1, const.GLM_5_TURBO, const.GLM_5, const.GLM_4_7],
         }),
         ("dashscope", {
-            "label": "通义千问",
+            "label": {"zh": "通义千问", "en": "Qwen"},
             "api_key_field": "dashscope_api_key",
             "api_base_key": None,
             "api_base_default": None,
             "api_base_placeholder": "",
-            "models": [const.QWEN36_PLUS, const.QWEN35_PLUS, const.QWEN3_MAX],
+            "models": [const.QWEN36_PLUS, const.QWEN37_MAX, const.QWEN35_PLUS, const.QWEN3_MAX],
         }),
         ("doubao", {
-            "label": "豆包",
+            "label": {"zh": "豆包", "en": "Doubao"},
             "api_key_field": "ark_api_key",
             "api_base_key": "ark_base_url",
             "api_base_default": "https://ark.cn-beijing.volces.com/api/v3",
@@ -1053,20 +1489,20 @@ class ConfigHandler:
             "models": [const.KIMI_K2_6, const.KIMI_K2_5, const.KIMI_K2],
         }),
         ("qianfan", {
-            "label": "百度千帆",
+            "label": {"zh": "百度千帆", "en": "ERNIE"},
             "api_key_field": "qianfan_api_key",
             "api_base_key": "qianfan_api_base",
             "api_base_default": "https://qianfan.baidubce.com/v2",
             "api_base_placeholder": _PLACEHOLDER_QIANFAN,
             "models": [const.ERNIE_5_1, const.ERNIE_5, const.ERNIE_X1_1, const.ERNIE_45_TURBO_128K, const.ERNIE_45_TURBO_32K],
         }),
-        ("modelscope", {
-            "label": "ModelScope",
-            "api_key_field": "modelscope_api_key",
-            "api_base_key": None,
-            "api_base_default": None,
-            "api_base_placeholder": "",
-            "models": [const.QWEN3_5_27B, const.QWEN3_235B_A22B_INSTRUCT_2507],
+        ("mimo", {
+            "label": {"zh": "小米 MiMo", "en": "MiMo"},
+            "api_key_field": "mimo_api_key",
+            "api_base_key": "mimo_api_base",
+            "api_base_default": "https://api.xiaomimimo.com/v1",
+            "api_base_placeholder": _PLACEHOLDER_V1,
+            "models": [const.MIMO_V2_5_PRO, const.MIMO_V2_5],
         }),
         ("linkai", {
             "label": "LinkAI",
@@ -1077,7 +1513,7 @@ class ConfigHandler:
             "models": _RECOMMENDED_MODELS,
         }),
         ("custom", {
-            "label": "自定义",
+            "label": {"zh": "自定义", "en": "Custom"},
             "api_key_field": "custom_api_key",
             "api_base_key": "custom_api_base",
             "api_base_default": "",
@@ -1089,10 +1525,10 @@ class ConfigHandler:
     EDITABLE_KEYS = {
         "model", "bot_type", "use_linkai",
         "open_ai_api_base", "deepseek_api_base", "qianfan_api_base", "claude_api_base", "gemini_api_base",
-        "zhipu_ai_api_base", "moonshot_base_url", "ark_base_url", "custom_api_base",
+        "zhipu_ai_api_base", "moonshot_base_url", "ark_base_url", "custom_api_base", "mimo_api_base",
         "open_ai_api_key", "deepseek_api_key", "qianfan_api_key", "claude_api_key", "gemini_api_key",
         "zhipu_ai_api_key", "dashscope_api_key", "moonshot_api_key",
-        "ark_api_key", "minimax_api_key", "linkai_api_key", "custom_api_key",
+        "ark_api_key", "minimax_api_key", "linkai_api_key", "custom_api_key", "mimo_api_key",
         "agent_max_context_tokens", "agent_max_context_turns", "agent_max_steps",
         "enable_thinking", "web_password",
     }
@@ -1134,7 +1570,7 @@ class ConfigHandler:
                     "api_key_field": p.get("api_key_field"),
                 }
 
-            raw_pwd = local_config.get("web_password", "")
+            raw_pwd = str(local_config.get("web_password", "") or "")
             masked_pwd = ("*" * len(raw_pwd)) if raw_pwd else ""
 
             return json.dumps({
@@ -1213,6 +1649,1209 @@ class ConfigHandler:
             return json.dumps({"status": "error", "message": str(e)})
 
 
+class ModelsHandler:
+    """API for the unified Models console.
+
+    Layered model:
+      Layer 1 (providers): vendor credentials shared across capabilities.
+                            Stored as flat *_api_key / *_api_base fields in
+                            config.json — the same fields ConfigHandler
+                            already manages.
+      Layer 2 (capabilities): which provider/model is used by chat / vision /
+                            asr / tts / embedding / image / search.
+
+    GET  /api/models           -> overview (providers + capabilities)
+    POST /api/models/provider  -> upsert a vendor credential
+    DELETE /api/models/provider -> clear a vendor credential
+    POST /api/models/capability -> set provider/model for a capability
+    """
+
+    # Capability -> provider ids drawn from ConfigHandler.PROVIDER_MODELS.
+    _ASR_PROVIDERS = ["openai", "dashscope", "zhipu", "linkai"]
+    # Web-console white-list. Other vendors stay usable via direct config.
+    _TTS_PROVIDERS = ["openai", "minimax", "dashscope", "mimo", "linkai"]
+
+    # TTS engine catalog (speech models, not voice timbres). Entries are
+    # either a bare code or {value, hint?} when a friendly label helps.
+    _TTS_PROVIDER_MODELS = {
+        "openai":    ["tts-1", "tts-1-hd", "gpt-4o-mini-tts"],
+        "minimax": [
+            {"value": "speech-2.8-hd",    "hint": "情绪渲染融合语气词,自然听感"},
+            {"value": "speech-2.8-turbo", "hint": "极致生成速度,更自然逼真"},
+            {"value": "speech-2.6-hd",    "hint": "超低延时,归一化升级"},
+            {"value": "speech-2.6-turbo", "hint": "更快更便宜,适合语音聊天/数字人"},
+        ],
+        "dashscope": [
+            {"value": "qwen3-tts-flash", "hint": "覆盖普通话、方言与主流外语"},
+        ],
+        # 小米 MiMo TTS 系列，通过 chat completions 接口合成
+        "mimo": [
+            {"value": "mimo-v2.5-tts", "hint": "预置音色 · 支持唱歌模式"},
+        ],
+        # Aggregating gateway: a single endpoint multiplexes several
+        # underlying TTS engines, selected via the `model` field.
+        # Each engine exposes its own voice catalog (see _TTS_PROVIDER_VOICES).
+        "linkai": [
+            {"value": "tts-1",  "hint": "OpenAI · 多语种通用"},
+            {"value": "doubao", "hint": "字节豆包 · 中文音色丰富"},
+            {"value": "baidu",  "hint": "百度 · 中文主播音色"},
+        ],
+    }
+
+    # Per-provider voice timbres. Entries can be a bare code string
+    # (label = code) or {value, hint?} when a friendly secondary label
+    # helps recognition. We keep `value` as the raw API code so power
+    # users can cross-reference config.json.
+    _TTS_PROVIDER_VOICES = {
+        "openai":    [
+            "alloy", "echo", "fable", "onyx", "nova", "shimmer",
+            "ash", "ballad", "coral", "sage", "verse",
+        ],
+        "minimax": [
+            # Mandarin Chinese (full catalog)
+            {"value": "male-qn-qingse",                           "hint": "中文 · 青涩青年（男）"},
+            {"value": "male-qn-jingying",                         "hint": "中文 · 精英青年（男）"},
+            {"value": "male-qn-badao",                            "hint": "中文 · 霸道青年（男）"},
+            {"value": "male-qn-daxuesheng",                       "hint": "中文 · 青年大学生（男）"},
+            {"value": "female-shaonv",                            "hint": "中文 · 少女（女）"},
+            {"value": "female-yujie",                             "hint": "中文 · 御姐（女）"},
+            {"value": "female-chengshu",                          "hint": "中文 · 成熟女性（女）"},
+            {"value": "female-tianmei",                           "hint": "中文 · 甜美女性（女）"},
+            {"value": "male-qn-qingse-jingpin",                   "hint": "中文 · 青涩青年-beta（男）"},
+            {"value": "male-qn-jingying-jingpin",                 "hint": "中文 · 精英青年-beta（男）"},
+            {"value": "male-qn-badao-jingpin",                    "hint": "中文 · 霸道青年-beta（男）"},
+            {"value": "male-qn-daxuesheng-jingpin",               "hint": "中文 · 青年大学生-beta（男）"},
+            {"value": "female-shaonv-jingpin",                    "hint": "中文 · 少女-beta（女）"},
+            {"value": "female-yujie-jingpin",                     "hint": "中文 · 御姐-beta（女）"},
+            {"value": "female-chengshu-jingpin",                  "hint": "中文 · 成熟女性-beta（女）"},
+            {"value": "female-tianmei-jingpin",                   "hint": "中文 · 甜美女性-beta（女）"},
+            {"value": "clever_boy",                               "hint": "中文 · 聪明男童"},
+            {"value": "cute_boy",                                 "hint": "中文 · 可爱男童"},
+            {"value": "lovely_girl",                              "hint": "中文 · 萌萌女童"},
+            {"value": "cartoon_pig",                              "hint": "中文 · 卡通猪小琪"},
+            {"value": "bingjiao_didi",                            "hint": "中文 · 病娇弟弟"},
+            {"value": "junlang_nanyou",                           "hint": "中文 · 俊朗男友"},
+            {"value": "chunzhen_xuedi",                           "hint": "中文 · 纯真学弟"},
+            {"value": "lengdan_xiongzhang",                       "hint": "中文 · 冷淡学长"},
+            {"value": "badao_shaoye",                             "hint": "中文 · 霸道少爷"},
+            {"value": "tianxin_xiaoling",                         "hint": "中文 · 甜心小玲"},
+            {"value": "qiaopi_mengmei",                           "hint": "中文 · 俏皮萌妹"},
+            {"value": "wumei_yujie",                              "hint": "中文 · 妩媚御姐"},
+            {"value": "diadia_xuemei",                            "hint": "中文 · 嗲嗲学妹"},
+            {"value": "danya_xuejie",                             "hint": "中文 · 淡雅学姐"},
+            {"value": "Chinese (Mandarin)_Reliable_Executive",    "hint": "中文 · 沉稳高管"},
+            {"value": "Chinese (Mandarin)_News_Anchor",           "hint": "中文 · 新闻女声"},
+            {"value": "Chinese (Mandarin)_Mature_Woman",          "hint": "中文 · 傲娇御姐"},
+            {"value": "Chinese (Mandarin)_Unrestrained_Young_Man","hint": "中文 · 不羁青年"},
+            {"value": "Arrogant_Miss",                            "hint": "中文 · 嚣张小姐"},
+            {"value": "Robot_Armor",                              "hint": "中文 · 机械战甲"},
+            {"value": "Chinese (Mandarin)_Kind-hearted_Antie",    "hint": "中文 · 热心大婶"},
+            {"value": "Chinese (Mandarin)_HK_Flight_Attendant",   "hint": "中文 · 港普空姐"},
+            {"value": "Chinese (Mandarin)_Humorous_Elder",        "hint": "中文 · 搞笑大爷"},
+            {"value": "Chinese (Mandarin)_Gentleman",             "hint": "中文 · 温润男声"},
+            {"value": "Chinese (Mandarin)_Warm_Bestie",           "hint": "中文 · 温暖闺蜜"},
+            {"value": "Chinese (Mandarin)_Male_Announcer",        "hint": "中文 · 播报男声"},
+            {"value": "Chinese (Mandarin)_Sweet_Lady",            "hint": "中文 · 甜美女声"},
+            {"value": "Chinese (Mandarin)_Southern_Young_Man",    "hint": "中文 · 南方小哥"},
+            {"value": "Chinese (Mandarin)_Wise_Women",            "hint": "中文 · 阅历姐姐"},
+            {"value": "Chinese (Mandarin)_Gentle_Youth",          "hint": "中文 · 温润青年"},
+            {"value": "Chinese (Mandarin)_Warm_Girl",             "hint": "中文 · 温暖少女"},
+            {"value": "Chinese (Mandarin)_Kind-hearted_Elder",    "hint": "中文 · 花甲奶奶"},
+            {"value": "Chinese (Mandarin)_Cute_Spirit",           "hint": "中文 · 憨憨萌兽"},
+            {"value": "Chinese (Mandarin)_Radio_Host",            "hint": "中文 · 电台男主播"},
+            {"value": "Chinese (Mandarin)_Lyrical_Voice",         "hint": "中文 · 抒情男声"},
+            {"value": "Chinese (Mandarin)_Straightforward_Boy",   "hint": "中文 · 率真弟弟"},
+            {"value": "Chinese (Mandarin)_Sincere_Adult",         "hint": "中文 · 真诚青年"},
+            {"value": "Chinese (Mandarin)_Gentle_Senior",         "hint": "中文 · 温柔学姐"},
+            {"value": "Chinese (Mandarin)_Stubborn_Friend",       "hint": "中文 · 嘴硬竹马"},
+            {"value": "Chinese (Mandarin)_Crisp_Girl",            "hint": "中文 · 清脆少女"},
+            {"value": "Chinese (Mandarin)_Pure-hearted_Boy",      "hint": "中文 · 清澈邻家弟弟"},
+            {"value": "Chinese (Mandarin)_Soft_Girl",             "hint": "中文 · 柔和少女"},
+            # Cantonese (full catalog)
+            {"value": "Cantonese_ProfessionalHost（F)",            "hint": "粤语 · 专业女主持"},
+            {"value": "Cantonese_GentleLady",                     "hint": "粤语 · 温柔女声"},
+            {"value": "Cantonese_ProfessionalHost（M)",            "hint": "粤语 · 专业男主持"},
+            {"value": "Cantonese_PlayfulMan",                     "hint": "粤语 · 活泼男声"},
+            {"value": "Cantonese_CuteGirl",                       "hint": "粤语 · 可爱女孩"},
+            {"value": "Cantonese_KindWoman",                      "hint": "粤语 · 善良女声"},
+            # English (curated: 1F + 1M)
+            {"value": "English_Graceful_Lady",                    "hint": "英文 · Graceful Lady（女）"},
+            {"value": "English_Trustworthy_Man",                  "hint": "英文 · Trustworthy Man（男）"},
+            # Japanese (curated: 1F + 1M)
+            {"value": "Japanese_KindLady",                        "hint": "日文 · Kind Lady（女）"},
+            {"value": "Japanese_LoyalKnight",                     "hint": "日文 · Loyal Knight（男）"},
+            # Korean (curated: 1F + 1M)
+            {"value": "Korean_SweetGirl",                         "hint": "韩文 · Sweet Girl（女）"},
+            {"value": "Korean_CheerfulBoyfriend",                 "hint": "韩文 · Cheerful Boyfriend（男）"},
+        ],
+        "dashscope": [
+            {"value": "Cherry",   "hint": "芊悦 · 阳光女声"},
+            {"value": "Serena",   "hint": "苏瑶 · 温柔女声"},
+            {"value": "Chelsie",  "hint": "千雪 · 二次元少女"},
+            {"value": "Ethan",    "hint": "晨煦 · 阳光男声"},
+            {"value": "Moon",     "hint": "月白 · 率性男声"},
+            {"value": "Kai",      "hint": "凯 · 治愈男声"},
+            {"value": "Nofish",   "hint": "不吃鱼 · 设计师男声"},
+            {"value": "Bella",    "hint": "萌宝 · 小萝莉"},
+            {"value": "Bunny",    "hint": "萌小姬 · 萌系少女"},
+            {"value": "Stella",   "hint": "少女阿月 · 元气少女"},
+            {"value": "Neil",     "hint": "阿闻 · 新闻主播"},
+            {"value": "Seren",    "hint": "小婉 · 助眠女声"},
+            {"value": "Jada",     "hint": "上海话 · 阿珍"},
+            {"value": "Dylan",    "hint": "北京话 · 晓东"},
+            {"value": "Sunny",    "hint": "四川话 · 晴儿"},
+            {"value": "Eric",     "hint": "四川话 · 程川"},
+            {"value": "Rocky",    "hint": "粤语 · 阿强"},
+            {"value": "Kiki",     "hint": "粤语 · 阿清"},
+            {"value": "Peter",    "hint": "天津话 · 李彼得"},
+            {"value": "Marcus",   "hint": "陕西话 · 秦川"},
+            {"value": "Roy",      "hint": "闽南语 · 阿杰"},
+        ],
+        # 小米 MiMo 预置音色列表（mimo-v2.5-tts），文档：
+        # https://platform.xiaomimimo.com/docs/zh-CN/usage-guide/speech-synthesis-v2.5
+        "mimo": [
+            {"value": "冰糖",   "hint": "中文 · 女声 · 冰糖"},
+            {"value": "茉莉",   "hint": "中文 · 女声 · 茉莉"},
+            {"value": "苏打",   "hint": "中文 · 男声 · 苏打"},
+            {"value": "白桦",   "hint": "中文 · 男声 · 白桦"},
+            {"value": "Mia",   "hint": "英文 · 女声 · Mia"},
+            {"value": "Chloe", "hint": "英文 · 女声 · Chloe"},
+            {"value": "Milo",  "hint": "英文 · 男声 · Milo"},
+            {"value": "Dean",  "hint": "英文 · 男声 · Dean"},
+        ],
+        # Aggregating gateway: voices are scoped per engine model. The
+        # frontend picks the correct list based on the selected model so
+        # users don't see incompatible timbres for the active engine.
+        "linkai": {
+            "tts-1": [
+                "alloy", "echo", "fable", "onyx", "nova", "shimmer",
+            ],
+            "doubao": [
+                {"value": "zh_female_wanwanxiaohe_moon_bigtts",       "hint": "湾湾小何"},
+                {"value": "BV007_streaming",                          "hint": "亲切女声"},
+                {"value": "BV001_streaming",                          "hint": "通用女声"},
+                {"value": "BV002_streaming",                          "hint": "通用男声"},
+                {"value": "BV051_streaming",                          "hint": "奶气萌娃"},
+                {"value": "zh_female_linjianvhai_moon_bigtts",        "hint": "邻家女孩"},
+                {"value": "BV700_streaming",                          "hint": "灿灿"},
+                {"value": "BV019_streaming",                          "hint": "重庆小伙"},
+                {"value": "BV524_streaming",                          "hint": "日语男声"},
+                {"value": "BV021_streaming",                          "hint": "东北老铁"},
+                {"value": "BV701_streaming",                          "hint": "擎苍"},
+                {"value": "BV113_streaming",                          "hint": "甜宠少御"},
+                {"value": "BV056_streaming",                          "hint": "阳光男声"},
+                {"value": "BV213_streaming",                          "hint": "广西表哥"},
+                {"value": "BV119_streaming",                          "hint": "通用赘婿"},
+                {"value": "BV705_streaming",                          "hint": "炀炀"},
+                {"value": "BV033_streaming",                          "hint": "温柔小哥"},
+                {"value": "BV102_streaming",                          "hint": "儒雅青年"},
+                {"value": "BV522_streaming",                          "hint": "气质女生"},
+                {"value": "BV034_streaming",                          "hint": "知性姐姐 · 双语"},
+                {"value": "BV005_streaming",                          "hint": "活泼女声"},
+                {"value": "zh_female_wanqudashu_moon_bigtts",         "hint": "湾区大叔"},
+                {"value": "zh_female_daimengchuanmei_moon_bigtts",    "hint": "呆萌川妹"},
+                {"value": "zh_male_guozhoudege_moon_bigtts",          "hint": "广州德哥"},
+                {"value": "zh_male_beijingxiaoye_moon_bigtts",        "hint": "北京小爷"},
+                {"value": "zh_male_shaonianzixin_moon_bigtts",        "hint": "少年梓辛 / Brayan"},
+                {"value": "zh_female_meilinvyou_moon_bigtts",         "hint": "魅力女友"},
+                {"value": "zh_male_shenyeboke_moon_bigtts",           "hint": "深夜播客"},
+                {"value": "zh_female_sajiaonvyou_moon_bigtts",        "hint": "柔美女友"},
+                {"value": "zh_female_yuanqinvyou_moon_bigtts",        "hint": "撒娇学妹"},
+                {"value": "zh_male_haoyuxiaoge_moon_bigtts",          "hint": "浩宇小哥"},
+                {"value": "zh_male_guangxiyuanzhou_moon_bigtts",      "hint": "广西远舟"},
+                {"value": "zh_female_meituojieer_moon_bigtts",        "hint": "妹坨洁儿"},
+                {"value": "zh_male_yuzhouzixuan_moon_bigtts",         "hint": "豫州子轩"},
+                {"value": "BV115_streaming",                          "hint": "古风少御"},
+                {"value": "zh_female_gaolengyujie_moon_bigtts",       "hint": "高冷御姐"},
+                {"value": "zh_male_yuanboxiaoshu_moon_bigtts",        "hint": "渊博小叔"},
+                {"value": "zh_male_yangguangqingnian_moon_bigtts",    "hint": "阳光青年"},
+                {"value": "zh_male_aojiaobazong_moon_bigtts",         "hint": "傲娇霸总"},
+                {"value": "zh_male_jingqiangkanye_moon_bigtts",       "hint": "京腔侃爷 / Harmony"},
+                {"value": "zh_female_shuangkuaisisi_moon_bigtts",     "hint": "爽快思思 / Skye"},
+                {"value": "zh_male_wennuanahu_moon_bigtts",           "hint": "温暖阿虎 / Alvin"},
+                {"value": "multi_female_shuangkuaisisi_moon_bigtts",  "hint": "はるこ / Esmeralda"},
+                {"value": "multi_male_jingqiangkanye_moon_bigtts",    "hint": "かずね / Javier or Álvaro"},
+                {"value": "multi_female_gaolengyujie_moon_bigtts",    "hint": "あけみ"},
+                {"value": "multi_male_wanqudashu_moon_bigtts",        "hint": "ひろし / Roberto"},
+                {"value": "ICL_zh_female_bingruoshaonv_tob",          "hint": "病弱少女"},
+                {"value": "ICL_zh_female_huoponvhai_tob",             "hint": "活泼女孩"},
+                {"value": "ICL_zh_female_heainainai_tob",             "hint": "和蔼奶奶"},
+                {"value": "ICL_zh_female_linjuayi_tob",               "hint": "邻居阿姨"},
+                {"value": "zh_female_wenrouxiaoya_moon_bigtts",       "hint": "温柔小雅"},
+                {"value": "zh_female_tianmeixiaoyuan_moon_bigtts",    "hint": "甜美小源"},
+                {"value": "zh_female_qingchezizi_moon_bigtts",        "hint": "清澈梓梓"},
+                {"value": "zh_male_dongfanghaoran_moon_bigtts",       "hint": "东方浩然"},
+                {"value": "zh_male_jieshuoxiaoming_moon_bigtts",      "hint": "解说小明"},
+                {"value": "zh_female_kailangjiejie_moon_bigtts",      "hint": "开朗姐姐"},
+                {"value": "zh_male_linjiananhai_moon_bigtts",         "hint": "邻家男孩"},
+                {"value": "zh_female_tianmeiyueyue_moon_bigtts",      "hint": "甜美悦悦"},
+                {"value": "zh_female_xinlingjitang_moon_bigtts",      "hint": "心灵鸡汤"},
+            ],
+            "baidu": [
+                {"value": "baidu_0",    "hint": "度小美 · 标准女主播"},
+                {"value": "baidu_1",    "hint": "度小宇 · 亲切男声"},
+                {"value": "baidu_3",    "hint": "度逍遥 · 情感男声"},
+                {"value": "baidu_4",    "hint": "度丫丫 · 童声"},
+                {"value": "baidu_5",    "hint": "度小娇 · 成熟女主播"},
+                {"value": "baidu_5003", "hint": "度逍遥 · 情感男声"},
+                {"value": "baidu_5118", "hint": "度小鹿 · 甜美女声"},
+                {"value": "baidu_103",  "hint": "度米朵 · 可爱童声"},
+                {"value": "baidu_106",  "hint": "度博文 · 专业男主播"},
+                {"value": "baidu_110",  "hint": "度小童 · 童声主播"},
+                {"value": "baidu_111",  "hint": "度小萌 · 软萌妹子"},
+                {"value": "baidu_4003", "hint": "度逍遥 · 情感男声"},
+                {"value": "baidu_4100", "hint": "度小雯 · 活力女主播"},
+                {"value": "baidu_4103", "hint": "度米朵 · 可爱女声"},
+                {"value": "baidu_4105", "hint": "度灵儿 · 清澈女声"},
+                {"value": "baidu_4106", "hint": "度博文 · 专业男主播"},
+                {"value": "baidu_4115", "hint": "度小贤 · 电台男主播"},
+                {"value": "baidu_4117", "hint": "度小乔 · 活泼女声"},
+                {"value": "baidu_4119", "hint": "度小鹿 · 甜美女声"},
+                {"value": "baidu_4129", "hint": "度小彦 · 知识男主播"},
+                {"value": "baidu_4140", "hint": "度小新 · 专业女主播"},
+                {"value": "baidu_4143", "hint": "度清风 · 配音男声"},
+                {"value": "baidu_4144", "hint": "度姗姗 · 娱乐女声"},
+                {"value": "baidu_4149", "hint": "度星河 · 广告男声"},
+                {"value": "baidu_4206", "hint": "度博文 · 综艺男声"},
+                {"value": "baidu_4226", "hint": "南方 · 电台女主播"},
+                {"value": "baidu_4254", "hint": "度小清 · 广告女声"},
+                {"value": "baidu_4278", "hint": "度小贝 · 知识女主播"},
+            ],
+        },
+    }
+    _EMBEDDING_PROVIDERS = ["openai", "dashscope", "doubao", "zhipu", "linkai"]
+
+    # Capability-scoped model catalogs. The chat dropdown can reuse the
+    # provider's generic model list, but vision and image generation are
+    # served by a narrower subset that the runtime actually dispatches to —
+    # see agent/tools/vision/vision.py and skills/image-generation/SKILL.md.
+    # Anything not listed here intentionally hides the model dropdown so
+    # users cannot pin a chat-only model and silently get a 4xx at runtime.
+    _VISION_PROVIDER_MODELS = {
+        # OpenAI ordering matches the recommended GPT-5.4 family first, then
+        # GPT-5 and the GPT-4.1/4o backstops.
+        "openai":    [
+            const.GPT_55,
+            const.GPT_54,
+            const.GPT_54_MINI,
+            const.GPT_54_NANO,
+            const.GPT_5,
+            const.GPT_41,
+            const.GPT_41_MINI,
+            const.GPT_4o,
+        ],
+        "doubao":    [const.DOUBAO_SEED_2_PRO],
+        "moonshot":  [const.KIMI_K2_6],
+        "dashscope": [const.QWEN36_PLUS, const.QWEN35_PLUS, const.QWEN3_MAX],
+        "claudeAPI": [const.CLAUDE_4_8_OPUS, const.CLAUDE_4_7_OPUS, const.CLAUDE_4_6_SONNET, const.CLAUDE_4_6_OPUS],
+        "gemini":    [const.GEMINI_35_FLASH, const.GEMINI_31_FLASH_LITE_PRE, const.GEMINI_31_PRO_PRE, const.GEMINI_3_FLASH_PRE],
+        "qianfan":   [const.ERNIE_45_TURBO_VL],
+        # Zhipu's bot hard-codes the call to glm-5v-turbo regardless of what
+        # name is passed in (see models/zhipuai/zhipuai_bot.py::call_vision),
+        # so listing the chat models here would silently route to the same
+        # endpoint. Surface only the model the runtime can truly dispatch to.
+        "zhipu":     [const.GLM_5V_TURBO],
+        # MiniMax's vision endpoint is similarly hard-coded to MiniMax-Text-01
+        # (see models/minimax/minimax_bot.py::call_vision); the M2.x chat
+        # family is text-only.
+        "minimax":   [const.MINIMAX_TEXT_01],
+        # MiMo 原生全模态模型：v2.5-pro / v2.5 支持图像/音频/视频输入
+        "mimo":      [const.MIMO_V2_5_PRO, const.MIMO_V2_5],
+        # LinkAI proxies the underlying vendor; surface a curated set of
+        # multimodal models. Order: gpt-4.1-mini → gpt-5.4-mini as the
+        # cross-vendor baselines, then each vendor's recommended default.
+        "linkai":    [
+            const.GPT_41_MINI,
+            const.GPT_54_MINI,
+            const.QWEN36_PLUS,
+            const.DOUBAO_SEED_2_PRO,
+            const.KIMI_K2_6,
+            const.CLAUDE_4_6_SONNET,
+            const.GEMINI_31_FLASH_LITE_PRE,
+        ],
+    }
+
+    # Image-generation catalog. Source of truth: skills/image-generation/SKILL.md.
+    # Listed verbatim (not via const.*) because these are skill-side names
+    # the script forwards directly to the vendor's image endpoint.
+    #
+    # Two shapes are accepted per model entry:
+    #   - bare string                           → the model id, no hint
+    #   - {"value": ..., "hint": "..."}         → model id + dim secondary
+    #                                             label rendered on the right
+    #                                             of the dropdown row. Useful
+    #                                             for surfacing brand names
+    #                                             (e.g. "Nano Banana 2" next
+    #                                             to gemini-3.1-flash-image-preview).
+    # The skill itself maps either form to the real vendor endpoint, so the
+    # hint is purely cosmetic.
+    _IMAGE_PROVIDER_MODELS = {
+        "openai":    ["gpt-image-2", "gpt-image-1"],
+        "gemini": [
+            {"value": "gemini-3.1-flash-image-preview", "hint": "Nano Banana 2"},
+            {"value": "gemini-3-pro-image-preview",     "hint": "Nano Banana Pro"},
+            {"value": "gemini-2.5-flash-image",         "hint": "Nano Banana"},
+        ],
+        "doubao":    ["seedream-5.0-lite", "seedream-4.5"],
+        "dashscope": ["qwen-image-2.0-pro", "qwen-image-2.0"],
+        "minimax":   ["image-01"],
+        "linkai": [
+            "gpt-image-2",
+            {"value": "gemini-3.1-flash-image-preview", "hint": "Nano Banana 2"},
+            {"value": "gemini-3-pro-image-preview",     "hint": "Nano Banana Pro"},
+            "seedream-5.0-lite",
+        ],
+    }
+
+    @staticmethod
+    def _config_path() -> str:
+        return os.path.join(
+            os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))),
+            "config.json",
+        )
+
+    @classmethod
+    def _read_file_config(cls) -> dict:
+        path = cls._config_path()
+        if not os.path.exists(path):
+            return {}
+        with open(path, "r", encoding="utf-8") as f:
+            return json.load(f)
+
+    @classmethod
+    def _write_file_config(cls, data: dict) -> None:
+        with open(cls._config_path(), "w", encoding="utf-8") as f:
+            json.dump(data, f, indent=4, ensure_ascii=False)
+
+    @staticmethod
+    def _is_real_key(value: str) -> bool:
+        return bool(value) and value not in ("", "YOUR API KEY", "YOUR_API_KEY")
+
+    @classmethod
+    def _provider_overview(cls) -> List[dict]:
+        """All known providers (configured first, unconfigured after).
+        Re-uses ConfigHandler.PROVIDER_MODELS for the canonical list."""
+        local_config = conf()
+        items = []
+        for pid, p in ConfigHandler.PROVIDER_MODELS.items():
+            key_field = p.get("api_key_field")
+            base_field = p.get("api_base_key")
+            raw_key = local_config.get(key_field, "") if key_field else ""
+            raw_base = local_config.get(base_field, "") if base_field else ""
+            configured = cls._is_real_key(raw_key)
+            items.append({
+                "id": pid,
+                "label": p["label"],
+                "configured": configured,
+                "api_key_field": key_field,
+                "api_base_field": base_field,
+                "api_key_masked": ConfigHandler._mask_key(raw_key) if configured else "",
+                "api_base": raw_base or (p.get("api_base_default") or ""),
+                "api_base_default": p.get("api_base_default") or "",
+                "api_base_placeholder": p.get("api_base_placeholder") or "",
+                "models": list(p.get("models") or []),
+            })
+        items.sort(key=lambda it: (0 if it["configured"] else 1, list(ConfigHandler.PROVIDER_MODELS.keys()).index(it["id"])))
+        return items
+
+    @classmethod
+    def _chat_capability(cls, local_config: dict) -> dict:
+        """Main chat model — drives the agent. bot_type maps to a provider id."""
+        bot_type = local_config.get("bot_type") or ""
+        provider_id = "openai" if bot_type == "chatGPT" else bot_type
+        if provider_id not in ConfigHandler.PROVIDER_MODELS and local_config.get("use_linkai"):
+            provider_id = "linkai"
+        return {
+            "editable": True,
+            "current_provider": provider_id,
+            "current_model": local_config.get("model", ""),
+            "providers": list(ConfigHandler.PROVIDER_MODELS.keys()),
+            "use_linkai": bool(local_config.get("use_linkai", False)),
+        }
+
+    # Auto-fallback order for vision when no explicit model is pinned.
+    # Mirrors agent/tools/vision/vision.py::_resolve_providers — DeepSeek and
+    # other text-only chat bots are intentionally absent, since they cannot
+    # actually serve a vision request. Each entry is
+    #   (provider_id, api_key_field, default_vision_model)
+    # and lookups are case-insensitive on the api_key_field. LinkAI and
+    # OpenAI are handled separately below so use_linkai can promote LinkAI
+    # to the front of the chain.
+    _VISION_AUTO_ORDER = [
+        ("moonshot",  "moonshot_api_key",  const.KIMI_K2_6),
+        ("doubao",    "ark_api_key",       const.DOUBAO_SEED_2_PRO),
+        ("dashscope", "dashscope_api_key", const.QWEN36_PLUS),
+        ("claudeAPI", "claude_api_key",    const.CLAUDE_4_6_SONNET),
+        ("gemini",    "gemini_api_key",    const.GEMINI_35_FLASH),
+        ("qianfan",   "qianfan_api_key",   const.ERNIE_45_TURBO_VL),
+        ("zhipu",     "zhipu_ai_api_key",  const.GLM_5V_TURBO),
+        ("minimax",   "minimax_api_key",   const.MINIMAX_TEXT_01),
+        ("mimo",      "mimo_api_key",      const.MIMO_V2_5_PRO),
+    ]
+
+    @classmethod
+    def _predict_vision_auto(cls, local_config: dict) -> dict:
+        """Predict which provider vision.py will actually dispatch to when
+        no tools.vision.model is set. Mirrors the fallback order in
+        agent/tools/vision/vision.py::_resolve_providers so the UI hint
+        matches reality."""
+        chat = cls._chat_capability(local_config)
+        main_provider = chat["current_provider"]
+        main_model = chat["current_model"]
+        use_linkai_flag = bool(local_config.get("use_linkai", False))
+        linkai_configured = cls._is_real_key(local_config.get("linkai_api_key", ""))
+
+        def _try(pid: str, model_default: str):
+            # Look up the api_key for this provider via the canonical
+            # provider table so we don't hardcode field names here.
+            meta = ConfigHandler.PROVIDER_MODELS.get(pid) or {}
+            key_field = meta.get("api_key_field")
+            if not key_field:
+                return None
+            if not cls._is_real_key(local_config.get(key_field, "")):
+                return None
+            # Pick a model that the vision runtime can actually dispatch to
+            # for this provider. Using `main_model` here is unsafe — for
+            # vendors like Zhipu/MiniMax the bot hard-codes the vision model
+            # name regardless of the chat-model name, so surfacing the chat
+            # model name in the hint is misleading. Trust the curated
+            # _VISION_PROVIDER_MODELS list: prefer the main model only if
+            # it appears there; otherwise show the vendor's first vision-
+            # capable model.
+            allowed = cls._VISION_PROVIDER_MODELS.get(pid, [])
+            if pid == main_provider and main_model and main_model in allowed:
+                return {"provider": pid, "model": main_model}
+            fallback = allowed[0] if allowed else model_default
+            return {"provider": pid, "model": fallback}
+
+        # 1. use_linkai → suppress the hint entirely. LinkAI is a proxy and
+        #    we don't observe which underlying model it picks; surfacing
+        #    "LinkAI" with no model would not tell the user anything useful.
+        if use_linkai_flag and linkai_configured:
+            return {"provider": "", "model": ""}
+
+        # 2. Main bot — only when it natively supports vision. We approximate
+        #    "natively supports" by membership in _VISION_PROVIDER_MODELS,
+        #    which is the same set vision.py's _DISCOVERABLE_MODELS covers
+        #    (minus the chat-only DeepSeek family).
+        if main_provider in cls._VISION_PROVIDER_MODELS:
+            hit = _try(main_provider, main_model)
+            if hit:
+                return hit
+
+        # 3. Other discoverable providers in declared order
+        for pid, _key, default_model in cls._VISION_AUTO_ORDER:
+            hit = _try(pid, default_model)
+            if hit:
+                return hit
+
+        # 4. OpenAI raw HTTP
+        if cls._is_real_key(local_config.get("open_ai_api_key", "")):
+            return {"provider": "openai", "model": const.GPT_55}
+
+        # 5. LinkAI as last resort (only reached when use_linkai is off)
+        if linkai_configured:
+            return {"provider": "linkai", "model": const.GPT_41_MINI}
+
+        return {"provider": "", "model": ""}
+
+    @classmethod
+    def _vision_capability(cls, local_config: dict) -> dict:
+        """Vision model. tools.vision.model is the explicit override; otherwise
+        the runtime fallback chain in agent/tools/vision/vision.py decides."""
+        tools_conf = local_config.get("tools") or local_config.get("tool") or {}
+        if not isinstance(tools_conf, dict):
+            tools_conf = {}
+        vision_conf = tools_conf.get("vision") or {}
+        if not isinstance(vision_conf, dict):
+            vision_conf = {}
+        user_specified = (vision_conf.get("model") or "").strip()
+        explicit_provider = (vision_conf.get("provider") or "").strip()
+
+        # Provider resolution priority:
+        #   1. Explicit `tools.vision.provider` (persisted via UI; supports
+        #      custom model names that prefix-inference can't recognize).
+        #   2. Scan per-provider model lists by model name.
+        # Empty provider keeps the dropdown on "auto" when we can't tell.
+        inferred_provider = ""
+        if explicit_provider and explicit_provider in cls._VISION_PROVIDER_MODELS:
+            inferred_provider = explicit_provider
+        elif user_specified:
+            for pid, models in cls._VISION_PROVIDER_MODELS.items():
+                if user_specified in models:
+                    inferred_provider = pid
+                    break
+
+        # In auto mode the hint should reflect what vision.py will actually
+        # dispatch to — surface that prediction via fallback_* so the UI
+        # shows e.g. "openai / gpt-4.1-mini" instead of the chat-model name.
+        predicted = cls._predict_vision_auto(local_config)
+
+        return {
+            "editable": True,
+            "strategy": "specified" if user_specified else "auto",
+            "user_specified_model": user_specified,
+            "current_provider": inferred_provider,
+            "current_model": user_specified,
+            "fallback_provider": predicted["provider"],
+            "fallback_model": predicted["model"],
+            "providers": list(cls._VISION_PROVIDER_MODELS.keys()),
+            "provider_models": cls._VISION_PROVIDER_MODELS,
+        }
+
+    @classmethod
+    def _asr_capability(cls, local_config: dict) -> dict:
+        # "Pick or empty" — when voice_to_text is unset we don't show a
+        # current selection. `suggested_provider` previews which vendor
+        # the bridge auto-picker would land on (purely a UX hint, NOT
+        # persisted). Once the user saves a vendor, we lock onto it.
+        explicit = (local_config.get("voice_to_text") or "").strip().lower()
+        suggested = ""
+        if not explicit:
+            for pid in cls._ASR_PROVIDERS:
+                meta = ConfigHandler.PROVIDER_MODELS.get(pid) or {}
+                key_field = meta.get("api_key_field")
+                if key_field and cls._is_real_key(local_config.get(key_field, "")):
+                    suggested = pid
+                    break
+        return {
+            "editable": True,
+            "current_provider": explicit,
+            "suggested_provider": suggested,
+            "current_model": "",
+            "providers": cls._ASR_PROVIDERS,
+        }
+
+    @classmethod
+    def _tts_capability(cls, local_config: dict) -> dict:
+        explicit = (local_config.get("text_to_voice") or "").strip().lower()
+        # Providers outside the white-list don't drive the picker, but their
+        # underlying runtime config is preserved so bridge still routes them.
+        ui_provider = explicit if explicit in cls._TTS_PROVIDERS else ""
+        suggested = ""
+        if not ui_provider:
+            for pid in cls._TTS_PROVIDERS:
+                meta = ConfigHandler.PROVIDER_MODELS.get(pid) or {}
+                key_field = meta.get("api_key_field")
+                if key_field and cls._is_real_key(local_config.get(key_field, "")):
+                    suggested = pid
+                    break
+        return {
+            "editable": True,
+            "current_provider": ui_provider,
+            "suggested_provider": suggested,
+            "current_model": (local_config.get("text_to_voice_model") or "") if ui_provider else "",
+            "current_voice": (local_config.get("tts_voice_id") or "") if ui_provider else "",
+            "providers": cls._TTS_PROVIDERS,
+            "provider_models": cls._TTS_PROVIDER_MODELS,
+            "provider_voices": cls._TTS_PROVIDER_VOICES,
+            "reply_mode": cls._tts_reply_mode(local_config),
+        }
+
+    @staticmethod
+    def _tts_reply_mode(local_config: dict) -> str:
+        if local_config.get("always_reply_voice", False):
+            return "always"
+        if local_config.get("voice_reply_voice", False):
+            return "voice_if_voice"
+        return "off"
+
+    @classmethod
+    def _embedding_capability(cls, local_config: dict) -> dict:
+        # Embedding is "pick or empty" — runtime's legacy openai/linkai
+        # fallback is a safety net, not a UX-visible auto mode.
+        # `suggested_provider` is a UI-only hint (NOT persisted) that
+        # preselects the dropdown to whichever configured vendor we'd
+        # recommend, so users don't have to expand the menu to find it.
+        explicit = (local_config.get("embedding_provider") or "").strip().lower()
+        suggested = ""
+        if not explicit:
+            for pid in cls._EMBEDDING_PROVIDERS:
+                meta = ConfigHandler.PROVIDER_MODELS.get(pid) or {}
+                key_field = meta.get("api_key_field")
+                if key_field and cls._is_real_key(local_config.get(key_field, "")):
+                    suggested = pid
+                    break
+        return {
+            "editable": True,
+            "current_provider": explicit,
+            "suggested_provider": suggested,
+            "current_model": local_config.get("embedding_model", "") or "",
+            "current_dim": int(local_config.get("embedding_dimensions") or 0) or None,
+            "providers": cls._EMBEDDING_PROVIDERS,
+        }
+
+    # Auto-fallback order for image generation. Mirrors the global priority
+    # used inside skills/image-generation/scripts/generate.py
+    # (`_DEFAULT_PROVIDER_ORDER`): OpenAI → Gemini → Seedream(Ark/doubao) →
+    # Qwen(dashscope) → MiniMax → LinkAI. Each entry maps the
+    # provider-card id to the script's per-provider DEFAULT_MODEL so the
+    # hint matches what the runtime would actually request.
+    _IMAGE_AUTO_ORDER = [
+        ("openai",    "gpt-image-2"),
+        ("gemini",    "gemini-3.1-flash-image-preview"),  # nano-banana-2
+        ("doubao",    "seedream-5.0-lite"),
+        ("dashscope", "qwen-image-2.0"),
+        ("minimax",   "image-01"),
+        ("linkai",    "gpt-image-2"),
+    ]
+
+    @classmethod
+    def _predict_image_auto(cls, local_config: dict) -> dict:
+        """Predict which provider/model the image-generation skill will hit
+        when no SKILL_IMAGE_GENERATION_MODEL override is set. Mirrors
+        skills/image-generation/scripts/generate.py::_build_providers so
+        the UI hint matches reality. Chat-only providers (DeepSeek etc.)
+        are absent by design — image generation never falls back to a chat
+        bot regardless of the main model.
+
+        When use_linkai is enabled the hint is suppressed entirely — LinkAI
+        proxies to whichever backend it deems appropriate and surfacing
+        "LinkAI" alone tells the user nothing actionable."""
+        use_linkai_flag = bool(local_config.get("use_linkai", False))
+        linkai_configured = cls._is_real_key(local_config.get("linkai_api_key", ""))
+        if use_linkai_flag and linkai_configured:
+            return {"provider": "", "model": ""}
+
+        for pid, default_model in cls._IMAGE_AUTO_ORDER:
+            meta = ConfigHandler.PROVIDER_MODELS.get(pid) or {}
+            key_field = meta.get("api_key_field")
+            if not key_field:
+                continue
+            if cls._is_real_key(local_config.get(key_field, "")):
+                return {"provider": pid, "model": default_model}
+        return {"provider": "", "model": ""}
+
+    @classmethod
+    def _image_capability(cls, local_config: dict) -> dict:
+        """Image generation. Source of truth: config["skills"]["image-generation"]["model"]
+        (mirrors the per-skill config schema documented in skills/image-generation).
+        The runtime resolver in skills/image-generation/scripts/generate.py
+        reads this via the SKILL_IMAGE_GENERATION_MODEL env var that the
+        agent_initializer syncs at startup; provider is inferred from the
+        model name prefix, mirroring vision.py's design.
+
+        ``skill`` (singular) is still tolerated as a legacy fallback —
+        config.load_config() folds it into ``skills`` at startup.
+        """
+        skills_node = local_config.get("skills") or local_config.get("skill") or {}
+        if not isinstance(skills_node, dict):
+            skills_node = {}
+        img_node = skills_node.get("image-generation") or {}
+        if not isinstance(img_node, dict):
+            img_node = {}
+        explicit_model = (img_node.get("model") or "").strip()
+        explicit_provider = (img_node.get("provider") or "").strip()
+
+        # Provider resolution priority:
+        #   1. Explicit `skills.image-generation.provider` (persisted via UI;
+        #      supports custom model names that prefix-inference can't catch).
+        #   2. Scan per-provider model catalog by model name.
+        # Empty provider keeps the dropdown on "auto" when we can't tell.
+        inferred_provider = ""
+        if explicit_provider and explicit_provider in cls._IMAGE_PROVIDER_MODELS:
+            inferred_provider = explicit_provider
+        elif explicit_model:
+            for pid, models in cls._IMAGE_PROVIDER_MODELS.items():
+                for entry in models:
+                    val = entry if isinstance(entry, str) else (entry.get("value") or "")
+                    if val == explicit_model:
+                        inferred_provider = pid
+                        break
+                if inferred_provider:
+                    break
+
+        # In auto mode the hint should reflect what generate.py will actually
+        # dispatch to — surface that prediction via fallback_* so the UI
+        # never claims a chat-only bot (e.g. minimax/MiniMax-M2.7) "would
+        # generate the image", which is impossible.
+        predicted = cls._predict_image_auto(local_config)
+
+        return {
+            "editable": True,
+            "strategy": "specified" if explicit_model else "auto",
+            "current_provider": inferred_provider,
+            "current_model": explicit_model,
+            "fallback_provider": predicted["provider"],
+            "fallback_model": predicted["model"],
+            "providers": list(cls._IMAGE_PROVIDER_MODELS.keys()),
+            "provider_models": cls._IMAGE_PROVIDER_MODELS,
+            # The dispatcher that honors a pinned provider isn't wired up
+            # yet; advertise this so the UI can show a "saved but not active"
+            # banner until the runtime catches up.
+            "runtime_active": False,
+            "note": "router_pending",
+        }
+
+    # Canonical search provider order. Mirrors PROVIDER_ORDER in
+    # agent/tools/web_search/web_search.py — keep them in sync.
+    _SEARCH_PROVIDERS = ("bocha", "qianfan", "zhipu", "linkai")
+
+    _SEARCH_PROVIDER_LABELS = {
+        "bocha":   {"zh": "博查", "en": "Bocha"},
+        "zhipu":   {"zh": "智谱", "en": "GLM"},
+        "qianfan": {"zh": "百度千帆", "en": "ERNIE"},
+        "linkai":  {"zh": "LinkAI", "en": "LinkAI"},
+    }
+
+    @classmethod
+    def _search_provider_key(cls, provider: str, local_config: dict) -> str:
+        """Resolve the (raw) key for a given search provider."""
+        if provider == "bocha":
+            tools_cfg = local_config.get("tools") or {}
+            block = tools_cfg.get("web_search") or {} if isinstance(tools_cfg, dict) else {}
+            return (block.get("bocha_api_key") if isinstance(block, dict) else "") or os.environ.get("BOCHA_API_KEY", "")
+        if provider == "zhipu":
+            return local_config.get("zhipu_ai_api_key") or os.environ.get("ZHIPUAI_API_KEY", "")
+        if provider == "qianfan":
+            return local_config.get("qianfan_api_key") or os.environ.get("QIANFAN_API_KEY", "")
+        if provider == "linkai":
+            return local_config.get("linkai_api_key") or os.environ.get("LINKAI_API_KEY", "")
+        return ""
+
+    @classmethod
+    def _search_capability(cls, local_config: dict) -> dict:
+        """Search is editable: pick auto (default) or pin a specific backend.
+        Providers reuse model-vendor keys (zhipu/qianfan/linkai) so they show
+        up as configured once the user adds those vendors; bocha keeps its
+        own key under tools.web_search."""
+        tools_cfg = local_config.get("tools") or {}
+        ws_cfg = tools_cfg.get("web_search") or {} if isinstance(tools_cfg, dict) else {}
+        if not isinstance(ws_cfg, dict):
+            ws_cfg = {}
+
+        providers = []
+        configured_ids = []
+        for pid in cls._SEARCH_PROVIDERS:
+            ok = cls._is_real_key(cls._search_provider_key(pid, local_config))
+            raw_key = cls._search_provider_key(pid, local_config) if ok else ""
+            providers.append({
+                "id": pid,
+                "label": cls._SEARCH_PROVIDER_LABELS.get(pid, pid),
+                "configured": ok,
+                # bocha owns its key under tools.web_search; the other three
+                # piggy-back on a model-vendor credential. Frontend uses
+                # this hint to decide which credential editor to surface.
+                "needs_dedicated_key": pid == "bocha",
+                "api_key_masked": ConfigHandler._mask_key(raw_key) if raw_key else "",
+            })
+            if ok:
+                configured_ids.append(pid)
+
+        strategy = (ws_cfg.get("strategy") or "auto").strip().lower()
+        if strategy not in ("auto", "fixed"):
+            strategy = "auto"
+        fixed_provider = (ws_cfg.get("provider") or "").strip().lower()
+        if fixed_provider and fixed_provider not in configured_ids:
+            fixed_provider = ""
+
+        # current_provider drives the chip in the header — show the actually
+        # active backend (pinned or first auto-picked).
+        if strategy == "fixed" and fixed_provider:
+            current = fixed_provider
+        else:
+            current = configured_ids[0] if configured_ids else ""
+
+        return {
+            "editable": True,
+            "strategy": strategy,
+            "providers": providers,
+            "configured_providers": configured_ids,
+            "current_provider": current,
+            "fixed_provider": fixed_provider,
+            "available": bool(current),
+        }
+
+    @classmethod
+    def _capabilities(cls, local_config: dict) -> dict:
+        return {
+            "chat":      cls._chat_capability(local_config),
+            "vision":    cls._vision_capability(local_config),
+            "asr":       cls._asr_capability(local_config),
+            "tts":       cls._tts_capability(local_config),
+            "embedding": cls._embedding_capability(local_config),
+            "image":     cls._image_capability(local_config),
+            "search":    cls._search_capability(local_config),
+        }
+
+    def GET(self):
+        _require_auth()
+        web.header("Content-Type", "application/json; charset=utf-8")
+        try:
+            local_config = conf()
+            return json.dumps({
+                "status": "success",
+                "providers": self._provider_overview(),
+                "capabilities": self._capabilities(local_config),
+            }, ensure_ascii=False)
+        except Exception as e:
+            logger.error(f"[ModelsHandler] GET failed: {e}")
+            return json.dumps({"status": "error", "message": str(e)})
+
+    def POST(self):
+        _require_auth()
+        web.header("Content-Type", "application/json; charset=utf-8")
+        try:
+            data = json.loads(web.data() or b"{}")
+            action = data.get("action") or ""
+            if action == "set_provider":
+                return self._handle_set_provider(data)
+            if action == "delete_provider":
+                return self._handle_delete_provider(data)
+            if action == "set_capability":
+                return self._handle_set_capability(data)
+            if action == "set_voice_reply_mode":
+                return self._handle_set_voice_reply_mode(data)
+            if action == "set_search_credential":
+                return self._handle_set_search_credential(data)
+            return json.dumps({"status": "error", "message": f"unknown action: {action!r}"})
+        except Exception as e:
+            logger.error(f"[ModelsHandler] POST failed: {e}")
+            return json.dumps({"status": "error", "message": str(e)})
+
+    def _handle_set_provider(self, data: dict) -> str:
+        provider_id = (data.get("provider_id") or "").strip()
+        meta = ConfigHandler.PROVIDER_MODELS.get(provider_id)
+        if not meta:
+            return json.dumps({"status": "error", "message": f"unknown provider: {provider_id}"})
+
+        # api_key absent / empty / null => leave the existing key untouched
+        # (used by the "edit only base url" flow). To clear the key, callers
+        # must use action=delete_provider explicitly.
+        api_key_raw = data.get("api_key")
+        api_key = api_key_raw.strip() if isinstance(api_key_raw, str) else ""
+
+        # api_base presence is significant: an explicit "" means "reset to
+        # default", whereas a missing key means "no change".
+        api_base_present = "api_base" in data
+        api_base = (data.get("api_base") or "").strip() if api_base_present else None
+
+        applied = {}
+        local_config = conf()
+        file_cfg = self._read_file_config()
+
+        key_field = meta.get("api_key_field")
+        if key_field and api_key:
+            local_config[key_field] = api_key
+            file_cfg[key_field] = api_key
+            applied[key_field] = True
+        base_field = meta.get("api_base_key")
+        if base_field and api_base_present:
+            local_config[base_field] = api_base
+            file_cfg[base_field] = api_base
+            applied[base_field] = True
+
+        if not applied:
+            # Nothing actually changed (e.g. user opened the modal and hit
+            # save without editing). Treat as a successful no-op so the
+            # frontend can show "Saved" instead of surfacing an error.
+            return json.dumps({"status": "success", "provider": provider_id, "noop": True})
+
+        self._write_file_config(file_cfg)
+        logger.info(f"[ModelsHandler] provider {provider_id} updated: {sorted(applied.keys())}")
+
+        # Vendor credentials affect bot routing for any capability that uses
+        # them; safest to reset Bridge so the next request rebuilds bots.
+        self._reset_bridge()
+        return json.dumps({"status": "success", "provider": provider_id})
+
+    def _handle_delete_provider(self, data: dict) -> str:
+        provider_id = (data.get("provider_id") or "").strip()
+        meta = ConfigHandler.PROVIDER_MODELS.get(provider_id)
+        if not meta:
+            return json.dumps({"status": "error", "message": f"unknown provider: {provider_id}"})
+
+        local_config = conf()
+        file_cfg = self._read_file_config()
+
+        cleared = []
+        for field_name in (meta.get("api_key_field"), meta.get("api_base_key")):
+            if not field_name:
+                continue
+            # Always write the key — even if it was absent before — so the
+            # in-memory conf() reflects the cleared state without needing a
+            # restart. (`in local_config` was too strict: provider keys that
+            # were ever set then deleted manually wouldn't get reset.)
+            local_config[field_name] = ""
+            file_cfg[field_name] = ""
+            cleared.append(field_name)
+
+        self._write_file_config(file_cfg)
+        logger.info(f"[ModelsHandler] provider {provider_id} cleared: {cleared}")
+        self._reset_bridge()
+        return json.dumps({"status": "success", "provider": provider_id, "cleared": cleared})
+
+    def _handle_set_capability(self, data: dict) -> str:
+        capability = (data.get("capability") or "").strip()
+        provider_id = (data.get("provider_id") or "").strip()
+        model = (data.get("model") or "").strip()
+
+        if capability == "chat":
+            return self._set_chat(provider_id, model)
+        if capability == "vision":
+            return self._set_vision(provider_id, model)
+        if capability == "asr":
+            return self._set_simple("voice_to_text", provider_id)
+        if capability == "tts":
+            return self._set_tts(provider_id, model, (data.get("voice") or "").strip())
+        if capability == "embedding":
+            return self._set_embedding(provider_id, model)
+        if capability == "image":
+            return self._set_image(provider_id, model)
+        if capability == "search":
+            return self._set_search(
+                (data.get("strategy") or "").strip().lower(),
+                (data.get("provider") or "").strip().lower(),
+            )
+        return json.dumps({"status": "error", "message": f"capability not editable: {capability}"})
+
+    def _set_image(self, provider_id: str, model: str) -> str:
+        # Source of truth: skills.image-generation.{provider, model}. The
+        # provider field is persisted so users picking a custom model under
+        # a specific vendor still get routed there — runtime falls back to
+        # model-name prefix inference only when provider is empty.
+        local_config = conf()
+        file_cfg = self._read_file_config()
+
+        self._set_nested_namespace_value(local_config, "skills", "image-generation", "model", model or "")
+        self._set_nested_namespace_value(file_cfg, "skills", "image-generation", "model", model or "")
+        self._set_nested_namespace_value(local_config, "skills", "image-generation", "provider", provider_id or "")
+        self._set_nested_namespace_value(file_cfg, "skills", "image-generation", "provider", provider_id or "")
+        self._drop_legacy_namespace(local_config, "skill", "skills", child="image-generation")
+        self._drop_legacy_namespace(file_cfg, "skill", "skills", child="image-generation")
+
+        self._write_file_config(file_cfg)
+
+        # The skill subprocess reads SKILL_IMAGE_GENERATION_{MODEL,PROVIDER}
+        # from env at startup; mirror the change so live edits apply without
+        # restart.
+        model_env = "SKILL_IMAGE_GENERATION_MODEL"
+        provider_env = "SKILL_IMAGE_GENERATION_PROVIDER"
+        if model:
+            os.environ[model_env] = model
+        else:
+            os.environ.pop(model_env, None)
+        if provider_id:
+            os.environ[provider_env] = provider_id
+        else:
+            os.environ.pop(provider_env, None)
+
+        logger.info(f"[ModelsHandler] image updated: provider={provider_id!r} model={model!r}")
+        return json.dumps({
+            "status": "success",
+            "provider": provider_id,
+            "model": model,
+            "router_pending": True,
+        })
+
+    def _set_chat(self, provider_id: str, model: str) -> str:
+        if provider_id and provider_id not in ConfigHandler.PROVIDER_MODELS:
+            return json.dumps({"status": "error", "message": f"unknown provider: {provider_id}"})
+
+        applied = {}
+        local_config = conf()
+        file_cfg = self._read_file_config()
+
+        if provider_id:
+            bot_type_value = "chatGPT" if provider_id == "openai" else provider_id
+            local_config["bot_type"] = bot_type_value
+            file_cfg["bot_type"] = bot_type_value
+            applied["bot_type"] = bot_type_value
+            use_linkai = (provider_id == "linkai")
+            local_config["use_linkai"] = use_linkai
+            file_cfg["use_linkai"] = use_linkai
+            applied["use_linkai"] = use_linkai
+        if model:
+            local_config["model"] = model
+            file_cfg["model"] = model
+            applied["model"] = model
+
+        if not applied:
+            return json.dumps({"status": "success", "applied": {}, "noop": True})
+
+        self._write_file_config(file_cfg)
+        logger.info(f"[ModelsHandler] chat updated: {applied}")
+        self._reset_bridge()
+        return json.dumps({"status": "success", "applied": applied})
+
+    def _set_vision(self, provider_id: str, model: str) -> str:
+        # Source of truth: tools.vision.{provider, model}. The provider field
+        # is persisted so users picking a custom model under a specific vendor
+        # still get routed there — runtime falls back to model-name prefix
+        # inference only when provider is empty.
+        local_config = conf()
+        file_cfg = self._read_file_config()
+        self._set_nested_namespace_value(file_cfg, "tools", "vision", "model", model)
+        self._set_nested_namespace_value(local_config, "tools", "vision", "model", model)
+        self._set_nested_namespace_value(file_cfg, "tools", "vision", "provider", provider_id or "")
+        self._set_nested_namespace_value(local_config, "tools", "vision", "provider", provider_id or "")
+        self._drop_legacy_namespace(file_cfg, "tool", "tools", child="vision")
+        self._drop_legacy_namespace(local_config, "tool", "tools", child="vision")
+
+        self._write_file_config(file_cfg)
+        logger.info(f"[ModelsHandler] vision updated: provider={provider_id!r} model={model!r}")
+        return json.dumps({"status": "success", "provider": provider_id, "model": model})
+
+    @staticmethod
+    def _set_nested_namespace_value(cfg, top: str, name: str, key: str, value):
+        """Set ``cfg[top][name][key] = value``, creating missing dicts."""
+        bucket = cfg.get(top)
+        if not isinstance(bucket, dict):
+            bucket = {}
+        node = bucket.get(name)
+        if not isinstance(node, dict):
+            node = {}
+        node[key] = value
+        bucket[name] = node
+        cfg[top] = bucket
+
+    @staticmethod
+    def _drop_legacy_namespace(cfg, legacy: str, canonical: str, child: str) -> None:
+        """Strip the deprecated singular key so config.json stays single-source."""
+        legacy_section = cfg.get(legacy)
+        if not isinstance(legacy_section, dict):
+            return
+        legacy_section.pop(child, None)
+        if legacy_section:
+            cfg[legacy] = legacy_section
+        else:
+            cfg.pop(legacy, None)
+
+    def _handle_set_voice_reply_mode(self, data: dict) -> str:
+        # UI picker (off / voice_if_voice / always) maps to the legacy
+        # always_reply_voice + voice_reply_voice pair that chat_channel.py
+        # reads, so all channels (web/feishu/wecom/...) share the routing.
+        mode = (data.get("mode") or "").strip().lower()
+        if mode not in ("off", "voice_if_voice", "always"):
+            return json.dumps({"status": "error", "message": f"invalid mode: {mode!r}"})
+        always = (mode == "always")
+        if_voice = (mode == "voice_if_voice")
+        local_config = conf()
+        file_cfg = self._read_file_config()
+        local_config["always_reply_voice"] = always
+        local_config["voice_reply_voice"] = if_voice
+        file_cfg["always_reply_voice"] = always
+        file_cfg["voice_reply_voice"] = if_voice
+        self._write_file_config(file_cfg)
+        logger.info(
+            f"[ModelsHandler] voice reply mode set: {mode!r} "
+            f"(always_reply_voice={always}, voice_reply_voice={if_voice})"
+        )
+        return json.dumps({"status": "success", "mode": mode})
+
+    def _set_simple(self, key: str, value: str) -> str:
+        local_config = conf()
+        file_cfg = self._read_file_config()
+        local_config[key] = value
+        file_cfg[key] = value
+        self._write_file_config(file_cfg)
+        logger.info(f"[ModelsHandler] {key} set: {value!r}")
+        # Hot-swap the cached voice bot so the change takes effect immediately.
+        if key in ("voice_to_text", "text_to_voice"):
+            self._refresh_voice_routing()
+        return json.dumps({"status": "success", key: value})
+
+    def _set_tts(self, provider_id: str, model: str, voice: str = "") -> str:
+        local_config = conf()
+        file_cfg = self._read_file_config()
+        local_config["text_to_voice"] = provider_id
+        file_cfg["text_to_voice"] = provider_id
+        local_config["text_to_voice_model"] = model
+        file_cfg["text_to_voice_model"] = model
+        local_config["tts_voice_id"] = voice
+        file_cfg["tts_voice_id"] = voice
+        self._write_file_config(file_cfg)
+        logger.info(
+            f"[ModelsHandler] tts updated: provider={provider_id!r} "
+            f"model={model!r} voice={voice!r}"
+        )
+        self._refresh_voice_routing()
+        return json.dumps({
+            "status": "success",
+            "provider": provider_id, "model": model, "voice": voice,
+        })
+
+    @staticmethod
+    def _refresh_voice_routing() -> None:
+        try:
+            from bridge.bridge import Bridge
+            Bridge().refresh_voice()
+        except Exception as e:
+            logger.warning(f"[ModelsHandler] Bridge voice refresh failed: {e}")
+
+    def _set_embedding(self, provider_id: str, model: str) -> str:
+        # Two valid states: both empty (reset to pick-or-empty) OR both set.
+        # A provider without a model leaves the runtime in a broken half-state,
+        # so reject that explicitly instead of silently writing it through.
+        if provider_id and not model:
+            return json.dumps({
+                "status": "error",
+                "message": "embedding model is required when a provider is selected",
+            })
+        local_config = conf()
+        file_cfg = self._read_file_config()
+        local_config["embedding_provider"] = provider_id
+        file_cfg["embedding_provider"] = provider_id
+        local_config["embedding_model"] = model
+        file_cfg["embedding_model"] = model
+        self._write_file_config(file_cfg)
+        logger.info(f"[ModelsHandler] embedding updated: provider={provider_id!r} model={model!r}")
+        # The next /memory rebuild-index command hot-swaps the provider onto
+        # the running MemoryManager (see plugins/cow_cli). The dim may have
+        # changed, so the frontend prompts the user to rebuild.
+        return json.dumps({"status": "success", "provider": provider_id, "model": model})
+
+    def _set_search(self, strategy: str, provider: str) -> str:
+        """Persist search routing under tools.web_search.{strategy,provider}.
+
+        strategy 'auto'  -> provider field is cleared (auto picks at call time)
+        strategy 'fixed' -> provider must be in the canonical list; runtime
+                            silently falls back to auto if its key is missing.
+        """
+        if strategy not in ("auto", "fixed"):
+            return json.dumps({"status": "error", "message": f"invalid strategy: {strategy!r}"})
+        if strategy == "fixed":
+            if provider not in self._SEARCH_PROVIDERS:
+                return json.dumps({"status": "error", "message": f"unknown provider: {provider!r}"})
+        else:
+            provider = ""
+
+        local_config = conf()
+        file_cfg = self._read_file_config()
+        self._set_nested_namespace_value(local_config, "tools", "web_search", "strategy", strategy)
+        self._set_nested_namespace_value(file_cfg,     "tools", "web_search", "strategy", strategy)
+        self._set_nested_namespace_value(local_config, "tools", "web_search", "provider", provider)
+        self._set_nested_namespace_value(file_cfg,     "tools", "web_search", "provider", provider)
+        self._write_file_config(file_cfg)
+        logger.info(f"[ModelsHandler] search updated: strategy={strategy!r} provider={provider!r}")
+        return json.dumps({"status": "success", "strategy": strategy, "provider": provider})
+
+    def _handle_set_search_credential(self, data: dict) -> str:
+        """Persist the bocha API key under tools.web_search.bocha_api_key.
+
+        The other three providers (zhipu/qianfan/linkai) reuse model-vendor
+        credentials, so they go through set_provider with the standard
+        model-vendor flow.
+        """
+        api_key = (data.get("api_key") or "").strip() if isinstance(data.get("api_key"), str) else ""
+        local_config = conf()
+        file_cfg = self._read_file_config()
+        self._set_nested_namespace_value(local_config, "tools", "web_search", "bocha_api_key", api_key)
+        self._set_nested_namespace_value(file_cfg,     "tools", "web_search", "bocha_api_key", api_key)
+        self._write_file_config(file_cfg)
+        logger.info(f"[ModelsHandler] search credential set: bocha_api_key={'***' if api_key else ''}")
+        return json.dumps({"status": "success", "provider": "bocha"})
+
+    @staticmethod
+    def _reset_bridge() -> None:
+        try:
+            from bridge.bridge import Bridge
+            Bridge().reset_bot()
+            logger.info("[ModelsHandler] Bridge bot routing reset")
+        except Exception as e:
+            logger.warning(f"[ModelsHandler] Bridge reset failed: {e}")
+
+
 class ChannelsHandler:
     """API for managing external channel configurations (feishu, dingtalk, etc)."""
 
@@ -1296,6 +2935,23 @@ class ChannelsHandler:
                 {"key": "wechatmp_port", "label": "Port", "type": "number", "default": 8080},
             ],
         }),
+        ("telegram", {
+            "label": {"zh": "Telegram", "en": "Telegram"},
+            "icon": "fa-paper-plane",
+            "color": "sky",
+            "fields": [
+                {"key": "telegram_token", "label": "Bot Token", "type": "secret"},
+            ],
+        }),
+        ("slack", {
+            "label": {"zh": "Slack", "en": "Slack"},
+            "icon": "fa-hashtag",
+            "color": "purple",
+            "fields": [
+                {"key": "slack_bot_token", "label": "Bot Token (xoxb-)", "type": "secret"},
+                {"key": "slack_app_token", "label": "App Token (xapp-)", "type": "secret"},
+            ],
+        }),
     ])
 
     @staticmethod
@@ -2255,7 +3911,12 @@ class AssetsHandler:
                 raise web.notfound()
 
             if not os.path.exists(full_path) or not os.path.isfile(full_path):
-                logger.error(f"File not found: {full_path}")
+                # Browsers routinely probe optional asset variants (e.g. a
+                # .ttf fallback declared alongside .woff2 in @font-face);
+                # logging these as errors floods the console with harmless
+                # noise. Keep it at debug level — real misconfigurations
+                # will still surface via the network panel.
+                logger.debug(f"Static file not found: {full_path}")
                 raise web.notfound()
 
             # 设置正确的Content-Type
@@ -2270,8 +3931,12 @@ class AssetsHandler:
             with open(full_path, 'rb') as f:
                 return f.read()
 
+        except web.HTTPError:
+            # The 404 path above already logged at debug; re-raise as-is so
+            # web.py returns the original status to the client.
+            raise
         except Exception as e:
-            logger.error(f"Error serving static file: {e}", exc_info=True)  # 添加更详细的错误信息
+            logger.error(f"Error serving static file: {e}", exc_info=True)
             raise web.notfound()
 
 
diff --git a/channel/wechatmp/passive_reply.py b/channel/wechatmp/passive_reply.py
index d03efc4d..85b3b402 100644
--- a/channel/wechatmp/passive_reply.py
+++ b/channel/wechatmp/passive_reply.py
@@ -103,14 +103,21 @@ class Query:
                 task_running = True
                 waiting_until = request_time + 4
                 while time.time() < waiting_until:
-                    if from_user in channel.running:
-                        time.sleep(0.1)
-                    else:
+                    if from_user not in channel.running:
                         task_running = False
                         break
+                    # Task still running, but if it has already produced cached
+                    # segments (e.g. multi-turn thinking output), return them now
+                    # instead of forcing the user to wait for the whole task. The
+                    # remaining segments are fetched by the user's next message.
+                    if channel.cache_dict.get(from_user):
+                        break
+                    time.sleep(0.1)
 
                 reply_text = ""
-                if task_running:
+                # Only fall back to retry / "thinking" hint when the task is still
+                # running AND there is nothing cached to send yet.
+                if task_running and not channel.cache_dict.get(from_user):
                     if request_cnt < 3:
                         # waiting for timeout (the POST request will be closed by Wechat official server)
                         time.sleep(2)
@@ -131,8 +138,22 @@ class Query:
 
                 # Only one request can access to the cached data
                 try:
-                    (reply_type, reply_content) = channel.cache_dict[from_user].pop(0)
-                    if not channel.cache_dict[from_user]:  # If popping the message makes the list empty, delete the user entry from cache
+                    # WeChat passive reply allows only a single reply per request.
+                    # To avoid forcing the user to send an extra message for every
+                    # segment of multi-turn agent output, drain all consecutive
+                    # cached text segments at once and merge them into one reply.
+                    # Media (voice/image) can only be returned one at a time, so it
+                    # stops the merge and is returned on its own.
+                    cached = channel.cache_dict[from_user]
+                    if cached[0][0] == "text":
+                        reply_type = "text"
+                        merged_parts = []
+                        while cached and cached[0][0] == "text":
+                            merged_parts.append(cached.pop(0)[1])
+                        reply_content = "\n\n".join(merged_parts)
+                    else:
+                        (reply_type, reply_content) = cached.pop(0)
+                    if not channel.cache_dict[from_user]:  # If draining empties the list, delete the user entry from cache
                         del channel.cache_dict[from_user]
                 except IndexError:
                     return "success"
diff --git a/channel/wechatmp/wechatmp_channel.py b/channel/wechatmp/wechatmp_channel.py
index c066f286..dc0ffb26 100644
--- a/channel/wechatmp/wechatmp_channel.py
+++ b/channel/wechatmp/wechatmp_channel.py
@@ -134,10 +134,16 @@ class WechatMPChannel(ChatChannel):
 
             elif reply.type == ReplyType.IMAGE_URL:  # 从网络下载图片
                 img_url = reply.content
-                pic_res = requests.get(img_url, stream=True)
                 image_storage = io.BytesIO()
-                for block in pic_res.iter_content(1024):
-                    image_storage.write(block)
+                if img_url.startswith("file://") or os.path.isfile(img_url):
+                    # Local file produced by the agent (e.g. a generated image)
+                    local_path = img_url[len("file://"):] if img_url.startswith("file://") else img_url
+                    with open(local_path, "rb") as f:
+                        image_storage.write(f.read())
+                else:
+                    pic_res = requests.get(img_url, stream=True)
+                    for block in pic_res.iter_content(1024):
+                        image_storage.write(block)
                 image_storage.seek(0)
                 image_type = imghdr.what(image_storage)
                 filename = receiver + "-" + str(context["msg"].msg_id) + "." + image_type
@@ -258,10 +264,16 @@ class WechatMPChannel(ChatChannel):
                 logger.info("[wechatmp] Do send voice to {}".format(receiver))
             elif reply.type == ReplyType.IMAGE_URL:  # 从网络下载图片
                 img_url = reply.content
-                pic_res = requests.get(img_url, stream=True)
                 image_storage = io.BytesIO()
-                for block in pic_res.iter_content(1024):
-                    image_storage.write(block)
+                if img_url.startswith("file://") or os.path.isfile(img_url):
+                    # Local file produced by the agent (e.g. a generated image)
+                    local_path = img_url[len("file://"):] if img_url.startswith("file://") else img_url
+                    with open(local_path, "rb") as f:
+                        image_storage.write(f.read())
+                else:
+                    pic_res = requests.get(img_url, stream=True)
+                    for block in pic_res.iter_content(1024):
+                        image_storage.write(block)
                 image_storage.seek(0)
                 image_type = imghdr.what(image_storage)
                 filename = receiver + "-" + str(context["msg"].msg_id) + "." + image_type
diff --git a/channel/wecom_bot/wecom_bot_channel.py b/channel/wecom_bot/wecom_bot_channel.py
index 7aaca56b..ebc1104b 100644
--- a/channel/wecom_bot/wecom_bot_channel.py
+++ b/channel/wecom_bot/wecom_bot_channel.py
@@ -81,6 +81,8 @@ def _loads_wecom_ws_json(raw):
 @singleton
 class WecomBotChannel(ChatChannel):
 
+    NOT_SUPPORT_REPLYTYPE = []
+
     def __init__(self):
         super().__init__()
         self.bot_id = ""
@@ -438,6 +440,17 @@ class WecomBotChannel(ChatChannel):
                     state["current"] = ""
                 _push_stream(state, force=True)
 
+            elif event_type == "agent_cancelled":
+                # Flush partial output and strip trailing "---" separator
+                # left over from previous turn, to avoid a dangling divider.
+                if state["current"]:
+                    state["committed"] += state["current"]
+                    state["current"] = ""
+                state["committed"] = state["committed"].rstrip()
+                if state["committed"].endswith("---"):
+                    state["committed"] = state["committed"][:-3].rstrip()
+                _push_stream(state, force=True)
+
         return on_event
 
     # ------------------------------------------------------------------
@@ -472,6 +485,8 @@ class WecomBotChannel(ChatChannel):
             else:
                 context.type = ContextType.TEXT
             context.content = content.strip()
+            if "desire_rtype" not in context and conf().get("always_reply_voice"):
+                context["desire_rtype"] = ReplyType.VOICE
 
         return context
 
@@ -498,6 +513,8 @@ class WecomBotChannel(ChatChannel):
             self._send_file(reply.content, receiver, is_group, req_id)
         elif reply.type == ReplyType.VIDEO or reply.type == ReplyType.VIDEO_URL:
             self._send_file(reply.content, receiver, is_group, req_id, media_type="video")
+        elif reply.type == ReplyType.VOICE:
+            self._send_voice(reply.content, receiver, is_group, req_id)
         else:
             logger.warning(f"[WecomBot] Unsupported reply type: {reply.type}, falling back to text")
             self._send_text(str(reply.content), receiver, is_group, req_id)
@@ -730,6 +747,65 @@ class WecomBotChannel(ChatChannel):
                 },
             })
 
+    def _send_voice(self, voice_path: str, receiver: str, is_group: bool, req_id: str = None):
+        """Send native voice reply. WeCom voice media must be amr."""
+        local_path = voice_path
+        if local_path.startswith("file://"):
+            local_path = local_path[7:]
+
+        if local_path.startswith(("http://", "https://")):
+            try:
+                resp = requests.get(local_path, timeout=60)
+                resp.raise_for_status()
+                ext = os.path.splitext(local_path)[1] or ".mp3"
+                tmp_path = f"/tmp/wecom_voice_{uuid.uuid4().hex[:8]}{ext}"
+                with open(tmp_path, "wb") as f:
+                    f.write(resp.content)
+                local_path = tmp_path
+            except Exception as e:
+                logger.error(f"[WecomBot] Failed to download voice for sending: {e}")
+                return
+
+        if not os.path.exists(local_path):
+            logger.error(f"[WecomBot] Voice file not found: {local_path}")
+            return
+
+        amr_path = local_path
+        if not local_path.lower().endswith(".amr"):
+            try:
+                from voice.audio_convert import any_to_amr
+                amr_path = os.path.splitext(local_path)[0] + ".amr"
+                any_to_amr(local_path, amr_path)
+            except Exception as e:
+                logger.error(f"[WecomBot] Failed to convert voice to amr: {e}")
+                return
+
+        media_id = self._upload_media(amr_path, "voice")
+        if not media_id:
+            logger.error("[WecomBot] Failed to upload voice media")
+            return
+
+        if req_id:
+            self._ws_send({
+                "cmd": "aibot_respond_msg",
+                "headers": {"req_id": req_id},
+                "body": {
+                    "msgtype": "voice",
+                    "voice": {"media_id": media_id},
+                },
+            })
+        else:
+            self._ws_send({
+                "cmd": "aibot_send_msg",
+                "headers": {"req_id": self._gen_req_id()},
+                "body": {
+                    "chatid": receiver,
+                    "chat_type": 2 if is_group else 1,
+                    "msgtype": "voice",
+                    "voice": {"media_id": media_id},
+                },
+            })
+
     def _active_send_markdown(self, content: str, receiver: str, is_group: bool):
         """Proactively send markdown message (for scheduled tasks, no req_id)."""
         self._ws_send({
diff --git a/channel/weixin/weixin_channel.py b/channel/weixin/weixin_channel.py
index dba9060f..309aecab 100644
--- a/channel/weixin/weixin_channel.py
+++ b/channel/weixin/weixin_channel.py
@@ -47,19 +47,24 @@ def _load_credentials(cred_path: str) -> dict:
 
 
 def _save_credentials(cred_path: str, data: dict):
-    """Save credentials to JSON file."""
+    """Atomically save credentials to JSON file (tmp + rename)."""
     os.makedirs(os.path.dirname(cred_path), exist_ok=True)
-    with open(cred_path, "w") as f:
+    tmp_path = f"{cred_path}.tmp"
+    with open(tmp_path, "w") as f:
         json.dump(data, f, indent=2)
     try:
-        os.chmod(cred_path, 0o600)
+        os.chmod(tmp_path, 0o600)
     except Exception:
         pass
+    os.replace(tmp_path, cred_path)
 
 
 @singleton
 class WeixinChannel(ChatChannel):
 
+    # ilink bot protocol has no outbound voice item; deliver TTS as a file.
+    NOT_SUPPORT_REPLYTYPE = []
+
     LOGIN_STATUS_IDLE = "idle"
     LOGIN_STATUS_WAITING = "waiting_scan"
     LOGIN_STATUS_SCANNED = "scanned"
@@ -70,7 +75,10 @@ class WeixinChannel(ChatChannel):
         self.api = None
         self._stop_event = threading.Event()
         self._poll_thread = None
-        self._context_tokens = {}  # user_id -> context_token
+        # user_id -> context_token. Guarded by _context_tokens_lock for any
+        # mutation that races with disk persistence.
+        self._context_tokens = {}
+        self._context_tokens_lock = threading.Lock()
         self._received_msgs = ExpiredDict(60 * 60 * 7.1)
         self._get_updates_buf = ""
         self._credentials_path = ""
@@ -92,12 +100,19 @@ class WeixinChannel(ChatChannel):
             conf().get("weixin_credentials_path", "~/.weixin_cow_credentials.json")
         )
 
+        # Always load credentials so we can restore context_tokens even when
+        # the bot token itself comes from config.
+        creds = _load_credentials(self._credentials_path)
         if not token:
-            creds = _load_credentials(self._credentials_path)
             token = creds.get("token", "")
             if creds.get("base_url"):
                 base_url = creds["base_url"]
 
+        # Restore persisted context_tokens so scheduler can deliver pushes
+        # immediately after restart, without waiting for the user to ping
+        # the bot first.
+        self._restore_context_tokens_from_creds(creds)
+
         if not token:
             token, base_url = self._login_with_retry(base_url)
             if not token:
@@ -137,11 +152,16 @@ class WeixinChannel(ChatChannel):
     def _relogin(self) -> bool:
         """Re-login after session expiry. Returns True on success."""
         base_url = self.api.base_url if self.api else DEFAULT_BASE_URL
-        if os.path.exists(self._credentials_path):
-            try:
-                os.remove(self._credentials_path)
-            except Exception:
-                pass
+        # Clearing the whole credentials file is intentional: the new login
+        # will issue a fresh `token` and persisted context_tokens belong to
+        # the previous bot identity, so they must not survive.
+        with self._context_tokens_lock:
+            self._context_tokens.clear()
+            if os.path.exists(self._credentials_path):
+                try:
+                    os.remove(self._credentials_path)
+                except Exception:
+                    pass
         self.login_status = self.LOGIN_STATUS_WAITING
         result = self._qr_login(base_url)
         if not result:
@@ -153,9 +173,62 @@ class WeixinChannel(ChatChannel):
             cdn_base_url=self.api.cdn_base_url if self.api else CDN_BASE_URL,
         )
         self.login_status = self.LOGIN_STATUS_OK
-        self._context_tokens.clear()
         return True
 
+    # ── Context token persistence ──────────────────────────────────────
+    # ilink requires every outbound send to echo the context_token from the
+    # user's latest inbound message. We mirror the in-memory map into the
+    # credentials JSON so scheduled pushes survive process restarts.
+    # All mutation + disk IO is serialized via _context_tokens_lock so that
+    # concurrent updates can never lose each other's writes.
+
+    def _restore_context_tokens_from_creds(self, creds: dict) -> None:
+        if not isinstance(creds, dict):
+            return
+        tokens = creds.get("context_tokens")
+        if not isinstance(tokens, dict):
+            return
+        restored = 0
+        with self._context_tokens_lock:
+            for user_id, token in tokens.items():
+                if isinstance(user_id, str) and isinstance(token, str) and token:
+                    self._context_tokens[user_id] = token
+                    restored += 1
+        if restored:
+            logger.info(f"[Weixin] Restored {restored} context_tokens from credentials")
+
+    def _persist_context_tokens_locked(self) -> None:
+        """Flush the token map to disk. Caller must hold _context_tokens_lock."""
+        if not self._credentials_path:
+            return
+        try:
+            creds = _load_credentials(self._credentials_path) or {}
+            creds["context_tokens"] = dict(self._context_tokens)
+            _save_credentials(self._credentials_path, creds)
+        except Exception as e:
+            logger.warning(f"[Weixin] Failed to persist context_tokens: {e}")
+
+    def _update_context_token(self, user_id: str, token: str) -> None:
+        """Update the in-memory token for a user; flush to disk only on change."""
+        if not user_id or not token:
+            return
+        with self._context_tokens_lock:
+            if self._context_tokens.get(user_id) == token:
+                return
+            self._context_tokens[user_id] = token
+            self._persist_context_tokens_locked()
+
+    def _invalidate_context_token(self, user_id: str) -> None:
+        """Drop the cached token for a user (used after -14 / send rejection)."""
+        if not user_id:
+            return
+        with self._context_tokens_lock:
+            if user_id not in self._context_tokens:
+                return
+            del self._context_tokens[user_id]
+            logger.info(f"[Weixin] Invalidated stale context_token for {user_id}")
+            self._persist_context_tokens_locked()
+
     # ── QR Login ───────────────────────────────────────────────────────
 
     @staticmethod
@@ -388,7 +461,7 @@ class WeixinChannel(ChatChannel):
         context_token = raw_msg.get("context_token", "")
 
         if context_token and from_user:
-            self._context_tokens[from_user] = context_token
+            self._update_context_token(from_user, context_token)
 
         cdn_base_url = self.api.cdn_base_url if self.api else CDN_BASE_URL
         try:
@@ -464,6 +537,14 @@ class WeixinChannel(ChatChannel):
             else:
                 context.type = ContextType.TEXT
             context.content = content.strip()
+            if "desire_rtype" not in context and conf().get("always_reply_voice"):
+                context["desire_rtype"] = ReplyType.VOICE
+
+        elif ctype == ContextType.VOICE:
+            if "desire_rtype" not in context and (
+                conf().get("voice_reply_voice") or conf().get("always_reply_voice")
+            ):
+                context["desire_rtype"] = ReplyType.VOICE
 
         return context
 
@@ -486,6 +567,9 @@ class WeixinChannel(ChatChannel):
             self._send_file(reply.content, receiver, context_token)
         elif reply.type in (ReplyType.VIDEO, ReplyType.VIDEO_URL):
             self._send_video(reply.content, receiver, context_token)
+        elif reply.type == ReplyType.VOICE:
+            # ilink has no outbound voice item; deliver TTS as a file attachment.
+            self._send_file(reply.content, receiver, context_token)
         else:
             logger.warning(f"[Weixin] Unsupported reply type: {reply.type}, fallback to text")
             self._send_text(str(reply.content), receiver, context_token)
@@ -496,10 +580,30 @@ class WeixinChannel(ChatChannel):
             return msg.context_token
         return self._context_tokens.get(receiver, "")
 
+    def _check_send_response(self, resp, receiver: str) -> None:
+        """Inspect a send-API response; drop stale context_token on -14.
+
+        ilink uses ret/errcode = -14 to signal that the session (and any
+        cached context_token) is no longer valid. The plugin keeps running
+        because the bot itself can re-login; we just need to forget the
+        per-user token so the next push won't retry forever.
+        """
+        if not isinstance(resp, dict):
+            return
+        ret = resp.get("ret")
+        errcode = resp.get("errcode")
+        if ret == -14 or errcode == -14:
+            logger.warning(
+                f"[Weixin] Send returned -14 (session expired) for "
+                f"receiver={receiver}; dropping cached context_token"
+            )
+            self._invalidate_context_token(receiver)
+
     def _send_text(self, text: str, receiver: str, context_token: str):
         if len(text) <= TEXT_CHUNK_LIMIT:
             try:
-                self.api.send_text(receiver, text, context_token)
+                resp = self.api.send_text(receiver, text, context_token)
+                self._check_send_response(resp, receiver)
                 logger.debug(f"[Weixin] Text sent to {receiver}, len={len(text)}")
             except Exception as e:
                 logger.error(f"[Weixin] Failed to send text: {e}")
@@ -508,7 +612,8 @@ class WeixinChannel(ChatChannel):
         chunks = self._split_text(text, TEXT_CHUNK_LIMIT)
         for i, chunk in enumerate(chunks):
             try:
-                self.api.send_text(receiver, chunk, context_token)
+                resp = self.api.send_text(receiver, chunk, context_token)
+                self._check_send_response(resp, receiver)
                 logger.debug(f"[Weixin] Text chunk {i+1}/{len(chunks)} sent to {receiver}, len={len(chunk)}")
             except Exception as e:
                 logger.error(f"[Weixin] Failed to send text chunk {i+1}/{len(chunks)}: {e}")
@@ -542,13 +647,14 @@ class WeixinChannel(ChatChannel):
             return
         try:
             result = upload_media_to_cdn(self.api, local_path, receiver, media_type=1)
-            self.api.send_image_item(
+            resp = self.api.send_image_item(
                 to=receiver,
                 context_token=context_token,
                 encrypt_query_param=result["encrypt_query_param"],
                 aes_key_b64=result["aes_key_b64"],
                 ciphertext_size=result["ciphertext_size"],
             )
+            self._check_send_response(resp, receiver)
             logger.info(f"[Weixin] Image sent to {receiver}")
         except Exception as e:
             logger.error(f"[Weixin] Image send failed: {e}")
@@ -561,7 +667,7 @@ class WeixinChannel(ChatChannel):
             return
         try:
             result = upload_media_to_cdn(self.api, local_path, receiver, media_type=3)
-            self.api.send_file_item(
+            resp = self.api.send_file_item(
                 to=receiver,
                 context_token=context_token,
                 encrypt_query_param=result["encrypt_query_param"],
@@ -569,6 +675,7 @@ class WeixinChannel(ChatChannel):
                 file_name=os.path.basename(local_path),
                 file_size=result["raw_size"],
             )
+            self._check_send_response(resp, receiver)
             logger.info(f"[Weixin] File sent to {receiver}")
         except Exception as e:
             logger.error(f"[Weixin] File send failed: {e}")
@@ -581,13 +688,14 @@ class WeixinChannel(ChatChannel):
             return
         try:
             result = upload_media_to_cdn(self.api, local_path, receiver, media_type=2)
-            self.api.send_video_item(
+            resp = self.api.send_video_item(
                 to=receiver,
                 context_token=context_token,
                 encrypt_query_param=result["encrypt_query_param"],
                 aes_key_b64=result["aes_key_b64"],
                 ciphertext_size=result["ciphertext_size"],
             )
+            self._check_send_response(resp, receiver)
             logger.info(f"[Weixin] Video sent to {receiver}")
         except Exception as e:
             logger.error(f"[Weixin] Video send failed: {e}")
diff --git a/cli/VERSION b/cli/VERSION
index 815e68dd..09843e3b 100644
--- a/cli/VERSION
+++ b/cli/VERSION
@@ -1 +1 @@
-2.0.8
+2.0.9
diff --git a/common/const.py b/common/const.py
index dccac1a4..4e4aca24 100644
--- a/common/const.py
+++ b/common/const.py
@@ -15,6 +15,7 @@ ZHIPU_AI = "zhipu"
 MOONSHOT = "moonshot"
 MiniMax = "minimax"
 DEEPSEEK = "deepseek"
+MIMO = "mimo"  # 小米 MiMo 大模型
 CUSTOM = "custom"  # custom OpenAI-compatible API, bot_type won't auto-switch on model change
 MODELSCOPE = "modelscope"
 
@@ -29,8 +30,9 @@ CLAUDE_35_SONNET = "claude-3-5-sonnet-latest"  # 带 latest 标签的模型名
 CLAUDE_35_SONNET_1022 = "claude-3-5-sonnet-20241022"  # 带具体日期的模型名称，会固定为该日期发布的模型
 CLAUDE_35_SONNET_0620 = "claude-3-5-sonnet-20240620"
 CLAUDE_4_OPUS = "claude-opus-4-0"
+CLAUDE_4_8_OPUS = "claude-opus-4-8"      # Claude Opus 4.8 - Agent推荐模型
 CLAUDE_4_7_OPUS = "claude-opus-4-7"      # Claude Opus 4.7
-CLAUDE_4_6_OPUS = "claude-opus-4-6"      # Claude Opus 4.6 - Agent推荐模型
+CLAUDE_4_6_OPUS = "claude-opus-4-6"      # Claude Opus 4.6
 CLAUDE_4_SONNET = "claude-sonnet-4-0"    # Claude Sonnet 4.0
 CLAUDE_4_5_SONNET = "claude-sonnet-4-5"  # Claude Sonnet 4.5 - Agent推荐模型
 CLAUDE_4_6_SONNET = "claude-sonnet-4-6"  # Claude Sonnet 4.6 - Agent推荐模型
@@ -47,6 +49,7 @@ GEMINI_3_FLASH_PRE = "gemini-3-flash-preview"  # Gemini 3 Flash Preview - Agent
 GEMINI_3_PRO_PRE = "gemini-3-pro-preview"  # Gemini 3 Pro Preview
 GEMINI_31_PRO_PRE = "gemini-3.1-pro-preview"  # Gemini 3.1 Pro Preview - Agent推荐模型
 GEMINI_31_FLASH_LITE_PRE = "gemini-3.1-flash-lite-preview"  # Gemini 3.1 Flash Lite Preview - Agent推荐模型
+GEMINI_35_FLASH = "gemini-3.5-flash"  # Gemini 3.5 Flash - Agent推荐模型
 
 # OpenAI
 GPT35 = "gpt-3.5-turbo"
@@ -74,6 +77,7 @@ GPT_5_NANO = "gpt-5-nano"
 GPT_54 = "gpt-5.4"  # GPT-5.4 - Agent recommended model
 GPT_54_MINI = "gpt-5.4-mini"
 GPT_54_NANO = "gpt-5.4-nano"
+GPT_55 = "gpt-5.5"  # GPT-5.5 - top-tier (expensive), not default
 O1 = "o1-preview"
 O1_MINI = "o1-mini"
 WHISPER_1 = "whisper-1"
@@ -104,10 +108,12 @@ QWEN_LONG = "qwen-long"
 QWEN3_MAX = "qwen3-max"  # Qwen3 Max - Agent推荐模型
 QWEN35_PLUS = "qwen3.5-plus"  # Qwen3.5 Plus - Omni model (MultiModalConversation)
 QWEN36_PLUS = "qwen3.6-plus"  # Qwen3.6 Plus - Omni model (MultiModalConversation)
+QWEN37_MAX = "qwen3.7-max"  # Qwen3.7 Max - Agent推荐模型
 QWQ_PLUS = "qwq-plus"
 
 # MiniMax
 MINIMAX_M2_7 = "MiniMax-M2.7"  # MiniMax M2.7 - Latest
+MINIMAX_TEXT_01 = "MiniMax-Text-01"  # MiniMax 多模态 (vision)
 MINIMAX_M2_7_HIGHSPEED = "MiniMax-M2.7-highspeed"  # MiniMax M2.7 highspeed
 MINIMAX_M2_5 = "MiniMax-M2.5"  # MiniMax M2.5
 MINIMAX_M2_1 = "MiniMax-M2.1"  # MiniMax M2.1
@@ -119,6 +125,7 @@ MINIMAX_ABAB6_5 = "abab6.5-chat"  # MiniMax abab6.5
 GLM_5_1 = "glm-5.1"  # 智谱 GLM-5.1 - Agent recommended model (default)
 GLM_5_TURBO = "glm-5-turbo"  # 智谱 GLM-5-Turbo
 GLM_5 = "glm-5"  # 智谱 GLM-5
+GLM_5V_TURBO = "glm-5v-turbo"  # 智谱多模态 (vision)
 GLM_4 = "glm-4"
 GLM_4_PLUS = "glm-4-plus"
 GLM_4_flash = "glm-4-flash"
@@ -135,6 +142,13 @@ KIMI_K2 = "kimi-k2"
 KIMI_K2_5 = "kimi-k2.5"
 KIMI_K2_6 = "kimi-k2.6"  # Kimi K2.6 - Agent recommended model (default)
 
+# 小米 MiMo
+MIMO_V2_5_PRO = "mimo-v2.5-pro"      # MiMo V2.5 Pro - 旗舰，长上下文（默认推荐）
+MIMO_V2_5 = "mimo-v2.5"              # MiMo V2.5 - 多模态（文/图/音/视频）
+MIMO_V2_PRO = "mimo-v2-pro"          # MiMo V2 Pro
+MIMO_V2_OMNI = "mimo-v2-omni"        # MiMo V2 Omni - 多模态
+MIMO_V2_FLASH = "mimo-v2-flash"      # MiMo V2 Flash - 极速版
+
 # Doubao (Volcengine Ark)
 DOUBAO = "doubao"
 DOUBAO_SEED_2_CODE = "doubao-seed-2-0-code-preview-260215"
@@ -177,13 +191,16 @@ MODEL_LIST = [
               # MiniMax
               MiniMax, MINIMAX_M2_7, MINIMAX_M2_7_HIGHSPEED, MINIMAX_M2_5, MINIMAX_M2_1, MINIMAX_M2_1_LIGHTNING, MINIMAX_M2, MINIMAX_ABAB6_5,
 
+              # 小米 MiMo
+              MIMO, MIMO_V2_5_PRO, MIMO_V2_5, MIMO_V2_PRO, MIMO_V2_OMNI, MIMO_V2_FLASH,
+
               # Claude
-              CLAUDE3, CLAUDE_4_6_SONNET, CLAUDE_4_7_OPUS, CLAUDE_4_6_OPUS, CLAUDE_4_OPUS, CLAUDE_4_5_SONNET, CLAUDE_4_SONNET, CLAUDE_3_OPUS, CLAUDE_3_OPUS_0229,
+              CLAUDE3, CLAUDE_4_8_OPUS, CLAUDE_4_7_OPUS, CLAUDE_4_6_SONNET, CLAUDE_4_6_OPUS, CLAUDE_4_OPUS, CLAUDE_4_5_SONNET, CLAUDE_4_SONNET, CLAUDE_3_OPUS, CLAUDE_3_OPUS_0229,
               CLAUDE_35_SONNET, CLAUDE_35_SONNET_1022, CLAUDE_35_SONNET_0620, CLAUDE_3_SONNET, CLAUDE_3_HAIKU,
               "claude", "claude-3-haiku", "claude-3-sonnet", "claude-3-opus", "claude-3.5-sonnet",
 
               # Gemini
-              GEMINI_31_FLASH_LITE_PRE, GEMINI_31_PRO_PRE, GEMINI_3_PRO_PRE, GEMINI_3_FLASH_PRE, GEMINI_25_PRO_PRE, GEMINI_25_FLASH_PRE,
+              GEMINI_35_FLASH, GEMINI_31_FLASH_LITE_PRE, GEMINI_31_PRO_PRE, GEMINI_3_PRO_PRE, GEMINI_3_FLASH_PRE, GEMINI_25_PRO_PRE, GEMINI_25_FLASH_PRE,
               GEMINI_20_FLASH, GEMINI_20_flash_exp, GEMINI_15_PRO, GEMINI_15_flash, GEMINI_PRO, GEMINI,
 
               # OpenAI
@@ -193,7 +210,7 @@ MODEL_LIST = [
               GPT_4o, GPT_4O_0806, GPT_4o_MINI,
               GPT_41, GPT_41_MINI, GPT_41_NANO,
               GPT_5, GPT_5_MINI, GPT_5_NANO,
-              GPT_54, GPT_54_MINI, GPT_54_NANO,
+              GPT_54, GPT_55, GPT_54_MINI, GPT_54_NANO,
               O1, O1_MINI,
 
               # GLM (智谱AI)
@@ -201,7 +218,7 @@ MODEL_LIST = [
               GLM_4_0520, GLM_4_AIR, GLM_4_AIRX, GLM_4_7,
 
               # Qwen (通义千问)
-              QWEN36_PLUS, QWEN35_PLUS, QWEN3_MAX, QWEN_MAX, QWEN_PLUS, QWEN_TURBO, QWEN_LONG,
+              QWEN37_MAX, QWEN36_PLUS, QWEN35_PLUS, QWEN3_MAX, QWEN_MAX, QWEN_PLUS, QWEN_TURBO, QWEN_LONG,
 
               # Doubao (豆包)
               DOUBAO, DOUBAO_SEED_2_CODE, DOUBAO_SEED_2_PRO, DOUBAO_SEED_2_LITE, DOUBAO_SEED_2_MINI,
@@ -227,4 +244,6 @@ DINGTALK = "dingtalk"
 WECOM_BOT = "wecom_bot"
 QQ = "qq"
 WEIXIN = "weixin"
-WECHAT_KF = "wechat_kf"  # WeCom customer service (微信客服) channel
+WECHAT_KF = "wechat_kf"
+TELEGRAM = "telegram"
+SLACK = "slack"
diff --git a/common/utils.py b/common/utils.py
index 812b20ab..e7264e20 100644
--- a/common/utils.py
+++ b/common/utils.py
@@ -117,6 +117,18 @@ def expand_path(path: str) -> str:
     return expanded
 
 
+def is_cloud_deployment() -> bool:
+    if os.environ.get("CLOUD_DEPLOYMENT_ID"):
+        return True
+    try:
+        from config import conf
+        if conf().get("cloud_deployment_id"):
+            return True
+    except Exception:
+        pass
+    return False
+
+
 def get_cloud_headers(api_key: str) -> dict:
     """
     Build standard headers for LinkAI API requests,
diff --git a/config-template.json b/config-template.json
index bf7e8b3c..4e4a7d36 100644
--- a/config-template.json
+++ b/config-template.json
@@ -16,8 +16,8 @@
   "open_ai_api_base": "https://api.openai.com/v1",
   "gemini_api_key": "",
   "gemini_api_base": "https://generativelanguage.googleapis.com",
-  "voice_to_text": "openai",
-  "text_to_voice": "openai",
+  "voice_to_text": "",
+  "text_to_voice": "",
   "voice_reply_voice": false,
   "speech_recognition": true,
   "group_speech_recognition": false,
diff --git a/config.py b/config.py
index 19355576..62896d17 100644
--- a/config.py
+++ b/config.py
@@ -173,6 +173,15 @@ available_setting = {
     # 企微智能机器人配置(长连接模式)
     "wecom_bot_id": "",  # 企微智能机器人BotID
     "wecom_bot_secret": "",  # 企微智能机器人长连接Secret
+    # Telegram 配置
+    "telegram_token": "",  # 从 @BotFather 申请的 bot token
+    "telegram_proxy": "",  # 可选的 HTTP/SOCKS5 代理，例如 http://127.0.0.1:7890 或 socks5://127.0.0.1:1080（留空则走系统环境变量）
+    "telegram_group_trigger": "mention_or_reply",  # 群聊触发方式: mention_or_reply(@或回复触发,推荐) | mention_only(仅@) | all(所有消息)
+    "telegram_register_commands": True,  # 启动时是否自动向 BotFather 注册命令菜单（与 web 端 slash 命令一致）
+    # Slack 配置（Socket Mode，无需公网 IP）
+    "slack_bot_token": "",  # Bot User OAuth Token，形如 xoxb-...
+    "slack_app_token": "",  # App-Level Token（开启 Socket Mode 后生成），形如 xapp-...
+    "slack_group_trigger": "mention_or_reply",  # 频道触发方式: mention_or_reply(@或线程内回复,推荐) | mention_only(仅@) | all(所有消息)
     # 微信配置
     "weixin_token": "",  # 微信登录后获取的bot_token，留空则启动时自动扫码登录
     "weixin_base_url": "https://ilinkai.weixin.qq.com",  # Weixin ilink API base URL
@@ -181,7 +190,7 @@ available_setting = {
     # chatgpt指令自定义触发词
     "clear_memory_commands": ["#清除记忆"],  # 重置会话指令，必须以#开头
     # channel配置
-    "channel_type": "",  # 通道类型，支持多渠道同时运行。单个: "feishu"，多个: "feishu, dingtalk" 或 ["feishu", "dingtalk"]。可选值: web,feishu,dingtalk,wecom_bot,weixin,wechatmp,wechatmp_service,wechatcom_app,wechat_kf
+    "channel_type": "",  # 通道类型，支持多渠道同时运行。单个: "feishu"，多个: "feishu, dingtalk" 或 ["feishu", "dingtalk"]。可选值: web,feishu,dingtalk,wecom_bot,weixin,wechatmp,wechatmp_service,wechatcom_app,wechat_kf,telegram,slack
     "web_console": True,  # 是否自动启动Web控制台（默认启动）。设为False可禁用
     "subscribe_msg": "",  # 订阅消息, 支持: wechatmp, wechatmp_service, wechatcom_app
     "debug": False,  # 是否开启debug模式，开启后会打印更多日志
@@ -216,10 +225,14 @@ available_setting = {
     "Minimax_base_url": "",
     "deepseek_api_key": "",
     "deepseek_api_base": "https://api.deepseek.com/v1",
+    # 小米 MiMo 大模型
+    "mimo_api_key": "",
+    "mimo_api_base": "https://api.xiaomimimo.com/v1",
     "web_host": "",  # Web console bind address; empty means auto
     "web_port": 9899,
     "web_password": "",  # Web console password; empty means no authentication required
     "web_session_expire_days": 30,  # Auth session expiry in days
+    "web_file_serve_root": "~",  # Root dir the /api/file endpoint may serve; "/" allows the whole filesystem
     "agent": True,  # 是否开启Agent模式
     "agent_workspace": "~/cow",  # agent工作空间路径，用于存储skills、memory等
     "agent_max_context_tokens": 50000,  # Agent模式下最大上下文tokens
@@ -337,8 +350,18 @@ def load_config():
     config_str = read_file(config_path)
     logger.debug("[INIT] config str: {}".format(drag_sensitive(config_str)))
 
-    # 将json字符串反序列化为dict类型
-    config = Config(json.loads(config_str))
+    # 将json字符串反序列化为dict类型。
+    # `object_pairs_hook` lets us catch users who accidentally typed the
+    # same key twice (e.g. two `"tools"` blocks) — json.loads would
+    # otherwise silently drop all but the last occurrence.
+    config = Config(json.loads(config_str, object_pairs_hook=_merge_duplicate_keys))
+
+    # Migrate legacy singular keys (`tool`, `skill`) into the canonical
+    # plural buckets so the rest of the codebase only reads one schema.
+    # Deep-merge so existing `tools`/`skills` entries are preserved and
+    # only missing namespaces are filled in from the legacy section.
+    _merge_legacy_namespace(config, legacy="tool",  canonical="tools")
+    _merge_legacy_namespace(config, legacy="skill", canonical="skills")
 
     # override config with environment variables.
     # Some online deployment platforms (e.g. Railway) deploy project from github directly. So you shouldn't put your secrets like api key in a config file, instead use environment variables to override the default config.
@@ -398,6 +421,8 @@ def load_config():
         "minimax_api_base": "MINIMAX_API_BASE",
         "deepseek_api_key": "DEEPSEEK_API_KEY",
         "deepseek_api_base": "DEEPSEEK_API_BASE",
+        "mimo_api_key": "MIMO_API_KEY",
+        "mimo_api_base": "MIMO_API_BASE",
         "qianfan_api_key": "QIANFAN_API_KEY",
         "qianfan_api_base": "QIANFAN_API_BASE",
         "zhipu_ai_api_key": "ZHIPU_AI_API_KEY",
@@ -434,7 +459,7 @@ def load_config():
                 os.environ[env_key] = str(val)
                 injected += 1
 
-    injected += _sync_skill_config_to_env(config.get("skill", {}))
+    injected += _sync_skill_config_to_env(config.get("skills", {}))
 
     if injected:
         logger.info("[INIT] Synced {} config values to environment variables".format(injected))
@@ -442,11 +467,90 @@ def load_config():
     config.load_user_datas()
 
 
+def _deep_merge_dicts(base: dict, incoming: dict) -> dict:
+    """Recursively merge ``incoming`` into ``base`` (incoming wins on leaves)."""
+    for key, val in incoming.items():
+        if (
+            key in base
+            and isinstance(base[key], dict)
+            and isinstance(val, dict)
+        ):
+            _deep_merge_dicts(base[key], val)
+        else:
+            base[key] = val
+    return base
+
+
+def _merge_duplicate_keys(pairs):
+    """object_pairs_hook for json.loads: deep-merge duplicate top-level keys
+    (lists concat, dicts merge, scalars take the latter) instead of dropping."""
+    out = {}
+    duplicates = []
+    for key, val in pairs:
+        if key not in out:
+            out[key] = val
+            continue
+        duplicates.append(key)
+        prev = out[key]
+        if isinstance(prev, dict) and isinstance(val, dict):
+            _deep_merge_dicts(prev, val)
+        elif isinstance(prev, list) and isinstance(val, list):
+            prev.extend(val)
+        else:
+            out[key] = val
+    if duplicates:
+        # logger may not be wired yet — fall back to print so we never lose the warning.
+        unique = sorted(set(duplicates))
+        try:
+            logger.warning("[INIT] config.json has duplicate keys (merged): %s", unique)
+        except Exception:
+            print("[INIT] config.json has duplicate keys (merged):", unique)
+    return out
+
+
+def _merge_legacy_namespace(cfg, legacy: str, canonical: str) -> None:
+    """Fold deprecated singular keys (``tool`` / ``skill``) into their plural
+    canonical counterparts at load time. Canonical entries always win."""
+    legacy_section = cfg.get(legacy)
+    if not isinstance(legacy_section, dict) or not legacy_section:
+        cfg.pop(legacy, None)
+        return
+    canonical_section = cfg.get(canonical)
+    if not isinstance(canonical_section, dict):
+        canonical_section = {}
+    merged_keys = []
+    for name, val in legacy_section.items():
+        if name in canonical_section:
+            if isinstance(canonical_section[name], dict) and isinstance(val, dict):
+                for sub_key, sub_val in val.items():
+                    if (
+                        sub_key in canonical_section[name]
+                        and isinstance(canonical_section[name][sub_key], dict)
+                        and isinstance(sub_val, dict)
+                    ):
+                        _deep_merge_dicts(sub_val, canonical_section[name][sub_key])
+                        canonical_section[name][sub_key] = sub_val
+                    else:
+                        canonical_section[name].setdefault(sub_key, sub_val)
+            continue
+        canonical_section[name] = val
+        merged_keys.append(name)
+    cfg[canonical] = canonical_section
+    cfg.pop(legacy, None)
+    if merged_keys:
+        logger.warning(
+            "[INIT] Legacy config key '{}' is deprecated; merged into '{}': {}. "
+            "Please rename '{}' to '{}' in your config.json.".format(
+                legacy, canonical, merged_keys, legacy, canonical,
+            )
+        )
+
+
 def _sync_skill_config_to_env(skill_section) -> int:
     """Flatten skill-namespaced config into environment variables.
 
-    Mapping rule: ``config["skill"][<name>][<key>]`` -> ``SKILL_<NAME>_<KEY>``
-    (e.g. ``skill["image-generation"].model`` -> ``SKILL_IMAGE_GENERATION_MODEL``).
+    Mapping rule: ``config["skills"][<name>][<key>]`` -> ``SKILL_<NAME>_<KEY>``
+    (e.g. ``skills["image-generation"].model`` -> ``SKILL_IMAGE_GENERATION_MODEL``).
 
     This lets subprocess-based skill scripts read their own settings without
     importing project code. Existing env vars are NOT overwritten so the
diff --git a/docs/README.md b/docs/README.md
new file mode 100644
index 00000000..f406cc2a
--- /dev/null
+++ b/docs/README.md
@@ -0,0 +1,30 @@
+# Documentation
+
+This directory contains the Mintlify documentation site for the project.
+
+## Prerequisites
+
+- Node.js v20.17.0 or higher (LTS recommended)
+
+## Install the CLI (one-time, global)
+
+```bash
+npm i -g mint
+```
+
+## Run the docs locally
+
+From this `docs/` directory:
+
+```bash
+mint dev
+```
+
+Then open http://localhost:3000 (or the port Mint reports if 3000 is in use).
+
+> The first run downloads the Mint preview framework (~90 MB) into `~/.mintlify/`.
+> Subsequent runs start instantly from the local cache.
+
+## More
+
+- Mintlify docs: https://www.mintlify.com/docs
diff --git a/docs/channels/index.mdx b/docs/channels/index.mdx
new file mode 100644
index 00000000..97ba16ab
--- /dev/null
+++ b/docs/channels/index.mdx
@@ -0,0 +1,43 @@
+---
+title: 通道概览
+description: CowAgent 支持的通道及能力矩阵
+---
+
+CowAgent 支持接入多种聊天通道，启动时通过 `channel_type` 切换。Web 控制台默认开启，可与其他接入通道并行运行。
+
+## 能力矩阵
+
+下表汇总各通道支持的入站消息类型、机器人回复类型与群聊能力，方便按场景选择。
+
+| 通道 | 文本 | 图片 | 文件 | 语音 | 群聊 |
+| --- | :-: | :-: | :-: | :-: | :-: |
+| [微信](/channels/weixin) | ✅ | ✅ | ✅ | ✅ |  |
+| [Web 控制台](/channels/web) | ✅ | ✅ | ✅ | ✅ | |
+| [飞书](/channels/feishu) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [钉钉](/channels/dingtalk) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [企微智能机器人](/channels/wecom-bot) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [QQ](/channels/qq) | ✅ | ✅ | ✅ | | ✅ |
+| [企业微信应用](/channels/wecom) | ✅ | ✅ | ✅ | ✅ | |
+| [公众号](/channels/wechatmp) | ✅ | ✅ | | ✅ | |
+| [Telegram](/channels/telegram) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [Slack](/channels/slack) | ✅ | ✅ | ✅ | | ✅ |
+
+- **图片 / 文件 / 语音**列表示通道支持收发对应消息类型，具体细节详见各通道文档
+- **群聊**列指可识别并响应群消息
+
+<Tip>
+  每个通道的语音 / 图像能力依赖对应模型厂商的配置，详见 [模型概览](/models)。
+</Tip>
+
+## 通道一览
+
+- [Web 控制台](/channels/web) — 内置浏览器对话和管理面板，默认开启
+- [微信](/channels/weixin) — 通过个人微信扫码登录
+- [飞书](/channels/feishu) — 飞书自建机器人
+- [钉钉](/channels/dingtalk) — 钉钉自建机器人
+- [企微智能机器人](/channels/wecom-bot) — 企业微信智能机器人
+- [QQ](/channels/qq) — QQ 官方机器人开放平台
+- [企业微信应用](/channels/wecom) — 企业微信自建应用接入
+- [公众号](/channels/wechatmp) — 微信公众号（订阅号 / 服务号）
+- [Telegram](/channels/telegram) — 海外 IM，5 分钟接入，无需公网 IP
+- [Slack](/channels/slack) — 团队协作 IM，Socket Mode 接入，无需公网 IP
diff --git a/docs/channels/slack.mdx b/docs/channels/slack.mdx
new file mode 100644
index 00000000..1103f1c0
--- /dev/null
+++ b/docs/channels/slack.mdx
@@ -0,0 +1,118 @@
+---
+title: Slack
+description: 将 CowAgent 接入 Slack App
+---
+
+> 通过 Slack App 的 **Socket Mode** 接入 CowAgent，支持私聊（DM）与频道（@机器人 / 线程内回复触发）。Socket Mode 基于长连接，无需公网 IP 与回调地址，开箱即用。
+
+## 一、接入步骤
+
+### 步骤一：创建 Slack App
+
+1. 打开 [Slack API 应用管理页](https://api.slack.com/apps)，点击 **Create New App** → **From scratch**。
+2. 填写 **App Name**（如 `CowAgent`），选择要安装的 **Workspace**，点击创建。
+
+### 步骤二：开启 Socket Mode 并获取 App Token
+
+1. 左侧菜单进入 **Settings → Socket Mode**，打开 **Enable Socket Mode**。
+2. 系统会提示生成一个 **App-Level Token**，作用域勾选 `connections:write`，生成后保存这串以 `xapp-` 开头的 Token。
+
+<Tip>
+  Socket Mode 通过 WebSocket 长连接接收事件，无需在公网暴露回调 URL，适合本地或内网部署。
+</Tip>
+
+### 步骤三：配置 Bot 权限并安装
+
+1. 进入 **Features → OAuth & Permissions**，在 **Bot Token Scopes** 中点击 **Add an OAuth Scope**，逐项添加以下权限：
+
+   ```
+   app_mentions:read
+   channels:history
+   chat:write
+   commands
+   files:read
+   files:write
+   groups:history
+   im:history
+   mpim:history
+   users:read
+   ```
+
+   <Note>
+     `files:read` / `files:write` 用于图片、文件的收发；若仅需文本对话可省略。
+   </Note>
+
+2. 进入 **Features → Event Subscriptions**，打开 **Enable Events**，在 **Subscribe to bot events** 中点击 **Add Bot User Event** 添加以下事件：
+
+   ```
+   app_mention
+   message.im
+   message.channels
+   ```
+
+   <Note>
+     如需在私有频道使用，再添加 `message.groups`。
+   </Note>
+3. 进入 **Features → App Home**，在 **Show Tabs** 区域勾选 **Messages Tab**，并勾选下方的 **Allow users to send Slash commands and messages from the messages tab**（允许用户从消息标签页发送消息），否则私聊输入框会被关闭、无法给机器人发消息。
+4. 回到 **OAuth & Permissions**，点击 **Install to Workspace** 完成安装，安装后获取以 `xoxb-` 开头的 **Bot User OAuth Token**。
+
+<Tip>
+  若 Slack 客户端仍提示「向此应用发送消息的功能已关闭」，请确认已完成上一步的 App Home 设置，并刷新或重启 Slack 客户端（必要时把 App 从对话列表移除后重新打开）。
+</Tip>
+
+### 步骤四：接入 CowAgent
+
+<Tabs>
+  <Tab title="Web 控制台（推荐）">
+    打开 Web 控制台（本地链接：http://127.0.0.1:9899 ），选择 **通道** 菜单，点击 **接入通道**，选择 **Slack**，分别填入 Bot Token（`xoxb-`）和 App Token（`xapp-`），点击接入即可。
+  </Tab>
+  <Tab title="配置文件">
+    在 `config.json` 中添加以下配置后启动：
+
+    ```json
+    {
+      "channel_type": "slack",
+      "slack_bot_token": "xoxb-xxxxxxxxxxxx",
+      "slack_app_token": "xapp-xxxxxxxxxxxx",
+      "slack_group_trigger": "mention_or_reply"
+    }
+    ```
+
+    | 参数 | 说明 | 默认值 |
+    | --- | --- | --- |
+    | `slack_bot_token` | Bot User OAuth Token，形如 `xoxb-...` | - |
+    | `slack_app_token` | App-Level Token（开启 Socket Mode 后生成），形如 `xapp-...` | - |
+    | `slack_group_trigger` | 频道触发方式：`mention_or_reply`（@或线程内回复）/ `mention_only`（仅@） / `all`（所有消息） | `mention_or_reply` |
+  </Tab>
+</Tabs>
+
+启动 Cow 后，日志中出现以下输出即表示接入成功：
+
+```
+[Slack] Bot logged in as user_id=U0XXXXXXX, team=Txxxxxxxx
+[Slack] ✅ Slack bot ready, listening for events
+```
+
+## 二、功能说明
+
+| 功能 | 支持情况 |
+| --- | --- |
+| 私聊（DM） | ✅ |
+| 频道（@机器人 / 线程内回复） | ✅ |
+| 文本消息 | ✅ 收发 |
+| 图片消息 | ✅ 收发 |
+| 文件消息 | ✅ 收发（PDF / Word / Excel 等） |
+| 线程回复 | ✅ 回复发送至触发消息所在线程 |
+
+<Note>
+  Slack 通过线程（Thread）组织对话。机器人会把回复发送到触发消息所在的线程，频道内更整洁。
+</Note>
+
+## 三、使用
+
+完成接入后：
+
+- **私聊（DM）**：在 Slack 左侧 **Apps** 中找到你的 App，直接发消息对话。
+- **频道**：把 App 邀请进频道（`/invite @你的App`），使用 `@你的App 你好` 触发对话；后续在同一线程内直接回复即可继续对话。
+
+发送图片或文件时，可以在附件的输入框中 **添加文字说明**（描述/问题）一并发送，机器人会结合附件回答。也支持先发附件再发问题，两条消息会自动合并提问。
diff --git a/docs/channels/telegram.mdx b/docs/channels/telegram.mdx
new file mode 100644
index 00000000..d7ab7a44
--- /dev/null
+++ b/docs/channels/telegram.mdx
@@ -0,0 +1,112 @@
+---
+title: Telegram
+description: 将 CowAgent 接入 Telegram Bot
+---
+
+> 通过 Telegram Bot API 接入 CowAgent，支持单聊与群聊（@机器人 / 回复机器人触发），使用 Long Polling 模式无需公网 IP，开箱即用。
+
+
+## 一、接入步骤
+
+### 步骤一：通过 BotFather 创建 Bot
+
+1. 在 Telegram 中搜索并打开官方账号 [@BotFather](https://t.me/BotFather)。
+2. 发送 `/newbot` 命令，按提示输入：
+   - **Bot 名称**（显示名，可中文，例如 `My CowAgent Bot`）
+   - **Bot 用户名**（必须以 `bot` 结尾，例如 `my_cowagent_bot`）
+3. 创建成功后，BotFather 会返回一段 **HTTP API Token**（形如 `123456789:ABCdefGhIJKlmNoPQRsTUVwxyZ`），妥善保存。
+
+<Tip>
+  这个 Token 等同于 Bot 的密码，请勿泄露。如果意外泄漏可向 `@BotFather` 发送 `/revoke` 重置。
+</Tip>
+
+### 步骤二：（群聊使用）关闭 Privacy Mode
+
+仅使用单聊可跳过此步。Telegram Bot 默认开启 **Privacy Mode**，群聊中只能收到带 `@bot` 的命令（如 `/start@your_bot`）以及对 bot 消息的 reply；**普通的 `@bot 你好` 文字消息收不到**，会导致群聊无响应。
+
+向 `@BotFather` 发送：
+
+1. `/setprivacy`
+2. 选择刚才创建的 bot
+3. 选择 `Disable`
+
+<Note>
+  若设置后群聊仍无响应，可尝试把 Bot 从群里移除并重新拉入。
+</Note>
+
+### 步骤三：接入 CowAgent
+
+<Tabs>
+  <Tab title="Web 控制台（推荐）">
+    打开 Web 控制台（本地链接：http://127.0.0.1:9899 ），选择 **通道** 菜单，点击 **接入通道**，选择 **Telegram**，填入 Bot Token，点击接入即可。
+  </Tab>
+  <Tab title="配置文件">
+    在 `config.json` 中添加以下配置后启动：
+
+    ```json
+    {
+      "channel_type": "telegram",
+      "telegram_token": "123456789:ABCdefGhIJKlmNoPQRsTUVwxyZ",
+      "telegram_group_trigger": "mention_or_reply"
+    }
+    ```
+
+    | 参数 | 说明 | 默认值 |
+    | --- | --- | --- |
+    | `telegram_token` | BotFather 返回的 HTTP API Token | - |
+    | `telegram_group_trigger` | 群聊触发方式：`mention_or_reply`（@或回复机器人）/ `mention_only`（仅@） / `all`（所有消息） | `mention_or_reply` |
+    | `telegram_register_commands` | 启动时是否自动向 BotFather 注册命令菜单 | `true` |
+    | `telegram_proxy` | （可选）代理地址，如 `http://127.0.0.1:7890`、`socks5://127.0.0.1:1080`；运行环境无法直连 `api.telegram.org` 时配置，留空则使用环境变量 `HTTPS_PROXY` | `""` |
+  </Tab>
+</Tabs>
+
+启动 Cow 后，日志中出现以下输出即表示接入成功：
+
+```
+[Telegram] Bot logged in as @my_cowagent_bot (id=123456789)
+[Telegram] Registered 10 bot commands
+[Telegram] ✅ Telegram bot ready, polling for updates
+```
+
+## 二、功能说明
+
+| 功能 | 支持情况 |
+| --- | --- |
+| 单聊 | ✅ |
+| 群聊（@机器人 / 回复机器人） | ✅ |
+| 文本消息 | ✅ 收发 |
+| 图片消息 | ✅ 收发 |
+| 语音消息 | ✅ 收发（接收 OGG/Opus，发送 OGG/Opus） |
+| 视频消息 | ✅ 收发 |
+| 文件消息 | ✅ 收发（PDF / Word / Excel 等） |
+| 命令菜单 | ✅ 与 Web 控制台 slash 命令一致 |
+
+### 命令菜单
+
+启动时会自动向 BotFather 注册命令菜单，用户在 Telegram 输入框输入 `/` 会出现下拉提示：
+
+| 命令 | 说明 |
+| --- | --- |
+| `/help` | 显示命令帮助 |
+| `/status` | 查看运行状态 |
+| `/context` | 查看对话上下文（`/context clear` 清除） |
+| `/skill` | 技能管理（`/skill list`、`/skill install` 等） |
+| `/memory` | 记忆管理（`/memory dream`） |
+| `/knowledge` | 知识库管理（`/knowledge list` / `on` / `off`） |
+| `/config` | 查看当前配置 |
+| `/cancel` | 中止当前正在运行的 Agent 任务 |
+| `/logs` | 查看最近日志 |
+| `/version` | 查看版本 |
+
+<Note>
+  Telegram 命令菜单只能展示一级命令，子命令通过空格输入即可，例如 `/skill list`、`/context clear`。
+</Note>
+
+## 三、使用
+
+完成接入后：
+
+- **单聊**：在 Telegram 中搜索你创建的 Bot 用户名（如 `@my_cowagent_bot`），点击 `Start` 即可开始对话。
+- **群聊**：把 Bot 拉进群，使用 `@bot 你好` 或 **回复 Bot 的某条消息** 触发对话。若群聊无响应，请检查 Privacy Mode 是否已按 [步骤二](#步骤二-群聊使用-关闭-privacy-mode) 关闭。
+
+发送图片或文件时，可以直接在附件上方的输入框中 **添加 Caption**（描述/问题）一并发送，机器人会结合附件回答。也支持先发附件再发问题，两条消息会自动合并提问。
diff --git a/docs/channels/web.mdx b/docs/channels/web.mdx
index 29d9ed97..30bea09b 100644
--- a/docs/channels/web.mdx
+++ b/docs/channels/web.mdx
@@ -59,9 +59,9 @@ Web 控制台是 CowAgent 的默认通道，启动后会自动运行，通过浏
 
 ### 模型管理
 
-支持在线管理模型配置，无需手动编辑配置文件：
+支持在线管理不同模型厂商的文本、图像、语音、向量模型配置，无需手动编辑配置文件：
 
-<img width="850" src="https://cdn.link-ai.tech/doc/20260227173811.png" />
+<img width="850" src="https://cdn.link-ai.tech/doc/20260521212949.png" />
 
 ### 技能管理
 
diff --git a/docs/cli/general.mdx b/docs/cli/general.mdx
index cb3f933d..36af1783 100644
--- a/docs/cli/general.mdx
+++ b/docs/cli/general.mdx
@@ -39,6 +39,14 @@ Mode:    agent
 Session: 12 messages | 8 skills loaded
 ```
 
+## cancel
+
+中止当前会话正在运行的 Agent 任务。在 Agent 执行长时间任务（例如多轮工具调用、长流式输出）时，可随时发送 `/cancel`，Agent 会在下一次工具执行前停止。Web 端、微信、企业微信、飞书等各通道均可使用。
+
+```text
+/cancel
+```
+
 ## config
 
 查看或修改运行时配置。修改后立即生效，无需重启服务。
diff --git a/docs/cli/index.mdx b/docs/cli/index.mdx
index ce67be27..f6462ecb 100644
--- a/docs/cli/index.mdx
+++ b/docs/cli/index.mdx
@@ -57,6 +57,7 @@ Others:
 | --- | --- |
 | `/help` | 显示命令帮助 |
 | `/status` | 查看服务状态和配置 |
+| `/cancel` | 中止当前正在运行的 Agent 任务 |
 | `/config` | 查看或修改运行时配置 |
 | `/skill` | 管理技能（安装、卸载、启用、禁用等） |
 | `/memory dream [N]` | 手动触发记忆蒸馏（默认 3 天，最大 30） |
@@ -82,6 +83,7 @@ Others:
 | version | ✓ | ✓ |
 | status | ✓ | ✓ |
 | logs | ✓ | ✓ |
+| cancel | ✗ | ✓ |
 | config | ✗ | ✓ |
 | context | — | ✓ |
 | memory (子命令) | ✗ | ✓ |
diff --git a/docs/docs.json b/docs/docs.json
index 355f9e9e..7cad9585 100644
--- a/docs/docs.json
+++ b/docs/docs.json
@@ -38,6 +38,12 @@
       {
         "language": "zh",
         "default": true,
+        "navbar": {
+          "links": [
+            { "label": "官网", "href": "https://cowagent.ai/?lang=zh" },
+            { "label": "GitHub", "href": "https://github.com/zhayujie/CowAgent" }
+          ]
+        },
         "tabs": [
           {
             "tab": "项目介绍",
@@ -82,6 +88,7 @@
                   "models/doubao",
                   "models/kimi",
                   "models/qianfan",
+                  "models/mimo",
                   "models/linkai",
                   "models/coding-plan",
                   "models/custom"
@@ -181,6 +188,7 @@
               {
                 "group": "接入渠道",
                 "pages": [
+                  "channels/index",
                   "channels/weixin",
                   "channels/web",
                   "channels/feishu",
@@ -189,7 +197,9 @@
                   "channels/qq",
                   "channels/wecom",
                   "channels/wechat-kf",
-                  "channels/wechatmp"
+                  "channels/wechatmp",
+                  "channels/telegram",
+                  "channels/slack"
                 ]
               }
             ]
@@ -216,6 +226,7 @@
                 "group": "发布记录",
                 "pages": [
                   "releases/overview",
+                  "releases/v2.0.9",
                   "releases/v2.0.8",
                   "releases/v2.0.7",
                   "releases/v2.0.6",
@@ -233,6 +244,12 @@
       },
       {
         "language": "en",
+        "navbar": {
+          "links": [
+            { "label": "Website", "href": "https://cowagent.ai/" },
+            { "label": "GitHub", "href": "https://github.com/zhayujie/CowAgent" }
+          ]
+        },
         "tabs": [
           {
             "tab": "Introduction",
@@ -254,7 +271,8 @@
                 "group": "Installation",
                 "pages": [
                   "en/guide/quick-start",
-                  "en/guide/manual-install"
+                  "en/guide/manual-install",
+                  "en/guide/upgrade"
                 ]
               }
             ]
@@ -276,6 +294,7 @@
                   "en/models/doubao",
                   "en/models/kimi",
                   "en/models/qianfan",
+                  "en/models/mimo",
                   "en/models/linkai",
                   "en/models/coding-plan",
                   "en/models/custom"
@@ -331,6 +350,7 @@
                 "pages": [
                   "en/skills/index",
                   "en/skills/install",
+                  "en/skills/create",
                   "en/skills/hub"
                 ]
               },
@@ -374,6 +394,7 @@
               {
                 "group": "Platforms",
                 "pages": [
+                  "en/channels/index",
                   "en/channels/weixin",
                   "en/channels/web",
                   "en/channels/feishu",
@@ -382,7 +403,9 @@
                   "en/channels/qq",
                   "en/channels/wecom",
                   "en/channels/wechat-kf",
-                  "en/channels/wechatmp"
+                  "en/channels/wechatmp",
+                  "en/channels/telegram",
+                  "en/channels/slack"
                 ]
               }
             ]
@@ -397,7 +420,7 @@
                   "en/cli/process",
                   "en/cli/skill",
                   "en/cli/memory-knowledge",
-                  "en/cli/chat"
+                  "en/cli/general"
                 ]
               }
             ]
@@ -409,6 +432,7 @@
                 "group": "Release Notes",
                 "pages": [
                   "en/releases/overview",
+                  "en/releases/v2.0.9",
                   "en/releases/v2.0.8",
                   "en/releases/v2.0.7",
                   "en/releases/v2.0.6",
@@ -426,6 +450,12 @@
       },
       {
         "language": "ja",
+        "navbar": {
+          "links": [
+            { "label": "ウェブサイト", "href": "https://cowagent.ai/" },
+            { "label": "GitHub", "href": "https://github.com/zhayujie/CowAgent" }
+          ]
+        },
         "tabs": [
           {
             "tab": "紹介",
@@ -470,6 +500,7 @@
                   "ja/models/doubao",
                   "ja/models/kimi",
                   "ja/models/qianfan",
+                  "ja/models/mimo",
                   "ja/models/linkai",
                   "ja/models/coding-plan",
                   "ja/models/custom"
@@ -569,6 +600,7 @@
               {
                 "group": "プラットフォーム",
                 "pages": [
+                  "ja/channels/index",
                   "ja/channels/weixin",
                   "ja/channels/web",
                   "ja/channels/feishu",
@@ -577,7 +609,9 @@
                   "ja/channels/qq",
                   "ja/channels/wecom",
                   "ja/channels/wechat-kf",
-                  "ja/channels/wechatmp"
+                  "ja/channels/wechatmp",
+                  "ja/channels/telegram",
+                  "ja/channels/slack"
                 ]
               }
             ]
@@ -604,6 +638,7 @@
                 "group": "リリースノート",
                 "pages": [
                   "ja/releases/overview",
+                  "ja/releases/v2.0.9",
                   "ja/releases/v2.0.8",
                   "ja/releases/v2.0.7",
                   "ja/releases/v2.0.6",
diff --git a/docs/en/README.md b/docs/en/README.md
deleted file mode 100644
index 56665285..00000000
--- a/docs/en/README.md
+++ /dev/null
@@ -1,250 +0,0 @@
-<p align="center"><img src="https://github.com/user-attachments/assets/eca9a9ec-8534-4615-9e0f-96c5ac1d10a3" alt="CowAgent" width="550" /></p>
-
-<p align="center">
-  <a href="https://github.com/zhayujie/CowAgent/releases/latest"><img src="https://img.shields.io/github/v/release/zhayujie/CowAgent" alt="Latest release"></a>
-  <a href="https://github.com/zhayujie/CowAgent/blob/master/LICENSE"><img src="https://img.shields.io/github/license/zhayujie/CowAgent" alt="License: MIT"></a>
-  <a href="https://github.com/zhayujie/CowAgent"><img src="https://img.shields.io/github/stars/zhayujie/CowAgent?style=flat-square" alt="Stars"></a> <br/>
-  [<a href="https://github.com/zhayujie/CowAgent/blob/master/README.md">中文</a>] | [English] | [<a href="https://github.com/zhayujie/CowAgent/blob/master/docs/ja/README.md">日本語</a>]
-</p>
-
-**CowAgent** is an AI super assistant powered by LLMs, capable of autonomous task planning, operating computers and external resources, creating and executing Skills, and continuously growing with long-term memory and a personal knowledge base. It supports flexible model switching, handles text, voice, images, and files, and can be integrated into WeChat, Web, Feishu, DingTalk, WeCom Bot, WeCom App, and WeChat Official Account — running 7×24 hours on your personal computer or server.
-
-<p align="center">
-  <a href="https://cowagent.ai/">🌐 Website</a> &nbsp;·&nbsp;
-  <a href="https://docs.cowagent.ai/en/intro/index">📖 Docs</a> &nbsp;·&nbsp;
-  <a href="https://docs.cowagent.ai/en/guide/quick-start">🚀 Quick Start</a> &nbsp;·&nbsp;
-  <a href="https://skills.cowagent.ai/">🧩 Skill Hub</a> &nbsp;·&nbsp;
-  <a href="https://link-ai.tech/cowagent/create">☁️ Try Online</a>
-</p>
-
-## Introduction
-
-> CowAgent is both an out-of-the-box AI super assistant and a highly extensible Agent framework. You can extend it with new model interfaces, channels, built-in tools, and the Skills system to flexibly implement various customization needs.
-
-- ✅ **Autonomous Task Planning**: Understands complex tasks and autonomously plans execution, continuously thinking and invoking tools until goals are achieved.
-- ✅ **Long-term Memory**: Automatically persists conversation memory to local files and databases, including core memory, daily memory, and Deep Dream distillation, with keyword and vector retrieval support.
-- ✅ **Personal Knowledge Base**: Automatically organizes structured knowledge with cross-references to build a knowledge graph, with web-based visualization and conversational management.
-- ✅ **Skills System**: Implements a Skills creation and execution engine, supports installing skills from [Skill Hub](https://skills.cowagent.ai), GitHub, etc., or creating custom Skills through conversation.
-- ✅ **Tool System**: Built-in tools for file I/O, terminal execution, browser automation, scheduled tasks, messaging, and more — autonomously invoked by the Agent.
-- ✅ **CLI System**: Provides terminal commands and in-chat commands for process management, skill installation, configuration, and more.
-- ✅ **Multimodal Messages**: Supports parsing, processing, generating, and sending text, images, voice, files, and other message types.
-- ✅ **Multiple Model Support**: Supports DeepSeek, MiniMax, Claude, Gemini, OpenAI, GLM, Qwen, Doubao, Kimi, and other mainstream model providers.
-- ✅ **Multi-platform Deployment**: Runs on local computers or servers, integrable into WeChat, Web, Feishu, DingTalk, WeChat Official Account, and WeCom applications.
-
-## Disclaimer
-
-1. This project follows the [MIT License](/LICENSE) and is intended for technical research and learning. Users must comply with local laws, regulations, policies, and corporate bylaws. Any illegal or rights-infringing use is prohibited.
-2. Agent mode consumes more tokens than normal chat mode. Choose models based on effectiveness and cost. Agent has access to the host OS — please deploy in trusted environments.
-3. CowAgent focuses on open-source development and does not participate in, authorize, or issue any cryptocurrency.
-
-## Demo
-
-Try online (no deployment needed): [CowAgent](https://link-ai.tech/cowagent/create)
-
-## Changelog
-
-> **2026.04.14:** [v2.0.6](https://github.com/zhayujie/CowAgent/releases/tag/2.0.6) — Knowledge Base, Deep Dream Memory Distillation, Smart Context Compression, Web Console upgrades.
-
-> **2026.04.01:** [v2.0.5](https://github.com/zhayujie/CowAgent/releases/tag/2.0.5) — Cow CLI, Skill Hub open source, Browser tool, WeCom Bot QR scan, and more.
-
-> **2026.02.27:** [v2.0.2](https://github.com/zhayujie/CowAgent/releases/tag/2.0.2) — Web console overhaul (streaming chat, model/skill/memory/channel/scheduler/log management), multi-channel concurrent running, session persistence, new models including Gemini 3.1 Pro / Claude 4.6 Sonnet / Qwen3.5 Plus.
-
-> **2026.02.13:** [v2.0.1](https://github.com/zhayujie/CowAgent/releases/tag/2.0.1) — Built-in Web Search tool, smart context trimming, runtime info dynamic update, Windows compatibility, fixes for scheduler memory loss, Feishu connection issues, and more.
-
-> **2026.02.03:** [v2.0.0](https://github.com/zhayujie/CowAgent/releases/tag/2.0.0) — Full upgrade to AI super assistant with multi-step task planning, long-term memory, built-in tools, Skills framework, new models, and optimized channels.
-
-> **2025.05.23:** [v1.7.6](https://github.com/zhayujie/CowAgent/releases/tag/1.7.6) — Web channel optimization, AgentMesh multi-agent plugin, Baidu TTS, claude-4-sonnet/opus support.
-
-> **2025.04.11:** [v1.7.5](https://github.com/zhayujie/CowAgent/releases/tag/1.7.5) — wechatferry protocol, DeepSeek model, Tencent Cloud voice, ModelScope and Gitee-AI support.
-
-> **2024.12.13:** [v1.7.4](https://github.com/zhayujie/CowAgent/releases/tag/1.7.4) — Gemini 2.0 model, Web channel, memory leak fix.
-
-Full changelog: [Release Notes](https://docs.cowagent.ai/en/releases/overview)
-
-<br/>
-
-## 🚀 Quick Start
-
-The project provides a one-click script for installation, configuration, startup, and management:
-
-**Linux / macOS:**
-```bash
-bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
-```
-
-**Windows (PowerShell):**
-```powershell
-irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
-```
-
-After running, the Web service starts by default. Access `http://localhost:9899/chat` to chat.
-
-Script usage: [One-click Install](https://docs.cowagent.ai/en/guide/quick-start). After installation, you can also use `cow start`, `cow stop`, and other [CLI commands](https://docs.cowagent.ai/en/cli/index) to manage the service.
-
-### Manual Installation
-
-**1. Clone the project**
-
-```bash
-git clone https://github.com/zhayujie/CowAgent
-cd CowAgent/
-```
-
-**2. Install dependencies**
-
-```bash
-pip3 install -r requirements.txt
-pip3 install -r requirements-optional.txt   # optional but recommended
-```
-
-**3. Install Cow CLI (recommended)**
-
-```bash
-pip3 install -e .
-```
-
-After installation, use `cow` commands to manage the service (start, stop, update, etc.) and skills. See [Command Docs](https://docs.cowagent.ai/en/cli/index).
-
-**4. Install browser (optional)**
-
-If you need the Agent to operate a browser (visit web pages, fill forms, etc.):
-
-```bash
-cow install-browser
-```
-
-This auto-installs `playwright` and Chromium. See [Browser Tool Docs](https://docs.cowagent.ai/en/tools/browser).
-
-**5. Configure**
-
-```bash
-cp config-template.json config.json
-```
-
-Fill in your model API key and channel type in `config.json`. See the [configuration docs](https://docs.cowagent.ai/en/guide/manual-install) for details.
-
-**6. Run**
-
-```bash
-cow start              # recommended, requires Cow CLI
-python3 app.py         # or run directly
-```
-
-For server deployment, use `cow` commands to manage the service:
-
-```bash
-cow start              # start in background
-cow stop               # stop service
-cow restart            # restart service
-cow status             # check running status
-cow logs               # view logs
-cow update             # pull latest code and restart
-```
-
-Or use the traditional way:
-
-```bash
-nohup python3 app.py & tail -f nohup.out
-```
-
-### Docker Deployment
-
-```bash
-curl -O https://cdn.link-ai.tech/code/cow/docker-compose.yml
-# Edit docker-compose.yml with your config
-sudo docker compose up -d
-sudo docker logs -f chatgpt-on-wechat
-```
-
-<br/>
-
-## Models
-
-Supports mainstream model providers. Recommended models for Agent mode:
-
-| Provider | Recommended Model |
-| --- | --- |
-| DeepSeek | `deepseek-v4-flash` |
-| MiniMax | `MiniMax-M2.7` |
-| Claude | `claude-sonnet-4-6` |
-| Gemini | `gemini-3.1-pro-preview` |
-| OpenAI | `gpt-5.4` |
-| GLM | `glm-5.1` |
-| Qwen | `qwen3.6-plus` |
-| Doubao | `doubao-seed-2-0-code-preview-260215` |
-| Kimi | `kimi-k2.6` |
-
-For detailed configuration of each model, see the [Models documentation](https://docs.cowagent.ai/en/models/index).
-
-### Coding Plan
-
-Coding Plan is a monthly subscription package offered by various providers, ideal for high-frequency Agent usage. All providers can be accessed via OpenAI-compatible mode:
-
-```json
-{
-  "bot_type": "openai",
-  "model": "MODEL_NAME",
-  "open_ai_api_base": "PROVIDER_CODING_PLAN_API_BASE",
-  "open_ai_api_key": "YOUR_API_KEY"
-}
-```
-
-- `bot_type`: Must be `openai`
-- `model`: Model name supported by the provider
-- `open_ai_api_base`: Provider's Coding Plan API Base (different from standard pay-as-you-go)
-- `open_ai_api_key`: Provider's Coding Plan API Key
-
-> Note: Coding Plan API Base and API Key are usually separate from standard pay-as-you-go ones. Please obtain them from each provider's platform.
-
-Supported providers include Alibaba Cloud, MiniMax, Zhipu GLM, Kimi, Volcengine, and more. For detailed configuration of each provider, see the [Coding Plan documentation](https://docs.cowagent.ai/en/models/coding-plan).
-
-<br/>
-
-## Channels
-
-Supports multiple platforms. Set `channel_type` in `config.json` to switch:
-
-| Channel | `channel_type` | Docs |
-| --- | --- | --- |
-| WeChat | `weixin` | [WeChat Setup](https://docs.cowagent.ai/en/channels/weixin) |
-| Web (default) | `web` | [Web Channel](https://docs.cowagent.ai/en/channels/web) |
-| Feishu | `feishu` | [Feishu Setup](https://docs.cowagent.ai/en/channels/feishu) |
-| DingTalk | `dingtalk` | [DingTalk Setup](https://docs.cowagent.ai/en/channels/dingtalk) |
-| WeCom Bot | `wecom_bot` | [WeCom Bot Setup](https://docs.cowagent.ai/en/channels/wecom-bot) |
-| WeCom App | `wechatcom_app` | [WeCom Setup](https://docs.cowagent.ai/en/channels/wecom) |
-| WeChat MP | `wechatmp` / `wechatmp_service` | [WeChat MP Setup](https://docs.cowagent.ai/en/channels/wechatmp) |
-| Terminal | `terminal` | — |
-
-Multiple channels can be enabled simultaneously, separated by commas: `"channel_type": "feishu,dingtalk"`.
-
-<br/>
-
-## Enterprise Services
-
-<a href="https://link-ai.tech" target="_blank"><img width="720" src="https://cdn.link-ai.tech/image/link-ai-intro.jpg"></a>
-
-> [LinkAI](https://link-ai.tech/) is a one-stop AI agent platform for enterprises and developers, integrating multimodal LLMs, knowledge bases, Agent plugins, and workflows. Supports one-click integration with mainstream platforms, SaaS and private deployment.
-
-<br/>
-
-## 🔗 Related Projects
-
-- [Cow Skill Hub](https://github.com/zhayujie/cow-skill-hub): Open skill marketplace for AI Agents — browse, search, install, and publish skills for CowAgent, OpenClaw, Claude Code, and more.
-- [bot-on-anything](https://github.com/zhayujie/bot-on-anything): Lightweight and highly extensible LLM application framework supporting Slack, Telegram, Discord, Gmail, and more.
-- [AgentMesh](https://github.com/MinimalFuture/AgentMesh): Open-source Multi-Agent framework for complex problem solving through agent team collaboration.
-
-## 🔎 FAQ
-
-FAQs: <https://github.com/zhayujie/CowAgent/wiki/FAQs>
-
-## 🛠️ Contributing
-
-Welcome to add new channels, referring to the [Feishu channel](https://github.com/zhayujie/CowAgent/blob/master/channel/feishu/feishu_channel.py) as an example. Also welcome to contribute new Skills, see the [Skill Creation docs](https://docs.cowagent.ai/en/skills/create), or submit to [Skill Hub](https://skills.cowagent.ai/submit).
-
-## ✉ Contact
-
-Welcome to submit PRs and Issues, and support the project with a 🌟 Star. For questions, check the [FAQ list](https://github.com/zhayujie/CowAgent/wiki/FAQs) or search [Issues](https://github.com/zhayujie/CowAgent/issues).
-
-## 🌟 Contributors
-
-![cow contributors](https://contrib.rocks/image?repo=zhayujie/CowAgent&max=1000)
diff --git a/docs/en/channels/feishu.mdx b/docs/en/channels/feishu.mdx
index 9c317a9d..1283d0c1 100644
--- a/docs/en/channels/feishu.mdx
+++ b/docs/en/channels/feishu.mdx
@@ -15,8 +15,11 @@ description: Integrate CowAgent into Feishu via a custom enterprise app
 
 No need to manually create an app on the Feishu Developer Platform. Start the Cow project, open the web console (default `http://127.0.0.1:9899/`), go to **Channels**, click **Add Channel**, choose **Feishu**, then under the **Scan QR** tab click **One-click Create Feishu App** and scan with the **Feishu App** to complete app creation and connection automatically.
 
+<img src="https://cdn.link-ai.tech/doc/20260505181126.png" width="800"/>
+
 <Note>
-  The created app comes with all required permissions (messaging, card read/write, group events, etc.) and event subscriptions pre-configured. Currently only the Feishu mainland version is supported (Lark international not yet supported).
+  1. Requires `lark-oapi` ≥ 1.5.5.
+  2. The created app comes with all required permissions (messaging, card read/write, group events, etc.) and event subscriptions pre-configured — no manual setup on the developer console needed. Currently only the Feishu mainland version is supported (Lark international not yet supported).
 </Note>
 
 When starting from CLI without `feishu_app_id` configured, the QR code is also printed to the terminal.
diff --git a/docs/en/channels/index.mdx b/docs/en/channels/index.mdx
new file mode 100644
index 00000000..8b6a25e9
--- /dev/null
+++ b/docs/en/channels/index.mdx
@@ -0,0 +1,43 @@
+---
+title: Channels Overview
+description: Channels supported by CowAgent and their capability matrix
+---
+
+CowAgent supports multiple chat channels. Switch between them at startup via `channel_type`. The Web Console is enabled by default and can run in parallel with other channels.
+
+## Capability Matrix
+
+The table below summarizes the inbound message types, bot reply types, and group chat capabilities supported by each channel, making it easy to choose by scenario.
+
+| Channel | Text | Image | File | Voice | Group Chat |
+| --- | :-: | :-: | :-: | :-: | :-: |
+| [WeChat](/en/channels/weixin) | ✅ | ✅ | ✅ | ✅ |  |
+| [Web Console](/en/channels/web) | ✅ | ✅ | ✅ | ✅ | |
+| [Feishu](/en/channels/feishu) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [DingTalk](/en/channels/dingtalk) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [WeCom Bot](/en/channels/wecom-bot) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [QQ](/en/channels/qq) | ✅ | ✅ | ✅ | | ✅ |
+| [WeCom App](/en/channels/wecom) | ✅ | ✅ | ✅ | ✅ | |
+| [Official Account](/en/channels/wechatmp) | ✅ | ✅ | | ✅ | |
+| [Telegram](/en/channels/telegram) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [Slack](/en/channels/slack) | ✅ | ✅ | ✅ | | ✅ |
+
+- The **Image / File / Voice** columns indicate that the channel can send and receive the corresponding message types; see each channel's docs for details
+- The **Group Chat** column indicates the ability to recognize and respond to group messages
+
+<Tip>
+  The voice / image capabilities of each channel depend on the configuration of the corresponding model provider. See [Models Overview](/en/models/index) for details.
+</Tip>
+
+## Channel List
+
+- [Web Console](/en/channels/web) — built-in browser-based chat and management panel, enabled by default
+- [WeChat](/en/channels/weixin) — log in via personal WeChat QR scan
+- [Feishu](/en/channels/feishu) — Feishu custom bot
+- [DingTalk](/en/channels/dingtalk) — DingTalk custom bot
+- [WeCom Bot](/en/channels/wecom-bot) — WeCom AI Bot via WebSocket long connection
+- [QQ](/en/channels/qq) — QQ Official Bot open platform
+- [WeCom App](/en/channels/wecom) — WeCom custom app integration
+- [Official Account](/en/channels/wechatmp) — WeChat Official Account (subscription / service)
+- [Telegram](/en/channels/telegram) — global IM, 5-minute setup, no public IP needed
+- [Slack](/en/channels/slack) — team collaboration IM, Socket Mode integration, no public IP needed
diff --git a/docs/en/channels/slack.mdx b/docs/en/channels/slack.mdx
new file mode 100644
index 00000000..f95272ca
--- /dev/null
+++ b/docs/en/channels/slack.mdx
@@ -0,0 +1,118 @@
+---
+title: Slack
+description: Integrate CowAgent with a Slack App
+---
+
+> Integrate CowAgent into Slack via a Slack App in **Socket Mode**. Supports direct messages (DM) and channels (triggered by @mention or replying within a thread). Socket Mode uses a persistent WebSocket connection — no public IP or callback URL required, works out of the box.
+
+## 1. Setup
+
+### Step 1: Create a Slack App
+
+1. Open the [Slack API apps page](https://api.slack.com/apps), click **Create New App** → **From scratch**.
+2. Enter an **App Name** (e.g. `CowAgent`), pick the **Workspace** to install into, and create it.
+
+### Step 2: Enable Socket Mode and get the App Token
+
+1. In the left sidebar go to **Settings → Socket Mode** and turn on **Enable Socket Mode**.
+2. You will be prompted to generate an **App-Level Token** with the `connections:write` scope. Save this token starting with `xapp-`.
+
+<Tip>
+  Socket Mode receives events over a WebSocket connection, so you don't need to expose a public callback URL — ideal for local or intranet deployments.
+</Tip>
+
+### Step 3: Configure bot scopes and install
+
+1. Go to **Features → OAuth & Permissions**, click **Add an OAuth Scope** under **Bot Token Scopes**, and add the following scopes one by one:
+
+   ```
+   app_mentions:read
+   channels:history
+   chat:write
+   commands
+   files:read
+   files:write
+   groups:history
+   im:history
+   mpim:history
+   users:read
+   ```
+
+   <Note>
+     `files:read` / `files:write` are used for sending/receiving images and files; omit them if you only need text conversations.
+   </Note>
+
+2. Go to **Features → Event Subscriptions**, turn on **Enable Events**, and under **Subscribe to bot events** click **Add Bot User Event** to add:
+
+   ```
+   app_mention
+   message.im
+   message.channels
+   ```
+
+   <Note>
+     Add `message.groups` if you need to use the bot in private channels.
+   </Note>
+3. Go to **Features → App Home**, enable **Messages Tab** under **Show Tabs**, and check **Allow users to send Slash commands and messages from the messages tab**. Otherwise the DM input box is disabled and users cannot message the bot.
+4. Back in **OAuth & Permissions**, click **Install to Workspace**. After installing, copy the **Bot User OAuth Token** starting with `xoxb-`.
+
+<Tip>
+  If the Slack client still shows "Sending messages to this app has been turned off", make sure you completed the App Home step above, then refresh or restart the Slack client (remove the app from your conversations and reopen it if needed).
+</Tip>
+
+### Step 4: Connect to CowAgent
+
+<Tabs>
+  <Tab title="Web Console (Recommended)">
+    Open the Web Console (default `http://127.0.0.1:9899`), go to **Channels**, click **Add Channel**, choose **Slack**, paste the Bot Token (`xoxb-`) and App Token (`xapp-`), and click connect.
+  </Tab>
+  <Tab title="Config File">
+    Add the following to `config.json` and start Cow:
+
+    ```json
+    {
+      "channel_type": "slack",
+      "slack_bot_token": "xoxb-xxxxxxxxxxxx",
+      "slack_app_token": "xapp-xxxxxxxxxxxx",
+      "slack_group_trigger": "mention_or_reply"
+    }
+    ```
+
+    | Key | Description | Default |
+    | --- | --- | --- |
+    | `slack_bot_token` | Bot User OAuth Token, like `xoxb-...` | - |
+    | `slack_app_token` | App-Level Token (generated after enabling Socket Mode), like `xapp-...` | - |
+    | `slack_group_trigger` | Channel trigger: `mention_or_reply` (@ or reply in thread) / `mention_only` (@ only) / `all` (all messages) | `mention_or_reply` |
+  </Tab>
+</Tabs>
+
+The integration is ready when you see logs like:
+
+```
+[Slack] Bot logged in as user_id=U0XXXXXXX, team=Txxxxxxxx
+[Slack] ✅ Slack bot ready, listening for events
+```
+
+## 2. Capabilities
+
+| Feature | Support |
+| --- | --- |
+| Direct message (DM) | ✅ |
+| Channel (@bot / reply in thread) | ✅ |
+| Text messages | ✅ send / receive |
+| Image messages | ✅ send / receive |
+| File messages | ✅ send / receive (PDF / Word / Excel, etc.) |
+| Thread replies | ✅ replies are posted to the thread of the triggering message |
+
+<Note>
+  Slack organizes conversations into threads. The bot posts replies into the thread of the triggering message, keeping channels tidy.
+</Note>
+
+## 3. Usage
+
+Once connected:
+
+- **Direct message (DM)**: find your App under **Apps** in the Slack sidebar and message it directly.
+- **Channel**: invite the App into a channel (`/invite @your-app`), then trigger it with `@your-app hello`; continue the conversation by replying within the same thread.
+
+When sending an image or file, you can **add a text caption** (description / question) in the attachment input — the bot will answer based on both. Sending an attachment first and then a follow-up question also works; the two messages are merged automatically.
diff --git a/docs/en/channels/telegram.mdx b/docs/en/channels/telegram.mdx
new file mode 100644
index 00000000..f90da992
--- /dev/null
+++ b/docs/en/channels/telegram.mdx
@@ -0,0 +1,111 @@
+---
+title: Telegram
+description: Integrate CowAgent with Telegram via the Bot API
+---
+
+> Integrate CowAgent into Telegram via the official Bot API. Supports private chat and group chat (triggered by @mention or replying to the bot). Uses Long Polling — no public IP required, works out of the box.
+
+
+## 1. Setup
+
+### Step 1: Create a Bot via BotFather
+
+1. Open the official account [@BotFather](https://t.me/BotFather) in Telegram.
+2. Send `/newbot` and follow the prompts:
+   - **Bot name** (display name, e.g. `My CowAgent Bot`)
+   - **Bot username** (must end with `bot`, e.g. `my_cowagent_bot`)
+3. Once created, BotFather returns an **HTTP API Token** (e.g. `123456789:ABCdefGhIJKlmNoPQRsTUVwxyZ`). Keep it safe.
+
+<Tip>
+  The token is the password of your bot — never share it. If it leaks, send `/revoke` to `@BotFather` to reset it.
+</Tip>
+
+### Step 2: (Group chat only) Disable Privacy Mode
+
+Skip this step if you only use private chat. Telegram bots run in **Privacy Mode** by default — in groups they can only see commands suffixed with `@bot` (e.g. `/start@your_bot`) and replies to bot messages; **plain `@bot hello` text messages are not delivered**, so the bot will appear unresponsive in groups.
+
+Send the following to `@BotFather`:
+
+1. `/setprivacy`
+2. Pick the bot you just created
+3. Choose `Disable`
+
+<Note>
+  If the bot is still silent in groups after this, try removing it from the group and adding it back.
+</Note>
+
+### Step 3: Connect to CowAgent
+
+<Tabs>
+  <Tab title="Web Console (Recommended)">
+    Open the Web Console (default `http://127.0.0.1:9899`), go to **Channels**, click **Add Channel**, choose **Telegram**, paste the Bot Token, and click connect.
+  </Tab>
+  <Tab title="Config File">
+    Add the following to `config.json` and start Cow:
+
+    ```json
+    {
+      "channel_type": "telegram",
+      "telegram_token": "123456789:ABCdefGhIJKlmNoPQRsTUVwxyZ",
+      "telegram_group_trigger": "mention_or_reply"
+    }
+    ```
+
+    | Key | Description | Default |
+    | --- | --- | --- |
+    | `telegram_token` | HTTP API Token returned by BotFather | - |
+    | `telegram_group_trigger` | Group trigger: `mention_or_reply` (@ or reply) / `mention_only` (@ only) / `all` (all messages) | `mention_or_reply` |
+    | `telegram_register_commands` | Whether to register the command menu with BotFather on startup | `true` |
+  </Tab>
+</Tabs>
+
+The integration is ready when you see logs like:
+
+```
+[Telegram] Bot logged in as @my_cowagent_bot (id=123456789)
+[Telegram] Registered 10 bot commands
+[Telegram] ✅ Telegram bot ready, polling for updates
+```
+
+## 2. Capabilities
+
+| Feature | Support |
+| --- | --- |
+| Private chat | ✅ |
+| Group chat (@bot / reply to bot) | ✅ |
+| Text messages | ✅ send / receive |
+| Image messages | ✅ send / receive |
+| Voice messages | ✅ send / receive (OGG/Opus) |
+| Video messages | ✅ send / receive |
+| File messages | ✅ send / receive (PDF / Word / Excel, etc.) |
+| Command menu | ✅ aligned with Web Console slash commands |
+
+### Command Menu
+
+On startup, the channel registers a command menu with BotFather. Typing `/` in Telegram shows a dropdown:
+
+| Command | Description |
+| --- | --- |
+| `/help` | Show command help |
+| `/status` | View runtime status |
+| `/context` | View conversation context (`/context clear` to clear) |
+| `/skill` | Skill management (`/skill list`, `/skill install`, ...) |
+| `/memory` | Memory management (`/memory dream`) |
+| `/knowledge` | Knowledge base (`/knowledge list` / `on` / `off`) |
+| `/config` | View current config |
+| `/cancel` | Cancel the running Agent task |
+| `/logs` | View recent logs |
+| `/version` | Show version |
+
+<Note>
+  Telegram's command menu only displays top-level commands; subcommands are entered with a space, e.g. `/skill list`, `/context clear`.
+</Note>
+
+## 3. Usage
+
+Once connected:
+
+- **Private chat**: search for your bot username (e.g. `@my_cowagent_bot`) in Telegram, click `Start` and chat away.
+- **Group chat**: add the bot to a group, then trigger it with `@bot hello` or by **replying to one of the bot's messages**. If the bot doesn't respond in groups, double-check Privacy Mode in [Step 2](#step-2-group-chat-only-disable-privacy-mode).
+
+When sending an image or file, you can **add a caption** (description / question) directly in the attachment input — the bot will answer based on both. Sending an attachment first and then a follow-up question also works; the two messages are merged automatically.
diff --git a/docs/en/channels/web.mdx b/docs/en/channels/web.mdx
index 503fe95b..a2a39c72 100644
--- a/docs/en/channels/web.mdx
+++ b/docs/en/channels/web.mdx
@@ -1,23 +1,32 @@
 ---
 title: Web Console
-description: Use CowAgent through the web console
+description: Use CowAgent through the Web Console
 ---
 
-The Web Console is CowAgent's default channel. It starts automatically after launch, allowing you to chat with the Agent through a browser and manage models, skills, memory, channels, and other configurations online.
+The Web Console is CowAgent's default channel. It runs automatically once started, letting you chat with the Agent in a browser and manage models, skills, memory, channels, and other configuration online.
 
 ## Configuration
 
 ```json
 {
   "channel_type": "web",
-  "web_port": 9899
+  "web_host": "0.0.0.0",
+  "web_port": 9899,
+  "web_password": "",
+  "enable_thinking": false
 }
 ```
 
 | Parameter | Description | Default |
 | --- | --- | --- |
 | `channel_type` | Set to `web` | `web` |
+| `web_host` | Web service listen address. Defaults to `127.0.0.1` (local only); set to `0.0.0.0` for public access and configure a password | `""` |
 | `web_port` | Web service listen port | `9899` |
+| `web_password` | Access password. Leave empty to disable password protection; recommended when listening on `0.0.0.0` | `""` |
+| `web_session_expire_days` | Login session validity in days | `30` |
+| `enable_thinking` | Whether to enable deep thinking mode | `false` |
+
+Once a password is configured, you must enter it to log in when accessing the console. The login session is kept for 30 days by default, so restarting the service during that period does not require re-login. The password can also be changed online from the "Configuration" page in the console.
 
 ## Access URL
 
@@ -34,13 +43,13 @@ After starting the project, visit:
 
 ### Chat Interface
 
-Supports streaming output with real-time display of the Agent's reasoning process and tool calls, providing intuitive observation of the Agent's decision-making:
+Supports streaming output with real-time display of the Agent's reasoning process and tool calls, providing intuitive observation of the Agent's decision-making. Deep thinking can be toggled via configuration or the "Agent Configuration" switch in the console.
 
 <img width="850" src="https://cdn.link-ai.tech/doc/20260227180120.png" />
 
 #### Multi-Session Management
 
-The chat interface supports multi-session management. All session records are persistently stored in a SQLite database:
+The chat interface supports multi-session management. All session records are persistently stored in the database:
 
 - **Session List**: Click the history icon on the left to expand/collapse the session list panel, with scroll-to-load support for all historical sessions
 - **AI-Generated Titles**: After the first exchange in a new session, the model is automatically called to generate a short summary title
@@ -50,9 +59,9 @@ The chat interface supports multi-session management. All session records are pe
 
 ### Model Management
 
-Manage model configurations online without manually editing config files:
+Manage text, image, voice, and embedding model configurations for different providers online — no need to edit config files manually:
 
-<img width="850" src="https://cdn.link-ai.tech/doc/20260227173811.png" />
+<img width="850" src="https://cdn.link-ai.tech/doc/20260521212949.png" />
 
 ### Skill Management
 
@@ -80,6 +89,6 @@ View and manage scheduled tasks online, including one-time tasks, fixed interval
 
 ### Logs
 
-View Agent runtime logs in real-time for monitoring and troubleshooting:
+View Agent runtime logs in real time for monitoring and troubleshooting:
 
 <img width="850" src="https://cdn.link-ai.tech/doc/20260227173514.png" />
diff --git a/docs/en/channels/wecom-bot.mdx b/docs/en/channels/wecom-bot.mdx
index 4959c8b5..2cb51fff 100644
--- a/docs/en/channels/wecom-bot.mdx
+++ b/docs/en/channels/wecom-bot.mdx
@@ -3,71 +3,88 @@ title: WeCom Bot
 description: Connect CowAgent to WeCom AI Bot (WebSocket long connection)
 ---
 
-Connect CowAgent via WeCom AI Bot, supporting both direct messages and group chats. No public IP required — uses WebSocket long connection with Markdown rendering and streaming output.
+> Connect CowAgent via WeCom AI Bot, supporting both internal direct messages and group chats. No public IP required — uses a WebSocket long connection, with Markdown rendering and streaming output.
 
 <Note>
-  WeCom Bot and WeCom App are two different integration methods. WeCom Bot uses WebSocket long connection, requiring no public IP or domain, making it easier to set up.
+  WeCom Bot and WeCom App are two different integration methods. WeCom Bot uses a WebSocket long connection and requires no public IP or domain, making setup much simpler.
 </Note>
 
-## 1. Create an AI Bot
+## 1. Connection methods
+
+### Option A: One-click QR scan (recommended)
+
+No need to create the bot ahead of time. Start CowAgent and open the Web console (local URL: http://127.0.0.1:9899/), go to the **Channels** tab, click **Connect Channel**, choose **WeCom Bot**, switch to **QR scan** mode, and scan the QR code with **WeCom** — bot creation and connection complete automatically.
+
+<img src="https://cdn.link-ai.tech/doc/20260401121213.png" width="800"/>
+
+<Note>
+  After a successful scan, you can further configure the bot (name, avatar, visibility scope, etc.) in **WeCom Workbench → AI Bot**.
+</Note>
+
+### Option B: Manual creation
+
+Create the AI Bot in WeCom and obtain the Bot ID and Secret, then connect via the Web console or config file.
+
+**Step 1: Create the AI Bot**
 
 1. Open the WeCom client, go to **Workbench**, and click **AI Bot**:
 
 <img src="https://cdn.link-ai.tech/doc/20260316180959.png" width="800"/>
 
-2. Click **Create Bot** → **Manual Creation**:
+2. Click **Create Bot → Manual Creation**:
 
-<img src="https://cdn.link-ai.tech/doc/20260316181118.png" width="600"/>
+<img src="https://cdn.link-ai.tech/doc/20260316181118.png" width="800"/>
 
 3. Scroll to the bottom of the right panel and select **API Mode**:
 
-<img src="https://cdn.link-ai.tech/doc/20260316181215.png" width="600"/>
+<img src="https://cdn.link-ai.tech/doc/20260316181215.png" width="800"/>
 
-4. Set the bot name, avatar, and visibility scope. Select **Long Connection** mode, note down the **Bot ID** and **Secret**, then click Save.
+4. Set the bot name, avatar, and visibility scope. Choose **Long Connection** mode, save the **Bot ID** and **Secret**, then click Save.
 
-## 2. Configuration
+**Step 2: Connect to CowAgent**
 
-### Option A: Web Console
+<Tabs>
+  <Tab title="Web Console">
+    Open the Web console, go to the **Channels** tab, click **Connect Channel**, choose **WeCom Bot**, switch to **Manual** mode, enter the Bot ID and Secret, and click Connect.
 
-Start the program and open the Web console (local access: http://127.0.0.1:9899). Go to the **Channels** tab, click **Connect Channel**, select **WeCom Bot**, fill in the Bot ID and Secret from the previous step, and click Connect.
+    <img src="https://cdn.link-ai.tech/doc/20260316181711.png" width="800"/>
+  </Tab>
+  <Tab title="Config File">
+    Add the following to `config.json`, then start CowAgent:
 
-<img src="https://cdn.link-ai.tech/doc/20260316181711.png" width="600"/>
+    ```json
+    {
+      "channel_type": "wecom_bot",
+      "wecom_bot_id": "YOUR_BOT_ID",
+      "wecom_bot_secret": "YOUR_SECRET"
+    }
+    ```
 
-### Option B: Config File
+    | Parameter | Description |
+    | --- | --- |
+    | `wecom_bot_id` | Bot ID of the AI Bot |
+    | `wecom_bot_secret` | Secret of the AI Bot |
+  </Tab>
+</Tabs>
 
-Add the following to your `config.json`:
+The log line `[WecomBot] Subscribe success` confirms the connection is established.
 
-```json
-{
-  "channel_type": "wecom_bot",
-  "wecom_bot_id": "YOUR_BOT_ID",
-  "wecom_bot_secret": "YOUR_SECRET"
-}
-```
-
-| Parameter | Description |
-| --- | --- |
-| `wecom_bot_id` | Bot ID of the AI Bot |
-| `wecom_bot_secret` | Secret for the AI Bot |
-
-After configuration, start the program. The log message `[WecomBot] Subscribe success` indicates a successful connection.
-
-## 3. Supported Features
+## 2. Supported features
 
 | Feature | Status |
 | --- | --- |
-| Direct Messages | ✅ |
-| Group Chat (@bot) | ✅ |
-| Text Messages | ✅ Send & Receive |
-| Image Messages | ✅ Send & Receive |
-| File Messages | ✅ Send & Receive |
-| Streaming Reply | ✅ |
-| Scheduled Push | ✅ |
+| Direct chat | ✅ |
+| Group chat (@bot) | ✅ |
+| Text messages | ✅ Send / Receive |
+| Image messages | ✅ Send / Receive |
+| File messages | ✅ Send / Receive |
+| Streaming replies | ✅ |
+| Scheduled push messages | ✅ |
 
-## 4. Usage
+## 3. Usage
 
-Search for the bot name in WeCom to start a direct conversation.
+Search for the bot's name inside WeCom to start a direct chat.
 
-To use in group chats, add the bot to a group and @mention it to send messages.
+To use the bot in an internal group chat, add it to the group and @-mention it.
 
 <img src="https://cdn.link-ai.tech/doc/20260316182902.png" width="800"/>
diff --git a/docs/en/channels/weixin.mdx b/docs/en/channels/weixin.mdx
index 6cd90a45..0acb0a43 100644
--- a/docs/en/channels/weixin.mdx
+++ b/docs/en/channels/weixin.mdx
@@ -1,19 +1,21 @@
 ---
 title: WeChat
-description: Connect CowAgent to personal WeChat
+description: Connect CowAgent to personal WeChat (via the official API)
 ---
 
-> Connect CowAgent to your personal WeChat. Simply scan a QR code to log in — no public IP required. Supports text, image, voice, file, and video messages.
+> Connect CowAgent to your personal WeChat — scan to log in, no public IP required. Supports text, image, voice, file, and video messages in 1-on-1 chats. Backed by WeChat's official API; safe to use. After connecting, a bot assistant is added to your conversation list without affecting normal account usage.
 
-## 1. Configuration
+## 1. Setup and run
 
-### Option A: Web Console
+### Option A: Web console
 
-Start the program and open the Web console (local access: http://127.0.0.1:9899). Go to the **Channels** tab, click **Connect Channel**, select **WeChat**, and follow the prompts to scan the QR code.
+Start CowAgent and open the Web console (local URL: http://127.0.0.1:9899/). Go to the **Channels** tab, click **Connect Channel**, select **WeChat**, and follow the prompts to scan in.
 
-### Option B: Config File
+<img src="https://cdn.link-ai.tech/doc/20260322195114.png" width="800" />
 
-Set `channel_type` to `weixin` in your `config.json`:
+### Option B: Config file
+
+Set `channel_type` to `weixin` in `config.json`:
 
 ```json
 {
@@ -21,52 +23,49 @@ Set `channel_type` to `weixin` in your `config.json`:
 }
 ```
 
-After starting the program, a QR code will be displayed in the terminal. Scan it with WeChat and confirm on your phone to complete login.
+After starting CowAgent, a QR code is displayed in the terminal. Scan it with WeChat to complete login.
+
+<img src="https://cdn.link-ai.tech/doc/20260322195509.png" width="800" />
 
 <Note>
-  For backward compatibility, setting `channel_type` to `wx` also activates the WeChat channel.
+  1. For backward compatibility, setting `channel_type` to `wx` also activates the WeChat channel.
+  2. The WeChat client must be on version **8.0.69** or higher.
 </Note>
 
-## 2. Parameters
+## 2. Usage
 
-| Parameter | Description | Default |
-| --- | --- | --- |
-| `channel_type` | Set to `weixin` or `wx` | — |
+Once authorized, the integration completes and you can start chatting. A bot assistant is created in your WeChat conversation list, leaving normal account usage unaffected.
 
-Login credentials are automatically saved to `~/.weixin_cow_credentials.json`. To force a re-login, delete this file and restart.
+> You can find the bot at any time by searching for **"微信ClawBot"**. You may also rename it, change its avatar, pin it to the top of your conversation list, and so on.
+
+<img src="https://cdn.link-ai.tech/doc/83ae8251d896219fde4803f4205205be.jpg" width="250" />
 
 ## 3. Login
 
-### QR Code Login
+### QR code login
 
-On first startup, a QR code is displayed in the terminal (valid for approximately 2 minutes). Scan it with WeChat and confirm on your phone.
+On first startup, a QR code appears in the terminal (valid for around 2 minutes). Scan it with WeChat and confirm on your phone to log in.
 
-- The QR code automatically refreshes when it expires
-- The `qrcode` dependency is already included in `requirements.txt`, enabling QR code rendering directly in the terminal
+- The QR code refreshes automatically when it expires
+- The `qrcode` dependency is already included in `requirements.txt`, so the QR code renders directly in the terminal after install
 
-### Credential Persistence
+### Credential persistence
 
-After successful login, credentials are saved to `~/.weixin_cow_credentials.json`. Subsequent startups will reuse the saved credentials without requiring a new scan.
+After a successful login, credentials are saved to `~/.weixin_cow_credentials.json`. Subsequent startups reuse the saved credentials with no need to re-scan.
 
-To force a re-login, delete the credentials file and restart the program.
+To force a re-login, delete the credentials file and restart.
 
-### Session Expiry
+### Session expiry
 
-When the WeChat session expires (errcode -14), the program automatically clears old credentials and initiates a new QR login — no manual intervention required.
+When the WeChat session expires (errcode `-14`), CowAgent automatically clears old credentials and initiates a new QR login — no manual intervention required.
 
-## 4. Supported Features
+## 4. Supported features
 
 | Feature | Status |
 | --- | --- |
-| Direct Messages | ✅ |
-| Text Messages | ✅ Send & Receive |
-| Image Messages | ✅ Send & Receive |
-| File Messages | ✅ Send & Receive |
-| Video Messages | ✅ Send & Receive |
-| Voice Messages | ✅ Receive |
-
-## 5. Notes
-
-1. Ensure network access to `ilinkai.weixin.qq.com`.
-2. Media files (images, files, videos) are transferred via CDN with AES-128-ECB encryption, handled automatically by the program.
-3. A stable network connection is recommended to avoid frequent disconnections that would require re-scanning.
+| Direct messages | ✅ |
+| Text messages | ✅ Send & Receive |
+| Image messages | ✅ Send & Receive |
+| File messages | ✅ Send & Receive |
+| Video messages | ✅ Send & Receive |
+| Voice messages | ✅ Receive (built-in speech recognition) |
diff --git a/docs/en/cli/general.mdx b/docs/en/cli/general.mdx
index ae78f3e4..8107fcb5 100644
--- a/docs/en/cli/general.mdx
+++ b/docs/en/cli/general.mdx
@@ -25,6 +25,14 @@ View current session and service status, including process info, model configura
 /status
 ```
 
+## cancel
+
+Abort the agent task currently running in this session. When the agent is busy with a long task (e.g. multi-turn tool calls or a long streaming response), send `/cancel` and the agent will stop before the next tool execution. Available across all channels — Web, WeChat, WeCom, Feishu, etc.
+
+```text
+/cancel
+```
+
 ## config
 
 View or modify runtime configuration. Changes take effect immediately without restarting.
diff --git a/docs/en/cli/index.mdx b/docs/en/cli/index.mdx
index 36147261..e13b45a3 100644
--- a/docs/en/cli/index.mdx
+++ b/docs/en/cli/index.mdx
@@ -57,6 +57,7 @@ In the Web console or any connected channel, type `/` to see command suggestions
 | --- | --- |
 | `/help` | Show command help |
 | `/status` | View service status and configuration |
+| `/cancel` | Abort the currently running agent task |
 | `/config` | View or modify runtime configuration |
 | `/skill` | Manage skills (install, uninstall, enable, disable, etc.) |
 | `/memory dream [N]` | Manually trigger memory distillation (default 3 days, max 30) |
@@ -80,6 +81,7 @@ In the Web console or any connected channel, type `/` to see command suggestions
 | version | ✓ | ✓ |
 | status | ✓ | ✓ |
 | logs | ✓ | ✓ |
+| cancel | ✗ | ✓ |
 | config | ✗ | ✓ |
 | context | — | ✓ |
 | memory (subcommands) | ✗ | ✓ |
diff --git a/docs/en/cli/skill.mdx b/docs/en/cli/skill.mdx
index d712b4d2..99e41dec 100644
--- a/docs/en/cli/skill.mdx
+++ b/docs/en/cli/skill.mdx
@@ -19,6 +19,24 @@ cow skill list
 ```
 </CodeGroup>
 
+Example output:
+
+```
+📦 Installed skills (3/4)
+
+✅ pptx
+   Use this skill any time a .pptx file is involved…
+   Source: cowhub
+
+✅ skill-creator
+   Create, install, or update skills…
+   Source: builtin
+
+⏸️ image-vision (disabled)
+   Image understanding and visual analysis
+   Source: builtin
+```
+
 **Browse the Skill Hub** (view all available skills):
 
 <CodeGroup>
diff --git a/docs/en/guide/manual-install.mdx b/docs/en/guide/manual-install.mdx
index e610ae28..e17c0e84 100644
--- a/docs/en/guide/manual-install.mdx
+++ b/docs/en/guide/manual-install.mdx
@@ -81,7 +81,7 @@ nohup python3 app.py & tail -f nohup.out
 ```
 
 <Tip>
-  If deploying on a server, open port `9899` in your firewall or security group to access the Web console. It's recommended to restrict access to specific IPs for security.
+  **Deploying on a server?** By default `web_host` only listens on `127.0.0.1` (local access). Set `web_host` to `0.0.0.0` in `config.json` to make the console reachable from outside, and set `web_password` to protect it. Don't forget to open port `9899` in your firewall or security group — ideally restricted to specific IPs.
 </Tip>
 
 ## Docker Deployment
@@ -113,7 +113,7 @@ sudo docker logs -f chatgpt-on-wechat
 ```
 
 <Tip>
-  If deploying on a server, open port `9899` in your firewall or security group to access the Web console. It's recommended to restrict access to specific IPs for security.
+  **Running in Docker?** Set `WEB_HOST` to `0.0.0.0` in `docker-compose.yml` so the console is reachable from outside the container, and set `WEB_PASSWORD` to protect it. Make sure port `9899` is mapped to the host and open in your firewall or security group.
 </Tip>
 
 ## Core Configuration
diff --git a/docs/en/guide/quick-start.mdx b/docs/en/guide/quick-start.mdx
index a23691e7..343956dc 100644
--- a/docs/en/guide/quick-start.mdx
+++ b/docs/en/guide/quick-start.mdx
@@ -33,6 +33,10 @@ The script automatically performs these steps:
 
 By default, the Web console starts after installation. Access `http://localhost:9899` to begin chatting.
 
+<Note>
+  **Deploying on a server?** By default `web_host` only listens on `127.0.0.1` (local access only). Set `web_host` to `0.0.0.0` in `config.json` to make the console reachable from outside, and set `web_password` to protect it. Don't forget to open port `9899` in your firewall or security group — ideally restricted to specific IPs.
+</Note>
+
 ## Management Commands
 
 After installation, use the `cow` command to manage the service:
diff --git a/docs/en/guide/upgrade.mdx b/docs/en/guide/upgrade.mdx
new file mode 100644
index 00000000..d1cd5df6
--- /dev/null
+++ b/docs/en/guide/upgrade.mdx
@@ -0,0 +1,61 @@
+---
+title: Upgrade
+description: How to upgrade CowAgent
+---
+
+## Recommended: One-line upgrade
+
+Use `cow update` to pull the latest code and restart the service in one step:
+
+```bash
+cow update
+```
+
+The command runs the following automatically:
+
+1. Pull the latest code (`git pull`)
+2. Stop the running service
+3. Update Python dependencies
+4. Reinstall the CLI
+5. Start the service
+
+<Note>
+  If the Cow CLI is not installed, `./run.sh update` performs the same operations.
+</Note>
+
+## Manual upgrade
+
+Run the following inside the project root:
+
+```bash
+git pull
+pip3 install -r requirements.txt
+pip3 install -e .
+```
+
+Then restart the service:
+
+```bash
+# Using Cow CLI (recommended)
+cow restart
+
+# Or using run.sh
+./run.sh restart
+
+# Or restart manually with nohup
+kill $(ps -ef | grep app.py | grep -v grep | awk '{print $2}')
+nohup python3 app.py & tail -f nohup.out
+```
+
+## Docker upgrade
+
+Run the following in the directory containing `docker-compose.yml`:
+
+```bash
+sudo docker compose pull
+sudo docker compose up -d
+```
+
+<Tip>
+  Back up `config.json` before upgrading. For Docker deployments, mount the workspace directory as a volume to persist data across upgrades.
+</Tip>
diff --git a/docs/en/intro/architecture.mdx b/docs/en/intro/architecture.mdx
index 7de50bac..9fce8e5b 100644
--- a/docs/en/intro/architecture.mdx
+++ b/docs/en/intro/architecture.mdx
@@ -9,7 +9,7 @@ CowAgent 2.0 has evolved from a simple chatbot into a super intelligent assistan
 
 CowAgent's architecture consists of the following core modules:
 
-<img src="https://cdn.link-ai.tech/doc/cow-agent-arch-en.jpg.jpg" alt="CowAgent Architecture" />
+<img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/architecture/en/architecture.jpg" alt="CowAgent Architecture" />
 
 | Module | Description |
 | --- | --- |
@@ -39,8 +39,8 @@ The Agent workspace is located at `~/cow` by default and stores system prompts,
 
 ```
 ~/cow/
-├── system.md          # Agent system prompt
-├── user.md            # User profile
+├── SYSTEM.md          # Agent system prompt
+├── USER.md            # User profile
 ├── MEMORY.md          # Core memory
 ├── memory/            # Long-term memory storage
 │   └── YYYY-MM-DD.md  # Daily memory
@@ -67,9 +67,10 @@ Configure Agent mode parameters in `config.json`:
 {
   "agent": true,
   "agent_workspace": "~/cow",
-  "agent_max_context_tokens": 40000,
-  "agent_max_context_turns": 30,
-  "agent_max_steps": 15
+  "agent_max_context_tokens": 50000,
+  "agent_max_context_turns": 20,
+  "agent_max_steps": 20,
+  "enable_thinking": false
 }
 ```
 
@@ -77,7 +78,9 @@ Configure Agent mode parameters in `config.json`:
 | --- | --- | --- |
 | `agent` | Enable Agent mode | `true` |
 | `agent_workspace` | Workspace path | `~/cow` |
-| `agent_max_context_tokens` | Max context tokens | `40000` |
-| `agent_max_context_turns` | Max context turns | `30` |
-| `agent_max_steps` | Max decision steps per task | `15` |
+| `agent_max_context_tokens` | Max context tokens | `50000` |
+| `agent_max_context_turns` | Max context turns | `20` |
+| `agent_max_steps` | Max decision steps per task | `20` |
+| `enable_thinking` | Enable deep-thinking mode | `false` |
+| `knowledge` | Enable personal knowledge base | `true` |
 | `knowledge` | Enable personal knowledge base | `true` |
diff --git a/docs/en/intro/features.mdx b/docs/en/intro/features.mdx
index 9a3b21c2..8b65f18d 100644
--- a/docs/en/intro/features.mdx
+++ b/docs/en/intro/features.mdx
@@ -84,7 +84,7 @@ Secrets required by skills are stored in an environment variable file, managed b
 
 The Skills system provides infinite extensibility for the Agent. Each Skill consists of a description file, execution scripts (optional), and resources (optional), describing how to complete specific types of tasks. Skills allow the Agent to follow instructions for complex workflows, invoke tools, or integrate third-party systems.
 
-- **[Skill Hub](https://skills.cowagent.ai/):** An open skill marketplace featuring official, community, and third-party skills. Install with one command.
+- [Skill Hub](https://skills.cowagent.ai/): An open skill marketplace featuring official, community, and third-party skills. Install with one command.
 - **Built-in skills:** Located in the project's `skills/` directory, including skill creator, image recognition, LinkAI agent, web fetch, and more. Built-in skills are automatically enabled based on dependency conditions (API keys, system commands, etc.).
 - **Custom skills:** Created by users through conversation, stored in the workspace (`~/cow/skills/`), capable of implementing any complex business process or third-party integration.
 
diff --git a/docs/en/intro/index.mdx b/docs/en/intro/index.mdx
index aec04e9c..373383b2 100644
--- a/docs/en/intro/index.mdx
+++ b/docs/en/intro/index.mdx
@@ -1,53 +1,60 @@
 ---
 title: Introduction
-description: CowAgent - AI Super Assistant powered by LLMs
+description: CowAgent - Open-source super AI assistant and Agent Harness
 ---
 
-<img src="https://cdn.link-ai.tech/doc/78c5dd674e2c828642ecc0406669fed7.png" alt="CowAgent" width="600px"/>
+<div align="center">
+  <img src="https://cdn.link-ai.tech/doc/78c5dd674e2c828642ecc0406669fed7.png" alt="CowAgent" width="450px"/>
+</div>
 
-**CowAgent** is an AI super assistant powered by LLMs with autonomous task planning, long-term memory, skills system, multimodal messages, multiple model support, and multi-platform deployment.
+**CowAgent** is an open-source super AI assistant and Agent Harness. It proactively plans tasks, runs tools and skills, and autonomously grows with memory and knowledge.
 
-CowAgent can proactively think and plan tasks, operate computers and external resources, create and execute Skills, and continuously grow with long-term memory. It supports flexible switching between multiple models, handles text, voice, images, files and other multimodal messages, and can be integrated into WeChat, web, Feishu, DingTalk, WeCom, and WeChat Official Account. It runs 7x24 hours on your personal computer or server.
+CowAgent is lightweight, easy to deploy, and built to extend. Plug in any major LLM provider, run it across Web and major IM platforms, 24/7 on a personal computer or server.
 
-<Card title="GitHub" icon="github" href="https://github.com/zhayujie/CowAgent">
-  github.com/zhayujie/CowAgent
-</Card>
+<CardGroup cols={2}>
+  <Card title="GitHub" icon="github" href="https://github.com/zhayujie/CowAgent">
+    Open-source repository — Star and contribute
+  </Card>
+  <Card title="Try Online" icon="cloud" href="https://link-ai.tech/cowagent/create">
+    No setup required — experience CowAgent instantly
+  </Card>
+</CardGroup>
 
 ## Core Capabilities
 
 <CardGroup cols={2}>
   <Card title="Autonomous Task Planning" icon="brain" href="/en/intro/architecture">
-    Understands complex tasks and autonomously plans execution, continuously thinking and invoking tools until goals are achieved. Supports accessing file systems, terminals, browsers, schedulers, and other system resources through tools.
+    Decomposes complex tasks and executes them step by step, looping over tools and skills until the goal is reached.
   </Card>
-  <Card title="Long-term Memory" icon="database" href="/en/memory">
-    Three-tier memory flow (context → daily memory → global memory) with daily Deep Dream distillation, keyword and vector retrieval support.
+  <Card title="Long-term Memory" icon="database" href="/en/memory/index">
+    Three-tier architecture (context → daily → core), automatic Deep Dream distillation, hybrid keyword + vector retrieval.
   </Card>
-  <Card title="Knowledge Base" icon="book" href="/en/knowledge">
-    Automatically organizes structured knowledge with knowledge graph visualization, building a continuously growing knowledge network through cross-references.
+  <Card title="Personal Knowledge Base" icon="book" href="/en/knowledge/index">
+    Auto-curates structured knowledge into a Markdown wiki, builds an evolving knowledge graph with visual browsing.
   </Card>
   <Card title="Skills System" icon="puzzle-piece" href="/en/skills/index">
-    Implements a Skills creation and execution engine with built-in skills, and supports custom Skills development through natural language conversation.
+    A complete skill creation and execution engine. Install from Skill Hub or generate custom skills via natural-language conversation.
   </Card>
-  <Card title="Multimodal Messages" icon="image" href="/en/channels/web">
-    Supports parsing, processing, generating, and sending text, images, voice, files, and other message types.
+  <Card title="Multimodal Messaging" icon="image" href="/en/channels/web">
+    First-class support for text, images, voice, and files — recognition, generation, and delivery.
   </Card>
   <Card title="Tool System" icon="wrench" href="/en/tools/index">
-    Built-in tools for file I/O, terminal execution, browser automation, scheduled tasks, messaging, and more. The Agent autonomously invokes tools to accomplish complex tasks.
+    Built-in file I/O, terminal, browser, scheduler, memory retrieval, web search, and more — with native MCP integration.
   </Card>
   <Card title="Command System" icon="terminal" href="/en/cli/index">
-    Provides terminal CLI and in-chat commands for process management, skill installation, configuration, context inspection, and other common operations.
+    Terminal CLI and in-chat commands for process management, skill installation, configuration, and context inspection.
   </Card>
-  <Card title="Multiple Model Support" icon="microchip" href="/en/models/index">
-    Supports mainstream model providers including OpenAI, Claude, Gemini, DeepSeek, MiniMax, GLM, Qwen, Kimi, Doubao, and more.
+  <Card title="Pluggable Models" icon="microchip" href="/en/models/index">
+    Claude, GPT, Gemini, DeepSeek, Qwen, GLM, Kimi, MiniMax, Doubao, and more — swap providers from the Web console with one click.
   </Card>
-  <Card title="Multi-platform Deployment" icon="server" href="/en/channels/weixin">
-    Runs on local computers or servers, integrable into WeChat, web, Feishu, DingTalk, WeChat Official Account, and WeCom applications.
+  <Card title="Multi-channel Integration" icon="server" href="/en/channels/index">
+    A single Agent simultaneously serves Web, WeChat, Feishu, DingTalk, WeCom, QQ, and Official Accounts.
   </Card>
 </CardGroup>
 
-## Quick Experience
+## Quick Start
 
-Run the following command in your terminal for one-click install, configuration, and startup:
+Run one of the commands below to install, configure, and start CowAgent in a single step:
 
 <Tabs>
   <Tab title="Linux / macOS">
@@ -62,25 +69,25 @@ Run the following command in your terminal for one-click install, configuration,
   </Tab>
 </Tabs>
 
-By default, the Web service starts after running. Access `http://localhost:9899/chat` to chat in the web interface.
+Once started, open `http://localhost:9899` to access the **Web console** — the unified place to chat, configure providers, connect channels, and install skills.
 
 <CardGroup cols={2}>
   <Card title="Quick Start" icon="rocket" href="/en/guide/quick-start">
     Complete installation and run guide
   </Card>
   <Card title="Architecture" icon="sitemap" href="/en/intro/architecture">
-    CowAgent system architecture design
+    CowAgent system architecture
   </Card>
 </CardGroup>
 
 ## Disclaimer
 
-1. This project follows the [MIT License](https://github.com/zhayujie/CowAgent/blob/master/LICENSE) and is intended for technical research and learning. Users must comply with local laws, regulations, policies, and corporate bylaws. Any illegal or rights-infringing use is prohibited.
-2. Agent mode consumes more tokens than normal chat mode. Choose models based on effectiveness and cost. Agent has access to the host operating system — deploy with caution.
-3. CowAgent focuses on open-source development and does not participate in, authorize, or issue any cryptocurrency.
+1. This project is licensed under the [MIT License](https://github.com/zhayujie/CowAgent/blob/master/LICENSE) and is intended for technical research and learning. You are responsible for complying with applicable laws and regulations in your jurisdiction; the maintainers assume no liability for any consequences arising from use of this project.
+2. **Cost & safety:** Agent mode consumes substantially more tokens than plain chat — pick models that balance quality and cost. The Agent has access to your local operating system; deploy only in trusted environments.
+3. CowAgent is a pure open-source project and does not participate in, authorize, or issue any cryptocurrency.
 
 ## Community
 
-Add our assistant on WeChat to join the open-source community:
+Scan the WeChat QR code to join the open-source community group:
 
 <img width="140" src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/open-community.png" />
diff --git a/docs/en/knowledge/index.mdx b/docs/en/knowledge/index.mdx
index a877c386..f1610dc9 100644
--- a/docs/en/knowledge/index.mdx
+++ b/docs/en/knowledge/index.mdx
@@ -5,6 +5,10 @@ description: CowAgent personal knowledge base — structured knowledge accumulat
 
 The personal knowledge base is the Agent's long-term structured knowledge store, saved in the `knowledge/` directory within the workspace. Unlike memory, which is organized by timeline, the knowledge base organizes content by topic — articles, conversation insights, and learning materials are structured into interlinked Markdown pages, forming a continuously growing knowledge network.
 
+<Frame>
+  <img src="https://cdn.link-ai.tech/doc/20260413105435.png" width="800" />
+</Frame>
+
 ## Core Concepts
 
 ### Knowledge vs Memory
@@ -43,7 +47,7 @@ Knowledge writing is an autonomous Agent behavior, triggered in these scenarios:
 Each knowledge page includes cross-reference links to related pages, gradually building a knowledge graph.
 
 <Frame>
-  <img src="https://gist.github.com/user-attachments/assets/3ce92f78-1863-4820-8fa8-660c0f2b7f09" alt="Conversational knowledge ingest" />
+  <img src="https://cdn.link-ai.tech/doc/20260413110104.png" width="800" />
 </Frame>
 
 ## Knowledge Retrieval
@@ -63,11 +67,11 @@ The web console provides a dedicated "Knowledge" module with:
 - **Chat integration** — Knowledge document links referenced in Agent replies are clickable for direct navigation
 
 <Frame>
-  <img src="https://gist.github.com/user-attachments/assets/b7b9d6be-0ac1-4c65-803b-2c6b36bd59a7" alt="Knowledge document browsing" />
+  <img src="https://cdn.link-ai.tech/doc/17aad553d3e9e428c52ff9dc31726fda.png" width="800" />
 </Frame>
 
 <Frame>
-  <img src="https://gist.github.com/user-attachments/assets/44ae68ca-96cc-40b9-ab33-cdbec34c2379" alt="Knowledge graph visualization" />
+  <img src="https://cdn.link-ai.tech/doc/20260413105402.png" width="800" />
 </Frame>
 
 ## CLI Commands
diff --git a/docs/en/memory/index.mdx b/docs/en/memory/index.mdx
index d54daf33..e3f6513f 100644
--- a/docs/en/memory/index.mdx
+++ b/docs/en/memory/index.mdx
@@ -27,7 +27,7 @@ The Agent automatically persists conversation content to long-term memory throug
 
 - **On context trimming** — When conversation turns or tokens exceed the configured limit, the oldest half of the context is trimmed, and the discarded content is summarized by LLM into key information and written to the daily memory file. The summary is also asynchronously injected into the retained context for conversational continuity
 - **Daily scheduled summary** — A full summary is automatically triggered at 23:55 every day, ensuring memory is preserved even on low-activity days (skipped if content hasn't changed)
-- **[Deep Dream (memory distillation)](/en/memory/deep-dream)** — Runs automatically after the daily summary, distilling daily memories into MEMORY.md and generating a dream diary
+- [Deep Dream (memory distillation)](/en/memory/deep-dream) — Runs automatically after the daily summary, distilling daily memories into MEMORY.md and generating a dream diary
 - **On API context overflow** — When the model API returns a context overflow error, the current conversation summary is saved as an emergency measure
 
 All memory writes run asynchronously in a background thread (LLM summarization + file writing), never blocking normal conversation replies.
diff --git a/docs/en/models/claude.mdx b/docs/en/models/claude.mdx
index 3ac7651e..bb831eb8 100644
--- a/docs/en/models/claude.mdx
+++ b/docs/en/models/claude.mdx
@@ -1,17 +1,50 @@
 ---
 title: Claude
-description: Claude model configuration
+description: Anthropic Claude model configuration (Text Chat + Image Understanding)
 ---
 
+Claude is provided by Anthropic and supports both text chat and image understanding. The mainstream Sonnet / Opus models natively support vision, so no separate Vision model needs to be specified.
+
+<Tip>
+  All capabilities below can be configured in one place via the "Model Management" page in the Web Console, with no need to manually edit the configuration file.
+</Tip>
+
+## Text Chat
+
 ```json
 {
-  "model": "claude-sonnet-4-6",
+  "model": "claude-opus-4-8",
   "claude_api_key": "YOUR_API_KEY"
 }
 ```
 
 | Parameter | Description |
 | --- | --- |
-| `model` | Options include `claude-sonnet-4-6`, `claude-opus-4-7`, `claude-opus-4-6`, `claude-sonnet-4-5`, `claude-sonnet-4-0`, `claude-3-5-sonnet-latest`, etc. See [official models](https://docs.anthropic.com/en/docs/about-claude/models/overview) |
-| `claude_api_key` | Create at [Claude Console](https://console.anthropic.com/settings/keys) |
-| `claude_api_base` | Optional. Defaults to `https://api.anthropic.com/v1`. Change to use third-party proxy |
+| `model` | Supports `claude-opus-4-8`, `claude-opus-4-7`, `claude-sonnet-4-6`, `claude-opus-4-6`, `claude-sonnet-4-5`, `claude-sonnet-4-0`, `claude-3-5-sonnet-latest`, etc. See [official models](https://docs.anthropic.com/en/docs/about-claude/models/overview) |
+| `claude_api_key` | Create one in the [Claude Console](https://console.anthropic.com/settings/keys) |
+| `claude_api_base` | Optional, defaults to `https://api.anthropic.com/v1`. Can be changed to a third-party proxy |
+
+### Model Selection
+
+| Model | Use Case |
+| --- | --- |
+| `claude-opus-4-8` | Default recommended, latest flagship; best for complex reasoning and long-running tasks |
+| `claude-opus-4-7` | Previous-generation Opus flagship |
+| `claude-sonnet-4-6` | Balanced cost and speed, lower cost |
+| `claude-opus-4-6` / `claude-sonnet-4-5` / `claude-sonnet-4-0` | Earlier flagships at a lower price |
+
+## Image Understanding
+
+Once `claude_api_key` is configured, the Agent's Vision tool automatically uses the Claude main model to recognize images, with no extra setup required.
+
+To manually specify a Vision model, set it explicitly in the configuration file:
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "claude-sonnet-4-6"
+    }
+  }
+}
+```
diff --git a/docs/en/models/coding-plan.mdx b/docs/en/models/coding-plan.mdx
index cf48906a..b09715eb 100644
--- a/docs/en/models/coding-plan.mdx
+++ b/docs/en/models/coding-plan.mdx
@@ -77,7 +77,7 @@ Reference: [China Key](https://platform.minimaxi.com/docs/coding-plan/quickstart
 
 ---
 
-## Zhipu GLM
+## GLM
 
 ```json
 {
diff --git a/docs/en/models/custom.mdx b/docs/en/models/custom.mdx
index a70ca31d..45a7d2e1 100644
--- a/docs/en/models/custom.mdx
+++ b/docs/en/models/custom.mdx
@@ -1,26 +1,26 @@
 ---
 title: Custom
-description: Custom provider for third-party APIs and local models
+description: Custom vendor configuration for third-party API proxies and local models
 ---
 
-For models accessed via OpenAI-compatible APIs, such as:
+For model services accessed via the OpenAI-compatible protocol or locally deployed models, such as:
 
-- **Third-party API proxies**: Use a unified API Base to call multiple models
-- **Local models**: Models deployed locally via Ollama, vLLM, LocalAI, etc.
-- **Private deployments**: Self-hosted model services within your organization
+- **Third-party API proxies**: call multiple models through a unified API base
+- **Local models**: models deployed locally with tools like Ollama, vLLM, LocalAI
+- **Private deployments**: model services deployed inside an enterprise
 
 <Note>
-  Unlike the `openai` provider, switching models under the Custom provider will not auto-switch the provider type. Your custom API address is always preserved.
+  Difference from the `openai` vendor: when a custom vendor is selected, switching models via `/config model` does not automatically switch the vendor type — the custom API address is always used.
 </Note>
 
-## Configuration
+## Text Chat
 
-### Third-party API Proxy
+### Third-party API proxy
 
 ```json
 {
   "bot_type": "custom",
-  "model": "deepseek-v4-flash",
+  "model": "",
   "custom_api_key": "YOUR_API_KEY",
   "custom_api_base": "https://{your-proxy.com}/v1"
 }
@@ -29,13 +29,13 @@ For models accessed via OpenAI-compatible APIs, such as:
 | Parameter | Description |
 | --- | --- |
 | `bot_type` | Must be set to `custom` |
-| `model` | Model name, any model supported by your proxy service |
-| `custom_api_key` | API key provided by your proxy service |
-| `custom_api_base` | API base URL, must be OpenAI-compatible |
+| `model` | Model name; any model name supported by the proxy service |
+| `custom_api_key` | API key provided by the proxy service |
+| `custom_api_base` | API endpoint provided by the proxy service; must be OpenAI-compatible |
 
-### Local Models
+### Local models
 
-Local models typically don't require an API key — just set the API base:
+Local models usually do not require an API key — only the API base needs to be filled in:
 
 ```json
 {
@@ -45,7 +45,7 @@ Local models typically don't require an API key — just set the API base:
 }
 ```
 
-Common local deployment tools and their default addresses:
+Common local deployment tools and their default endpoints:
 
 | Tool | Default API Base |
 | --- | --- |
@@ -53,9 +53,9 @@ Common local deployment tools and their default addresses:
 | [vLLM](https://docs.vllm.ai) | `http://localhost:8000/v1` |
 | [LocalAI](https://localai.io) | `http://localhost:8080/v1` |
 
-## Switching Models
+### Switching Models
 
-Under the Custom provider, switching models only changes `model` without affecting `bot_type` or the API address:
+Switching models under a custom vendor only changes `model` — `bot_type` and the API endpoint remain unchanged:
 
 ```
 /config model qwen3.5:27b
diff --git a/docs/en/models/deepseek.mdx b/docs/en/models/deepseek.mdx
index 1bafb076..6de8d09b 100644
--- a/docs/en/models/deepseek.mdx
+++ b/docs/en/models/deepseek.mdx
@@ -1,9 +1,11 @@
 ---
 title: DeepSeek
-description: DeepSeek model configuration
+description: DeepSeek model configuration (Text Chat + Thinking Mode)
 ---
 
-Option 1: Native integration (recommended):
+DeepSeek is one of the default recommended vendors in Agent mode, focused on cost-effective text chat and task planning.
+
+## Text Chat
 
 ```json
 {
@@ -14,24 +16,24 @@ Option 1: Native integration (recommended):
 
 | Parameter | Description |
 | --- | --- |
-| `model` | Supports `deepseek-v4-flash` (default) and `deepseek-v4-pro` |
-| `deepseek_api_key` | Create at [DeepSeek Platform](https://platform.deepseek.com/api_keys) |
+| `model` | Supports `deepseek-v4-flash` (Default), `deepseek-v4-pro` |
+| `deepseek_api_key` | Create one on the [DeepSeek Platform](https://platform.deepseek.com/api_keys) |
 | `deepseek_api_base` | Optional, defaults to `https://api.deepseek.com/v1`. Can be changed to a third-party proxy |
 
-## Model Selection
+### Model Selection
 
 | Model | Use Case |
 | --- | --- |
-| `deepseek-v4-flash` | Default: fast and cost-effective |
-| `deepseek-v4-pro` | Stronger on complex tasks |
+| `deepseek-v4-flash` | Default recommended; fast and low cost |
+| `deepseek-v4-pro` | Smarter; better for complex tasks |
 
 ## Thinking Mode
 
-The V4 series (`deepseek-v4-flash` / `deepseek-v4-pro`) supports an explicit "thinking mode": the model emits a chain-of-thought (`reasoning_content`) before the final answer to improve answer quality.
+The V4 series (`deepseek-v4-flash` / `deepseek-v4-pro`) supports an explicit "thinking mode": before producing the final answer, the model emits a chain of thought (`reasoning_content`) to improve answer quality.
 
 ### Toggle
 
-Controlled by the global `enable_thinking` setting:
+Controlled by the global `enable_thinking` config, and can also be toggled from the Web Console's configuration page:
 
 ```json
 {
@@ -39,12 +41,12 @@ Controlled by the global `enable_thinking` setting:
 }
 ```
 
-- `true`: thinking is on across all channels. The Web console renders the reasoning trace; IM channels (WeChat / WeCom / DingTalk / Feishu) don't render it but still benefit from higher answer quality.
-- `false`: thinking off, faster responses with lower first-token latency.
+- `true`: the model thinks before answering across all channels. The Web Console displays the thinking process; IM channels (WeChat / WeCom / DingTalk / Feishu) do not show it but still get better answers.
+- `false`: thinking is disabled, responses are faster, and time-to-first-token is lower.
 
 ### Reasoning Effort
 
-Under thinking mode, `reasoning_effort` controls how hard the model thinks:
+Under thinking mode, `reasoning_effort` controls reasoning intensity:
 
 ```json
 {
@@ -55,27 +57,16 @@ Under thinking mode, `reasoning_effort` controls how hard the model thinks:
 
 | Value | Use Case |
 | --- | --- |
-| `high` (default) | Day-to-day agent tasks; balanced thinking depth and latency |
-| `max` | Complex coding, long-horizon planning, strict-constraint tasks. Deeper reasoning at the cost of more output tokens and higher latency |
+| `high` (Default) | Day-to-day Agent tasks; balanced reasoning and speed |
+| `max` | Complex coding, long-horizon planning, strictly constrained tasks; deeper reasoning but more time and output tokens |
 
-`reasoning_effort` only takes effect when `enable_thinking` is `true`. It is silently ignored on models that do not support thinking mode.
+`reasoning_effort` only takes effect when `enable_thinking` is `true`; it is ignored automatically when the model does not support thinking mode.
 
-### Notes
+### Behavior Notes
 
-- **Sampling parameters**: under thinking mode, `temperature`, `top_p`, `presence_penalty`, and `frequency_penalty` are silently ignored by the server (no error). CowAgent skips sending them automatically.
-- **Multi-turn tool calls**: once the history contains any tool-call turn, DeepSeek requires `reasoning_content` on every assistant message. CowAgent handles the round-trip automatically, including across mid-session toggles of the thinking switch.
+- **Sampling parameters**: in thinking mode, `temperature`, `top_p`, `presence_penalty`, and `frequency_penalty` are ignored by the server (without errors). CowAgent automatically skips them.
+- **Multi-turn tool calls**: when the history contains tool calls, DeepSeek requires every assistant message to include `reasoning_content`. CowAgent handles this automatically, so toggling thinking mode across turns will not cause errors.
 
 <Tip>
-  Start with `deepseek-v4-flash`; switch to `deepseek-v4-pro` for harder tasks; enable `enable_thinking` when you want deeper reasoning.
+  `deepseek-v4-flash` is used by default; switch to `deepseek-v4-pro` for complex tasks; enable `enable_thinking` when deep reasoning is needed.
 </Tip>
-
-Option 2: OpenAI-compatible configuration:
-
-```json
-{
-  "model": "deepseek-v4-flash",
-  "bot_type": "openai",
-  "open_ai_api_key": "YOUR_API_KEY",
-  "open_ai_api_base": "https://api.deepseek.com/v1"
-}
-```
diff --git a/docs/en/models/doubao.mdx b/docs/en/models/doubao.mdx
index fd9fc015..818275e5 100644
--- a/docs/en/models/doubao.mdx
+++ b/docs/en/models/doubao.mdx
@@ -1,17 +1,66 @@
 ---
-title: Doubao (ByteDance)
-description: Doubao (Volcano Ark) model configuration
+title: Doubao
+description: Doubao (Volcengine Ark) model configuration (Text / Image Understanding / Image Generation / Embedding)
 ---
 
+Doubao (Volcengine Ark) supports text chat, image understanding, image generation (Seedream), and embedding. A single `ark_api_key` enables all capabilities.
+
+<Tip>
+  All capabilities below can be configured in one place via the "Model Management" page in the Web Console, with no need to manually edit the configuration file.
+</Tip>
+
+## Text Chat
+
 ```json
 {
-  "model": "doubao-seed-2-0-code-preview-260215",
+  "model": "doubao-seed-2-0-pro-260215",
   "ark_api_key": "YOUR_API_KEY"
 }
 ```
 
 | Parameter | Description |
 | --- | --- |
-| `model` | Options include `doubao-seed-2-0-code-preview-260215`, `doubao-seed-2-0-pro-260215`, `doubao-seed-2-0-lite-260215`, etc. |
-| `ark_api_key` | Create at [Volcano Ark Console](https://console.volcengine.com/ark/region:ark+cn-beijing/apikey) |
-| `ark_base_url` | Optional. Defaults to `https://ark.cn-beijing.volces.com/api/v3` |
+| `model` | Can be `doubao-seed-2-0-pro-260215`, `doubao-seed-2-0-code-preview-260215`, `doubao-seed-2-0-lite-260215`, etc. |
+| `ark_api_key` | Create one in the [Volcengine Ark Console](https://console.volcengine.com/ark/region:ark+cn-beijing/apikey) |
+| `ark_base_url` | Optional, defaults to `https://ark.cn-beijing.volces.com/api/v3` |
+
+## Image Understanding
+
+Once `ark_api_key` is configured, the Agent's Vision tool automatically uses `doubao-seed-2-0-pro-260215` to recognize images, with no extra setup required.
+
+To manually specify a Vision model:
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "doubao-seed-2-0-pro-260215"
+    }
+  }
+}
+```
+
+## Image Generation
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "seedream-5.0-lite"
+    }
+  }
+}
+```
+
+Available models: `seedream-5.0-lite`, `seedream-4.5`.
+
+## Embedding
+
+```json
+{
+  "embedding_provider": "doubao",
+  "embedding_model": "doubao-embedding-vision-251215"
+}
+```
+
+The default model is `doubao-embedding-vision-251215` (multimodal embedding); the dimension (1024 or 2048) can be set via `embedding_dimensions` in the configuration file. After changing the embedding, run `/memory rebuild-index` to rebuild the index.
diff --git a/docs/en/models/gemini.mdx b/docs/en/models/gemini.mdx
index dd857db2..b2d9520b 100644
--- a/docs/en/models/gemini.mdx
+++ b/docs/en/models/gemini.mdx
@@ -1,16 +1,59 @@
 ---
 title: Gemini
-description: Google Gemini model configuration
+description: Google Gemini model configuration (Text Chat + Image Understanding + Image Generation)
 ---
 
+Google Gemini supports text chat, image understanding, and image generation (Nano Banana series). A single `gemini_api_key` enables all capabilities.
+
+<Tip>
+  All capabilities below can be configured in one place via the "Model Management" page in the Web Console, with no need to manually edit the configuration file.
+</Tip>
+
+## Text Chat
+
 ```json
 {
-  "model": "gemini-3.1-pro-preview",
+  "model": "gemini-3.5-flash",
   "gemini_api_key": "YOUR_API_KEY"
 }
 ```
 
 | Parameter | Description |
 | --- | --- |
-| `model` | Options include `gemini-3.1-flash-lite-preview`, `gemini-3.1-pro-preview`, `gemini-3-flash-preview`, `gemini-3-pro-preview`, etc. See [official docs](https://ai.google.dev/gemini-api/docs/models) |
-| `gemini_api_key` | Create at [Google AI Studio](https://aistudio.google.com/app/apikey) |
+| `model` | Recommended: `gemini-3.5-flash`; also supports `gemini-3.1-pro-preview`, `gemini-3.1-flash-lite-preview`, `gemini-3-flash-preview`, `gemini-3-pro-preview`, etc. See [official docs](https://ai.google.dev/gemini-api/docs/models) |
+| `gemini_api_key` | Create one in [Google AI Studio](https://aistudio.google.com/app/apikey) |
+| `gemini_api_base` | Optional, defaults to `https://generativelanguage.googleapis.com`. Can be changed to a third-party proxy |
+
+## Image Understanding
+
+All Gemini models natively support vision. Once `gemini_api_key` is configured, the Agent's Vision tool automatically uses the main model to recognize images, with no extra setup required.
+
+To manually specify a Vision model:
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "gemini-3.1-flash-lite-preview"
+    }
+  }
+}
+```
+
+## Image Generation
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "gemini-3.1-flash-image-preview"
+    }
+  }
+}
+```
+
+| Model ID | Alias |
+| --- | --- |
+| `gemini-3.1-flash-image-preview` | Nano Banana 2 |
+| `gemini-3-pro-image-preview` | Nano Banana Pro |
+| `gemini-2.5-flash-image` | Nano Banana |
diff --git a/docs/en/models/glm.mdx b/docs/en/models/glm.mdx
index 5f236f2b..473a805c 100644
--- a/docs/en/models/glm.mdx
+++ b/docs/en/models/glm.mdx
@@ -1,8 +1,16 @@
 ---
-title: GLM (Zhipu AI)
-description: Zhipu AI GLM model configuration
+title: GLM
+description: Zhipu AI GLM model configuration (Text / Image Understanding / Speech-to-Text / Embedding)
 ---
 
+Zhipu AI supports text chat, image understanding, speech-to-text (ASR), and embedding. A single `zhipu_ai_api_key` enables all capabilities.
+
+<Tip>
+  All capabilities below can be configured in one place via the "Model Management" page in the Web Console, with no need to manually edit the configuration file.
+</Tip>
+
+## Text Chat
+
 ```json
 {
   "model": "glm-5.1",
@@ -12,16 +20,37 @@ description: Zhipu AI GLM model configuration
 
 | Parameter | Description |
 | --- | --- |
-| `model` | Options include `glm-5.1`, `glm-5-turbo`, `glm-5`, `glm-4.7`, `glm-4-plus`, `glm-4-flash`, `glm-4-air`, etc. See [model codes](https://bigmodel.cn/dev/api/normal-model/glm-4) |
-| `zhipu_ai_api_key` | Create at [Zhipu AI Console](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) |
+| `model` | Can be `glm-5.1`, `glm-5-turbo`, `glm-5`, `glm-4.7`, `glm-4-plus`, `glm-4-flash`, `glm-4-air`, etc. See [model codes](https://bigmodel.cn/dev/api/normal-model/glm-4) |
+| `zhipu_ai_api_key` | Create one in the [Zhipu AI Console](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) |
+| `zhipu_ai_api_base` | Optional, defaults to `https://open.bigmodel.cn/api/paas/v4` |
 
-OpenAI-compatible configuration is also supported:
+## Image Understanding
+
+Zhipu's chat models (`glm-5.1`, `glm-5-turbo`, etc.) do not support vision; vision calls are uniformly routed to `glm-5v-turbo`. Once `zhipu_ai_api_key` is configured, the Agent's Vision tool automatically uses this model, with no need to specify it explicitly in the configuration file.
+
+## Speech-to-Text (ASR)
 
 ```json
 {
-  "bot_type": "openai",
-  "model": "glm-5.1",
-  "open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "voice_to_text": "zhipu",
+  "voice_to_text_model": "glm-asr-2512"
 }
 ```
+
+| Parameter | Description |
+| --- | --- |
+| `voice_to_text` | Set to `zhipu` to enable Zhipu ASR |
+| `voice_to_text_model` | Optional, defaults to `glm-asr-2512` |
+
+Credentials are automatically reused from `zhipu_ai_api_key`. Audio files should be smaller than 25MB; oversized files may be rejected by the server.
+
+## Embedding
+
+```json
+{
+  "embedding_provider": "zhipu",
+  "embedding_model": "embedding-3"
+}
+```
+
+Available models: `embedding-3`, `embedding-2`. After changing the embedding, run `/memory rebuild-index` to rebuild the index.
diff --git a/docs/en/models/index.mdx b/docs/en/models/index.mdx
index 5fae1f7d..1a82d162 100644
--- a/docs/en/models/index.mdx
+++ b/docs/en/models/index.mdx
@@ -1,58 +1,38 @@
 ---
 title: Models Overview
-description: Supported models and recommended choices for CowAgent
+description: Model vendors supported by CowAgent and their capability matrix
 ---
 
-CowAgent supports mainstream LLMs from domestic and international providers. Model interfaces are implemented in the project's `models/` directory.
+CowAgent supports a wide range of mainstream large language models. Model interfaces live under the project's `models/` directory. Beyond text chat, several vendors also provide vision understanding, image generation, speech-to-text, text-to-speech, and embeddings — all of which can be invoked on demand in the Agent flow.
 
-<Note>
-  For Agent mode, the following models are recommended based on quality and cost: deepseek-v4-flash, MiniMax-M2.7, claude-sonnet-4-6, gemini-3.1-pro-preview, glm-5.1, qwen3.6-plus, kimi-k2.6, ernie-5.1
-</Note>
+## Capability Matrix
 
-## Configuration
+A snapshot of each vendor's capabilities. "Text" refers to the main chat model; the remaining columns show which Agent capabilities the vendor can power.
 
-Configure the model name and API key in `config.json` according to your chosen model. Each model also supports OpenAI-compatible access by setting `bot_type` to `openai` and configuring `open_ai_api_base` and `open_ai_api_key`.
-
-You can also use the [LinkAI](https://link-ai.tech) platform interface to flexibly switch between multiple models with support for knowledge base, workflows, and other Agent capabilities.
-
-## Supported Models
-
-<CardGroup cols={2}>
-  <Card title="DeepSeek" href="/en/models/deepseek">
-    deepseek-v4-flash, deepseek-v4-pro, and more
-  </Card>
-  <Card title="Baidu Qianfan / ERNIE" href="/en/models/qianfan">
-    ernie-5.1, ernie-5.0, ernie-4.5-turbo-128k, and more
-  </Card>
-  <Card title="MiniMax" href="/en/models/minimax">
-    MiniMax-M2.7 and other series models
-  </Card>
-  <Card title="Claude" href="/en/models/claude">
-    claude-sonnet-4-6 and more
-  </Card>
-  <Card title="Gemini" href="/en/models/gemini">
-    gemini-3.1-pro-preview and more
-  </Card>
-  <Card title="OpenAI" href="/en/models/openai">
-    gpt-5.4, gpt-4.1, o-series and more
-  </Card>
-  <Card title="GLM (Zhipu AI)" href="/en/models/glm">
-    glm-5.1, glm-5-turbo, glm-5 and other series models
-  </Card>
-  <Card title="Qwen (Tongyi Qianwen)" href="/en/models/qwen">
-    qwen3.6-plus, qwen3-max and more
-  </Card>
-  <Card title="Doubao (ByteDance)" href="/en/models/doubao">
-    doubao-seed series models
-  </Card>
-  <Card title="Kimi" href="/en/models/kimi">
-    kimi-k2.6, kimi-k2.5, kimi-k2 and more
-  </Card>
-  <Card title="LinkAI" href="/en/models/linkai">
-    Unified multi-model interface + knowledge base
-  </Card>
-</CardGroup>
+| Vendor | Representative Models | Text | Vision | Image Gen | STT | TTS | Embedding |
+| --- | --- | :-: | :-: | :-: | :-: | :-: | :-: |
+| [DeepSeek](/en/models/deepseek) | deepseek-v4-flash / pro | ✅ | | | | | |
+| [MiniMax](/en/models/minimax) | MiniMax-M2.7 | ✅ | ✅ | ✅ | | ✅ | |
+| [Claude](/en/models/claude) | claude-opus-4-8 | ✅ | ✅ | | | | |
+| [Gemini](/en/models/gemini) | gemini-3.5-flash | ✅ | ✅ | ✅ | | | |
+| [OpenAI](/en/models/openai) | gpt-5.5, o-series | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [GLM](/en/models/glm) | glm-5.1, glm-5v-turbo | ✅ | ✅ | | ✅ | | ✅ |
+| [Qwen](/en/models/qwen) | qwen3.7-max | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [Doubao](/en/models/doubao) | doubao-seed-2.0 series | ✅ | ✅ | ✅ | | | ✅ |
+| [Kimi](/en/models/kimi) | kimi-k2.6 | ✅ | ✅ | | | | |
+| [ERNIE](/en/models/qianfan) | ernie-5.1 | ✅ | ✅ | | | | |
+| [MiMo](/en/models/mimo) | mimo-v2.5-pro / v2.5 | ✅ | ✅ | | | ✅ | |
+| [LinkAI](/en/models/linkai) | 100+ models from multiple vendors | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [Custom](/en/models/custom) | Local models / third-party proxies | ✅ | | | | | |
 
 <Tip>
-  For a full list of model names, refer to the project's [`common/const.py`](https://github.com/zhayujie/CowAgent/blob/master/common/const.py) file.
+  Every capability in the Web console (Vision / Image / STT / TTS / Embedding / Web Search) can be configured independently with its own vendor and model — there is no forced binding between them.
 </Tip>
+
+## How to Configure
+
+**Option 1 (recommended):** Manage models and capabilities online via the [Web console](/en/channels/web), with no need to edit the configuration file:
+
+<img width="900" src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/screenshots/en/web-console-models-config.png" />
+
+**Option 2:** Edit `config.json` manually and fill in the model name and API key for the selected vendor. Every model also supports OpenAI-compatible access — just set `bot_type` to `openai` and configure `open_ai_api_base` and `open_ai_api_key`.
diff --git a/docs/en/models/kimi.mdx b/docs/en/models/kimi.mdx
index f882fa72..3292a976 100644
--- a/docs/en/models/kimi.mdx
+++ b/docs/en/models/kimi.mdx
@@ -1,8 +1,16 @@
 ---
-title: Kimi (Moonshot)
-description: Kimi (Moonshot) model configuration
+title: Kimi
+description: Kimi (Moonshot) model configuration (Text Chat + Image Understanding)
 ---
 
+Kimi is provided by Moonshot and supports both text chat and image understanding. The `kimi-k2.x` series natively supports vision.
+
+<Tip>
+  All capabilities below can be configured in one place via the "Model Management" page in the Web Console, with no need to manually edit the configuration file.
+</Tip>
+
+## Text Chat
+
 ```json
 {
   "model": "kimi-k2.6",
@@ -12,16 +20,22 @@ description: Kimi (Moonshot) model configuration
 
 | Parameter | Description |
 | --- | --- |
-| `model` | Options include `kimi-k2.6`, `kimi-k2.5`, `kimi-k2`, `moonshot-v1-8k`, `moonshot-v1-32k`, `moonshot-v1-128k` |
-| `moonshot_api_key` | Create at [Moonshot Console](https://platform.moonshot.cn/console/api-keys) |
+| `model` | Can be `kimi-k2.6`, `kimi-k2.5`, `kimi-k2`, `moonshot-v1-8k`, `moonshot-v1-32k`, `moonshot-v1-128k` |
+| `moonshot_api_key` | Create one in the [Moonshot Console](https://platform.moonshot.cn/console/api-keys) |
+| `moonshot_base_url` | Optional, defaults to `https://api.moonshot.cn/v1` |
 
-OpenAI-compatible configuration is also supported:
+## Image Understanding
+
+Once `moonshot_api_key` is configured, the Agent's Vision tool automatically uses `kimi-k2.6` to recognize images, with no extra setup required.
+
+To manually specify a Vision model:
 
 ```json
 {
-  "bot_type": "openai",
-  "model": "kimi-k2.6",
-  "open_ai_api_base": "https://api.moonshot.cn/v1",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "tools": {
+    "vision": {
+      "model": "kimi-k2.6"
+    }
+  }
 }
 ```
diff --git a/docs/en/models/linkai.mdx b/docs/en/models/linkai.mdx
index 41c29817..f60c2160 100644
--- a/docs/en/models/linkai.mdx
+++ b/docs/en/models/linkai.mdx
@@ -1,9 +1,15 @@
 ---
 title: LinkAI
-description: Unified access to multiple models via LinkAI platform
+description: Access text, vision, image, speech, and embedding capabilities through the LinkAI platform
 ---
 
-The [LinkAI](https://link-ai.tech) platform lets you flexibly switch between OpenAI, Claude, Gemini, DeepSeek, MiniMax, Qwen, Kimi, and other models, with support for knowledge base, workflows, plugins, and other Agent capabilities.
+A single `linkai_api_key` gives you access to all capabilities of mainstream vendors such as OpenAI, Claude, Gemini, DeepSeek, MiniMax, Qwen, Kimi, and Doubao.
+
+<Tip>
+  All capabilities below can be configured in one place via the "Model Management" page in the Web Console, with no need to manually edit the configuration file.
+</Tip>
+
+## Text Chat
 
 ```json
 {
@@ -14,8 +20,84 @@ The [LinkAI](https://link-ai.tech) platform lets you flexibly switch between Ope
 
 | Parameter | Description |
 | --- | --- |
-| `use_linkai` | Set to `true` to enable LinkAI interface |
-| `linkai_api_key` | Create at [LinkAI Console](https://link-ai.tech/console/interface) |
-| `model` | Leave empty to use the agent's default model. Can be switched flexibly on the platform. All models in the [model list](https://link-ai.tech/console/models) are supported |
+| `use_linkai` | Set to `true` to enable |
+| `linkai_api_key` | Create one in the [Console](https://link-ai.tech/console/interface) |
+| `model` | Can be any code from the [model list](https://link-ai.tech/console/models) |
 
-See the [API documentation](https://docs.link-ai.tech/platform/api) for more details.
+See [Model Service](https://link-ai.tech/console/models) for more.
+
+## Image Understanding
+
+Once configured, the Agent's Vision tool automatically calls multimodal models via the gateway, with no extra setup required. To manually specify a Vision model:
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "gpt-5.4-mini"
+    }
+  }
+}
+```
+
+Available models: `gpt-4.1-mini`, `gpt-5.4-mini`, `qwen3.6-plus`, `doubao-seed-2-0-pro-260215`, `kimi-k2.6`, `claude-sonnet-4-6`, `gemini-3.1-flash-lite-preview`, etc.
+
+## Image Generation
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "gpt-image-2"
+    }
+  }
+}
+```
+
+| Model ID | Alias |
+| --- | --- |
+| `gpt-image-2` | OpenAI |
+| `gemini-3.1-flash-image-preview` | Nano Banana 2 |
+| `gemini-3-pro-image-preview` | Nano Banana Pro |
+| `seedream-5.0-lite` | ByteDance Doubao Seedream |
+
+## Speech-to-Text (ASR)
+
+```json
+{
+  "voice_to_text": "linkai"
+}
+```
+
+ASR uses Whisper by default; credentials are automatically reused from `linkai_api_key`.
+
+## Text-to-Speech (TTS)
+
+The TTS gateway supports multiple underlying engines. The engine is selected by `text_to_voice_model`, and the available voices change with the engine.
+
+```json
+{
+  "text_to_voice": "linkai",
+  "text_to_voice_model": "doubao",
+  "tts_voice_id": "BV001_streaming"
+}
+```
+
+| `text_to_voice_model` | Engine |
+| --- | --- |
+| `tts-1` | OpenAI · Multi-language (voices like `alloy` / `nova` / `echo`, etc.) |
+| `doubao` | ByteDance Doubao · Rich Chinese voices |
+| `baidu` | Baidu · Chinese broadcaster voices |
+
+Voices differ by engine; we recommend selecting them visually in the Web Console under "Model Management → Text-to-Speech".
+
+## Embedding
+
+```json
+{
+  "embedding_provider": "linkai",
+  "embedding_model": "text-embedding-3-small"
+}
+```
+
+The default model is `text-embedding-3-small` (OpenAI-compatible). After changing the embedding, run `/memory rebuild-index` to rebuild the index.
diff --git a/docs/en/models/mimo.mdx b/docs/en/models/mimo.mdx
new file mode 100644
index 00000000..6f808b8e
--- /dev/null
+++ b/docs/en/models/mimo.mdx
@@ -0,0 +1,136 @@
+---
+title: MiMo
+description: Xiaomi MiMo model configuration (Text Chat + Image Understanding + Text-to-Speech)
+---
+
+Xiaomi MiMo is a native omni-modal large model. A single `mimo_api_key` enables text chat, image understanding, and text-to-speech all at once.
+
+<Tip>
+  All capabilities below can be configured in one place via the "Model Management" page in the Web Console — no need to manually edit the configuration file.
+</Tip>
+
+## Text Chat
+
+```json
+{
+  "model": "mimo-v2.5-pro",
+  "mimo_api_key": "YOUR_API_KEY",
+  "mimo_api_base": "https://api.xiaomimimo.com/v1"
+}
+```
+
+| Parameter | Description |
+| --- | --- |
+| `model` | Default recommendation: `mimo-v2.5-pro`; `mimo-v2.5` is also supported |
+| `mimo_api_key` | Create one in the [MiMo Open Platform](https://platform.xiaomimimo.com/console/api-keys) |
+| `mimo_api_base` | Optional, defaults to `https://api.xiaomimimo.com/v1` |
+
+### Model Selection
+
+| Model | Use Case |
+| --- | --- |
+| `mimo-v2.5-pro` | Flagship: native omni-modal + Agent capability, up to 1M tokens context |
+| `mimo-v2.5` | General-purpose, native omni-modal (text / image / video / audio) |
+
+## Thinking Mode
+
+The MiMo V2.5 series enables "thinking mode" by default: the model emits `reasoning_content` (chain-of-thought) before the final answer, improving performance on complex tasks.
+
+Use the global `enable_thinking` flag to toggle visibility (also switchable from the Web Console settings):
+
+```json
+{
+  "enable_thinking": true
+}
+```
+
+## Image Understanding
+
+Once `mimo_api_key` is configured, the Agent's Vision tool can automatically use MiMo's vision models:
+
+- When the main model itself is multimodal (`mimo-v2.5-pro` / `mimo-v2.5`), images are handled directly by the main model with no extra setup.
+- When the main model belongs to another vendor, the Vision tool falls back to `mimo-v2.5-pro` in order.
+
+To force a specific Vision model, set it explicitly in the configuration:
+
+```json
+{
+  "tools": {
+    "vision": {
+      "provider": "mimo",
+      "model": "mimo-v2.5-pro"
+    }
+  }
+}
+```
+
+## Text-to-Speech (TTS)
+
+```json
+{
+  "text_to_voice": "mimo",
+  "text_to_voice_model": "mimo-v2.5-tts",
+  "tts_voice_id": "冰糖"
+}
+```
+
+| Parameter | Description |
+| --- | --- |
+| `text_to_voice_model` | Currently only `mimo-v2.5-tts` (preset voices + singing mode) |
+| `tts_voice_id` | Preset voice name (Chinese voice IDs use the Chinese name directly) |
+
+### Preset Voices
+
+| Voice ID | Description |
+| --- | --- |
+| `Mia` | English · Female |
+| `Chloe` | English · Female |
+| `Milo` | English · Male |
+| `Dean` | English · Male |
+| `冰糖` | Chinese · Female (default) |
+| `茉莉` | Chinese · Female |
+| `苏打` | Chinese · Male |
+| `白桦` | Chinese · Male |
+
+
+You can also pick a voice visually from the Web Console under "Model Management → Text-to-Speech".
+
+### Style Control
+
+MiMo TTS supports embedding **audio tags** in the synthesis text to control emotion, tone, dialect, persona, and even singing. Tags must appear in the **text that will be synthesized to speech (i.e. the Agent's reply)**, with the overall style tag placed at the very beginning:
+
+```
+(style)content-to-synthesize
+```
+
+Half-width `()`, full-width `（）`, and `[]` brackets are all accepted. Both Chinese and English style descriptors work — pick whichever language expresses the timbre most precisely. Common examples:
+
+| Category | Example tags |
+| --- | --- |
+| Basic emotions | `happy` `sad` `angry` `fear` `surprised` `excited` `aggrieved` `calm` `indifferent` |
+| Compound emotions | `wistful` `relieved` `helpless` `guilty` `at ease` `uneasy` `touched` |
+| Overall tone | `gentle` `aloof` `lively` `serious` `languid` `playful` `deep` `sharp` `cutting` |
+| Voice character | `magnetic` `mellow` `bright` `ethereal` `childlike` `aged` `sweet` `husky` |
+| Persona | `squeaky` `mature lady` `young boy` `uncle` `Taiwanese accent` |
+| Dialect | `Northeastern` `Sichuan` `Henan` `Cantonese` |
+| Role-play | `Sun Wukong` `Lin Daiyu` |
+| Singing | `sing` / `singing` |
+
+Examples:
+
+- `(magnetic)The night is deep, and the city is still breathing.`
+- `(gentle)Take a breath. You've got this.`
+- `(serious)This is the final warning before the system reboots.`
+- `(singing)Oh, when the saints go marching in…`
+
+You can also insert fine-grained audio tags at any position in the text to control breathing, laughter, pauses, etc. For example:
+
+```
+(nervous, deep breath) Phew… stay calm, stay calm. (faster pace) I've rehearsed this intro fifty times, it'll be fine.
+```
+
+See the [MiMo speech synthesis documentation](https://platform.xiaomimimo.com/docs/zh-CN/usage-guide/speech-synthesis-v2.5) for the full tag list.
+
+<Tip>
+  When CowAgent calls TTS, the Agent's reply text (including any `(...)` tags) is forwarded directly to MiMo for synthesis. Tell the model in its persona / system prompt to "prefix replies with a `(style)` tag to control the tone", and IM channels (WeChat / Feishu / DingTalk / WeCom) will play voice replies with the corresponding emotion, dialect, or even singing.
+</Tip>
diff --git a/docs/en/models/minimax.mdx b/docs/en/models/minimax.mdx
index c3137ca2..d945d2ea 100644
--- a/docs/en/models/minimax.mdx
+++ b/docs/en/models/minimax.mdx
@@ -1,8 +1,16 @@
 ---
 title: MiniMax
-description: MiniMax model configuration
+description: MiniMax model configuration (Text / Image Understanding / Image Generation / Text-to-Speech)
 ---
 
+MiniMax supports text chat, image understanding, image generation, and text-to-speech. A single `minimax_api_key` enables all capabilities.
+
+<Tip>
+  All capabilities below can be configured in one place via the "Model Management" page in the Web Console, with no need to manually edit the configuration file.
+</Tip>
+
+## Text Chat
+
 ```json
 {
   "model": "MiniMax-M2.7",
@@ -12,16 +20,52 @@ description: MiniMax model configuration
 
 | Parameter | Description |
 | --- | --- |
-| `model` | Options include `MiniMax-M2.7`, `MiniMax-M2.5`, `MiniMax-M2.1`, `MiniMax-M2.1-lightning`, `MiniMax-M2`, etc. |
-| `minimax_api_key` | Create at [MiniMax Console](https://platform.minimaxi.com/user-center/basic-information/interface-key) |
+| `model` | Can be `MiniMax-M2.7`, `MiniMax-M2.7-highspeed`, `MiniMax-M2.5`, `MiniMax-M2.1`, `MiniMax-M2.1-lightning`, `MiniMax-M2`, etc. |
+| `minimax_api_key` | Create one in the [MiniMax Console](https://platform.minimaxi.com/user-center/basic-information/interface-key) |
 
-OpenAI-compatible configuration is also supported:
+## Image Understanding
+
+MiniMax's M2.x chat models do not support vision natively; vision calls are uniformly routed to `MiniMax-Text-01`. Once `minimax_api_key` is configured, the Agent's Vision tool automatically uses this model, with no need to specify it explicitly in the configuration file.
+
+## Image Generation
 
 ```json
 {
-  "bot_type": "openai",
-  "model": "MiniMax-M2.7",
-  "open_ai_api_base": "https://api.minimaxi.com/v1",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "skills": {
+    "image-generation": {
+      "model": "image-01"
+    }
+  }
 }
 ```
+
+Available models: `image-01`.
+
+## Text-to-Speech (TTS)
+
+```json
+{
+  "text_to_voice": "minimax",
+  "text_to_voice_model": "speech-2.8-hd",
+  "tts_voice_id": "female-shaonv"
+}
+```
+
+| Parameter | Description |
+| --- | --- |
+| `text_to_voice_model` | `speech-2.8-hd` (emotional rendering, natural sound), `speech-2.8-turbo` (ultra-fast), `speech-2.6-hd`, `speech-2.6-turbo` |
+| `tts_voice_id` | Voice ID; supports Chinese / Cantonese / English / Japanese / Korean — 70+ voices in total |
+
+Common voice examples:
+
+| Voice ID | Description |
+| --- | --- |
+| `female-shaonv` | Chinese · Young Girl (Female) |
+| `female-yujie` | Chinese · Mature Lady (Female) |
+| `female-tianmei` | Chinese · Sweet Female (Female) |
+| `male-qn-jingying` | Chinese · Elite Youth (Male) |
+| `male-qn-badao` | Chinese · Dominant Youth (Male) |
+| `Cantonese_GentleLady` | Cantonese · Gentle Female Voice |
+| `English_Graceful_Lady` | English · Graceful Lady |
+
+For the full voice list (70+ voices across Chinese / Cantonese / English / Japanese / Korean), see the [system voice list](https://platform.minimaxi.com/docs/faq/system-voice-id), or select visually in the Web Console under "Model Management → Text-to-Speech".
diff --git a/docs/en/models/openai.mdx b/docs/en/models/openai.mdx
index dc2c0be7..f8715562 100644
--- a/docs/en/models/openai.mdx
+++ b/docs/en/models/openai.mdx
@@ -1,11 +1,20 @@
 ---
 title: OpenAI
-description: OpenAI model configuration
+description: OpenAI model configuration (Text / Vision / Image / Speech / Embedding)
 ---
 
+OpenAI offers the most complete coverage and can simultaneously serve text chat, vision understanding, image generation, speech-to-text (ASR), text-to-speech (TTS), and embedding. A single `open_ai_api_key` lets the Agent use all of these capabilities.
+
+<Tip>
+  All capabilities below can be configured in one place via the "Model Management" page in the Web Console, with no need to manually edit the configuration file.
+</Tip>
+
+
+## Text Chat
+
 ```json
 {
-  "model": "gpt-5.4",
+  "model": "gpt-5.5",
   "open_ai_api_key": "YOUR_API_KEY",
   "open_ai_api_base": "https://api.openai.com/v1"
 }
@@ -13,7 +22,82 @@ description: OpenAI model configuration
 
 | Parameter | Description |
 | --- | --- |
-| `model` | Matches the [model parameter](https://platform.openai.com/docs/models) of the OpenAI API. Supports o-series, gpt-5.4, gpt-5 series, gpt-4.1, etc. Recommended for Agent mode: `gpt-5.4` |
-| `open_ai_api_key` | Create at [OpenAI Platform](https://platform.openai.com/api-keys) |
-| `open_ai_api_base` | Optional. Change to use third-party proxy |
-| `bot_type` | Not required for official OpenAI models. Set to `openai` when using Claude or other non-OpenAI models via proxy |
+| `model` | Same as OpenAI's [model parameter](https://platform.openai.com/docs/models); supports `gpt-5.5`, `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.4-nano`, the `gpt-5` series, `gpt-4.1`, the o-series, etc. Agent mode defaults to `gpt-5.5`; use `gpt-5.4` for better cost-efficiency |
+| `open_ai_api_key` | Create one on the [OpenAI Platform](https://platform.openai.com/api-keys) |
+| `open_ai_api_base` | Optional; change it to access a third-party proxy |
+| `bot_type` | Not required when using OpenAI's official models; set to `openai` when accessing other vendors via the compatible protocol |
+
+## Image Understanding
+
+OpenAI models like `gpt-5.5`, `gpt-5.4`, `gpt-4o`, and `gpt-4.1` natively support vision. Once `open_ai_api_key` is configured, the Agent's Vision tool automatically uses the main model to recognize images. If the main model does not support vision or you want to specify it explicitly, set it in the configuration file:
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "gpt-5.4-mini"
+    }
+  }
+}
+```
+
+Supported Vision models: `gpt-5.5`, `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.4-nano`, `gpt-5`, `gpt-4.1`, `gpt-4.1-mini`, `gpt-4o`.
+
+## Image Generation
+
+Specify the image generation model in the configuration file; the Agent automatically routes image generation skill calls to OpenAI:
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "gpt-image-2"
+    }
+  }
+}
+```
+
+Supported image generation models: `gpt-image-2`, `gpt-image-1`.
+
+## Speech-to-Text (ASR)
+
+```json
+{
+  "voice_to_text": "openai",
+  "voice_to_text_model": "gpt-4o-mini-transcribe"
+}
+```
+
+| Parameter | Description |
+| --- | --- |
+| `voice_to_text` | Set to `openai` to enable OpenAI speech-to-text |
+| `voice_to_text_model` | Optional, defaults to `gpt-4o-mini-transcribe`; can also be `gpt-4o-transcribe`, `whisper-1` |
+
+Credentials are automatically reused from `open_ai_api_key`.
+
+## Text-to-Speech (TTS)
+
+```json
+{
+  "text_to_voice": "openai",
+  "text_to_voice_model": "tts-1",
+  "tts_voice_id": "alloy"
+}
+```
+
+| Parameter | Description |
+| --- | --- |
+| `text_to_voice_model` | `tts-1`, `tts-1-hd`, `gpt-4o-mini-tts` |
+| `tts_voice_id` | Voices: `alloy`, `echo`, `fable`, `onyx`, `nova`, `shimmer`, `ash`, `ballad`, `coral`, `sage`, `verse` |
+
+## Embedding
+
+```json
+{
+  "embedding_provider": "openai",
+  "embedding_model": "text-embedding-3-small"
+}
+```
+
+Available models: `text-embedding-3-small`, `text-embedding-3-large`, `text-embedding-ada-002`. After changing the embedding, run `/memory rebuild-index` to rebuild the index.
+
diff --git a/docs/en/models/qianfan.mdx b/docs/en/models/qianfan.mdx
index aa88d040..13525967 100644
--- a/docs/en/models/qianfan.mdx
+++ b/docs/en/models/qianfan.mdx
@@ -1,6 +1,6 @@
 ---
-title: Baidu Qianfan / ERNIE
-description: Baidu Qianfan ERNIE model configuration
+title: ERNIE
+description: ERNIE model configuration (Baidu Qianfan)
 ---
 
 Option 1: Native integration (recommended):
@@ -40,7 +40,7 @@ To force a specific Vision model, set it explicitly in `config.json`:
 
 ```json
 {
-  "tool": {
+  "tools": {
     "vision": {
       "model": "ernie-4.5-turbo-vl"
     }
diff --git a/docs/en/models/qwen.mdx b/docs/en/models/qwen.mdx
index 30f7e485..8e27269c 100644
--- a/docs/en/models/qwen.mdx
+++ b/docs/en/models/qwen.mdx
@@ -1,8 +1,16 @@
 ---
-title: Qwen (Tongyi Qianwen)
-description: Tongyi Qianwen model configuration
+title: Qwen
+description: Qwen model configuration (Text / Image Understanding / Image Generation / Speech-to-Text / Text-to-Speech / Embedding)
 ---
 
+Qwen (Alibaba DashScope / Bailian) is one of the most fully-featured vendors. Text, image understanding, image generation, speech-to-text, text-to-speech, and embedding can all be enabled with a single `dashscope_api_key`.
+
+<Tip>
+  All capabilities below can be configured in one place via the "Model Management" page in the Web Console, with no need to manually edit the configuration file.
+</Tip>
+
+## Text Chat
+
 ```json
 {
   "model": "qwen3.6-plus",
@@ -12,16 +20,93 @@ description: Tongyi Qianwen model configuration
 
 | Parameter | Description |
 | --- | --- |
-| `model` | Options include `qwen3.6-plus`, `qwen3.5-plus`, `qwen3-max`, `qwen-max`, `qwen-plus`, `qwen-turbo`, `qwq-plus`, etc. |
-| `dashscope_api_key` | Create at [Bailian Console](https://bailian.console.aliyun.com/?tab=model#/api-key). See [official docs](https://bailian.console.aliyun.com/?tab=api#/api) |
+| `model` | Can be `qwen3.6-plus`, `qwen3.7-max`, `qwen3.5-plus`, `qwen3-max`, `qwen-max`, `qwen-plus`, `qwen-turbo`, `qwq-plus`, etc. |
+| `dashscope_api_key` | Create one in the [Bailian Console](https://bailian.console.aliyun.com/?tab=model#/api-key); see the [official docs](https://bailian.console.aliyun.com/?tab=api#/api) |
 
-OpenAI-compatible configuration is also supported:
+## Image Understanding
+
+Once `dashscope_api_key` is configured, the Agent's Vision tool automatically calls Qwen's vision models to recognize images. Models like `qwen3-max` / `qwen3.5-plus` / `qwen3.6-plus` are already multimodal; if the main model is text-only (e.g. `qwen-turbo`), it automatically falls back to `qwen-vl-max`.
+
+To manually specify a Vision model:
 
 ```json
 {
-  "bot_type": "openai",
-  "model": "qwen3.6-plus",
-  "open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "tools": {
+    "vision": {
+      "model": "qwen3.6-plus"
+    }
+  }
 }
 ```
+
+Supported models: `qwen3.6-plus`, `qwen3.5-plus`, `qwen3-max`.
+
+## Image Generation
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "qwen-image-2.0"
+    }
+  }
+}
+```
+
+Available models: `qwen-image-2.0`, `qwen-image-2.0-pro`.
+
+## Speech-to-Text (ASR)
+
+```json
+{
+  "voice_to_text": "dashscope",
+  "voice_to_text_model": "qwen3-asr-flash"
+}
+```
+
+| Parameter | Description |
+| --- | --- |
+| `voice_to_text` | Set to `dashscope` to enable Qwen ASR |
+| `voice_to_text_model` | Optional, defaults to `qwen3-asr-flash` |
+
+Credentials are automatically reused from `dashscope_api_key`. A single audio segment should be smaller than 10MB and no longer than 300 seconds.
+
+## Text-to-Speech (TTS)
+
+```json
+{
+  "text_to_voice": "dashscope",
+  "text_to_voice_model": "qwen3-tts-flash",
+  "tts_voice_id": "Cherry"
+}
+```
+
+| Parameter | Description |
+| --- | --- |
+| `text_to_voice_model` | Optional, defaults to `qwen3-tts-flash`; covers Mandarin, dialects, and major foreign languages |
+| `tts_voice_id` | Voice ID; see the common list below |
+
+Common voice examples:
+
+| Voice ID | Description |
+| --- | --- |
+| `Cherry` | Qianyue · Sunny Female Voice |
+| `Serena` | Suyao · Gentle Female Voice |
+| `Ethan` | Chenxu · Sunny Male Voice |
+| `Chelsie` | Qianxue · Anime Girl |
+| `Dylan` | Beijing Dialect · Xiaodong |
+| `Rocky` | Cantonese · Aqiang |
+| `Sunny` | Sichuan Dialect · Qing'er |
+
+The full voice list (Mandarin / regional dialects / bilingual, etc.) can be selected visually in the Web Console under "Model Management → Text-to-Speech".
+
+## Embedding
+
+```json
+{
+  "embedding_provider": "dashscope",
+  "embedding_model": "text-embedding-v4"
+}
+```
+
+The default model is `text-embedding-v4`. After changing the embedding, run `/memory rebuild-index` to rebuild the index.
diff --git a/docs/en/releases/overview.mdx b/docs/en/releases/overview.mdx
index b0a0dca1..ce932884 100644
--- a/docs/en/releases/overview.mdx
+++ b/docs/en/releases/overview.mdx
@@ -5,12 +5,15 @@ description: CowAgent version history
 
 | Version | Date | Description |
 | --- | --- | --- |
+| [2.0.9](/en/releases/v2.0.9) | 2026.05.22 | Model management console, MCP protocol support, browser persistent login, new models (gpt-5.5, gemini-3.5-flash, qwen3.7-max, etc.), deployment hardening |
+| [2.0.8](/en/releases/v2.0.8) | 2026.05.06 | Major Feishu channel upgrade (voice, streaming and Markdown, one-click QR-scan setup), DeepSeek V4 and Baidu models, scheduler tool enhancements |
 | [2.0.7](/en/releases/v2.0.7) | 2026.04.22 | Image Generation Skill (6-provider auto-routing), new models (Kimi K2.6, Claude Opus 4.7, GLM 5.1), knowledge base and Web Console improvements |
-| [2.0.6](/en/releases/v2.0.6) | 2026.04.14 | Knowledge Base, Deep Dream Memory Distillation, Smart Context Compression, Web Console upgrades |
+| [2.0.6](/en/releases/v2.0.6) | 2026.04.14 | Project rename, Knowledge Base system, Deep Dream Memory Distillation, Smart Context Compression, Web Console multi-session and various improvements |
 | [2.0.5](/en/releases/v2.0.5) | 2026.04.01 | Cow CLI, Skill Hub open source, Browser tool, WeCom Bot QR scan, and more |
 | [2.0.4](/en/releases/v2.0.4) | 2026.03.22 | Personal WeChat channel, new model support, Japanese docs, script refactoring and bug fixes |
+| [2.0.3](/en/releases/v2.0.3) | 2026.03.18 | WeCom Smart Bot and QQ channels, Coding Plan support, multiple new models, Web file processing, memory system upgrade |
 | [2.0.2](/en/releases/v2.0.2) | 2026.02.27 | Web Console upgrade, multi-channel concurrency, session persistence |
-| [2.0.1](/en/releases/v2.0.1) | 2026.02.27 | Built-in Web Search tool, smart context management, multiple fixes |
+| [2.0.1](/en/releases/v2.0.1) | 2026.02.13 | Built-in Web Search tool, smart context management, multiple fixes |
 | [2.0.0](/en/releases/v2.0.0) | 2026.02.03 | Full upgrade to AI super assistant |
 | 1.7.6 | 2025.05.23 | Web Channel optimization, AgentMesh plugin |
 | 1.7.5 | 2025.04.11 | DeepSeek model |
@@ -21,6 +24,8 @@ description: CowAgent version history
 | 1.6.9 | 2024.07.19 | gpt-4o-mini, Alibaba voice recognition |
 | 1.6.8 | 2024.07.05 | Claude 3.5, Gemini 1.5 Pro |
 | 1.6.0 | 2024.04.26 | Kimi integration, gpt-4-turbo upgrade |
+| 1.5.8 | 2024.03.26 | GLM-4, Claude-3, edge-tts |
+| 1.5.2 | 2023.11.10 | Feishu channel, image recognition chat |
 | 1.5.0 | 2023.11.10 | gpt-4-turbo, dall-e-3, tts multimodal |
 | 1.0.0 | 2022.12.12 | Project created, first ChatGPT integration |
 
diff --git a/docs/en/releases/v2.0.3.mdx b/docs/en/releases/v2.0.3.mdx
index 9547cc43..5f9a837d 100644
--- a/docs/en/releases/v2.0.3.mdx
+++ b/docs/en/releases/v2.0.3.mdx
@@ -34,7 +34,7 @@ Related commits: [30c6d9b](https://github.com/zhayujie/CowAgent/commit/30c6d9b)
 
 ## 💰 Coding Plan Support
 
-Added integration with vendor Coding Plan (monthly programming subscription) tiers via the unified OpenAI-compatible path. Supported vendors include Aliyun, MiniMax, Zhipu GLM, Kimi, and Volcengine.
+Added integration with vendor Coding Plan (monthly programming subscription) tiers via the unified OpenAI-compatible path. Supported vendors include Aliyun, MiniMax, GLM, Kimi, and Volcengine.
 
 See [Coding Plan docs](https://docs.cowagent.ai/en/models/coding-plan) for detailed configuration.
 
diff --git a/docs/en/releases/v2.0.7.mdx b/docs/en/releases/v2.0.7.mdx
index 3519812c..522e5339 100644
--- a/docs/en/releases/v2.0.7.mdx
+++ b/docs/en/releases/v2.0.7.mdx
@@ -11,7 +11,7 @@ New built-in `image-generation` skill supporting text-to-image, image-to-image,
 - **Zero model selection**: Just configure an API key and it works — no need to manually specify a model. You can also name a specific model in conversation (e.g. "draw a cat with seedream")
 - **Flexible control**: Supports `quality`, `size` (512/1K–4K), and `aspect_ratio` parameters, with each provider automatically mapping to its supported values
 - **Image editing**: Pass existing images for editing, style transfer, or multi-image fusion (Seedream supports up to 14 reference images)
-- **Skill-level config**: Pin a default model via `skill.image-generation.model` in `config.json`
+- **Skill-level config**: Pin a default model via `skills.image-generation.model` in `config.json`
 - **Image lightbox**: All images in the Web console now support click-to-enlarge preview
 
 Docs: [Image Generation Skill](https://docs.cowagent.ai/en/skills/image-generation)
diff --git a/docs/en/releases/v2.0.8.mdx b/docs/en/releases/v2.0.8.mdx
index 13a63d3a..3fcc29da 100644
--- a/docs/en/releases/v2.0.8.mdx
+++ b/docs/en/releases/v2.0.8.mdx
@@ -1,6 +1,6 @@
 ---
 title: v2.0.8
-description: CowAgent 2.0.8 - Major Feishu channel upgrade (voice, streaming typewriter, one-click QR app creation), DeepSeek V4 / Baidu Qianfan ERNIE 5.0 support, scheduler memory enhancements and multiple fixes
+description: CowAgent 2.0.8 - Major Feishu channel upgrade (voice, streaming typewriter, one-click QR app creation), DeepSeek V4 / ERNIE 5.0 support, scheduler memory enhancements and multiple fixes
 ---
 
 ## 🪶 Major Feishu Channel Upgrade
@@ -30,9 +30,9 @@ The voice and streaming building blocks come from a community contribution #2791
 
 - **DeepSeek V4 series**: Added `deepseek-v4-pro` / `deepseek-v4-flash`, with `deepseek-v4-flash` set as the new default
 - **Unified thinking-mode toggle**: DeepSeek V4, Qwen3 and other thinking-capable models now share the same `enable_thinking` switch
-- **Baidu Qianfan / ERNIE first-class integration**: New `qianfan` provider supporting `ernie-5.0` (default recommendation), `ernie-x1.1`, `ernie-4.5-turbo-128k`, `ernie-4.5-turbo-32k`. Dedicated `qianfan_api_key` / `qianfan_api_base` settings keep OpenAI config clean; legacy `wenxin` / `wenxin-4` paths are fully preserved. #2790 Thanks [@jimmyzhuu](https://github.com/jimmyzhuu)
+- **ERNIE first-class integration**: New `qianfan` provider supporting `ernie-5.0` (default recommendation), `ernie-x1.1`, `ernie-4.5-turbo-128k`, `ernie-4.5-turbo-32k`. Dedicated `qianfan_api_key` / `qianfan_api_base` settings keep OpenAI config clean; legacy `wenxin` / `wenxin-4` paths are fully preserved. #2790 Thanks [@jimmyzhuu](https://github.com/jimmyzhuu)
 
-  Documentation: [Baidu Qianfan / ERNIE](https://docs.cowagent.ai/en/models/qianfan)
+  Documentation: [ERNIE](https://docs.cowagent.ai/en/models/qianfan)
 
 ## 🌐 Translation Provider
 
@@ -51,7 +51,7 @@ The voice and streaming building blocks come from a community contribution #2791
 
 ## 🔧 Tools and Safety
 
-- **Vision model selection**: `tool.vision.model` config now actually takes effect, with automatic fallback when unconfigured #2792
+- **Vision model selection**: `tools.vision.model` config now actually takes effect, with automatic fallback when unconfigured #2792
 - **Bash safety prompt**: The destructive-deletion confirm prompt is now scoped to paths outside the workspace — routine in-workspace operations are no longer interrupted
 
 ## 🐛 Other Fixes
diff --git a/docs/en/releases/v2.0.9.mdx b/docs/en/releases/v2.0.9.mdx
new file mode 100644
index 00000000..ccae36fc
--- /dev/null
+++ b/docs/en/releases/v2.0.9.mdx
@@ -0,0 +1,65 @@
+---
+title: v2.0.9
+description: CowAgent 2.0.9 - Web Console model management, MCP protocol support, browser persistent login, new models and deployment hardening
+---
+
+## 🖥️ Model Management Console
+
+The Web Console adds a new **Models** page that organizes everything by **provider × capability**, covering chat, image, voice, embedding and search models in one place:
+
+- **Per-provider configuration**: Each provider's API Key / API Base is configured once at the top, and every capability below picks it up automatically — no more re-entering credentials
+- **Image models**: Image understanding and image generation can each pick their own provider and model independently; falls back to the main model when unspecified
+- **Voice models**: ASR (speech-to-text) and TTS (text-to-speech) can be configured independently, with new Qwen and Zhipu ASR/TTS models added
+- **Embedding models**: Configurable embedding models (used for memory and knowledge-base retrieval), with new support for OpenAI, Tongyi, Doubao, Zhipu and others; run `/memory rebuild-index` after switching to rebuild the index online
+- **Search capability**: Web search has been upgraded to support Bocha, Baidu, Zhipu and more providers — in auto mode the agent can synthesize results from multiple sources for deeper research
+
+Documentation: [Models Overview](https://docs.cowagent.ai/en/models)
+
+<img width="720" alt="20260522113305" src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/screenshots/en/web-console-models-config.png" />
+
+
+## 🧩 MCP Protocol Support
+
+Adds support for **MCP (Model Context Protocol)**, expanding from a fixed built-in toolset to an open, pluggable tool ecosystem — any MCP-compatible service can be plugged in directly as an agent tool.
+
+- Native JSON-RPC implementation, zero extra dependencies, supports both `stdio` and `sse` transports
+- Compatible with the `mcpServers` configuration style used by Claude Desktop / Cursor, reads `~/cow/mcp.json` by default
+
+Documentation: [MCP Tools](https://docs.cowagent.ai/en/tools/mcp). Thanks [@yangluxin613](https://github.com/yangluxin613) (#2801)
+
+## 🌐 Browser Persistent Login
+
+For sites that require login or have anti-bot protection, the browser tool can now persist a login session for long-term reuse, and supports attaching to your real Chrome browser to bypass fingerprint detection:
+
+- **Persistent user profile (default)**: Uses `~/.cow/browser_profile` as the browser user data dir by default; once logged in, sessions are reused automatically on subsequent runs
+- **CDP mode**: Configure `tools.browser.cdp_endpoint` to take over a real Chrome instance with full browser permissions
+
+Documentation: [Browser Tool](https://docs.cowagent.ai/en/tools/browser). Thanks [@leafmove](https://github.com/leafmove) (#2809)
+
+## 🤖 New Models and Improvements
+
+- **New models**: `gpt-5.5`, `gemini-3.5-flash`, `qwen3.7-max`, `ernie-5.1`
+- **Improvements**: DeepSeek V4 supports the `reasoning_effort` thinking-depth parameter; fixed thinking models like MiMo failing to connect via the OpenAI-compatible protocol
+
+## 🔒 Deployment & Security
+
+- **Bind to localhost by default**: The Web Console `web_host` now defaults to `127.0.0.1`; for server deployments, set it to `0.0.0.0` and configure a password manually. Thanks @August829, @yidaozhongqing, @YLChen-007, @icysun
+- **Fully bundled frontend assets**: All third-party CSS / JS are now served locally — the console works offline and on intranet deployments. Thanks [@gitlayzer](https://github.com/gitlayzer) (#2816)
+
+## 🛠 UX Improvements & Fixes
+
+- **TTS rolls out to more channels**: Web Console, Personal WeChat, Feishu, DingTalk and WeCom Smart Bot all support voice replies — see the [Channels Overview](https://docs.cowagent.ai/en/channels)
+- **Log panel enhancements**: Differentiated highlighting by log level, with level-based filtering. Thanks [@yangluxin613](https://github.com/yangluxin613) (#2807)
+- **Auto-launch Web Console**: The Web Console now opens automatically on startup. Thanks [@yangluxin613](https://github.com/yangluxin613) (#2804)
+- **Clean Ctrl+C exit**: No more long `KeyboardInterrupt` stack traces. Thanks [@yangluxin613](https://github.com/yangluxin613) (#2806)
+- **Folder upload**: Web Console supports directory uploads, with path validation adapted for Windows. Thanks [@TryToMakeUsBetter](https://github.com/TryToMakeUsBetter) (#2814)
+- Fixed scheduled tasks executing duplicates under certain conditions. Thanks [@CNXudiandian](https://github.com/CNXudiandian) (#2820)
+- Fixed one-shot scheduled tasks with timezone not firing. Thanks @AethericSpace
+- Fixed failed tool calls not being displayed after page refresh. Thanks [@a1094174619](https://github.com/a1094174619) (#2822)
+- Fixed WeCom bot messages with illegal control characters failing to be delivered. Thanks [@Jacques-Zhao](https://github.com/Jacques-Zhao) (#2810)
+
+## 📦 Upgrade
+
+Source-code deployments can run `cow update` for a one-click upgrade, or pull the latest code and restart manually. See the [Upgrade Guide](https://docs.cowagent.ai/en/guide/upgrade) for details.
+
+**Release Date**: 2026.05.22 | [Full Changelog](https://github.com/zhayujie/CowAgent/compare/2.0.8...2.0.9)
diff --git a/docs/en/skills/hub.mdx b/docs/en/skills/hub.mdx
new file mode 100644
index 00000000..0a9e73e1
--- /dev/null
+++ b/docs/en/skills/hub.mdx
@@ -0,0 +1,65 @@
+---
+title: Skill Hub
+description: Browse, search, and install AI Agent skills
+---
+
+[Cow Skill Hub](https://skills.cowagent.ai/) is an open-source skill marketplace for AI Agents, aggregating official picks, community contributions, and third-party skills from GitHub, ClawHub, and beyond.
+
+Source code: [github.com/zhayujie/cow-skill-hub](https://github.com/zhayujie/cow-skill-hub)
+
+<img src="https://cdn.link-ai.tech/doc/20260401110103.png" width="800" />
+
+## Features
+
+- **Browse skills** — filter by category (Featured / Community / Third-party) and tags
+- **Search skills** — find skills by name or description
+- **View details** — read the skill manifest, file contents, install command, and required environment variables
+- **One-click install** — copy the install command and run it in CowAgent
+
+## Installing a skill
+
+Run the install command in chat or in your terminal:
+
+<CodeGroup>
+```text Chat
+/skill install <name>
+```
+
+```bash Terminal
+cow skill install <name>
+```
+</CodeGroup>
+
+You can also browse the marketplace directly from chat:
+
+```text
+/skill list --remote
+/skill search <keyword>
+```
+
+Beyond the curated list, you can install third-party skills from **GitHub, ClawHub, LinkAI, or any URL** via the CLI. See [Installing skills](/en/skills/install) for details.
+
+## Contributing a skill
+
+To submit your own skill:
+
+1. Visit [skills.cowagent.ai/submit](https://skills.cowagent.ai/submit)
+2. Sign in with GitHub or Google
+3. Upload a folder or zip file containing `SKILL.md`
+4. Skill name, display name, and description are auto-detected — adjust as needed
+5. Submit for review; skills go live after security and quality checks
+
+<img src="https://cdn.link-ai.tech/doc/20260401111904.png" width="800" />
+
+Skill file layout:
+
+```
+your-skill/
+├── SKILL.md        # required, in the root
+├── scripts/        # optional, runtime scripts
+└── resources/      # optional, additional assets
+```
+
+<Tip>
+  Skills are built around the `SKILL.md` manifest. You can also download `SKILL.md` from a skill's detail page and use it with any Agent that supports custom instructions (OpenClaw, Cursor, Claude Code, and more).
+</Tip>
diff --git a/docs/en/skills/image-generation.mdx b/docs/en/skills/image-generation.mdx
index 49c0ed7d..608fa3bc 100644
--- a/docs/en/skills/image-generation.mdx
+++ b/docs/en/skills/image-generation.mdx
@@ -1,149 +1,89 @@
 ---
-title: image-generation - Image Generation
+title: image-generation
 description: Text-to-image / image-to-image / multi-image fusion with automatic multi-provider routing and fallback
 ---
 
-A general-purpose image generation and editing skill supporting six providers: OpenAI, Gemini, Seedream (Volcengine Ark), Qwen (DashScope), MiniMax, and LinkAI. No need to choose a model manually — the script automatically selects a configured provider based on a fixed priority order.
+A general-purpose image generation and editing skill supporting six providers: OpenAI, Gemini, Seedream (Volcengine Ark), Qwen (DashScope), MiniMax, and LinkAI. Configure any one provider's key to start using it; configure multiple to enable automatic fallback.
 
-## Model Selection
-
-`image-generation` uses a "fixed priority + automatic fallback" strategy — just configure your keys and it works:
-
-1. **Priority order**: `OpenAI → Gemini → Seedream → Qwen → MiniMax → LinkAI`
-2. **Unconfigured providers are skipped**: only providers with an API key participate
-3. **Automatic fallback on failure**: on errors like 401, model not enabled, or network issues, the next provider is tried
-4. **Specified model goes first**: if a specific model name is provided, its provider is promoted to the front
-
-### Supported Models
+## Supported Models
 
 | Provider | Models / Aliases | Notes |
 | --- | --- | --- |
 | OpenAI | `gpt-image-2`, `gpt-image-1` | General-purpose, high quality, supports `quality` parameter |
-| Gemini Nano Banana | `nano-banana-2`, `nano-banana-pro`, `nano-banana` | Corresponds to `gemini-3.1-flash`, `gemini-3-pro`, `gemini-2.5-flash` image variants |
+| Gemini Nano Banana | `nano-banana-2`, `nano-banana-pro`, `nano-banana` | Corresponds to the image variants of `gemini-3.1-flash`, `gemini-3-pro`, `gemini-2.5-flash` |
 | Seedream (Volcengine Ark) | `seedream-5.0-lite`, `seedream-4.5` | Native 2K–4K, up to 14 reference images for fusion |
 | Qwen (DashScope) | `qwen-image-2.0`, `qwen-image-2.0-pro` | Strong with Chinese text rendering and text-image layouts |
-| MiniMax | `image-01` | Fast and simple image generation |
-| LinkAI | Any model | Universal proxy, used as fallback |
+| MiniMax | `image-01` | Fast and simple |
+| LinkAI | Any model | Universal gateway, used as fallback |
 
-<Note>
-By default, the Agent does not pick a model — it uses automatic routing. If you want a specific model, just say so in the conversation, e.g. "use seedream to draw a cat" or "generate a poster with gpt-image-2". You can also pin a default model via the "Custom Configuration" section below.
-</Note>
+## Model Selection
 
-## Custom Configuration
+By default, "auto routing + automatic fallback" is used:
 
-### API Key Setup
+1. Pick the first configured provider in the order `OpenAI → Gemini → Seedream → Qwen → MiniMax → LinkAI`
+2. On errors such as 401, model not enabled, or network issues, automatically switch to the next provider
+3. If the user specifies a model in the conversation (e.g. "use seedream to draw a cat"), the corresponding provider is promoted to the front
 
-You need **at least one** provider key. Configuring multiple providers enables automatic fallback. There are three ways to set up keys:
-
-#### Option 1: Automatic Reuse of Existing Keys
-
-If you have already configured model keys in the web console or `config.json` (e.g. `openai_api_key`, `gemini_api_key`, etc.), these keys are **automatically synced** to the corresponding environment variables at startup. In other words, if your chat model works, image generation can use the same key with zero extra configuration.
-
-#### Option 2: Configure in config.json
-
-Add the key fields directly to `config.json`:
+To pin a specific model:
 
 ```json
 {
-  "openai_api_key": "sk-xxx",
-  "openai_api_base": "https://api.openai.com/v1",
-  "gemini_api_key": "AIza-xxx",
-  "ark_api_key": "xxx",
-  "dashscope_api_key": "sk-xxx",
-  "minimax_api_key": "xxx",
-  "linkai_api_key": "xxx"
-}
-```
-
-A restart is required after changes. Each key also has a corresponding `*_api_base` field for custom endpoints.
-
-#### Option 3: Configure via Conversation
-
-Send an API key in the chat and the Agent will save it to `~/cow/.env` using the `env_config` tool — **no restart needed**. For example:
-
-```
-Set OPENAI_API_KEY to sk-xxx
-```
-
-Or:
-
-```
-Configure ARK_API_KEY as xxx
-```
-
-### API Key Reference
-
-| Environment Variable | config.json Field | Provider | Default Base URL |
-| --- | --- | --- | --- |
-| `OPENAI_API_KEY` | `openai_api_key` | OpenAI | `https://api.openai.com/v1` |
-| `GEMINI_API_KEY` | `gemini_api_key` | Gemini | `https://generativelanguage.googleapis.com` |
-| `ARK_API_KEY` | `ark_api_key` | Volcengine Ark (Seedream) | `https://ark.cn-beijing.volces.com/api/v3` |
-| `DASHSCOPE_API_KEY` | `dashscope_api_key` | Alibaba DashScope (Qwen) | `https://dashscope.aliyuncs.com` |
-| `MINIMAX_API_KEY` | `minimax_api_key` | MiniMax | `https://api.minimaxi.com` |
-| `LINKAI_API_KEY` | `linkai_api_key` | LinkAI | `https://api.link-ai.tech` |
-
-### Pinning a Default Model
-
-To force all image generation through a specific provider's model, add this to `config.json`:
-
-```json
-"skill": {
-  "image-generation": {
-    "model": "seedream-5.0-lite"
+  "skills": {
+    "image-generation": {
+      "model": "seedream-5.0-lite"
+    }
   }
 }
 ```
 
-At startup, this is automatically converted to the environment variable `SKILL_IMAGE_GENERATION_MODEL`, and the script will always use this model's provider for generation.
+## Configuring API Keys
+
+<Tip>
+  It is recommended to configure providers from the "Model Management" page in the [Web console](/en/channels/web). Chat model keys configured there are automatically reused by the image generation skill — no need to set them twice. You can also edit the configuration file manually or temporarily set keys in a conversation using the `env_config` tool.
+</Tip>
+
+Credentials are shared with the main model providers:
+
+| Field | Provider |
+| --- | --- |
+| `openai_api_key` | OpenAI |
+| `gemini_api_key` | Gemini |
+| `ark_api_key` | Volcengine Ark (Seedream) |
+| `dashscope_api_key` | Alibaba DashScope (Qwen) |
+| `minimax_api_key` | MiniMax |
+| `linkai_api_key` | LinkAI |
+
 
 ## Enabling and Disabling
 
-`image-generation` is a built-in skill that **automatically adjusts its status based on API keys**:
+The skill automatically adjusts its status based on API keys:
 
-- **Key configured**: the skill is active — the Agent will invoke it when asked to draw
-- **Key not configured**: the skill still appears in context (marked as "needs configuration") — the Agent will guide the user to set up a key rather than failing silently
+- **Key configured**: the Agent calls the skill directly when it receives a drawing request
+- **Key not configured**: the skill still appears in context (marked as "needs configuration") — the Agent will guide the user to set up a key
 
 To control it manually:
 
 ```text
-/skill disable image-generation    # Disable (won't be invoked even if keys are present)
+/skill disable image-generation    # Disable
 /skill enable image-generation     # Re-enable
 ```
 
-In the terminal: `cow skill disable image-generation` / `cow skill enable image-generation`.
+Equivalent terminal commands: `cow skill disable image-generation` / `cow skill enable image-generation`.
 
 ## Parameters
 
 | Parameter | Type | Required | Default | Description |
 | --- | --- | --- | --- | --- |
 | `prompt` | string | Yes | — | Image description |
-| `image_url` | string / list | No | null | Input image(s) for editing — local path or URL. Pass multiple for multi-image fusion |
-| `quality` | string | No | auto | `low` / `medium` / `high` — only some providers support this |
+| `image_url` | string / list | No | null | Input image for editing — local path or URL; pass a list for multi-image fusion |
+| `quality` | string | No | auto | `low` / `medium` / `high`, supported only by some providers |
 | `size` | string | No | auto | `512` / `1K` / `2K` / `3K` / `4K`, or pixel value like `1024x1024` |
 | `aspect_ratio` | string | No | null | `1:1` / `3:2` / `2:3` / `16:9` / `9:16` / `21:9`; Gemini also supports `1:4` / `4:1` / `1:8` / `8:1` |
 
 <Warning>
-**Higher quality and larger size cost more and take longer.**
-
-- For everyday conversations and quick previews, use the defaults (`auto`) or `quality=low` + `size=1K` — roughly 20 seconds
-- For posters or when the user explicitly asks for high resolution, use `quality=high` + `size=2K/4K` — may take 1–5 minutes depending on the model
+  **Higher quality and larger size cost more and take longer.** For everyday conversations, use the defaults (`auto`) or `quality=low` + `size=1K` — about 20 seconds per image. For posters or when high resolution is explicitly requested, use `quality=high` + `size=2K/4K` — may take 1–5 minutes.
 </Warning>
 
-## Output
-
-On success:
-
-```json
-{
-  "model": "doubao-seedream-5-0-260128",
-  "images": [
-    {"url": "/path/to/output.png"}
-  ]
-}
-```
-
-On failure: `{ "error": "..." }`. After an error, **do not retry directly** — it is almost always a configuration issue (wrong key, incorrect API base, model not enabled). Have the user fix the configuration first.
-
 ## Common Use Cases
 
 - **Text-to-image**: generate illustrations, posters, icons, avatars, storyboards, etc. from a description
@@ -151,8 +91,8 @@ On failure: `{ "error": "..." }`. After an error, **do not retry directly** —
 - **Multi-image fusion**: combine multiple reference images into one (outfit swaps, character group photos, etc.)
 
 <Note>
-- Bash timeout should be set to 600 seconds. Each provider has a 300-second HTTP timeout, but the script may try multiple providers sequentially
+- Bash timeout should be set to 600 seconds: each provider has a 300-second HTTP timeout, and the script may try multiple providers sequentially
 - Input images are automatically compressed to ≤ 4 MB with the longest edge ≤ 4096 px
-- Gemini / Seedream / Qwen / MiniMax do not support the `quality` parameter — passing it has no effect
+- Gemini / Seedream / Qwen / MiniMax do not support the `quality` parameter
 - Seedream defaults to 2K; `seedream-5.0-lite` supports up to 3K; `seedream-4.5` supports up to 4K
 </Note>
diff --git a/docs/en/skills/install.mdx b/docs/en/skills/install.mdx
index f52b30ce..7a70205f 100644
--- a/docs/en/skills/install.mdx
+++ b/docs/en/skills/install.mdx
@@ -3,11 +3,11 @@ title: Install Skills
 description: Install skills from multiple sources with a single command
 ---
 
-CowAgent supports installing skills from **Cow Skill Hub, GitHub, ClawHub**, and any URL with a unified `install` command. Use `/skill install` in chat or `cow skill install` in the terminal.
+CowAgent supports installing skills from [Cow Skill Hub](https://skills.cowagent.ai/), GitHub, ClawHub, LinkAI, and any URL via a unified `install` command. Use `/skill install` in chat or `cow skill install` in the terminal.
 
-## From Skill Hub
+## From the Skill Hub
 
-Browse the Skill Hub and install:
+Browse all available skills at [skills.cowagent.ai](https://skills.cowagent.ai/) and install by name:
 
 ```text
 /skill list --remote
@@ -16,7 +16,7 @@ Browse the Skill Hub and install:
 
 ## From GitHub
 
-Supports batch install from repositories and single skill from subdirectories:
+Any GitHub-hosted skill can be installed directly. Supports both repository-level batch install and subdirectory-level single install:
 
 ```text
 /skill install larksuite/cli
@@ -25,10 +25,22 @@ Supports batch install from repositories and single skill from subdirectories:
 
 ## From ClawHub
 
+All [ClawHub](https://clawhub.ai/) skills (40k+) can be installed with a single command:
+
 ```text
-/skill install clawhub:baidu-search
+/skill install clawhub:<name>
 ```
 
+## From LinkAI
+
+All public resources on [LinkAI](https://link-ai.tech/console) (10k+ apps / workflows / plugins), as well as your own resources (apps, workflows, knowledge bases, databases, plugins), can be installed via:
+
+```text
+/skill install linkai:<code>
+```
+
+> Every resource created on the LinkAI platform has a unique `code`. Find it on each resource's page in the [console](https://link-ai.tech/console).
+
 ## From URL
 
 Supports zip archives and SKILL.md file links:
diff --git a/docs/en/skills/knowledge-wiki.mdx b/docs/en/skills/knowledge-wiki.mdx
index 9b54aad0..14ae9c90 100644
--- a/docs/en/skills/knowledge-wiki.mdx
+++ b/docs/en/skills/knowledge-wiki.mdx
@@ -1,5 +1,5 @@
 ---
-title: knowledge-wiki - Knowledge Base
+title: knowledge-wiki
 description: Maintain a local structured knowledge base with automatic archiving, categorisation, and cross-referencing
 ---
 
diff --git a/docs/en/skills/skill-creator.mdx b/docs/en/skills/skill-creator.mdx
index 2753cd45..58853f52 100644
--- a/docs/en/skills/skill-creator.mdx
+++ b/docs/en/skills/skill-creator.mdx
@@ -1,5 +1,5 @@
 ---
-title: skill-creator - Skill Creator
+title: skill-creator
 description: Create, install, and update skills — standardises SKILL.md format and directory structure
 ---
 
diff --git a/docs/en/tools/mcp.mdx b/docs/en/tools/mcp.mdx
index 9978c46e..fc320fe0 100644
--- a/docs/en/tools/mcp.mdx
+++ b/docs/en/tools/mcp.mdx
@@ -34,7 +34,9 @@ Fully compatible with the MCP community standard, identical to Claude Desktop /
 | `command` | stdio | Executable to launch the server (e.g. `npx`, `python`, `uvx`) |
 | `args` | No | Arguments passed to `command` |
 | `env` | No | Environment variables for the subprocess, commonly used for API keys |
-| `url` | SSE | SSE endpoint URL (alternative to `command`) |
+| `url` | SSE / Streamable HTTP | Remote endpoint URL (alternative to `command`) |
+| `type` | Remote | Remote transport type: `sse` or `streamable-http` (defaults to `sse`) |
+| `headers` | No | Extra HTTP headers for remote requests (e.g. `Authorization`); Streamable HTTP only |
 | `disabled` | No | When `true`, this server is skipped — handy for temporary disabling |
 
 ### Full Example
@@ -88,7 +90,8 @@ The Agent will:
 | Transport | Description | Config Field |
 | --- | --- | --- |
 | **stdio** | Subprocess communication. The most common option, with the richest community ecosystem. | `command` + `args` |
-| **SSE** | HTTP Server-Sent Events, suitable for remotely hosted MCP services. | `url` |
+| **SSE** | HTTP Server-Sent Events. Legacy remote transport. | `url` (default) |
+| **Streamable HTTP** | New unified remote transport, gradually replacing SSE. | `type: "streamable-http"` + `url` |
 
 ## Troubleshooting
 
@@ -106,4 +109,4 @@ You can browse third-party MCP marketplaces and copy a JSON config to use direct
 - [mcp.so](https://mcp.so) — Global MCP service index
 - [ModelScope MCP Hub](https://modelscope.cn/mcp) — ModelScope's MCP hub, more reliable from mainland China
 
-Any MCP server that follows the standard protocol (stdio / SSE) integrates with CowAgent out of the box.
+Any MCP server that follows the standard protocol (stdio / SSE / Streamable HTTP) integrates with CowAgent out of the box.
diff --git a/docs/en/tools/scheduler.mdx b/docs/en/tools/scheduler.mdx
index 0a5520c9..18c211bf 100644
--- a/docs/en/tools/scheduler.mdx
+++ b/docs/en/tools/scheduler.mdx
@@ -38,3 +38,43 @@ Create and manage scheduled tasks with natural language:
 <Frame>
   <img src="https://cdn.link-ai.tech/doc/20260202195402.png" width="800" />
 </Frame>
+
+## Results injected into the conversation
+
+Scheduled tasks run inside an isolated session (so internal planning and tool calls do not pollute the user's chat), but the **final output** is written back to the user's real session as a message pair. You can directly follow up — e.g. "expand on point 2 from earlier".
+
+**Default policy**
+
+- Output of Agent dynamic tasks is injected into the conversation
+- Fixed-message tasks are not injected by default (configurable)
+- Each session keeps the most recent **3 pairs** of scheduler messages; older pairs are pruned automatically. Regular user messages are unaffected
+
+**Configuration**
+
+| Key | Default | Description |
+| --- | --- | --- |
+| `scheduler_inject_to_session` | `true` | Master switch |
+| `scheduler_inject_max_per_session` | `3` | Max scheduler message pairs kept per session |
+| `scheduler_inject_send_message` | `false` | Whether to also inject fixed-message tasks |
+
+```json
+{
+  "scheduler_inject_to_session": true,
+  "scheduler_inject_max_per_session": 3,
+  "scheduler_inject_send_message": false
+}
+```
+
+## Context inside scheduled task execution
+
+The isolated session for scheduled tasks retains a few recent runs of conversation history, so you can naturally do "compare with last time" or "continue from previous conclusion". To prevent prompts from growing unbounded for high-frequency tasks (e.g. a 5-minute monitor), history is auto-trimmed:
+
+```
+scheduler_keep_turns = max(1, agent_max_context_turns / 5)
+```
+
+`agent_max_context_turns` defaults to `20`, so each scheduled run keeps the most recent **4 turns** of history by default. Increase `agent_max_context_turns` if you need longer memory.
+
+<Note>
+For group-chat scenarios (Feishu / WeCom group bots / DingTalk, etc.), the user's real `session_id` looks like `user_id:group_id` — different from `receiver`. Scheduler records the correct `session_id` when a task is created. For older `tasks.json` entries missing this field, the runtime falls back to `receiver`, matching legacy behavior.
+</Note>
diff --git a/docs/en/tools/vision.mdx b/docs/en/tools/vision.mdx
index 9e9da7f5..4db6bec0 100644
--- a/docs/en/tools/vision.mdx
+++ b/docs/en/tools/vision.mdx
@@ -1,5 +1,5 @@
 ---
-title: vision - Image Analysis
+title: vision - Image Understanding
 description: Analyze image content (recognition, description, OCR, etc.)
 ---
 
@@ -9,33 +9,49 @@ Analyze local images or image URLs using Vision API. Supports content descriptio
 
 The vision tool uses a multi-level auto-selection strategy with automatic fallback — no manual configuration required:
 
-1. **Main model** — uses the currently configured main model for image recognition (zero extra cost)
-2. **Other configured models** — auto-discovers other models with configured API keys as alternatives
-3. **OpenAI** — uses `open_ai_api_key` to call gpt-4.1-mini
-4. **LinkAI** — uses `linkai_api_key` to call LinkAI vision service
-
-When `use_linkai=true`, LinkAI is promoted to the highest priority.
+1. **Main model** — uses the currently configured main model for image recognition (must be a multimodal model)
+2. **Other configured models** — auto-discovers other multimodal models with configured API keys as alternatives
 
 If the current provider fails, the tool automatically tries the next one until it succeeds or all fail.
 
 ### Supported Models
 
-| Vendor | Vision Model | Notes |
+| Provider | Vision Model | Notes |
 | --- | --- | --- |
-| OpenAI / Compatible | Main model | All OpenAI-compatible multimodal models |
-| Baidu Qianfan | Main model | Multimodal main models (e.g. `ernie-5.1`) handle images directly; falls back to `ernie-4.5-turbo-vl` for text-only main models |
-| Qwen (DashScope) | Main model | Via MultiModalConversation API |
+| OpenAI / Compatible | Main model | All OpenAI-protocol-compatible multimodal models |
+| Qwen (DashScope) | Main model | e.g. qwen3.6-plus, etc. |
 | Claude | Main model | Anthropic native image format |
 | Gemini | Main model | inlineData format |
 | Doubao | Main model | doubao-seed-2-0 series natively supported |
 | Kimi (Moonshot) | Main model | kimi-k2.6, kimi-k2.5 natively supported |
-| ZhipuAI | glm-5v-turbo | Always uses dedicated vision model |
-| MiniMax | MiniMax-Text-01 | Always uses dedicated vision model |
+| ERNIE | Main model | Defaults to the multimodal main model (e.g. `ernie-5.1`); falls back to `ernie-4.5-turbo-vl` when the main model is not multimodal |
+| ZhipuAI | glm-5v-turbo | Always uses the dedicated vision model |
+| MiniMax | MiniMax-Text-01 | Always uses the dedicated vision model |
 
 <Note>
   ZhipuAI and MiniMax text models do not support image understanding, so their dedicated vision models are always used automatically.
 </Note>
 
+> When `use_linkai=true`, LinkAI's multimodal model is used by default.
+
+## Custom Configuration
+
+To specify the model used by Vision, configure it in `config.json`, for example:
+
+```json
+{
+    "tools": {
+        "vision": {
+            "model": "gpt-4.1"
+        }
+    }
+}
+```
+
+The specified model is **used first**, and the tool automatically routes to the corresponding provider based on the model name; on failure, it falls back to other configured providers.
+
+In most cases no configuration is needed — the tool works automatically as long as the main model supports multimodal input or any vision-capable API key is configured.
+
 ## Parameters
 
 | Parameter | Type | Required | Description |
@@ -45,21 +61,7 @@ If the current provider fails, the tool automatically tries the next one until i
 
 Supported image formats: jpg, jpeg, png, gif, webp
 
-## Custom Configuration
 
-To specify a particular model for the vision tool, add to `config.json`:
-
-```json
-{
-    "tool": {
-        "vision": {
-            "model": "ernie-4.5-turbo-vl"
-        }
-    }
-}
-```
-
-In most cases no configuration is needed. The tool works automatically as long as the main model supports multimodal input or any vision-capable API key is configured.
 
 ## Use Cases
 
@@ -69,5 +71,5 @@ In most cases no configuration is needed. The tool works automatically as long a
 - Analyze screenshots and scanned documents
 
 <Note>
-  Images larger than 1MB are automatically compressed (max edge 1536px). All images (including remote URLs) are converted to base64 for transmission to ensure compatibility with all model backends.
+  Images larger than 1MB are automatically compressed before upload. All images (including remote URLs) are converted to base64 for transmission to ensure compatibility with all model backends.
 </Note>
diff --git a/docs/en/tools/web-fetch.mdx b/docs/en/tools/web-fetch.mdx
new file mode 100644
index 00000000..0a0349b9
--- /dev/null
+++ b/docs/en/tools/web-fetch.mdx
@@ -0,0 +1,32 @@
+---
+title: web_fetch - Web Fetch
+description: Fetch web pages and document content
+---
+
+Fetch the content of an HTTP/HTTPS URL. Web pages are extracted as readable text; document files (PDF, Word, Excel, etc.) are downloaded and parsed automatically.
+
+## Parameters
+
+| Parameter | Type | Required | Description |
+| --- | --- | --- | --- |
+| `url` | string | Yes | HTTP/HTTPS URL (web page or document) |
+
+## Supported file types
+
+| Type | Formats |
+| --- | --- |
+| PDF | `.pdf` |
+| Word | `.docx` |
+| Text | `.txt`, `.md`, `.csv`, `.log` |
+| Spreadsheet | `.xls`, `.xlsx` |
+| Presentation | `.ppt`, `.pptx` |
+
+## Use cases
+
+- Extract readable text from a web page
+- Download and parse remote documents
+- Inspect API response bodies
+
+<Note>
+  `web_fetch` only retrieves static HTML. For pages that require JavaScript rendering (such as SPAs), use the `browser` tool instead.
+</Note>
diff --git a/docs/en/tools/web-search.mdx b/docs/en/tools/web-search.mdx
index d9d088c7..80c1eac1 100644
--- a/docs/en/tools/web-search.mdx
+++ b/docs/en/tools/web-search.mdx
@@ -1,32 +1,51 @@
 ---
 title: web_search - Web Search
-description: Search the internet for real-time information
+description: Search the internet for real-time information, with support for multiple search providers
 ---
 
-Search the internet for real-time information, news, research, and more. Supports two search backends with automatic fallback.
+Search the internet for real-time information, news, research, and more. Supports four backends — Bocha, ERNIE, GLM, and LinkAI — and works once any one of them is configured.
 
-## Dependencies
+<Tip>
+  It is recommended to configure providers and routing strategy visually from the "Model Management → Search" panel in the [Web console](/en/channels/web), without manually editing the configuration file.
+</Tip>
 
-Requires at least one search API key (configured via `env_config` tool or workspace `.env` file):
+## Providers
 
-| Backend | Environment Variable | Priority | How to Get |
-| --- | --- | --- | --- |
-| Bocha Search | `BOCHA_API_KEY` | Primary | [Bocha Open Platform](https://open.bochaai.com/) |
-| LinkAI Search | `LINKAI_API_KEY` | Fallback | [LinkAI Console](https://link-ai.tech/console/interface) |
+| Provider | Credential | Apply |
+| --- | --- | --- |
+| Bocha | `tools.web_search.bocha_api_key` | [Bocha Open Platform](https://open.bochaai.com/) |
+| ERNIE | Reuses `qianfan_api_key` | [Qianfan Console](https://cloud.baidu.com/doc/qianfan/s/2mh4su4uy) |
+| Zhipu | Reuses `zhipu_ai_api_key` | [Zhipu Open Platform](https://docs.bigmodel.cn/cn/guide/tools/web-search) |
+| LinkAI | Reuses `linkai_api_key` | [LinkAI Console](https://link-ai.tech/console/interface) |
 
-## Parameters
+Except for Bocha which requires a dedicated `bocha_api_key`, the other three reuse the corresponding model's API key — configuring the model automatically grants search capability.
+
+## Routing Strategy
+
+```json
+{
+  "tools": {
+    "web_search": {
+      "strategy": "auto",
+      "provider": ""
+    }
+  }
+}
+```
+
+- `auto` (default): the Agent intelligently picks among configured providers and may call multiple providers in a single task to gather more comprehensive results; when none is specified, falls back through `bocha → qianfan → zhipu → linkai`.
+- `fixed`: always use the provider specified in `provider`; falls back to the auto order if that provider's credentials are missing.
+
+## Tool Parameters
 
 | Parameter | Type | Required | Description |
 | --- | --- | --- | --- |
 | `query` | string | Yes | Search keywords |
-| `count` | integer | No | Number of results (1-50, default 10) |
-| `freshness` | string | No | Time range: `noLimit`, `oneDay`, `oneWeek`, `oneMonth`, `oneYear`, or date range like `2025-01-01..2025-02-01` |
-| `summary` | boolean | No | Return page summaries (default false) |
-
-## Use Cases
-
-When the user asks about latest information, needs fact-checking, or real-time data, the Agent automatically invokes this tool.
+| `count` | integer | No | Number of results (1–50, default 10) |
+| `freshness` | string | No | Time range: `noLimit` (default), `oneDay`, `oneWeek`, `oneMonth`, `oneYear`, or date range like `2025-01-01..2025-02-01` |
+| `summary` | boolean | No | Whether to return page summaries (default false) |
+| `provider` | string | No | Available when multiple providers are configured under the `auto` strategy; used to switch provider for a single call |
 
 <Note>
-  If no search API key is configured, this tool will not be loaded.
+  If none of the four credentials are configured, this tool is not registered with the Agent.
 </Note>
diff --git a/docs/guide/manual-install.mdx b/docs/guide/manual-install.mdx
index 305a355c..18aecd1d 100644
--- a/docs/guide/manual-install.mdx
+++ b/docs/guide/manual-install.mdx
@@ -97,7 +97,7 @@ nohup python3 app.py & tail -f nohup.out
 ```
 
 <Tip>
-  如果在服务器上部署，需要在防火墙或安全组中放行 `9899` 端口才能通过浏览器访问 Web 控制台，建议仅对指定IP开放以保证安全。
+  **服务器公网访问 Web 控制台**：默认 `web_host` 仅监听 `127.0.0.1`（本机访问），需公网访问时请在 `config.json` 中将 `web_host` 设为 `0.0.0.0`，同时强烈建议设置 `web_password` 启用鉴权。此外还需在防火墙/安全组中放行 `9899` 端口，建议仅对指定 IP 开放以保证安全。
 </Tip>
 
 ## Docker 部署
@@ -129,7 +129,7 @@ sudo docker logs -f chatgpt-on-wechat
 ```
 
 <Tip>
-  如果在服务器上部署，需要在防火墙或安全组中放行 `9899` 端口才能通过浏览器访问 Web 控制台，建议仅对指定IP开放以保证安全。
+  **Docker 公网访问 Web 控制台**：在 `docker-compose.yml` 中将 `WEB_HOST` 设为 `0.0.0.0`（容器内默认绑定 `127.0.0.1` 无法从宿主机外访问），同时强烈建议设置 `WEB_PASSWORD` 启用鉴权。此外需确保 `9899` 端口正确映射到宿主机，并在防火墙/安全组放行该端口。
 </Tip>
 
 ## 核心配置项
diff --git a/docs/guide/quick-start.mdx b/docs/guide/quick-start.mdx
index 9a2b55f7..dd71ee88 100644
--- a/docs/guide/quick-start.mdx
+++ b/docs/guide/quick-start.mdx
@@ -33,6 +33,10 @@ description: 使用脚本一键安装和管理 CowAgent
 
 运行后默认启动 Web 控制台，访问 `http://localhost:9899` 开始对话和管理Agent。
 
+<Note>
+  **服务器部署需要公网访问控制台时**，请在 `config.json` 中将 `web_host` 设为 `0.0.0.0`（默认仅监听 `127.0.0.1` 本机访问），同时强烈建议设置 `web_password` 启用鉴权。然后通过 `http://<server-ip>:9899` 访问，并确保防火墙/安全组放行 `9899` 端口。
+</Note>
+
 ## 管理命令
 
 安装完成后，使用 `cow` CLI 管理服务：
diff --git a/docs/intro/architecture.mdx b/docs/intro/architecture.mdx
index ffea7a6b..d7aa3e7a 100644
--- a/docs/intro/architecture.mdx
+++ b/docs/intro/architecture.mdx
@@ -9,7 +9,7 @@ CowAgent 2.0 从简单的聊天机器人全面升级为超级智能助理，采
 
 CowAgent 的整体架构由以下核心模块组成：
 
-<img src="https://cdn.link-ai.tech/doc/cow-agent-arch-zh.jpg" alt="CowAgent Architecture" />
+<img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/architecture/zh/architecture.jpg" alt="CowAgent Architecture" />
 
 | 模块 | 说明 |
 | --- | --- |
diff --git a/docs/intro/features.mdx b/docs/intro/features.mdx
index 3f32d14b..ae0820a3 100644
--- a/docs/intro/features.mdx
+++ b/docs/intro/features.mdx
@@ -84,7 +84,7 @@ Agent 会在对话中自动将有价值的信息整理为知识页面，维护
 
 技能系统为 Agent 提供无限的扩展性，每个 Skill 由说明文件、运行脚本（可选）、资源（可选）组成，描述如何完成特定类型的任务。通过 Skill 可以让 Agent 遵循说明完成复杂流程、调用各类工具或对接第三方系统。
 
-- **[Skill Hub](https://skills.cowagent.ai/)：** 开放的技能广场，汇集官方推荐、社区贡献和第三方技能，支持一键安装。
+- [Skill Hub](https://skills.cowagent.ai/)：开放的技能广场，汇集官方推荐、社区贡献和第三方技能，支持一键安装。
 - **内置技能：** 在项目的 `skills/` 目录下，包含技能创造器、图像识别、LinkAI 智能体、网页抓取等。内置 Skill 根据依赖条件（API Key、系统命令等）自动判断是否启用。
 - **自定义技能：** 由用户通过对话创建，存放在工作空间中（`~/cow/skills/`），可实现任何复杂的业务流程和第三方系统对接。
 
diff --git a/docs/intro/index.mdx b/docs/intro/index.mdx
index 3029abfb..10e64813 100644
--- a/docs/intro/index.mdx
+++ b/docs/intro/index.mdx
@@ -3,7 +3,9 @@ title: 项目介绍
 description: CowAgent - 基于大模型的超级AI助理
 ---
 
-<img src="https://cdn.link-ai.tech/doc/78c5dd674e2c828642ecc0406669fed7.png" alt="CowAgent" width="450px"/>
+<div align="center">
+  <img src="https://cdn.link-ai.tech/doc/78c5dd674e2c828642ecc0406669fed7.png" alt="CowAgent" width="450px"/>
+</div>
 
 **CowAgent** 是基于大模型的超级AI助理，能够主动思考和任务规划、操作计算机和外部资源、创造和执行Skills、拥有长期记忆和知识库并不断成长。
 
diff --git a/docs/ja/README.md b/docs/ja/README.md
index 6da81796..0e4327c8 100644
--- a/docs/ja/README.md
+++ b/docs/ja/README.md
@@ -1,250 +1,257 @@
-<p align="center"><img src="https://github.com/user-attachments/assets/eca9a9ec-8534-4615-9e0f-96c5ac1d10a3" alt="CowAgent" width="550" /></p>
+<p align="center"><img src="https://github.com/user-attachments/assets/eca9a9ec-8534-4615-9e0f-96c5ac1d10a3" alt="CowAgent" width="420" /></p>
 
 <p align="center">
   <a href="https://github.com/zhayujie/CowAgent/releases/latest"><img src="https://img.shields.io/github/v/release/zhayujie/CowAgent" alt="Latest release"></a>
   <a href="https://github.com/zhayujie/CowAgent/blob/master/LICENSE"><img src="https://img.shields.io/github/license/zhayujie/CowAgent" alt="License: MIT"></a>
   <a href="https://github.com/zhayujie/CowAgent"><img src="https://img.shields.io/github/stars/zhayujie/CowAgent?style=flat-square" alt="Stars"></a> <br/>
-  [<a href="https://github.com/zhayujie/CowAgent/blob/master/README.md">中文</a>] | [<a href="https://github.com/zhayujie/CowAgent/blob/master/docs/en/README.md">English</a>] | [日本語]
+  [<a href="../../README.md">English</a>] | [<a href="../zh/README.md">中文</a>] | [日本語]
 </p>
 
-**CowAgent** はLLMを搭載したAIスーパーアシスタントです。自律的なタスク計画、コンピュータや外部リソースの操作、Skillの作成・実行、長期記憶とパーソナルナレッジベースによる継続的な成長が可能です。柔軟なモデル切り替えに対応し、テキスト・音声・画像・ファイルを処理でき、WeChat、Web、Feishu（飛書）、DingTalk（釘釘）、WeCom Bot（企業微信ボット）、WeComアプリ、WeChat公式アカウントに統合可能で、個人のPCやサーバー上で24時間365日稼働できます。
+**CowAgent** は、自律的にタスクを計画し、コンピュータや外部リソースを操作し、Skill を作成・実行し、パーソナルナレッジベースと長期記憶でユーザーとともに成長するオープンソースのスーパー AI アシスタントです。エンドツーエンドの Agent Harness のリファレンス実装の一つでもあります。
+
+CowAgent は軽量でデプロイしやすく、拡張性に優れています。主要な LLM プロバイダーをそのまま組み込み、Web や主要な IM プラットフォーム上で動作。個人 PC やサーバー上で 24 時間 365 日稼働できます。
 
 <p align="center">
   <a href="https://cowagent.ai/">🌐 ウェブサイト</a> &nbsp;·&nbsp;
-  <a href="https://docs.cowagent.ai/en/intro/index">📖 ドキュメント</a> &nbsp;·&nbsp;
-  <a href="https://docs.cowagent.ai/en/guide/quick-start">🚀 クイックスタート</a> &nbsp;·&nbsp;
+  <a href="https://docs.cowagent.ai/ja/intro/index">📖 ドキュメント</a> &nbsp;·&nbsp;
+  <a href="https://docs.cowagent.ai/ja/guide/quick-start">🚀 クイックスタート</a> &nbsp;·&nbsp;
   <a href="https://skills.cowagent.ai/">🧩 Skill Hub</a> &nbsp;·&nbsp;
   <a href="https://link-ai.tech/cowagent/create">☁️ オンラインで試す</a>
 </p>
 
-## はじめに
+<br/>
 
-> CowAgentは、すぐに使えるAIスーパーアシスタントであると同時に、高い拡張性を持つAgentフレームワークでもあります。新しいモデルインターフェース、チャネル、組み込みツール、Skillシステムを拡張することで、さまざまなカスタマイズニーズに柔軟に対応できます。
+## 🌟 主な機能
 
-- ✅ **自律的タスク計画**: 複雑なタスクを理解し、自律的に実行計画を立て、目標達成までツールを呼び出しながら継続的に思考します。
-- ✅ **長期記憶**: 会話の記憶をローカルファイルやデータベースに自動的に永続化します。コアメモリ、デイリーメモリ、Deep Dream 蒸留を含み、キーワード検索やベクトル検索に対応しています。
-- ✅ **パーソナルナレッジベース**: 構造化された知識を自動整理し、相互参照によるナレッジグラフを構築。Web での可視化ブラウジングと対話による管理をサポートします。
-- ✅ **Skillシステム**: Skillの作成・実行エンジンを実装。[Skill Hub](https://skills.cowagent.ai)、GitHubなどからSkillをインストールでき、会話を通じたカスタムSkill作成もサポートしています。
-- ✅ **ツールシステム**: ファイル読み書き、ターミナル実行、ブラウザ操作、スケジュールタスク、メッセージ送信などの組み込みツールを提供。Agentが自律的に呼び出して複雑なタスクを完了します。
-- ✅ **CLIシステム**: ターミナルコマンドとチャットコマンドを提供し、プロセス管理、Skillインストール、設定変更などの操作をサポートします。
-- ✅ **マルチモーダルメッセージ**: テキスト、画像、音声、ファイルなど、さまざまなメッセージタイプの解析・処理・生成・送信に対応しています。
-- ✅ **複数モデル対応**: DeepSeek、MiniMax、Claude、Gemini、OpenAI、GLM、Qwen、Doubao、Kimiなど、主要なモデルプロバイダーに対応しています。
-- ✅ **マルチプラットフォームデプロイ**: ローカルPCやサーバー上で実行でき、WeChat、Web、Feishu、DingTalk、WeChat公式アカウント、WeComアプリケーションに統合可能です。
+| 機能 | 説明 |
+| :--- | :--- |
+| [タスク計画](https://docs.cowagent.ai/ja/intro/architecture) | 複雑なタスクを分解し、目標達成までツールを繰り返し呼び出して段階的に実行 |
+| [長期記憶](https://docs.cowagent.ai/ja/memory/index) | 三層構造（コンテキスト → デイリー → コア）、Deep Dream による自動蒸留、キーワードとベクトルのハイブリッド検索 |
+| [ナレッジベース](https://docs.cowagent.ai/ja/knowledge/index) | 構造化された知識を Markdown Wiki として自動整理し、進化し続けるナレッジグラフを可視化ブラウジング |
+| [Skill](https://docs.cowagent.ai/ja/skills/index) | [Skill Hub](https://skills.cowagent.ai/)、GitHub、ClawHub からワンクリックでインストール；対話によるカスタム Skill 作成にも対応 |
+| [ツール](https://docs.cowagent.ai/ja/tools/index) | ファイル I/O、ターミナル、ブラウザ、スケジューラ、記憶検索、Web 検索など 10+ の組み込みツール — MCP プロトコルに完全対応 |
+| [チャネル](https://docs.cowagent.ai/ja/channels/index) | 一つの Agent で Web、WeChat、Feishu、DingTalk、WeCom、QQ、公式アカウント、Telegram、Slack を同時にサポート |
+| マルチモーダル | テキスト・画像・音声・ファイルをフルサポート — 認識・生成・双方向送受信 |
+| [モデル](https://docs.cowagent.ai/ja/models/index) | Claude、GPT、Gemini、DeepSeek、GLM、Qwen、Kimi、MiniMax、Doubao など、設定 1 行で切り替え可能 |
+| [デプロイ](https://docs.cowagent.ai/ja/guide/quick-start) | ワンラインインストーラー、統合された Web コンソール、複数のデプロイモード（ローカル / Docker / サーバー） |
 
-## 免責事項
+<br/>
 
-1. 本プロジェクトは [MIT License](/LICENSE) に基づいており、技術研究・学習を目的としています。利用者は現地の法律、規制、ポリシー、企業の社則を遵守する必要があります。違法行為や権利侵害となる利用は禁止されています。
-2. Agentモードは通常のチャットモードよりも多くのトークンを消費します。効果とコストに基づいてモデルを選択してください。AgentはホストOSにアクセスできるため、信頼できる環境にデプロイしてください。
-3. CowAgentはオープンソース開発に注力しており、いかなる暗号通貨の発行・参加・承認も行っていません。
+## 🏗️ アーキテクチャ
 
-## デモ
+<img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/architecture/en/architecture.jpg" alt="CowAgent Architecture" width="750"/>
 
-オンラインで試す（デプロイ不要）: [CowAgent](https://link-ai.tech/cowagent/create)
+CowAgent は完全な **Agent Harness** です：メッセージは各種**チャネル**から流入し、**Agent Core** が記憶・知識・利用可能なツール／Skill を組み合わせてタスクを計画・判断、**モデル**が応答を生成し、結果は元のチャネルに返されます。各レイヤーは疎結合で、独立して拡張可能です。
 
-## 更新履歴
-
-> **2026.04.14:** [v2.0.6](https://github.com/zhayujie/CowAgent/releases/tag/2.0.6) — ナレッジベース、Deep Dream 記憶蒸留、スマートコンテキスト圧縮、Web コンソールアップグレード。
-
-> **2026.04.01:** [v2.0.5](https://github.com/zhayujie/CowAgent/releases/tag/2.0.5) — Cow CLI、Skill Hubオープンソース化、ブラウザツール、WeCom Botスキャン作成など。
-
-> **2026.02.27:** [v2.0.2](https://github.com/zhayujie/CowAgent/releases/tag/2.0.2) — Webコンソールの全面刷新（ストリーミングチャット、モデル/Skill/メモリ/チャネル/スケジューラ/ログ管理）、マルチチャネル同時実行、セッション永続化、Gemini 3.1 Pro / Claude 4.6 Sonnet / Qwen3.5 Plusなど新モデル追加。
-
-> **2026.02.13:** [v2.0.1](https://github.com/zhayujie/CowAgent/releases/tag/2.0.1) — 組み込みWeb検索ツール、スマートコンテキストトリミング、ランタイム情報の動的更新、Windows互換性、スケジューラのメモリ喪失やFeishu接続問題などの修正。
-
-> **2026.02.03:** [v2.0.0](https://github.com/zhayujie/CowAgent/releases/tag/2.0.0) — マルチステップタスク計画、長期記憶、組み込みツール、Skillフレームワーク、新モデル、チャネル最適化を備えたAIスーパーアシスタントへの全面アップグレード。
-
-> **2025.05.23:** [v1.7.6](https://github.com/zhayujie/CowAgent/releases/tag/1.7.6) — Webチャネル最適化、AgentMeshマルチエージェントプラグイン、Baidu TTS、claude-4-sonnet/opus対応。
-
-> **2025.04.11:** [v1.7.5](https://github.com/zhayujie/CowAgent/releases/tag/1.7.5) — wechatferryプロトコル、DeepSeekモデル、Tencent Cloud音声、ModelScope・Gitee-AI対応。
-
-> **2024.12.13:** [v1.7.4](https://github.com/zhayujie/CowAgent/releases/tag/1.7.4) — Gemini 2.0モデル、Webチャネル、メモリリーク修正。
-
-全更新履歴: [リリースノート](https://docs.cowagent.ai/en/releases/overview)
+詳細は [アーキテクチャ](https://docs.cowagent.ai/ja/intro/architecture) を参照してください。
 
 <br/>
 
 ## 🚀 クイックスタート
 
-本プロジェクトは、インストール・設定・起動・管理をワンクリックで行えるスクリプトを提供しています：
+依存関係のインストール、設定、起動を自動で行うワンラインインストーラーを提供しています：
 
 **Linux / macOS:**
+
 ```bash
 bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
 ```
 
 **Windows (PowerShell):**
+
 ```powershell
 irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
 ```
 
-実行後、デフォルトでWebサービスが起動します。`http://localhost:9899/chat` にアクセスしてチャットを開始できます。
-
-スクリプトの使い方: [ワンクリックインストール](https://docs.cowagent.ai/ja/guide/quick-start)。インストール後は `cow start`、`cow stop` などの [CLI コマンド](https://docs.cowagent.ai/ja/cli/index)でサービスを管理できます。
-
-### 手動インストール
-
-**1. プロジェクトのクローン**
-
-```bash
-git clone https://github.com/zhayujie/CowAgent
-cd CowAgent/
-```
-
-**2. 依存関係のインストール**
-
-```bash
-pip3 install -r requirements.txt
-pip3 install -r requirements-optional.txt   # 任意ですが推奨
-```
-
-**3. Cow CLI のインストール（推奨）**
-
-```bash
-pip3 install -e .
-```
-
-インストール後、`cow` コマンドでサービス管理（起動、停止、更新など）やSkill管理ができます。[コマンドドキュメント](https://docs.cowagent.ai/ja/cli/index)を参照してください。
-
-**4. ブラウザのインストール（任意）**
-
-Agentにブラウザ操作（Webページへのアクセス、フォーム入力など）が必要な場合：
-
-```bash
-cow install-browser
-```
-
-`playwright` と Chromium を自動インストールします。[ブラウザツールドキュメント](https://docs.cowagent.ai/ja/tools/browser)を参照してください。
-
-**5. 設定**
-
-```bash
-cp config-template.json config.json
-```
-
-`config.json` にモデルのAPIキーとチャネルタイプを記入してください。詳細は[設定ドキュメント](https://docs.cowagent.ai/en/guide/manual-install)を参照してください。
-
-**6. 実行**
-
-```bash
-cow start              # 推奨、Cow CLI が必要
-python3 app.py         # または直接実行
-```
-
-サーバーデプロイでは、`cow` コマンドでサービスを管理できます：
-
-```bash
-cow start              # バックグラウンドで起動
-cow stop               # サービス停止
-cow restart            # サービス再起動
-cow status             # 実行状態を確認
-cow logs               # ログを表示
-cow update             # 最新コードを取得して再起動
-```
-
-または従来の方法で実行：
-
-```bash
-nohup python3 app.py & tail -f nohup.out
-```
-
-### Dockerデプロイ
+**Docker:**
 
 ```bash
 curl -O https://cdn.link-ai.tech/code/cow/docker-compose.yml
-# docker-compose.yml を編集して設定を記入
-sudo docker compose up -d
-sudo docker logs -f chatgpt-on-wechat
+docker compose up -d
+```
+
+起動後、`http://localhost:9899` にアクセスして **Web コンソール**を開くと、モデル設定・チャネル接続・Skill インストールがすべてここで完結します。
+
+> サーバーデプロイでコンソールに公開アクセスする場合は、`config.json` の `web_host` を `0.0.0.0` に設定してください（あわせて `web_password` の設定も強く推奨）。その後 `http://<server-ip>:9899` にアクセスし、ファイアウォール／セキュリティグループで `9899` ポートを開放することも忘れずに。
+
+> 📖 詳細ガイド: [クイックスタート](https://docs.cowagent.ai/ja/guide/quick-start) · [ソースからインストール](https://docs.cowagent.ai/ja/guide/manual-install) · [アップグレード](https://docs.cowagent.ai/ja/guide/upgrade)
+
+インストール後は、[`cow` CLI](https://docs.cowagent.ai/ja/cli/index) でサービスを管理できます：
+
+```bash
+cow start | stop | restart        # サービス制御
+cow status | logs                  # ステータスとログ
+cow update                         # 最新コード取得後に再起動
+cow skill install <名前>           # Skill のインストール
+cow install-browser                # ブラウザツールのインストール
 ```
 
 <br/>
 
-## モデル
+## 🤖 モデル
 
-主要なモデルプロバイダーに対応しています。Agentモードの推奨モデル：
+CowAgent は主要な LLM プロバイダーすべてに対応しています。**チャット、画像認識、画像生成、ASR/TTS、埋め込み（Embedding）** の各機能はそれぞれ別のベンダーで設定可能です。
 
-| プロバイダー | 推奨モデル |
-| --- | --- |
-| DeepSeek | `deepseek-v4-flash` |
-| MiniMax | `MiniMax-M2.7` |
-| Claude | `claude-sonnet-4-6` |
-| Gemini | `gemini-3.1-pro-preview` |
-| OpenAI | `gpt-5.4` |
-| GLM | `glm-5.1` |
-| Qwen | `qwen3.6-plus` |
-| Doubao | `doubao-seed-2-0-code-preview-260215` |
-| Kimi | `kimi-k2.6` |
+| プロバイダー | 代表的なモデル | チャット | 画像認識 | 画像生成 | ASR | TTS | Embedding |
+| --- | --- | :-: | :-: | :-: | :-: | :-: | :-: |
+| [Claude](https://docs.cowagent.ai/ja/models/claude) | claude-opus-4-8 | ✅ | ✅ | | | | |
+| [OpenAI](https://docs.cowagent.ai/ja/models/openai) | gpt-5.5、o シリーズ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [Gemini](https://docs.cowagent.ai/ja/models/gemini) | gemini-3.5-flash | ✅ | ✅ | ✅ | | | |
+| [DeepSeek](https://docs.cowagent.ai/ja/models/deepseek) | deepseek-v4-flash / pro | ✅ | | | | | |
+| [Qwen](https://docs.cowagent.ai/ja/models/qwen) | qwen3.7-max | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [GLM](https://docs.cowagent.ai/ja/models/glm) | glm-5.1、glm-5v-turbo | ✅ | ✅ | | ✅ | | ✅ |
+| [Doubao](https://docs.cowagent.ai/ja/models/doubao) | doubao-seed-2.0 シリーズ | ✅ | ✅ | ✅ | | | ✅ |
+| [Kimi](https://docs.cowagent.ai/ja/models/kimi) | kimi-k2.6 | ✅ | ✅ | | | | |
+| [MiniMax](https://docs.cowagent.ai/ja/models/minimax) | MiniMax-M2.7 | ✅ | ✅ | ✅ | | ✅ | |
+| [ERNIE](https://docs.cowagent.ai/ja/models/qianfan) | ernie-5.1 | ✅ | ✅ | | | | |
+| [MiMo](https://docs.cowagent.ai/ja/models/mimo) | mimo-v2.5-pro / v2.5 | ✅ | ✅ | | | ✅ | |
+| [LinkAI](https://docs.cowagent.ai/ja/models/linkai) | 1 つの Key で 100+ モデルに接続 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [カスタム](https://docs.cowagent.ai/ja/models/custom) | ローカルモデル / サードパーティプロキシ | ✅ | | | | | |
 
-各モデルの詳細設定については、[モデルドキュメント](https://docs.cowagent.ai/en/models/index)を参照してください。
+> Web コンソールでの設定が推奨されており、ファイルを手動編集する必要はありません。手動設定については各プロバイダーのドキュメントおよび [モデル概要](https://docs.cowagent.ai/ja/models/index) を参照してください。
 
-### Coding Plan
+<br/>
 
-Coding Planは各プロバイダーが提供する月額サブスクリプションパッケージで、高頻度のAgent利用に最適です。すべてのプロバイダーはOpenAI互換モードでアクセスできます：
+## 💬 チャネル
 
-```json
-{
-  "bot_type": "openai",
-  "model": "MODEL_NAME",
-  "open_ai_api_base": "PROVIDER_CODING_PLAN_API_BASE",
-  "open_ai_api_key": "YOUR_API_KEY"
-}
+一つの Agent インスタンスで複数のチャネルを同時に提供できます。`channel_type` 設定で切り替えるか、複数のチャネルを並列実行できます。
+
+| チャネル | テキスト | 画像 | ファイル | 音声 | グループ |
+| --- | :-: | :-: | :-: | :-: | :-: |
+| [Web コンソール](https://docs.cowagent.ai/ja/channels/web)（デフォルト） | ✅ | ✅ | ✅ | ✅ | |
+| [WeChat](https://docs.cowagent.ai/ja/channels/weixin) | ✅ | ✅ | ✅ | ✅ | |
+| [Feishu / Lark](https://docs.cowagent.ai/ja/channels/feishu) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [DingTalk](https://docs.cowagent.ai/ja/channels/dingtalk) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [WeCom Bot](https://docs.cowagent.ai/ja/channels/wecom-bot) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [QQ](https://docs.cowagent.ai/ja/channels/qq) | ✅ | ✅ | ✅ | | ✅ |
+| [WeCom App](https://docs.cowagent.ai/ja/channels/wecom) | ✅ | ✅ | ✅ | ✅ | |
+| [WeChat 公式アカウント](https://docs.cowagent.ai/ja/channels/wechatmp) | ✅ | ✅ | | ✅ | |
+| [Telegram](https://docs.cowagent.ai/ja/channels/telegram) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [Slack](https://docs.cowagent.ai/ja/channels/slack) | ✅ | ✅ | ✅ | | ✅ |
+
+> Feishu と WeCom Bot は **Web コンソール内で QR コードをスキャンするだけで接続**できます — パブリック IP は不要です。詳細は [チャネル概要](https://docs.cowagent.ai/ja/channels/index) を参照してください。
+
+<img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/screenshots/en/web-console-chat.png" alt="CowAgent Web Console" width="800"/>
+
+*Web コンソールはデフォルトのチャネルであると同時に、Agent の設定・管理を統一的に行う場でもあります。*
+
+<br/>
+
+## 🧠 記憶とナレッジベース
+
+**長期記憶**は三層構造：会話コンテキスト（短期）→ デイリー記憶（中期）→ MEMORY.md（長期）。毎晩の **Deep Dream** が散在する記憶を洗練された長期記憶とナラティブな日記に蒸留します。詳細は [長期記憶](https://docs.cowagent.ai/ja/memory/index) · [Deep Dream](https://docs.cowagent.ai/ja/memory/deep-dream) を参照してください。
+
+**パーソナルナレッジベース**は時系列の記憶とは異なり、構造化された知識を**トピック単位**で整理します。Agent が会話中に有用な情報を自動でキュレーションし、相互参照とインデックスを維持し、Web コンソールでナレッジグラフを可視化できます。詳細は [パーソナルナレッジベース](https://docs.cowagent.ai/ja/knowledge/index) を参照してください。
+
+<table>
+  <tr>
+    <td width="50%">
+      <img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/screenshots/en/web-console-memory.png" alt="長期記憶" />
+      <p align="center"><em>長期記憶 · 三層構造 + Deep Dream</em></p>
+    </td>
+    <td width="50%">
+      <img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/screenshots/en/web-console-knowledge.png" alt="パーソナルナレッジベース" />
+      <p align="center"><em>ナレッジベース · 自動キュレーションされた Markdown Wiki</em></p>
+    </td>
+  </tr>
+</table>
+
+<br/>
+
+## 🔧 ツールと Skill
+
+**ツール（Tools）** は Agent がシステムリソースを操作するためのアトミックな機能です。**Skill（Skills）** はマニフェストファイルで定義される高レベルのワークフローで、複数のツールを組み合わせて複雑なタスクを完了します。
+
+### ツールシステム
+
+**組み込みツール**には、ファイル I/O（`read` / `write` / `edit` / `ls`）、ターミナル（`bash`）、ファイル送信（`send`）、記憶検索（`memory`）、環境変数（`env_config`）、Web フェッチ（`web_fetch`）、スケジューラ（`scheduler`）、Web 検索（`web_search`）、画像認識（`vision`）、ブラウザ自動化（`browser`）などが含まれます。
+
+**MCP プロトコル**は [Model Context Protocol](https://modelcontextprotocol.io) のオープンエコシステムを統合します。`mcp.json` を一度設定すれば即利用可能で、stdio / SSE トランスポート、ホットリロード、ノーコード統合をサポートします。
+
+詳細: [ツール概要](https://docs.cowagent.ai/ja/tools/index) · [MCP 統合](https://docs.cowagent.ai/ja/tools/mcp)。
+
+### Skill システム
+
+- **[Skill Hub](https://skills.cowagent.ai/)** — オープン Skill マーケットプレイス：閲覧、検索、ワンクリックインストール
+- **GitHub / ClawHub / URL など** — 任意のソースからワンクリックでインストール
+- **対話による作成** — `skill-creator` を使って対話でカスタム Skill を生成；ワークフローやサードパーティ API を再利用可能な Skill に変換
+
+```bash
+/skill list                   # インストール済み Skill の一覧
+/skill search <キーワード>     # マーケットプレイスで検索
+/skill install <名前>          # ワンクリックインストール
 ```
 
-- `bot_type`: `openai` を指定
-- `model`: プロバイダーがサポートするモデル名
-- `open_ai_api_base`: プロバイダーのCoding Plan API Base（標準の従量課金とは異なります）
-- `open_ai_api_key`: プロバイダーのCoding Plan APIキー
-
-> 注意：Coding PlanのAPI BaseとAPIキーは、通常の従量課金のものとは別です。各プロバイダーのプラットフォームから取得してください。
-
-対応プロバイダーには、Alibaba Cloud、MiniMax、Zhipu GLM、Kimi、Volcengineなどがあります。各プロバイダーの詳細設定については、[Coding Planドキュメント](https://docs.cowagent.ai/en/models/coding-plan)を参照してください。
+詳細: [Skill 概要](https://docs.cowagent.ai/ja/skills/index) · [Skill 作成](https://docs.cowagent.ai/ja/skills/create)。
 
 <br/>
 
-## チャネル
+## 🏷 更新履歴
 
-複数のプラットフォームに対応しています。`config.json` の `channel_type` を設定して切り替えます：
+> **2026.05.22:** [v2.0.9](https://github.com/zhayujie/CowAgent/releases/tag/2.0.9) — モデル管理、MCP プロトコル対応、ブラウザセッション永続化、新モデル（gpt-5.5、gemini-3.5-flash、qwen3.7-max）、デプロイのセキュリティ強化。
 
-| チャネル | `channel_type` | ドキュメント |
-| --- | --- | --- |
-| WeChat | `weixin` | [WeChat設定](https://docs.cowagent.ai/ja/channels/weixin) |
-| Web（デフォルト） | `web` | [Webチャネル](https://docs.cowagent.ai/en/channels/web) |
-| Feishu（飛書） | `feishu` | [Feishu設定](https://docs.cowagent.ai/en/channels/feishu) |
-| DingTalk（釘釘） | `dingtalk` | [DingTalk設定](https://docs.cowagent.ai/en/channels/dingtalk) |
-| WeCom Bot | `wecom_bot` | [WeCom Bot設定](https://docs.cowagent.ai/en/channels/wecom-bot) |
-| WeComアプリ | `wechatcom_app` | [WeCom設定](https://docs.cowagent.ai/en/channels/wecom) |
-| WeChat公式アカウント | `wechatmp` / `wechatmp_service` | [WeChat公式アカウント設定](https://docs.cowagent.ai/en/channels/wechatmp) |
-| ターミナル | `terminal` | — |
+> **2026.05.06:** [v2.0.8](https://github.com/zhayujie/CowAgent/releases/tag/2.0.8) — Feishu チャネル全面アップグレード（音声、ストリーミング、QR 接続）、DeepSeek V4 と Baidu Qianfan 対応、スケジューラツール強化。
 
-複数チャネルを同時に有効化できます。カンマ区切りで指定してください：`"channel_type": "feishu,dingtalk"`
+> **2026.04.22:** [v2.0.7](https://github.com/zhayujie/CowAgent/releases/tag/2.0.7) — 組み込み画像生成（GPT Image 2、Nano Banana）、新モデル（Kimi K2.6、Claude Opus 4.7、GLM 5.1）、ナレッジベースと記憶の強化。
+
+> **2026.04.14:** [v2.0.6](https://github.com/zhayujie/CowAgent/releases/tag/2.0.6) — ナレッジベース、Deep Dream 記憶蒸留、スマートコンテキスト圧縮、マルチセッション Web コンソール。
+
+> **2026.04.01:** [v2.0.5](https://github.com/zhayujie/CowAgent/releases/tag/2.0.5) — Cow CLI、Skill Hub オープンソース化、ブラウザツール、WeCom Bot QR 接続。
+
+> **2026.02.03:** [v2.0.0](https://github.com/zhayujie/CowAgent/releases/tag/2.0.0) — マルチステップタスク計画、長期記憶、Skill フレームワークを備えたスーパー Agent アシスタントへの全面アップグレード。
+
+完全な履歴: [リリースノート](https://docs.cowagent.ai/ja/releases/overview)
 
 <br/>
 
-## エンタープライズサービス
+## 🤝 コミュニティとサポート
 
-<a href="https://link-ai.tech" target="_blank"><img width="720" src="https://cdn.link-ai.tech/image/link-ai-intro.jpg"></a>
+GitHub で [Issue を報告](https://github.com/zhayujie/CowAgent/issues) するか、下記 QR コードをスキャンして WeChat コミュニティに参加してください：
 
-> [LinkAI](https://link-ai.tech/) は、企業や開発者向けのワンストップAIエージェントプラットフォームです。マルチモーダルLLM、ナレッジベース、Agentプラグイン、ワークフローを統合しています。主要プラットフォームへのワンクリック統合、SaaSおよびプライベートデプロイに対応しています。
+<img width="130" src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/open-community.png">
 
 <br/>
 
 ## 🔗 関連プロジェクト
 
-- [Cow Skill Hub](https://github.com/zhayujie/cow-skill-hub): AIエージェント向けのオープンSkillマーケットプレイス。CowAgent、OpenClaw、Claude Codeなどで利用可能なSkillの閲覧・検索・インストール・公開が可能。
-- [bot-on-anything](https://github.com/zhayujie/bot-on-anything): 軽量で高い拡張性を持つLLMアプリケーションフレームワーク。Slack、Telegram、Discord、Gmailなどに対応。
-- [AgentMesh](https://github.com/MinimalFuture/AgentMesh): エージェントチームの協調による複雑な問題解決のためのオープンソースのマルチエージェントフレームワーク。
+- **[Cow Skill Hub](https://github.com/zhayujie/cow-skill-hub)** — AI エージェント向けのオープン Skill マーケットプレイス；CowAgent、OpenClaw、Claude Code などに対応
+- **[bot-on-anything](https://github.com/zhayujie/bot-on-anything)** — 軽量な LLM アプリケーションフレームワーク；Slack、Telegram、Discord、Gmail などに対応
+- **[AgentMesh](https://github.com/MinimalFuture/AgentMesh)** — チーム協調による複雑な問題解決のためのオープンソースのマルチエージェントフレームワーク
 
-## 🔎 よくある質問
+<br/>
 
-FAQ: <https://github.com/zhayujie/CowAgent/wiki/FAQs>
+## 🏢 エンタープライズサービス
 
-## 🛠️ コントリビューション
+[**LinkAI**](https://link-ai.tech/) は企業や開発者向けのワンストップ AI Agent プラットフォームで、CowAgent にマネージドホスティングとエンタープライズグレードのサポートを提供します：
 
-新しいチャネルの追加を歓迎します。[Feishuチャネル](https://github.com/zhayujie/CowAgent/blob/master/channel/feishu/feishu_channel.py)を参考にしてください。また、新しいSkillのコントリビューションも歓迎します。[Skill作成ドキュメント](https://docs.cowagent.ai/ja/skills/create)を参照するか、[Skill Hub](https://skills.cowagent.ai/submit)に提出してください。
+- **🚀 デプロイ不要のホスト型ランタイム** — [CowAgent オンラインアシスタント](https://link-ai.tech/cowagent/create) を 1 分以内に起動、サーバー不要
+- **🧠 Agent インフラ** — 主要 LLM・ナレッジベース・データベース・Skill・ワークフローへの統一アクセス。CowAgent の機能を拡張する、すぐに使えるビルディングブロック
+- **🏢 チーム & エンタープライズ機能** — ワークスペース、ロールベースのアクセス制御、監査ログ、本番運用向けプライベートデプロイ
 
-## ✉ お問い合わせ
+エンタープライズに関するお問い合わせ：**sales@simple-future.tech** または [QR コードをスキャン](https://cdn.link-ai.tech/consultant.jpg) して WeChat でお問い合わせください。
 
-PRやIssueの提出を歓迎します。🌟 Starでプロジェクトをサポートしてください。ご質問がある場合は、[FAQリスト](https://github.com/zhayujie/CowAgent/wiki/FAQs)を確認するか、[Issues](https://github.com/zhayujie/CowAgent/issues)を検索してください。
+<br/>
+
+## 🛠️ 開発とコントリビューション
+
+新しいチャネルの追加を歓迎します — [Feishu チャネル](https://github.com/zhayujie/CowAgent/blob/master/channel/feishu/feishu_channel.py) を参考にカスタムチャネルを実装できます。新しい Skill のコントリビューションも [Skill Hub](https://skills.cowagent.ai/submit) で受け付けています。
+
+⭐ Star でプロジェクトの更新をフォローしてください。PR や Issue の提出も歓迎します。
 
 ## 🌟 コントリビューター
 
 ![cow contributors](https://contrib.rocks/image?repo=zhayujie/CowAgent&max=1000)
+
+<br/>
+
+## ⚠️ 免責事項
+
+1. 本プロジェクトは [MIT License](/LICENSE) に基づき、技術研究と学習を目的としています。利用者は所在地の法令・規制を遵守する必要があり、本プロジェクトの利用に起因するいかなる結果についてもメンテナーは責任を負いません。
+2. **コストと安全性：** Agent モードは通常のチャットよりトークン消費が大幅に多いため、品質とコストのバランスを考慮してモデルを選択してください。Agent はローカル OS にアクセスできるため、信頼できる環境にのみデプロイしてください。
+3. CowAgent は純粋なオープンソースプロジェクトであり、暗号通貨の発行・参加・承認は一切行いません。
+
+<br/>
+
+## 📌 プロジェクト改名のお知らせ
+
+本プロジェクトは旧名 `chatgpt-on-wechat` から、2026.04.13 に **CowAgent** へ正式に改名されました。元の GitHub URL は自動的にリダイレクトされます。既存ユーザーは `git remote set-url origin https://github.com/zhayujie/CowAgent.git` でローカルのリモートを更新できます。
diff --git a/docs/ja/channels/index.mdx b/docs/ja/channels/index.mdx
new file mode 100644
index 00000000..05a540d2
--- /dev/null
+++ b/docs/ja/channels/index.mdx
@@ -0,0 +1,43 @@
+---
+title: チャネル一覧
+description: CowAgent が対応するチャネルと機能マトリクス
+---
+
+CowAgent は複数のチャットチャネルへの接続に対応しており、起動時に `channel_type` で切り替えます。Web コンソールはデフォルトで有効で、他の接続チャネルと並行して動作します。
+
+## 機能マトリクス
+
+下表は各チャネルが対応する受信メッセージタイプ、ボットの返信タイプ、グループチャット機能をまとめたものです。シーンに合わせて選択してください。
+
+| チャネル | テキスト | 画像 | ファイル | 音声 | グループチャット |
+| --- | :-: | :-: | :-: | :-: | :-: |
+| [WeChat](/ja/channels/weixin) | ✅ | ✅ | ✅ | ✅ |  |
+| [Web コンソール](/ja/channels/web) | ✅ | ✅ | ✅ | ✅ | |
+| [Feishu](/ja/channels/feishu) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [DingTalk](/ja/channels/dingtalk) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [WeCom スマートボット](/ja/channels/wecom-bot) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [QQ](/ja/channels/qq) | ✅ | ✅ | ✅ | | ✅ |
+| [WeCom アプリ](/ja/channels/wecom) | ✅ | ✅ | ✅ | ✅ | |
+| [WeChat 公式アカウント](/ja/channels/wechatmp) | ✅ | ✅ | | ✅ | |
+| [Telegram](/ja/channels/telegram) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [Slack](/ja/channels/slack) | ✅ | ✅ | ✅ | | ✅ |
+
+- **画像 / ファイル / 音声**列は対応するメッセージタイプの送受信に対応していることを示します。詳細は各チャネルのドキュメントを参照してください
+- **グループチャット**列はグループメッセージを認識して応答できることを示します
+
+<Tip>
+  各チャネルの音声 / 画像機能は、対応するモデルプロバイダーの設定に依存します。詳細は [モデル一覧](/ja/models) を参照してください。
+</Tip>
+
+## チャネル一覧
+
+- [Web コンソール](/ja/channels/web) — 組み込みのブラウザ対話・管理パネル、デフォルトで有効
+- [WeChat](/ja/channels/weixin) — 個人 WeChat の QR コードログイン
+- [Feishu](/ja/channels/feishu) — Feishu 自作ボット
+- [DingTalk](/ja/channels/dingtalk) — DingTalk 自作ボット
+- [WeCom スマートボット](/ja/channels/wecom-bot) — WeCom スマートボット
+- [QQ](/ja/channels/qq) — QQ 公式ボットオープンプラットフォーム
+- [WeCom アプリ](/ja/channels/wecom) — WeCom 自作アプリ接続
+- [WeChat 公式アカウント](/ja/channels/wechatmp) — WeChat 公式アカウント（購読アカウント / サービスアカウント）
+- [Telegram](/ja/channels/telegram) — グローバル IM、5 分で接続、公開 IP 不要
+- [Slack](/ja/channels/slack) — チームコラボレーション IM、Socket Mode 接続、公開 IP 不要
diff --git a/docs/ja/channels/slack.mdx b/docs/ja/channels/slack.mdx
new file mode 100644
index 00000000..d871bd00
--- /dev/null
+++ b/docs/ja/channels/slack.mdx
@@ -0,0 +1,118 @@
+---
+title: Slack
+description: Slack App 経由で CowAgent を接続
+---
+
+> Slack App の **Socket Mode** を通じて CowAgent を接続します。ダイレクトメッセージ（DM）およびチャンネル（@メンションまたはスレッド内の返信で起動）に対応。Socket Mode は WebSocket の常時接続を使うため公開 IP やコールバック URL は不要で、すぐに利用できます。
+
+## 1. 接続手順
+
+### ステップ 1: Slack App を作成
+
+1. [Slack API アプリ管理ページ](https://api.slack.com/apps) を開き、**Create New App** → **From scratch** をクリックします。
+2. **App Name**（例: `CowAgent`）を入力し、インストール先の **Workspace** を選択して作成します。
+
+### ステップ 2: Socket Mode を有効化し App Token を取得
+
+1. 左メニューの **Settings → Socket Mode** で **Enable Socket Mode** をオンにします。
+2. `connections:write` スコープを持つ **App-Level Token** の生成を求められます。`xapp-` で始まるこの Token を保存してください。
+
+<Tip>
+  Socket Mode は WebSocket 接続でイベントを受信するため、公開コールバック URL を公開する必要がありません。ローカルやイントラネットでの運用に最適です。
+</Tip>
+
+### ステップ 3: Bot 権限を設定してインストール
+
+1. **Features → OAuth & Permissions** を開き、**Bot Token Scopes** で **Add an OAuth Scope** をクリックして以下を 1 つずつ追加します:
+
+   ```
+   app_mentions:read
+   channels:history
+   chat:write
+   commands
+   files:read
+   files:write
+   groups:history
+   im:history
+   mpim:history
+   users:read
+   ```
+
+   <Note>
+     `files:read` / `files:write` は画像・ファイルの送受信に使用します。テキスト会話のみであれば省略可能です。
+   </Note>
+
+2. **Features → Event Subscriptions** を開き、**Enable Events** をオンにして、**Subscribe to bot events** で **Add Bot User Event** をクリックし、以下を追加します:
+
+   ```
+   app_mention
+   message.im
+   message.channels
+   ```
+
+   <Note>
+     非公開チャンネルで使用する場合は `message.groups` も追加してください。
+   </Note>
+3. **Features → App Home** を開き、**Show Tabs** 内の **Messages Tab** を有効にして、下の **Allow users to send Slash commands and messages from the messages tab**（メッセージタブからの送信を許可）にチェックを入れます。これを行わないと DM の入力欄が無効化され、ボットにメッセージを送れません。
+4. **OAuth & Permissions** に戻り、**Install to Workspace** をクリックしてインストールします。インストール後、`xoxb-` で始まる **Bot User OAuth Token** を取得します。
+
+<Tip>
+  Slack クライアントで「このアプリへのメッセージ送信は無効です」と表示される場合は、上記の App Home 設定が完了しているか確認し、Slack クライアントを再読み込み／再起動してください（必要に応じてアプリを会話一覧から削除して再度開きます）。
+</Tip>
+
+### ステップ 4: CowAgent に接続
+
+<Tabs>
+  <Tab title="Web コンソール（推奨）">
+    Web コンソール（既定 `http://127.0.0.1:9899`）を開き、**チャネル** メニュー → **チャネルを追加** → **Slack** を選択し、Bot Token（`xoxb-`）と App Token（`xapp-`）を貼り付けて接続をクリックします。
+  </Tab>
+  <Tab title="設定ファイル">
+    `config.json` に以下を追加して Cow を起動します:
+
+    ```json
+    {
+      "channel_type": "slack",
+      "slack_bot_token": "xoxb-xxxxxxxxxxxx",
+      "slack_app_token": "xapp-xxxxxxxxxxxx",
+      "slack_group_trigger": "mention_or_reply"
+    }
+    ```
+
+    | パラメータ | 説明 | 既定値 |
+    | --- | --- | --- |
+    | `slack_bot_token` | Bot User OAuth Token、`xoxb-...` の形式 | - |
+    | `slack_app_token` | App-Level Token（Socket Mode 有効化後に生成）、`xapp-...` の形式 | - |
+    | `slack_group_trigger` | チャンネルのトリガー方式: `mention_or_reply`（@ またはスレッド返信）/ `mention_only`（@ のみ）/ `all`（全メッセージ） | `mention_or_reply` |
+  </Tab>
+</Tabs>
+
+ログに以下のような出力が表示されれば接続成功です:
+
+```
+[Slack] Bot logged in as user_id=U0XXXXXXX, team=Txxxxxxxx
+[Slack] ✅ Slack bot ready, listening for events
+```
+
+## 2. 機能
+
+| 機能 | 対応状況 |
+| --- | --- |
+| ダイレクトメッセージ（DM） | ✅ |
+| チャンネル（@bot / スレッド返信） | ✅ |
+| テキストメッセージ | ✅ 送受信 |
+| 画像メッセージ | ✅ 送受信 |
+| ファイルメッセージ | ✅ 送受信（PDF / Word / Excel など） |
+| スレッド返信 | ✅ 起動メッセージのスレッドに返信を送信 |
+
+<Note>
+  Slack はスレッドで会話を整理します。Bot は起動メッセージのスレッドに返信を送信するため、チャンネルがすっきりします。
+</Note>
+
+## 3. 使い方
+
+接続が完了したら:
+
+- **ダイレクトメッセージ（DM）**: Slack の左サイドバー **Apps** からアプリを開き、直接メッセージを送ります。
+- **チャンネル**: アプリをチャンネルに招待し（`/invite @your-app`）、`@your-app こんにちは` で起動します。以降は同じスレッド内で返信すれば会話を継続できます。
+
+画像やファイルを送るときは、添付の入力欄に **テキスト説明**（説明・質問）を書いて一緒に送信できます。Bot は添付ファイルと説明を合わせて回答します。先に添付を送り、その後に質問を送る形でも、2 つのメッセージは自動でまとめて処理されます。
diff --git a/docs/ja/channels/telegram.mdx b/docs/ja/channels/telegram.mdx
new file mode 100644
index 00000000..eab1d3b0
--- /dev/null
+++ b/docs/ja/channels/telegram.mdx
@@ -0,0 +1,111 @@
+---
+title: Telegram
+description: Telegram Bot API 経由で CowAgent を接続
+---
+
+> 公式の Telegram Bot API を通じて CowAgent を接続します。1 対 1 チャットおよびグループチャット（@メンションまたはボットへの返信で起動）に対応。Long Polling 方式のため公開 IP は不要で、すぐに利用できます。
+
+
+## 1. 接続手順
+
+### ステップ 1: BotFather で Bot を作成
+
+1. Telegram で公式アカウント [@BotFather](https://t.me/BotFather) を開きます。
+2. `/newbot` を送り、案内に従って入力します:
+   - **Bot 名**（表示名、例: `My CowAgent Bot`）
+   - **Bot ユーザー名**（`bot` で終わる必要があります、例: `my_cowagent_bot`）
+3. 作成完了後、BotFather から **HTTP API Token**（例: `123456789:ABCdefGhIJKlmNoPQRsTUVwxyZ`）が返されます。大切に保管してください。
+
+<Tip>
+  Token は Bot のパスワードに相当します。漏えいしないよう注意してください。万が一漏れた場合は `@BotFather` に `/revoke` を送って再発行できます。
+</Tip>
+
+### ステップ 2:（グループ利用時）Privacy Mode を無効化
+
+1 対 1 チャットのみ利用する場合はスキップ可能です。Telegram Bot は既定で **Privacy Mode** が有効で、グループ内では `@bot` 接尾辞付きのコマンド（例: `/start@your_bot`）と、Bot メッセージへの返信のみ受信できます。**通常の `@bot こんにちは` のようなテキストメッセージは届きません**。そのままだとグループで反応しないので、必要に応じて以下を設定してください。
+
+`@BotFather` に対して:
+
+1. `/setprivacy` を送信
+2. 作成した Bot を選択
+3. `Disable` を選択
+
+<Note>
+  設定後もグループで反応しない場合は、Bot を一度グループから外して再度追加してみてください。
+</Note>
+
+### ステップ 3: CowAgent に接続
+
+<Tabs>
+  <Tab title="Web コンソール（推奨）">
+    Web コンソール（既定 `http://127.0.0.1:9899`）を開き、**チャネル** メニュー → **チャネルを追加** → **Telegram** を選択し、Bot Token を貼り付けて接続をクリックします。
+  </Tab>
+  <Tab title="設定ファイル">
+    `config.json` に以下を追加して Cow を起動します:
+
+    ```json
+    {
+      "channel_type": "telegram",
+      "telegram_token": "123456789:ABCdefGhIJKlmNoPQRsTUVwxyZ",
+      "telegram_group_trigger": "mention_or_reply"
+    }
+    ```
+
+    | パラメータ | 説明 | 既定値 |
+    | --- | --- | --- |
+    | `telegram_token` | BotFather から発行された HTTP API Token | - |
+    | `telegram_group_trigger` | グループのトリガー方式: `mention_or_reply`（@ または返信）/ `mention_only`（@ のみ）/ `all`（全メッセージ） | `mention_or_reply` |
+    | `telegram_register_commands` | 起動時に BotFather にコマンドメニューを登録するかどうか | `true` |
+  </Tab>
+</Tabs>
+
+ログに以下のような出力が表示されれば接続成功です:
+
+```
+[Telegram] Bot logged in as @my_cowagent_bot (id=123456789)
+[Telegram] Registered 10 bot commands
+[Telegram] ✅ Telegram bot ready, polling for updates
+```
+
+## 2. 機能
+
+| 機能 | 対応状況 |
+| --- | --- |
+| 1 対 1 チャット | ✅ |
+| グループチャット（@bot / Bot への返信） | ✅ |
+| テキストメッセージ | ✅ 送受信 |
+| 画像メッセージ | ✅ 送受信 |
+| 音声メッセージ | ✅ 送受信（OGG/Opus） |
+| 動画メッセージ | ✅ 送受信 |
+| ファイルメッセージ | ✅ 送受信（PDF / Word / Excel など） |
+| コマンドメニュー | ✅ Web コンソールの slash コマンドと一致 |
+
+### コマンドメニュー
+
+起動時に BotFather へコマンドメニューを自動登録します。Telegram の入力欄で `/` を入力するとサジェストが表示されます:
+
+| コマンド | 説明 |
+| --- | --- |
+| `/help` | コマンドヘルプを表示 |
+| `/status` | 実行ステータスを確認 |
+| `/context` | 対話コンテキストを表示（`/context clear` でクリア） |
+| `/skill` | スキル管理（`/skill list`、`/skill install` など） |
+| `/memory` | 記憶管理（`/memory dream`） |
+| `/knowledge` | ナレッジベース管理（`/knowledge list` / `on` / `off`） |
+| `/config` | 現在の設定を表示 |
+| `/cancel` | 実行中の Agent タスクを中断 |
+| `/logs` | 最近のログを表示 |
+| `/version` | バージョンを表示 |
+
+<Note>
+  Telegram のコマンドメニューはトップレベルのコマンドのみ表示されます。サブコマンドはスペース区切りで入力します（例: `/skill list`、`/context clear`）。
+</Note>
+
+## 3. 使い方
+
+接続が完了したら:
+
+- **1 対 1 チャット**: Telegram で Bot のユーザー名（例: `@my_cowagent_bot`）を検索し、`Start` をタップして会話を開始します。
+- **グループチャット**: Bot をグループに追加し、`@bot こんにちは` または **Bot のメッセージに返信** することで起動します。グループで反応しない場合は [ステップ 2](#ステップ-2-グループ利用時-privacy-mode-を無効化) の Privacy Mode 設定を確認してください。
+
+画像やファイルを送るときは、添付欄の上の入力欄に **キャプション**（説明・質問）を直接書いて一緒に送信できます。Bot は添付ファイルとキャプションを合わせて回答します。先に添付を送り、その後に質問を送る形でも、2 つのメッセージは自動でまとめて処理されます。
diff --git a/docs/ja/channels/web.mdx b/docs/ja/channels/web.mdx
index 84ab18dd..922627fc 100644
--- a/docs/ja/channels/web.mdx
+++ b/docs/ja/channels/web.mdx
@@ -3,56 +3,65 @@ title: Web コンソール
 description: Web コンソールで CowAgent を使用する
 ---
 
-Web コンソールは CowAgent のデフォルトチャネルです。起動後に自動的に開始され、ブラウザを通じて Agent とチャットしたり、モデル、Skill、メモリ、チャネルなどの設定をオンラインで管理できます。
+Web コンソールは CowAgent のデフォルトチャネルです。起動後に自動的に実行され、ブラウザを通じて Agent と対話できるほか、モデル、Skill、メモリ、チャネルなどの設定をオンラインで管理できます。
 
 ## 設定
 
 ```json
 {
   "channel_type": "web",
-  "web_port": 9899
+  "web_host": "0.0.0.0",
+  "web_port": 9899,
+  "web_password": "",
+  "enable_thinking": false
 }
 ```
 
 | パラメータ | 説明 | デフォルト値 |
 | --- | --- | --- |
 | `channel_type` | `web` に設定 | `web` |
+| `web_host` | Web サービスのリスンアドレス。デフォルトは `127.0.0.1`（ローカルのみ）。公開アクセスが必要な場合は `0.0.0.0` に変更してパスワードを設定してください | `""` |
 | `web_port` | Web サービスのリスンポート | `9899` |
+| `web_password` | アクセスパスワード。空欄の場合はパスワード保護が無効。`0.0.0.0` でリスンする場合は設定を推奨 | `""` |
+| `web_session_expire_days` | ログインセッションの有効日数 | `30` |
+| `enable_thinking` | 深い思考モードを有効化するか | `false` |
+
+パスワード設定後、コンソールへアクセスする際にはまずパスワード入力によるログインが必要です。ログイン状態はデフォルトで 30 日間保持され、その間はサービスを再起動しても再ログインは不要です。パスワードはコンソールの「設定」ページからオンラインで変更することもできます。
 
 ## アクセス URL
 
 プロジェクト起動後、以下にアクセスしてください：
 
-- ローカル: `http://localhost:9899`
-- サーバー: `http://<server-ip>:9899`
+- ローカル実行: `http://localhost:9899`
+- サーバー実行: `http://<server-ip>:9899`
 
 <Note>
   サーバーのファイアウォールとセキュリティグループで該当ポートが許可されていることを確認してください。
 </Note>
 
-## 機能
+## 機能紹介
 
 ### チャット画面
 
-ストリーミング出力に対応しており、Agent の推論プロセスやツール呼び出しをリアルタイムで表示し、Agent の意思決定を直感的に観察できます：
+ストリーミング出力に対応しており、Agent の思考プロセス（Reasoning）とツール呼び出しプロセス（Tool Calls）をリアルタイムで表示でき、Agent の意思決定をより直感的に観察できます。深い思考機能は設定またはコンソールの「Agent 設定」スイッチで制御できます。
 
 <img width="850" src="https://cdn.link-ai.tech/doc/20260227180120.png" />
 
 #### マルチセッション管理
 
-チャット画面はマルチセッション管理に対応しています。すべてのセッション記録は SQLite データベースに永続的に保存されます：
+チャット画面はマルチセッション（Session）管理に対応しています。すべてのセッション記録はデータベースに永続化されます：
 
-- **セッション一覧**：左側の履歴アイコンをクリックしてセッション一覧パネルを展開/折りたたみでき、スクロールですべての履歴セッションを読み込めます
-- **AI によるタイトル生成**：新しいセッションの最初のやり取りが完了すると、自動的にモデルを呼び出して短い要約タイトルを生成します
-- **新規セッション**：セッション一覧上部の「新しい会話」ボタン、または入力エリアの `+` ボタンをクリックして新しいセッションを作成します
+- **セッション一覧**：左側の履歴セッションアイコンをクリックするとセッション一覧パネルを展開/折りたたみでき、スクロールですべての履歴セッションを読み込めます
+- **AI によるタイトル生成**：新しいセッションの初回対話完了後、自動的にモデルを呼び出して短いセッション要約タイトルを生成します
+- **新規セッション**：セッション一覧上部の「新しい会話」ボタンまたは入力エリアの `+` ボタンをクリックして新しいセッションを作成します
 - **セッション削除**：セッション項目の削除ボタンをクリックし、確認後にそのセッションとすべてのメッセージを完全に削除します
-- **コンテキストクリア**：入力エリアのクリアボタンをクリックすると、現在のセッションに区切り線が挿入されます。区切り線より上のメッセージは表示されたままですが、モデルのコンテキストには含まれなくなります
+- **コンテキストクリア**：入力エリアのクリアボタンをクリックすると、現在のセッションに区切り線が挿入されます。区切り線より上のメッセージは表示されたままですが、モデルのコンテキスト入力には含まれなくなります
 
 ### モデル管理
 
-設定ファイルを手動で編集せずに、オンラインでモデル設定を管理できます：
+設定ファイルを手動で編集することなく、異なるモデルプロバイダーのテキスト、画像、音声、埋め込みモデル設定をオンラインで管理できます：
 
-<img width="850" src="https://cdn.link-ai.tech/doc/20260227173811.png" />
+<img width="850" src="https://cdn.link-ai.tech/doc/20260521212949.png" />
 
 ### Skill 管理
 
@@ -68,18 +77,18 @@ Agent のメモリをオンラインで閲覧・管理できます：
 
 ### チャネル管理
 
-接続中のチャネルをオンラインで管理し、リアルタイムで接続・切断操作を行えます：
+接続中のチャネルをオンラインで管理でき、リアルタイムでの接続・切断操作に対応しています：
 
 <img width="850" src="https://cdn.link-ai.tech/doc/20260227173331.png" />
 
 ### スケジュールタスク
 
-スケジュールタスクをオンラインで閲覧・管理できます。一回限りのタスク、固定間隔、Cron 式に対応しています：
+スケジュールタスクをオンラインで閲覧・管理できます。一回限りのタスク、固定間隔、Cron 式など複数のスケジューリング方式を可視化管理できます：
 
 <img width="850" src="https://cdn.link-ai.tech/doc/20260227173704.png" />
 
 ### ログ
 
-Agent のランタイムログをリアルタイムで確認でき、監視やトラブルシューティングに活用できます：
+Agent のランタイムログをオンラインでリアルタイムに確認でき、実行状態の監視やトラブルシューティングに便利です：
 
 <img width="850" src="https://cdn.link-ai.tech/doc/20260227173514.png" />
diff --git a/docs/ja/cli/general.mdx b/docs/ja/cli/general.mdx
index ab24a14e..e31d7cf3 100644
--- a/docs/ja/cli/general.mdx
+++ b/docs/ja/cli/general.mdx
@@ -25,6 +25,14 @@ description: ステータスの確認、設定管理、コンテキスト制御
 /status
 ```
 
+## cancel
+
+現在のセッションで実行中の Agent タスクを中止します。Agent が長時間のタスク（マルチターンのツール呼び出しや長いストリーミング応答など）を実行している間、`/cancel` を送信すると、次のツール実行の前に停止します。Web、WeChat、企業微信、Feishu など、すべてのチャネルで利用可能です。
+
+```text
+/cancel
+```
+
 ## config
 
 実行時設定の表示または変更を行います。変更は即座に反映され、再起動は不要です。
diff --git a/docs/ja/cli/index.mdx b/docs/ja/cli/index.mdx
index f8e28eff..4d00a654 100644
--- a/docs/ja/cli/index.mdx
+++ b/docs/ja/cli/index.mdx
@@ -57,6 +57,7 @@ Web コンソールや接続されたチャネルの会話で `/` を入力す
 | --- | --- |
 | `/help` | コマンドヘルプを表示 |
 | `/status` | サービスの状態と設定を表示 |
+| `/cancel` | 実行中の Agent タスクを中止 |
 | `/config` | 実行時設定の表示・変更 |
 | `/skill` | スキル管理（インストール、アンインストール、有効化、無効化など） |
 | `/memory dream [N]` | 記憶蒸留を手動トリガー（デフォルト 3 日、最大 30） |
@@ -80,6 +81,7 @@ Web コンソールや接続されたチャネルの会話で `/` を入力す
 | version | ✓ | ✓ |
 | status | ✓ | ✓ |
 | logs | ✓ | ✓ |
+| cancel | ✗ | ✓ |
 | config | ✗ | ✓ |
 | context | — | ✓ |
 | memory（サブコマンド） | ✗ | ✓ |
diff --git a/docs/ja/intro/architecture.mdx b/docs/ja/intro/architecture.mdx
index 8ecf4a6f..e6aa6e1d 100644
--- a/docs/ja/intro/architecture.mdx
+++ b/docs/ja/intro/architecture.mdx
@@ -9,7 +9,7 @@ CowAgent 2.0 は、シンプルなチャットボットから、自律的な思
 
 CowAgent のアーキテクチャは以下のコアモジュールで構成されています：
 
-<img src="https://cdn.link-ai.tech/doc/cow-agent-arch-en.jpg.jpg" alt="CowAgent Architecture" />
+<img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/architecture/en/architecture.jpg" alt="CowAgent Architecture" />
 
 | モジュール | 説明 |
 | --- | --- |
diff --git a/docs/ja/intro/features.mdx b/docs/ja/intro/features.mdx
index f1a79d91..e5c65685 100644
--- a/docs/ja/intro/features.mdx
+++ b/docs/ja/intro/features.mdx
@@ -84,7 +84,7 @@ Skill が必要とするシークレットキーは環境変数ファイルに
 
 Skill システムは Agent に無限の拡張性を提供します。各 Skill は説明ファイル、実行スクリプト（任意）、リソース（任意）で構成され、特定のタイプのタスクを完了する方法を記述します。Skill により Agent は複雑なワークフローの指示に従い、ツールを呼び出し、サードパーティシステムと連携できます。
 
-- **[Skill Hub](https://skills.cowagent.ai/)：** オープンな Skill マーケットプレイス。公式推奨、コミュニティ、サードパーティの Skill を収録。ワンコマンドでインストール可能。
+- [Skill Hub](https://skills.cowagent.ai/)：オープンな Skill マーケットプレイス。公式推奨、コミュニティ、サードパーティの Skill を収録。ワンコマンドでインストール可能。
 - **組み込み Skill：** プロジェクトの `skills/` ディレクトリにあり、Skill クリエイター、画像認識、LinkAI Agent、Web フェッチなどが含まれます。組み込み Skill は依存条件（API キー、システムコマンドなど）に基づいて自動的に有効化されます。
 - **カスタム Skill：** ユーザーが会話を通じて作成し、ワークスペース（`~/cow/skills/`）に保存されます。あらゆる複雑なビジネスプロセスやサードパーティ連携を実装できます。
 
diff --git a/docs/ja/memory/index.mdx b/docs/ja/memory/index.mdx
index b83520e1..d01028f5 100644
--- a/docs/ja/memory/index.mdx
+++ b/docs/ja/memory/index.mdx
@@ -27,7 +27,7 @@ Agent は以下のメカニズムにより、会話内容を長期記憶に自
 
 - **コンテキストトリミング時** — 会話ターン数またはトークン数が設定上限を超えた場合、最も古い半分のコンテキストがトリミングされ、LLM によって要約されて日次記憶ファイルに書き込まれます。要約は保持されたコンテキストにも非同期で注入され、会話の連続性を維持します
 - **毎日のスケジュール要約** — 毎日 23:55 に自動的にフル要約がトリガーされ、アクティビティが少ない日でも記憶が保存されます（内容が変更されていない場合はスキップ）
-- **[夢境蒸留（Deep Dream）](/ja/memory/deep-dream)** — 毎日の要約完了後に自動実行され、日次記憶を MEMORY.md に蒸留し、夢日記を生成します
+- [夢境蒸留（Deep Dream）](/ja/memory/deep-dream) — 毎日の要約完了後に自動実行され、日次記憶を MEMORY.md に蒸留し、夢日記を生成します
 - **API コンテキストオーバーフロー時** — モデル API がコンテキストオーバーフローエラーを返した場合、緊急措置として現在の会話要約が保存されます
 
 すべての記憶書き込みはバックグラウンドスレッドで非同期に実行され（LLM の要約 + ファイル書き込み）、通常の会話応答をブロックしません。
diff --git a/docs/ja/models/claude.mdx b/docs/ja/models/claude.mdx
index eddd24fa..2b951e34 100644
--- a/docs/ja/models/claude.mdx
+++ b/docs/ja/models/claude.mdx
@@ -1,17 +1,50 @@
 ---
 title: Claude
-description: Claudeモデルの設定
+description: Anthropic Claude モデル設定（テキスト対話 + 画像理解）
 ---
 
+Claude は Anthropic が提供するモデルで、テキスト対話と画像理解をサポートします。主流の Sonnet / Opus モデルはネイティブにビジョンをサポートしており、別途 Vision モデルを指定する必要はありません。
+
+<Tip>
+  Web コンソールの「モデル管理」ページから、以下のすべての機能をワンストップで設定でき、設定ファイルを手動で編集する必要はありません。
+</Tip>
+
+## テキスト対話
+
 ```json
 {
-  "model": "claude-sonnet-4-6",
+  "model": "claude-opus-4-8",
   "claude_api_key": "YOUR_API_KEY"
 }
 ```
 
 | パラメータ | 説明 |
 | --- | --- |
-| `model` | `claude-sonnet-4-6`、`claude-opus-4-7`、`claude-opus-4-6`、`claude-sonnet-4-5`、`claude-sonnet-4-0`、`claude-3-5-sonnet-latest`などから選択可能。[公式モデル一覧](https://docs.anthropic.com/en/docs/about-claude/models/overview)を参照 |
-| `claude_api_key` | [Claude Console](https://console.anthropic.com/settings/keys)で作成 |
-| `claude_api_base` | 任意。デフォルトは`https://api.anthropic.com/v1`。サードパーティプロキシを使用する場合に変更 |
+| `model` | `claude-opus-4-8`、`claude-opus-4-7`、`claude-sonnet-4-6`、`claude-opus-4-6`、`claude-sonnet-4-5`、`claude-sonnet-4-0`、`claude-3-5-sonnet-latest` などをサポート。詳細は [公式モデル一覧](https://docs.anthropic.com/en/docs/about-claude/models/overview) を参照 |
+| `claude_api_key` | [Claude コンソール](https://console.anthropic.com/settings/keys) で作成 |
+| `claude_api_base` | 任意。デフォルトは `https://api.anthropic.com/v1`。サードパーティのプロキシに変更可能 |
+
+### モデル選択
+
+| モデル | 用途 |
+| --- | --- |
+| `claude-opus-4-8` | デフォルト推奨。最新フラッグシップ。複雑な推論や長いタスクチェーンに最適 |
+| `claude-opus-4-7` | 前世代の Opus フラッグシップ |
+| `claude-sonnet-4-6` | コストパフォーマンスと速度のバランスが良く、コストも低い |
+| `claude-opus-4-6` / `claude-sonnet-4-5` / `claude-sonnet-4-0` | より以前のフラッグシップ。価格はより安い |
+
+## 画像理解
+
+`claude_api_key` を設定すると、Agent の Vision ツールは Claude のメインモデルを使用して自動的に画像を認識します。追加設定は不要です。
+
+Vision モデルを手動で指定したい場合は、設定ファイルで明示的に指定できます：
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "claude-sonnet-4-6"
+    }
+  }
+}
+```
diff --git a/docs/ja/models/custom.mdx b/docs/ja/models/custom.mdx
index 047f3f42..c2a3cfa9 100644
--- a/docs/ja/models/custom.mdx
+++ b/docs/ja/models/custom.mdx
@@ -1,26 +1,26 @@
 ---
 title: カスタム
-description: サードパーティAPIやローカルモデル向けのカスタムプロバイダー設定
+description: カスタムベンダー設定。サードパーティ API プロキシやローカルモデル向け
 ---
 
-OpenAI互換プロトコルでアクセスするモデルサービスに適用します：
+OpenAI 互換プロトコルで接続するサードパーティのモデルサービスや、ローカルにデプロイしたモデルに適しています。例えば：
 
-- **サードパーティAPIプロキシ**：統一APIベースで複数モデルを呼び出し
-- **ローカルモデル**：Ollama、vLLM、LocalAIなどでローカルにデプロイされたモデル
-- **プライベートデプロイ**：組織内でホストされたモデルサービス
+- **サードパーティ API プロキシ**：統一された API Base から複数のモデルを呼び出す
+- **ローカルモデル**：Ollama、vLLM、LocalAI などのツールでローカルにデプロイしたモデル
+- **プライベートデプロイ**：企業内部にデプロイしたモデルサービス
 
 <Note>
-  `openai` プロバイダーとの違い：カスタムプロバイダーでは `/config model` でモデルを切り替えてもプロバイダータイプは自動切り替えされず、カスタムAPIアドレスが常に保持されます。
+  `openai` ベンダーとの違い：カスタムベンダーを選択した場合、`/config model` でモデルを切り替えてもベンダータイプは自動で切り替わらず、常にカスタムの API アドレスを使用します。
 </Note>
 
-## 設定方法
+## テキスト対話
 
-### サードパーティAPIプロキシ
+### サードパーティ API プロキシ
 
 ```json
 {
   "bot_type": "custom",
-  "model": "deepseek-v4-flash",
+  "model": "",
   "custom_api_key": "YOUR_API_KEY",
   "custom_api_base": "https://{your-proxy.com}/v1"
 }
@@ -28,14 +28,14 @@ OpenAI互換プロトコルでアクセスするモデルサービスに適用
 
 | パラメータ | 説明 |
 | --- | --- |
-| `bot_type` | `custom` に設定必須 |
-| `model` | モデル名、プロキシサービスがサポートする任意のモデル名 |
-| `custom_api_key` | プロキシサービスが提供するAPIキー |
-| `custom_api_base` | APIアドレス、OpenAI互換プロトコルが必要 |
+| `bot_type` | `custom` に設定する必要があります |
+| `model` | モデル名。プロキシサービスがサポートする任意のモデル名を指定 |
+| `custom_api_key` | API キー。プロキシサービスから提供されます |
+| `custom_api_base` | API アドレス。プロキシサービスから提供され、OpenAI プロトコル互換である必要があります |
 
 ### ローカルモデル
 
-ローカルモデルは通常APIキー不要で、APIベースのみ設定します：
+ローカルモデルは通常 API Key が不要で、API Base のみ設定します：
 
 ```json
 {
@@ -47,15 +47,15 @@ OpenAI互換プロトコルでアクセスするモデルサービスに適用
 
 一般的なローカルデプロイツールとデフォルトアドレス：
 
-| ツール | デフォルトAPIベース |
+| ツール | デフォルト API Base |
 | --- | --- |
 | [Ollama](https://ollama.com) | `http://localhost:11434/v1` |
 | [vLLM](https://docs.vllm.ai) | `http://localhost:8000/v1` |
 | [LocalAI](https://localai.io) | `http://localhost:8080/v1` |
 
-## モデル切り替え
+### モデル切り替え
 
-カスタムプロバイダーではモデル切り替え時に `model` のみ変更され、`bot_type` やAPIアドレスは変わりません：
+カスタムベンダーでモデルを切り替える際は `model` のみが変更され、`bot_type` と API アドレスは変わりません：
 
 ```
 /config model qwen3.5:27b
diff --git a/docs/ja/models/deepseek.mdx b/docs/ja/models/deepseek.mdx
index 018931f0..726520fd 100644
--- a/docs/ja/models/deepseek.mdx
+++ b/docs/ja/models/deepseek.mdx
@@ -1,9 +1,11 @@
 ---
 title: DeepSeek
-description: DeepSeekモデルの設定
+description: DeepSeek モデル設定（テキスト対話 + 思考モード）
 ---
 
-方法1：公式接続（推奨）：
+DeepSeek は現在 Agent モードでデフォルト推奨されているベンダーの 1 つで、コストパフォーマンスの高いテキスト対話とタスクプランニング能力を主力としています。
+
+## テキスト対話
 
 ```json
 {
@@ -15,23 +17,23 @@ description: DeepSeekモデルの設定
 | パラメータ | 説明 |
 | --- | --- |
 | `model` | `deepseek-v4-flash`（デフォルト）、`deepseek-v4-pro` をサポート |
-| `deepseek_api_key` | [DeepSeek Platform](https://platform.deepseek.com/api_keys) で作成 |
-| `deepseek_api_base` | オプション、デフォルトは `https://api.deepseek.com/v1`。サードパーティプロキシに変更可能 |
+| `deepseek_api_key` | [DeepSeek プラットフォーム](https://platform.deepseek.com/api_keys) で作成 |
+| `deepseek_api_base` | 任意。デフォルトは `https://api.deepseek.com/v1`。サードパーティのプロキシアドレスに変更可能 |
 
-## モデルの選び方
+### モデル選択
 
-| モデル | 適用シーン |
+| モデル | 用途 |
 | --- | --- |
-| `deepseek-v4-flash` | デフォルト推奨、高速・低コスト |
-| `deepseek-v4-pro` | 複雑なタスクでより強力 |
+| `deepseek-v4-flash` | デフォルト推奨。高速かつ低コスト |
+| `deepseek-v4-pro` | より高い知能。複雑なタスクで効果が高い |
 
 ## 思考モード
 
-V4シリーズ（`deepseek-v4-flash` / `deepseek-v4-pro`）は明示的な「思考モード」をサポートします。最終回答の前に思考内容（`reasoning_content`）を出力することで、回答品質を高めます。
+V4 シリーズ（`deepseek-v4-flash` / `deepseek-v4-pro`）は明示的な「思考モード」をサポートしています：モデルは最終回答を出力する前に、まず思考連鎖（`reasoning_content`）を出力することで、回答の品質を向上させます。
 
 ### スイッチ
 
-グローバル設定 `enable_thinking` で制御します：
+グローバル設定 `enable_thinking` で制御し、Web コンソールの設定ページからも切り替えできます：
 
 ```json
 {
@@ -39,12 +41,12 @@ V4シリーズ（`deepseek-v4-flash` / `deepseek-v4-pro`）は明示的な「思
 }
 ```
 
-- `true`：すべてのチャネルで思考モードがオン。Webコンソールでは思考過程を表示し、IMチャネル（WeChat / WeCom / DingTalk / Feishu）では表示されないものの、回答品質の向上というメリットを得られます。
-- `false`：思考オフ、応答が速く、初回トークンの遅延も低くなります。
+- `true`：すべてのチャネルでモデルが先に思考してから回答します。Web コンソールでは思考過程が表示され、IM チャネル（WeChat / 企業 WeChat / DingTalk / Lark）では表示されませんが、同様により良い回答が得られます。
+- `false`：思考をオフにし、レスポンスが速くなり、初回トークン遅延が短くなります。
 
 ### 推論強度
 
-思考モード下では `reasoning_effort` で推論の深さを制御できます：
+思考モードでは `reasoning_effort` で推論の強さを制御できます：
 
 ```json
 {
@@ -53,29 +55,18 @@ V4シリーズ（`deepseek-v4-flash` / `deepseek-v4-pro`）は明示的な「思
 }
 ```
 
-| 値 | 適用シーン |
+| 値 | 用途 |
 | --- | --- |
-| `high`（デフォルト） | 通常の Agent タスク、思考の深さとレスポンス速度のバランス |
-| `max` | 複雑なコーディング、長いプランニング、厳密な制約のあるタスク。より深い推論と引き換えに出力トークンとレイテンシが増加 |
+| `high`（デフォルト） | 日常的な Agent タスク。思考と速度のバランス |
+| `max` | 複雑なコーディング、長いプランニング、厳しい制約を伴うタスク。推論はより深いが、所要時間と出力トークンが増える |
 
-`reasoning_effort` は `enable_thinking` が `true` の場合のみ有効になります。思考モードをサポートしないモデルでは自動的に無視されます。
+`reasoning_effort` は `enable_thinking` が `true` の場合のみ有効です。モデルが思考モードに対応していない場合、このフィールドは自動的に無視されます。
 
-### 注意事項
+### 動作の補足
 
-- **サンプリングパラメータ**：思考モード時は `temperature`、`top_p`、`presence_penalty`、`frequency_penalty` がサーバ側で無視されます（エラーにはなりません）。CowAgentは自動的に送信をスキップします。
-- **マルチターンのツール呼び出し**：履歴にツール呼び出しが含まれる場合、DeepSeekはすべてのassistantメッセージに `reasoning_content` を返送するよう要求します。CowAgentが自動でラウンドトリップ処理を行うため、セッション途中で思考スイッチを切り替えてもエラーになりません。
+- **サンプリングパラメータ**：思考モードでは `temperature`、`top_p`、`presence_penalty`、`frequency_penalty` がサーバ側で無視されます（エラーにはなりません）。CowAgent は自動的にこれらの送信をスキップします。
+- **マルチターンのツール呼び出し**：履歴にツール呼び出しが含まれる場合、DeepSeek はすべての assistant メッセージで `reasoning_content` を返却することを要求します。CowAgent は返却ロジックを自動的に処理しており、ターンをまたいで思考スイッチを切り替えてもエラーにはなりません。
 
 <Tip>
-  通常は `deepseek-v4-flash` を使い、難しいタスクでは `deepseek-v4-pro` に切り替え、深い思考が必要な時は `enable_thinking` を有効にしてください。
+  デフォルトでは `deepseek-v4-flash` を使用します。複雑なタスクには `deepseek-v4-pro` を使用でき、深い推論が必要な場合は `enable_thinking` をオンにできます。
 </Tip>
-
-方法2：OpenAI互換方式：
-
-```json
-{
-  "model": "deepseek-v4-flash",
-  "bot_type": "openai",
-  "open_ai_api_key": "YOUR_API_KEY",
-  "open_ai_api_base": "https://api.deepseek.com/v1"
-}
-```
diff --git a/docs/ja/models/doubao.mdx b/docs/ja/models/doubao.mdx
index d8ebc9fa..8ed039f7 100644
--- a/docs/ja/models/doubao.mdx
+++ b/docs/ja/models/doubao.mdx
@@ -1,17 +1,66 @@
 ---
-title: Doubao (ByteDance)
-description: Doubao (火山方舟) モデルの設定
+title: Doubao
+description: Doubao（火山方舟）モデル設定（テキスト / 画像理解 / 画像生成 / ベクトル）
 ---
 
+Doubao（火山方舟）はテキスト対話、画像理解、画像生成（Seedream）、ベクトル機能をサポートしており、1 つの `ark_api_key` ですべての機能を有効化できます。
+
+<Tip>
+  Web コンソールの「モデル管理」ページから、以下のすべての機能をワンストップで設定でき、設定ファイルを手動で編集する必要はありません。
+</Tip>
+
+## テキスト対話
+
 ```json
 {
-  "model": "doubao-seed-2-0-code-preview-260215",
+  "model": "doubao-seed-2-0-pro-260215",
   "ark_api_key": "YOUR_API_KEY"
 }
 ```
 
 | パラメータ | 説明 |
 | --- | --- |
-| `model` | `doubao-seed-2-0-code-preview-260215`、`doubao-seed-2-0-pro-260215`、`doubao-seed-2-0-lite-260215`などから選択可能 |
-| `ark_api_key` | [火山方舟 Console](https://console.volcengine.com/ark/region:ark+cn-beijing/apikey)で作成 |
-| `ark_base_url` | 任意。デフォルトは`https://ark.cn-beijing.volces.com/api/v3` |
+| `model` | `doubao-seed-2-0-pro-260215`、`doubao-seed-2-0-code-preview-260215`、`doubao-seed-2-0-lite-260215` などを指定可能 |
+| `ark_api_key` | [火山方舟コンソール](https://console.volcengine.com/ark/region:ark+cn-beijing/apikey) で作成 |
+| `ark_base_url` | 任意。デフォルトは `https://ark.cn-beijing.volces.com/api/v3` |
+
+## 画像理解
+
+`ark_api_key` を設定すると、Agent の Vision ツールは自動的に `doubao-seed-2-0-pro-260215` を使用して画像を認識します。追加設定は不要です。
+
+Vision モデルを手動で指定したい場合は：
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "doubao-seed-2-0-pro-260215"
+    }
+  }
+}
+```
+
+## 画像生成
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "seedream-5.0-lite"
+    }
+  }
+}
+```
+
+選択可能なモデル：`seedream-5.0-lite`、`seedream-4.5`。
+
+## ベクトル
+
+```json
+{
+  "embedding_provider": "doubao",
+  "embedding_model": "doubao-embedding-vision-251215"
+}
+```
+
+デフォルトモデルは `doubao-embedding-vision-251215`（マルチモーダル embedding）です。設定ファイルで `embedding_dimensions` から 1024 または 2048 次元を指定できます。embedding を変更した後は `/memory rebuild-index` コマンドを実行してインデックスを再構築する必要があります。
diff --git a/docs/ja/models/gemini.mdx b/docs/ja/models/gemini.mdx
index d59f7309..18f11250 100644
--- a/docs/ja/models/gemini.mdx
+++ b/docs/ja/models/gemini.mdx
@@ -1,16 +1,59 @@
 ---
 title: Gemini
-description: Google Geminiモデルの設定
+description: Google Gemini モデル設定（テキスト対話 + 画像理解 + 画像生成）
 ---
 
+Google Gemini はテキスト対話、画像理解、画像生成（Nano Banana シリーズ）をサポートしており、1 つの `gemini_api_key` ですべての機能を有効化できます。
+
+<Tip>
+  Web コンソールの「モデル管理」ページから、以下のすべての機能をワンストップで設定でき、設定ファイルを手動で編集する必要はありません。
+</Tip>
+
+## テキスト対話
+
 ```json
 {
-  "model": "gemini-3.1-pro-preview",
+  "model": "gemini-3.5-flash",
   "gemini_api_key": "YOUR_API_KEY"
 }
 ```
 
 | パラメータ | 説明 |
 | --- | --- |
-| `model` | `gemini-3.1-flash-lite-preview`、`gemini-3.1-pro-preview`、`gemini-3-flash-preview`、`gemini-3-pro-preview`などから選択可能。[公式ドキュメント](https://ai.google.dev/gemini-api/docs/models)を参照 |
-| `gemini_api_key` | [Google AI Studio](https://aistudio.google.com/app/apikey)で作成 |
+| `model` | 推奨は `gemini-3.5-flash`。`gemini-3.1-pro-preview`、`gemini-3.1-flash-lite-preview`、`gemini-3-flash-preview`、`gemini-3-pro-preview` などもサポート。詳細は [公式ドキュメント](https://ai.google.dev/gemini-api/docs/models) を参照 |
+| `gemini_api_key` | [Google AI Studio](https://aistudio.google.com/app/apikey) で作成 |
+| `gemini_api_base` | 任意。デフォルトは `https://generativelanguage.googleapis.com`。サードパーティのプロキシに変更可能 |
+
+## 画像理解
+
+Gemini の全シリーズモデルはネイティブにビジョンをサポートしています。`gemini_api_key` を設定すると、Agent の Vision ツールは自動的にメインモデルを使用して画像を認識します。追加設定は不要です。
+
+Vision モデルを手動で指定したい場合：
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "gemini-3.1-flash-lite-preview"
+    }
+  }
+}
+```
+
+## 画像生成
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "gemini-3.1-flash-image-preview"
+    }
+  }
+}
+```
+
+| モデル ID | エイリアス |
+| --- | --- |
+| `gemini-3.1-flash-image-preview` | Nano Banana 2 |
+| `gemini-3-pro-image-preview` | Nano Banana Pro |
+| `gemini-2.5-flash-image` | Nano Banana |
diff --git a/docs/ja/models/glm.mdx b/docs/ja/models/glm.mdx
index 9455e56b..b8ace28f 100644
--- a/docs/ja/models/glm.mdx
+++ b/docs/ja/models/glm.mdx
@@ -1,8 +1,16 @@
 ---
-title: GLM (智谱AI)
-description: 智谱AI GLMモデルの設定
+title: Zhipu GLM
+description: Zhipu AI GLM モデル設定（テキスト / 画像理解 / 音声認識 / ベクトル）
 ---
 
+Zhipu AI はテキスト対話、画像理解、音声認識（ASR）、ベクトル（Embedding）をサポートしており、1 つの `zhipu_ai_api_key` ですべての機能を有効化できます。
+
+<Tip>
+  Web コンソールの「モデル管理」ページから、以下のすべての機能をワンストップで設定でき、設定ファイルを手動で編集する必要はありません。
+</Tip>
+
+## テキスト対話
+
 ```json
 {
   "model": "glm-5.1",
@@ -12,16 +20,37 @@ description: 智谱AI GLMモデルの設定
 
 | パラメータ | 説明 |
 | --- | --- |
-| `model` | `glm-5.1`、`glm-5-turbo`、`glm-5`、`glm-4.7`、`glm-4-plus`、`glm-4-flash`、`glm-4-air`などから選択可能。[モデルコード](https://bigmodel.cn/dev/api/normal-model/glm-4)を参照 |
-| `zhipu_ai_api_key` | [智谱AI Console](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys)で作成 |
+| `model` | `glm-5.1`、`glm-5-turbo`、`glm-5`、`glm-4.7`、`glm-4-plus`、`glm-4-flash`、`glm-4-air` などを指定可能。詳細は [モデルコード](https://bigmodel.cn/dev/api/normal-model/glm-4) を参照 |
+| `zhipu_ai_api_key` | [Zhipu AI コンソール](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) で作成 |
+| `zhipu_ai_api_base` | 任意。デフォルトは `https://open.bigmodel.cn/api/paas/v4` |
 
-OpenAI互換の設定もサポートしています:
+## 画像理解
+
+Zhipu の chat 系モデル（`glm-5.1`、`glm-5-turbo` など）はビジョンに対応していないため、ビジョン呼び出しは `glm-5v-turbo` に統一的にルーティングされます。`zhipu_ai_api_key` を設定すると、Agent の Vision ツールは自動的にこのモデルを使用するため、設定ファイルで明示的に指定する必要はありません。
+
+## 音声認識
 
 ```json
 {
-  "bot_type": "openai",
-  "model": "glm-5.1",
-  "open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "voice_to_text": "zhipu",
+  "voice_to_text_model": "glm-asr-2512"
 }
 ```
+
+| パラメータ | 説明 |
+| --- | --- |
+| `voice_to_text` | `zhipu` に設定すると Zhipu ASR が有効になります |
+| `voice_to_text_model` | 任意。デフォルトは `glm-asr-2512` |
+
+認証情報は `zhipu_ai_api_key` を自動的に再利用します。音声ファイルは 25MB 未満を推奨します。サイズが大きすぎるファイルはサーバ側で拒否される可能性があります。
+
+## ベクトル
+
+```json
+{
+  "embedding_provider": "zhipu",
+  "embedding_model": "embedding-3"
+}
+```
+
+選択可能なモデル：`embedding-3`、`embedding-2`。embedding を変更した後は `/memory rebuild-index` コマンドを実行してインデックスを再構築する必要があります。
diff --git a/docs/ja/models/index.mdx b/docs/ja/models/index.mdx
index d86ada77..0f3916a3 100644
--- a/docs/ja/models/index.mdx
+++ b/docs/ja/models/index.mdx
@@ -1,58 +1,45 @@
 ---
 title: モデル概要
-description: CowAgentがサポートするモデルとおすすめの選択肢
+description: CowAgent がサポートするモデルベンダーと機能マトリクス
 ---
 
-CowAgentは国内外の主要なLLMをサポートしています。モデルインターフェースはプロジェクトの`models/`ディレクトリに実装されています。
+CowAgent は国内外の主要ベンダーの大規模言語モデルをサポートしており、モデル接続の実装はプロジェクトの `models/` ディレクトリにあります。テキスト対話に加えて、一部のベンダーは画像理解、画像生成、音声認識、音声合成、ベクトルなどの機能も提供しており、Agent フローの中で必要に応じて呼び出すことができます。
 
 <Note>
-  Agent モードでは、品質とコストのバランスから以下のモデルをおすすめします: deepseek-v4-flash、MiniMax-M2.7、claude-sonnet-4-6、gemini-3.1-pro-preview、glm-5.1、qwen3.6-plus、kimi-k2.6、ernie-5.1
+  Agent モードでは、効果とコストのバランスを考慮して以下のモデルの利用を推奨します：deepseek-v4-flash、MiniMax-M2.7、claude-sonnet-4-6、gemini-3.5-flash、glm-5.1、qwen3.6-plus、kimi-k2.6、ernie-5.1。
+
+  同時に [LinkAI](https://link-ai.tech) プラットフォームの API もサポートしており、1 つの Key で複数ベンダーを柔軟に切り替えられ、ナレッジベース、ワークフロー、プラグインなどの機能も付属しています。
 </Note>
 
-## 設定
 
-選択したモデルに応じて、`config.json`にモデル名とAPI Keyを設定してください。各モデルは`bot_type`を`openai`に設定し、`open_ai_api_base`と`open_ai_api_key`を設定することで、OpenAI互換アクセスもサポートしています。
+## モデル機能の全体像
 
-また、[LinkAI](https://link-ai.tech)プラットフォームインターフェースを使用すると、ナレッジベース、ワークフロー、その他のAgent機能をサポートしながら、複数のモデルを柔軟に切り替えることができます。
+各ベンダーが提供する機能の一覧です。「テキスト」はメインの対話モデルを指し、その他の列はそのベンダーが対応する Agent 機能を担えるかを示します。
 
-## サポートモデル
-
-<CardGroup cols={2}>
-  <Card title="DeepSeek" href="/ja/models/deepseek">
-    deepseek-v4-flash、deepseek-v4-pro など
-  </Card>
-  <Card title="Baidu Qianfan / ERNIE" href="/ja/models/qianfan">
-    ernie-5.1、ernie-5.0、ernie-4.5-turbo-128k など
-  </Card>
-  <Card title="MiniMax" href="/ja/models/minimax">
-    MiniMax-M2.7およびその他のシリーズモデル
-  </Card>
-  <Card title="Claude" href="/ja/models/claude">
-    claude-sonnet-4-6など
-  </Card>
-  <Card title="Gemini" href="/ja/models/gemini">
-    gemini-3.1-pro-previewなど
-  </Card>
-  <Card title="OpenAI" href="/ja/models/openai">
-    gpt-5.4、gpt-4.1、oシリーズなど
-  </Card>
-  <Card title="GLM (智谱AI)" href="/ja/models/glm">
-    glm-5.1、glm-5-turbo、glm-5およびその他のシリーズモデル
-  </Card>
-  <Card title="Qwen (通义千问)" href="/ja/models/qwen">
-    qwen3.6-plus、qwen3-maxなど
-  </Card>
-  <Card title="Doubao (ByteDance)" href="/ja/models/doubao">
-    doubao-seedシリーズモデル
-  </Card>
-  <Card title="Kimi" href="/ja/models/kimi">
-    kimi-k2.6、kimi-k2.5、kimi-k2など
-  </Card>
-  <Card title="LinkAI" href="/ja/models/linkai">
-    統合マルチモデルインターフェース + ナレッジベース
-  </Card>
-</CardGroup>
+| ベンダー | 代表モデル | テキスト | 画像理解 | 画像生成 | 音声認識 | 音声合成 | ベクトル |
+| --- | --- | :-: | :-: | :-: | :-: | :-: | :-: |
+| [DeepSeek](/models/deepseek) | deepseek-v4-flash / pro | ✅ | | | | | |
+| [MiniMax](/models/minimax) | MiniMax-M2.7 | ✅ | ✅ | ✅ | | ✅ | |
+| [Claude](/models/claude) | claude-opus-4-8 | ✅ | ✅ | | | | |
+| [Gemini](/models/gemini) | gemini-3.5-flash | ✅ | ✅ | ✅ | | | |
+| [OpenAI](/models/openai) | gpt-5.5、o シリーズ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [Zhipu GLM](/models/glm) | glm-5.1、glm-5v-turbo | ✅ | ✅ | | ✅ | | ✅ |
+| [Tongyi Qianwen](/models/qwen) | qwen3.7-max | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [Doubao](/models/doubao) | doubao-seed-2.0 シリーズ | ✅ | ✅ | ✅ | | | ✅ |
+| [Kimi](/models/kimi) | kimi-k2.6 | ✅ | ✅ | | | | |
+| [Baidu Qianfan](/models/qianfan) | ernie-5.1 | ✅ | ✅ | | | | |
+| [LinkAI](/models/linkai) | 複数ベンダー 100+ モデルを統一接続 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [カスタム](/models/custom) | ローカルモデル / サードパーティプロキシ | ✅ | | | | | |
 
 <Tip>
-  モデル名の完全なリストについては、プロジェクトの[`common/const.py`](https://github.com/zhayujie/CowAgent/blob/master/common/const.py)ファイルを参照してください。
+  Web コンソール上では各機能（ビジョン / 画像 / 音声認識 / 音声合成 / ベクトル / Web 検索）ごとに独立してベンダーとモデルを設定でき、互いに強制的に紐付けされません。
 </Tip>
+
+
+## 設定方法
+
+**方法 1（推奨）:** [Web コンソール](/channels/web) からオンラインでモデルや各機能を管理でき、設定ファイルを手動で編集する必要はありません：
+
+<img width="900" src="https://cdn.link-ai.tech/doc/20260521212527.png" />
+
+**方法 2:** `config.json` を手動で編集し、選択したモデルに応じてモデル名と API Key を設定します。各モデルは OpenAI 互換方式での接続もサポートしており、`bot_type` を `openai` に設定し、`open_ai_api_base` と `open_ai_api_key` を設定すれば利用できます。
diff --git a/docs/ja/models/kimi.mdx b/docs/ja/models/kimi.mdx
index fb80153c..ce09efea 100644
--- a/docs/ja/models/kimi.mdx
+++ b/docs/ja/models/kimi.mdx
@@ -1,8 +1,16 @@
 ---
-title: Kimi (Moonshot)
-description: Kimi (Moonshot) モデルの設定
+title: Kimi
+description: Kimi（Moonshot）モデル設定（テキスト対話 + 画像理解）
 ---
 
+Kimi は Moonshot が提供するモデルで、テキスト対話と画像理解をサポートします。`kimi-k2.x` シリーズはネイティブにビジョンをサポートしています。
+
+<Tip>
+  Web コンソールの「モデル管理」ページから、以下のすべての機能をワンストップで設定でき、設定ファイルを手動で編集する必要はありません。
+</Tip>
+
+## テキスト対話
+
 ```json
 {
   "model": "kimi-k2.6",
@@ -12,16 +20,22 @@ description: Kimi (Moonshot) モデルの設定
 
 | パラメータ | 説明 |
 | --- | --- |
-| `model` | `kimi-k2.6`、`kimi-k2.5`、`kimi-k2`、`moonshot-v1-8k`、`moonshot-v1-32k`、`moonshot-v1-128k`から選択可能 |
-| `moonshot_api_key` | [Moonshot Console](https://platform.moonshot.cn/console/api-keys)で作成 |
+| `model` | `kimi-k2.6`、`kimi-k2.5`、`kimi-k2`、`moonshot-v1-8k`、`moonshot-v1-32k`、`moonshot-v1-128k` を指定可能 |
+| `moonshot_api_key` | [Moonshot コンソール](https://platform.moonshot.cn/console/api-keys) で作成 |
+| `moonshot_base_url` | 任意。デフォルトは `https://api.moonshot.cn/v1` |
 
-OpenAI互換の設定もサポートしています:
+## 画像理解
+
+`moonshot_api_key` を設定すると、Agent の Vision ツールは自動的に `kimi-k2.6` を使用して画像を認識します。追加設定は不要です。
+
+Vision モデルを手動で指定したい場合：
 
 ```json
 {
-  "bot_type": "openai",
-  "model": "kimi-k2.6",
-  "open_ai_api_base": "https://api.moonshot.cn/v1",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "tools": {
+    "vision": {
+      "model": "kimi-k2.6"
+    }
+  }
 }
 ```
diff --git a/docs/ja/models/linkai.mdx b/docs/ja/models/linkai.mdx
index 23986b1b..a19c943c 100644
--- a/docs/ja/models/linkai.mdx
+++ b/docs/ja/models/linkai.mdx
@@ -1,9 +1,15 @@
 ---
 title: LinkAI
-description: LinkAIプラットフォームで複数モデルに統合アクセス
+description: LinkAI プラットフォーム経由でテキスト、ビジョン、画像、音声、ベクトル機能を統一接続
 ---
 
-[LinkAI](https://link-ai.tech)プラットフォームでは、OpenAI、Claude、Gemini、DeepSeek、MiniMax、Qwen、Kimiなどのモデルを柔軟に切り替えることができ、ナレッジベース、ワークフロー、プラグイン、その他のAgent機能をサポートしています。
+1 つの `linkai_api_key` で、OpenAI、Claude、Gemini、DeepSeek、MiniMax、Qwen、Kimi、Doubao など主要ベンダーのすべての機能にアクセスできます。
+
+<Tip>
+  Web コンソールの「モデル管理」ページから、以下のすべての機能をワンストップで設定でき、設定ファイルを手動で編集する必要はありません。
+</Tip>
+
+## テキスト対話
 
 ```json
 {
@@ -14,8 +20,84 @@ description: LinkAIプラットフォームで複数モデルに統合アクセ
 
 | パラメータ | 説明 |
 | --- | --- |
-| `use_linkai` | `true`に設定してLinkAIインターフェースを有効化 |
-| `linkai_api_key` | [LinkAI Console](https://link-ai.tech/console/interface)で作成 |
-| `model` | 空のままにするとAgentのデフォルトモデルを使用。プラットフォーム上で柔軟に切り替え可能。[モデル一覧](https://link-ai.tech/console/models)のすべてのモデルをサポート |
+| `use_linkai` | `true` に設定すると有効になります |
+| `linkai_api_key` | [コンソール](https://link-ai.tech/console/interface) で作成 |
+| `model` | [モデル一覧](https://link-ai.tech/console/models) の任意のコードを指定可能 |
 
-詳細は[APIドキュメント](https://docs.link-ai.tech/platform/api)を参照してください。
+詳細は [モデルサービス](https://link-ai.tech/console/models) を参照してください。
+
+## 画像理解
+
+設定が完了すると、Agent の Vision ツールは自動的にゲートウェイ上のマルチモーダルモデルを呼び出します。追加設定は不要です。Vision モデルを手動で指定したい場合：
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "gpt-5.4-mini"
+    }
+  }
+}
+```
+
+選択可能なモデル：`gpt-4.1-mini`、`gpt-5.4-mini`、`qwen3.6-plus`、`doubao-seed-2-0-pro-260215`、`kimi-k2.6`、`claude-sonnet-4-6`、`gemini-3.1-flash-lite-preview` など。
+
+## 画像生成
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "gpt-image-2"
+    }
+  }
+}
+```
+
+| モデル ID | エイリアス |
+| --- | --- |
+| `gpt-image-2` | OpenAI |
+| `gemini-3.1-flash-image-preview` | Nano Banana 2 |
+| `gemini-3-pro-image-preview` | Nano Banana Pro |
+| `seedream-5.0-lite` | ByteDance Doubao Seedream |
+
+## 音声認識
+
+```json
+{
+  "voice_to_text": "linkai"
+}
+```
+
+ASR は固定で Whisper を使用します。認証情報は `linkai_api_key` を自動的に再利用します。
+
+## 音声合成
+
+音声合成ゲートウェイは複数の TTS エンジンをサポートしており、`text_to_voice_model` でエンジンを選択し、音色はエンジンに応じて切り替わります。
+
+```json
+{
+  "text_to_voice": "linkai",
+  "text_to_voice_model": "doubao",
+  "tts_voice_id": "BV001_streaming"
+}
+```
+
+| `text_to_voice_model` | エンジンの説明 |
+| --- | --- |
+| `tts-1` | OpenAI · 多言語汎用（音色 `alloy` / `nova` / `echo` など） |
+| `doubao` | ByteDance Doubao · 中国語の音色が豊富 |
+| `baidu` | Baidu · 中国語のアナウンサー音色 |
+
+エンジンによって対応する音色が異なるため、Web コンソールの「モデル管理 → 音声合成」から視覚的に選択することをおすすめします。
+
+## ベクトル
+
+```json
+{
+  "embedding_provider": "linkai",
+  "embedding_model": "text-embedding-3-small"
+}
+```
+
+デフォルトモデルは `text-embedding-3-small`（OpenAI 互換）です。embedding を変更した後は `/memory rebuild-index` コマンドを実行してインデックスを再構築する必要があります。
diff --git a/docs/ja/models/mimo.mdx b/docs/ja/models/mimo.mdx
new file mode 100644
index 00000000..c677810f
--- /dev/null
+++ b/docs/ja/models/mimo.mdx
@@ -0,0 +1,135 @@
+---
+title: Xiaomi MiMo
+description: Xiaomi MiMo モデル設定（テキスト対話 + 画像理解 + 音声合成）
+---
+
+Xiaomi MiMo はネイティブ全モーダル大規模言語モデルです。1 つの `mimo_api_key` でテキスト対話、画像理解、音声合成を同時に有効化できます。
+
+<Tip>
+  Web コンソールの「モデル管理」ページから、以下のすべての機能をワンストップで設定でき、設定ファイルを手動で編集する必要はありません。
+</Tip>
+
+## テキスト対話
+
+```json
+{
+  "model": "mimo-v2.5-pro",
+  "mimo_api_key": "YOUR_API_KEY",
+  "mimo_api_base": "https://api.xiaomimimo.com/v1"
+}
+```
+
+| パラメータ | 説明 |
+| --- | --- |
+| `model` | 推奨は `mimo-v2.5-pro`。`mimo-v2.5` も使用可能 |
+| `mimo_api_key` | [MiMo Open Platform](https://platform.xiaomimimo.com/console/api-keys) で作成 |
+| `mimo_api_base` | 任意。デフォルトは `https://api.xiaomimimo.com/v1` |
+
+### モデル選択
+
+| モデル | ユースケース |
+| --- | --- |
+| `mimo-v2.5-pro` | フラッグシップ。ネイティブ全モーダル + Agent 能力、最大 100 万トークンのコンテキスト |
+| `mimo-v2.5` | 汎用版。ネイティブ全モーダル（テキスト / 画像 / 動画 / 音声） |
+
+## 思考モード
+
+MiMo V2.5 シリーズはデフォルトで「思考モード」が有効です。最終回答の前に `reasoning_content`（思考過程）を出力することで、複雑なタスクのパフォーマンスを高めます。
+
+表示の有無はグローバル設定 `enable_thinking` で切り替え可能です（Web コンソールの設定ページからも変更できます）：
+
+```json
+{
+  "enable_thinking": true
+}
+```
+
+## 画像理解
+
+`mimo_api_key` を設定すると、Agent の Vision ツールは自動的に MiMo のビジョンモデルを利用します：
+
+- メインモデル自体がマルチモーダル（`mimo-v2.5-pro` / `mimo-v2.5`）の場合は、画像はメインモデルが直接処理し、追加設定は不要です。
+- メインモデルが他社製の場合、Vision ツールは順序に従い `mimo-v2.5-pro` にフォールバックします。
+
+特定の Vision モデルを強制したい場合は、設定ファイルで明示的に指定してください：
+
+```json
+{
+  "tools": {
+    "vision": {
+      "provider": "mimo",
+      "model": "mimo-v2.5-pro"
+    }
+  }
+}
+```
+
+## 音声合成
+
+```json
+{
+  "text_to_voice": "mimo",
+  "text_to_voice_model": "mimo-v2.5-tts",
+  "tts_voice_id": "冰糖"
+}
+```
+
+| パラメータ | 説明 |
+| --- | --- |
+| `text_to_voice_model` | 現在は `mimo-v2.5-tts` のみ対応（プリセット音色 + 歌唱モード） |
+| `tts_voice_id` | プリセット音色名（中国語の音色は中国語名がそのまま ID） |
+
+### プリセット音色
+
+| 音色 ID | 説明 |
+| --- | --- |
+| `冰糖` | 中国語 · 女声（デフォルト） |
+| `茉莉` | 中国語 · 女声 |
+| `苏打` | 中国語 · 男声 |
+| `白桦` | 中国語 · 男声 |
+| `Mia` | 英語 · 女声 |
+| `Chloe` | 英語 · 女声 |
+| `Milo` | 英語 · 男声 |
+| `Dean` | 英語 · 男声 |
+
+Web コンソールの「モデル管理 → 音声合成」のドロップダウンから視覚的に選択することもできます。
+
+### スタイル制御
+
+MiMo TTS は合成テキスト内に **音声タグ** を埋め込むことで、感情、語調、方言、キャラクター、さらには歌唱まで制御できます。タグは **最終的に音声合成されるテキスト（つまり Agent の返信内容）** に含める必要があり、全体スタイルのタグは先頭に置きます：
+
+```
+(スタイル)合成するテキスト
+```
+
+半角 `()`、全角 `（）`、`[]` の 3 種類の括弧に対応。スタイル記述は中国語・英語のどちらでも OK で、最も的確に表現できる言語を選んでください。代表的なスタイル例：
+
+| 種類 | サンプルタグ |
+| --- | --- |
+| 基本感情 | `happy` `sad` `angry` `fear` `surprised` `excited` `aggrieved` `calm` `indifferent` |
+| 複合感情 | `wistful` `relieved` `helpless` `guilty` `at ease` `uneasy` `touched` |
+| 全体トーン | `gentle` `aloof` `lively` `serious` `languid` `playful` `deep` `sharp` `cutting` |
+| 声質 | `magnetic` `mellow` `bright` `ethereal` `childlike` `aged` `sweet` `husky` |
+| キャラクター調 | `squeaky` `mature lady` `young boy` `uncle` `Taiwanese accent` |
+| 方言 | `Northeastern` `Sichuan` `Henan` `Cantonese` |
+| ロールプレイ | `Sun Wukong` `Lin Daiyu` |
+| 歌唱 | `sing` / `singing` |
+
+例：
+
+- `(magnetic)夜が深まり、街はまだ呼吸している。`
+- `(gentle)深呼吸して。きっと大丈夫。`
+- `(serious)これがシステム再起動前の最後の警告です。`
+- `(singing)Twinkle, twinkle, little star, how I wonder what you are…`
+
+テキストの任意の位置に細かい音声タグを挿入して、呼吸、笑い声、間などを制御することもできます。例：
+
+```
+(nervous, deep breath) ふぅ……落ち着いて、落ち着いて。(faster pace) 自己紹介は五十回練習したから大丈夫。
+```
+
+タグの完全な一覧は [MiMo 音声合成ドキュメント](https://platform.xiaomimimo.com/docs/zh-CN/usage-guide/speech-synthesis-v2.5) を参照してください。
+
+<Tip>
+  CowAgent は TTS 呼び出し時、Agent の返信原文（`(...)` タグを含む）をそのまま MiMo に送信します。ペルソナ / システムプロンプトで「返信の冒頭に `(スタイル)` タグを付けて口調を指定する」よう指示すれば、IM チャネル（WeChat / Feishu / DingTalk / WeCom）の音声返信に感情・方言・歌唱などの効果を付与できます。
+</Tip>
diff --git a/docs/ja/models/minimax.mdx b/docs/ja/models/minimax.mdx
index c1e7477c..66d0024a 100644
--- a/docs/ja/models/minimax.mdx
+++ b/docs/ja/models/minimax.mdx
@@ -1,8 +1,16 @@
 ---
 title: MiniMax
-description: MiniMaxモデルの設定
+description: MiniMax モデル設定（テキスト / 画像理解 / 画像生成 / 音声合成）
 ---
 
+MiniMax はテキスト対話、画像理解、画像生成、音声合成をサポートしており、1 つの `minimax_api_key` ですべての機能を有効化できます。
+
+<Tip>
+  Web コンソールの「モデル管理」ページから、以下のすべての機能をワンストップで設定でき、設定ファイルを手動で編集する必要はありません。
+</Tip>
+
+## テキスト対話
+
 ```json
 {
   "model": "MiniMax-M2.7",
@@ -12,16 +20,52 @@ description: MiniMaxモデルの設定
 
 | パラメータ | 説明 |
 | --- | --- |
-| `model` | `MiniMax-M2.7`、`MiniMax-M2.5`、`MiniMax-M2.1`、`MiniMax-M2.1-lightning`、`MiniMax-M2`などから選択可能 |
-| `minimax_api_key` | [MiniMax Console](https://platform.minimaxi.com/user-center/basic-information/interface-key)で作成 |
+| `model` | `MiniMax-M2.7`、`MiniMax-M2.7-highspeed`、`MiniMax-M2.5`、`MiniMax-M2.1`、`MiniMax-M2.1-lightning`、`MiniMax-M2` などを指定可能 |
+| `minimax_api_key` | [MiniMax コンソール](https://platform.minimaxi.com/user-center/basic-information/interface-key) で作成 |
 
-OpenAI互換の設定もサポートしています:
+## 画像理解
+
+MiniMax の M2.x シリーズの chat モデル自体はビジョンに対応していないため、ビジョン呼び出しは `MiniMax-Text-01` に統一的にルーティングされます。`minimax_api_key` を設定すると、Agent の Vision ツールは自動的にこのモデルを使用するため、設定ファイルで明示的に指定する必要はありません。
+
+## 画像生成
 
 ```json
 {
-  "bot_type": "openai",
-  "model": "MiniMax-M2.7",
-  "open_ai_api_base": "https://api.minimaxi.com/v1",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "skills": {
+    "image-generation": {
+      "model": "image-01"
+    }
+  }
 }
 ```
+
+選択可能なモデル：`image-01`。
+
+## 音声合成
+
+```json
+{
+  "text_to_voice": "minimax",
+  "text_to_voice_model": "speech-2.8-hd",
+  "tts_voice_id": "female-shaonv"
+}
+```
+
+| パラメータ | 説明 |
+| --- | --- |
+| `text_to_voice_model` | `speech-2.8-hd`（感情表現、自然な聴感）、`speech-2.8-turbo`（高速）、`speech-2.6-hd`、`speech-2.6-turbo` |
+| `tts_voice_id` | 音色 ID。中国語 / 広東語 / 英語 / 日本語 / 韓国語をサポートし、合計 70 種類以上 |
+
+よく使われる音色の例：
+
+| 音色 ID | 説明 |
+| --- | --- |
+| `female-shaonv` | 中国語 · 少女（女性） |
+| `female-yujie` | 中国語 · お姉さま（女性） |
+| `female-tianmei` | 中国語 · 甘い女性（女性） |
+| `male-qn-jingying` | 中国語 · エリート青年（男性） |
+| `male-qn-badao` | 中国語 · 強気な青年（男性） |
+| `Cantonese_GentleLady` | 広東語 · 優しい女声 |
+| `English_Graceful_Lady` | 英語 · Graceful Lady |
+
+完全な音色リスト（中国語 / 広東語 / 英語 / 日本語 / 韓国語の合計 70 種類以上）は [システム音色一覧](https://platform.minimaxi.com/docs/faq/system-voice-id) を参照してください。Web コンソールの「モデル管理 → 音声合成」のドロップダウンから視覚的に選択することもできます。
diff --git a/docs/ja/models/openai.mdx b/docs/ja/models/openai.mdx
index 0b26d9e5..801c1fd3 100644
--- a/docs/ja/models/openai.mdx
+++ b/docs/ja/models/openai.mdx
@@ -1,11 +1,20 @@
 ---
 title: OpenAI
-description: OpenAIモデルの設定
+description: OpenAI モデル設定（テキスト / ビジョン / 画像 / 音声 / ベクトル）
 ---
 
+OpenAI は最も広範な機能をカバーするベンダーで、テキスト対話、画像理解、画像生成、音声認識（ASR）、音声合成（TTS）、ベクトル（Embedding）の各機能を同時に担えます。1 つの `open_ai_api_key` で Agent はすべての機能を利用できます。
+
+<Tip>
+  Web コンソールの「モデル管理」ページから、以下のすべての機能をワンストップで設定でき、設定ファイルを手動で編集する必要はありません。
+</Tip>
+
+
+## テキスト対話
+
 ```json
 {
-  "model": "gpt-5.4",
+  "model": "gpt-5.5",
   "open_ai_api_key": "YOUR_API_KEY",
   "open_ai_api_base": "https://api.openai.com/v1"
 }
@@ -13,7 +22,82 @@ description: OpenAIモデルの設定
 
 | パラメータ | 説明 |
 | --- | --- |
-| `model` | OpenAI APIの[modelパラメータ](https://platform.openai.com/docs/models)に対応。oシリーズ、gpt-5.4、gpt-5シリーズ、gpt-4.1などをサポート。Agentモードでは`gpt-5.4`を推奨 |
-| `open_ai_api_key` | [OpenAI Platform](https://platform.openai.com/api-keys)で作成 |
-| `open_ai_api_base` | 任意。サードパーティプロキシを使用する場合に変更 |
-| `bot_type` | 公式OpenAIモデルでは不要。Claudeなど非OpenAIモデルをプロキシ経由で使用する場合は`openai`に設定 |
+| `model` | OpenAI API の [model パラメータ](https://platform.openai.com/docs/models) と同じです。`gpt-5.5`、`gpt-5.4`、`gpt-5.4-mini`、`gpt-5.4-nano`、`gpt-5` シリーズ、`gpt-4.1`、o シリーズなどをサポート。Agent モードのデフォルトは `gpt-5.5`、コストパフォーマンスを重視する場合は `gpt-5.4` に変更可能 |
+| `open_ai_api_key` | [OpenAI プラットフォーム](https://platform.openai.com/api-keys) で作成 |
+| `open_ai_api_base` | 任意。サードパーティのプロキシに接続するために変更可能 |
+| `bot_type` | OpenAI 公式モデルを使用する場合は不要。互換プロトコルでベンダーモデルに接続する場合は `openai` に設定 |
+
+## 画像理解
+
+`gpt-5.5`、`gpt-5.4`、`gpt-4o`、`gpt-4.1` などの OpenAI モデルはネイティブにビジョンをサポートしています。`open_ai_api_key` を設定すると、Agent の Vision ツールは自動的にメインモデルを使用して画像を認識します。メインモデルがビジョンに対応していない場合や明示的に指定したい場合は、設定ファイルで指定できます：
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "gpt-5.4-mini"
+    }
+  }
+}
+```
+
+サポートする Vision モデル：`gpt-5.5`、`gpt-5.4`、`gpt-5.4-mini`、`gpt-5.4-nano`、`gpt-5`、`gpt-4.1`、`gpt-4.1-mini`、`gpt-4o`。
+
+## 画像生成
+
+設定ファイルで画像生成モデルを指定すると、Agent が画像生成スキルを呼び出す際に自動的に OpenAI にルーティングされます：
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "gpt-image-2"
+    }
+  }
+}
+```
+
+サポートする画像生成モデル：`gpt-image-2`、`gpt-image-1`。
+
+## 音声認識
+
+```json
+{
+  "voice_to_text": "openai",
+  "voice_to_text_model": "gpt-4o-mini-transcribe"
+}
+```
+
+| パラメータ | 説明 |
+| --- | --- |
+| `voice_to_text` | `openai` に設定すると OpenAI 音声認識が有効になります |
+| `voice_to_text_model` | 任意。デフォルトは `gpt-4o-mini-transcribe`。`gpt-4o-transcribe`、`whisper-1` も指定可能 |
+
+認証情報は `open_ai_api_key` を自動的に再利用します。
+
+## 音声合成
+
+```json
+{
+  "text_to_voice": "openai",
+  "text_to_voice_model": "tts-1",
+  "tts_voice_id": "alloy"
+}
+```
+
+| パラメータ | 説明 |
+| --- | --- |
+| `text_to_voice_model` | `tts-1`、`tts-1-hd`、`gpt-4o-mini-tts` |
+| `tts_voice_id` | 音色：`alloy`、`echo`、`fable`、`onyx`、`nova`、`shimmer`、`ash`、`ballad`、`coral`、`sage`、`verse` |
+
+## ベクトル
+
+```json
+{
+  "embedding_provider": "openai",
+  "embedding_model": "text-embedding-3-small"
+}
+```
+
+選択可能なモデル：`text-embedding-3-small`、`text-embedding-3-large`、`text-embedding-ada-002`。embedding を変更した後は `/memory rebuild-index` コマンドを実行してインデックスを再構築する必要があります。
+
diff --git a/docs/ja/models/qianfan.mdx b/docs/ja/models/qianfan.mdx
index 6e5fde15..4c3651f8 100644
--- a/docs/ja/models/qianfan.mdx
+++ b/docs/ja/models/qianfan.mdx
@@ -40,7 +40,7 @@ description: Baidu Qianfan ERNIE モデル設定
 
 ```json
 {
-  "tool": {
+  "tools": {
     "vision": {
       "model": "ernie-4.5-turbo-vl"
     }
diff --git a/docs/ja/models/qwen.mdx b/docs/ja/models/qwen.mdx
index c491b5e3..a9e03ad5 100644
--- a/docs/ja/models/qwen.mdx
+++ b/docs/ja/models/qwen.mdx
@@ -1,8 +1,16 @@
 ---
-title: Qwen (通義千問)
-description: 通義千問モデルの設定
+title: Tongyi Qianwen Qwen
+description: Tongyi Qianwen モデル設定（テキスト / 画像理解 / 画像生成 / 音声認識 / 音声合成 / ベクトル）
 ---
 
+Tongyi Qianwen（DashScope / Bailian）は国内で最も広範な機能をカバーするベンダーの 1 つで、テキスト、画像理解、画像生成、音声認識、音声合成、ベクトルの各機能を 1 つの `dashscope_api_key` で有効化できます。
+
+<Tip>
+  Web コンソールの「モデル管理」ページから、以下のすべての機能をワンストップで設定でき、設定ファイルを手動で編集する必要はありません。
+</Tip>
+
+## テキスト対話
+
 ```json
 {
   "model": "qwen3.6-plus",
@@ -12,16 +20,93 @@ description: 通義千問モデルの設定
 
 | パラメータ | 説明 |
 | --- | --- |
-| `model` | `qwen3.6-plus`、`qwen3.5-plus`、`qwen3-max`、`qwen-max`、`qwen-plus`、`qwen-turbo`、`qwq-plus`などから選択可能 |
-| `dashscope_api_key` | [百炼 Console](https://bailian.console.aliyun.com/?tab=model#/api-key)で作成。[公式ドキュメント](https://bailian.console.aliyun.com/?tab=api#/api)を参照 |
+| `model` | `qwen3.6-plus`、`qwen3.7-max`、`qwen3.5-plus`、`qwen3-max`、`qwen-max`、`qwen-plus`、`qwen-turbo`、`qwq-plus` などを指定可能 |
+| `dashscope_api_key` | [Bailian コンソール](https://bailian.console.aliyun.com/?tab=model#/api-key) で作成。詳細は [公式ドキュメント](https://bailian.console.aliyun.com/?tab=api#/api) を参照 |
 
-OpenAI互換の設定もサポートしています:
+## 画像理解
+
+`dashscope_api_key` を設定すると、Agent の Vision ツールは自動的に Qwen のビジョンモデルを呼び出して画像を認識します。`qwen3-max` / `qwen3.5-plus` / `qwen3.6-plus` などのモデルはそのままマルチモーダルです。メインモデルがテキスト専用（`qwen-turbo` など）の場合は、自動的に `qwen-vl-max` にフォールバックします。
+
+Vision モデルを手動で指定したい場合：
 
 ```json
 {
-  "bot_type": "openai",
-  "model": "qwen3.6-plus",
-  "open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "tools": {
+    "vision": {
+      "model": "qwen3.6-plus"
+    }
+  }
 }
 ```
+
+サポートするモデル：`qwen3.6-plus`、`qwen3.5-plus`、`qwen3-max`。
+
+## 画像生成
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "qwen-image-2.0"
+    }
+  }
+}
+```
+
+選択可能なモデル：`qwen-image-2.0`、`qwen-image-2.0-pro`。
+
+## 音声認識
+
+```json
+{
+  "voice_to_text": "dashscope",
+  "voice_to_text_model": "qwen3-asr-flash"
+}
+```
+
+| パラメータ | 説明 |
+| --- | --- |
+| `voice_to_text` | `dashscope` に設定すると Tongyi Qianwen ASR が有効になります |
+| `voice_to_text_model` | 任意。デフォルトは `qwen3-asr-flash` |
+
+認証情報は `dashscope_api_key` を自動的に再利用します。1 ファイルあたり 10MB 未満、長さ 300 秒以内を推奨します。
+
+## 音声合成
+
+```json
+{
+  "text_to_voice": "dashscope",
+  "text_to_voice_model": "qwen3-tts-flash",
+  "tts_voice_id": "Cherry"
+}
+```
+
+| パラメータ | 説明 |
+| --- | --- |
+| `text_to_voice_model` | 任意。デフォルトは `qwen3-tts-flash`。普通話、方言、主要な外国語をカバー |
+| `tts_voice_id` | 音色 ID。下記のよく使われる一覧を参照 |
+
+よく使われる音色の例：
+
+| 音色 ID | 説明 |
+| --- | --- |
+| `Cherry` | 芊悦 · 明るい女声 |
+| `Serena` | 苏瑶 · 優しい女声 |
+| `Ethan` | 晨煦 · 明るい男声 |
+| `Chelsie` | 千雪 · 二次元少女 |
+| `Dylan` | 北京語 · 晓东 |
+| `Rocky` | 広東語 · 阿强 |
+| `Sunny` | 四川語 · 晴儿 |
+
+完全な音色（普通話 / 各地の方言 / バイリンガルなど）は、Web コンソールの「モデル管理 → 音声合成」のドロップダウンから視覚的に選択できます。
+
+## ベクトル
+
+```json
+{
+  "embedding_provider": "dashscope",
+  "embedding_model": "text-embedding-v4"
+}
+```
+
+デフォルトモデルは `text-embedding-v4` です。embedding を変更した後は `/memory rebuild-index` コマンドを実行してインデックスを再構築する必要があります。
diff --git a/docs/ja/releases/overview.mdx b/docs/ja/releases/overview.mdx
index 483d557f..cc51eeb4 100644
--- a/docs/ja/releases/overview.mdx
+++ b/docs/ja/releases/overview.mdx
@@ -1,27 +1,32 @@
 ---
 title: 変更履歴
-description: CowAgent バージョン履歴
+description: CowAgent バージョン更新履歴
 ---
 
 | バージョン | 日付 | 説明 |
 | --- | --- | --- |
-| [2.0.7](/ja/releases/v2.0.7) | 2026.04.22 | 画像生成スキル（6プロバイダー自動ルーティング）、新モデル（Kimi K2.6、Claude Opus 4.7、GLM 5.1）、ナレッジベースと Web コンソールの改善 |
-| [2.0.6](/ja/releases/v2.0.6) | 2026.04.14 | ナレッジベース、Deep Dream 記憶蒸留、スマートコンテキスト圧縮、Web コンソールアップグレード |
-| [2.0.5](/ja/releases/v2.0.5) | 2026.04.01 | Cow CLI、Skill Hub オープンソース、ブラウザツール、企業微信スキャン作成、その他改善 |
-| [2.0.4](/ja/releases/v2.0.4) | 2026.03.22 | 個人WeChatチャネル追加、新モデルサポート、日本語ドキュメント、スクリプトリファクタリングおよび複数修正 |
-| [2.0.2](/ja/releases/v2.0.2) | 2026.02.27 | Web Console アップグレード、マルチチャネル同時実行、セッション永続化 |
-| [2.0.1](/en/releases/v2.0.1) | 2026.02.27 | 組み込み Web Search ツール、スマートコンテキスト管理、複数の修正 |
-| [2.0.0](/en/releases/v2.0.0) | 2026.02.03 | AI スーパーアシスタントへの全面アップグレード |
-| 1.7.6 | 2025.05.23 | Web Channel 最適化、AgentMesh プラグイン |
+| [2.0.9](/ja/releases/v2.0.9) | 2026.05.22 | モデル管理機能の追加、MCP プロトコル対応、ブラウザログイン状態の永続化、新モデル追加（gpt-5.5、gemini-3.5-flash、qwen3.7-max など）、デプロイ・セキュリティ強化 |
+| [2.0.8](/ja/releases/v2.0.8) | 2026.05.06 | Feishu チャネル全面アップグレード（音声、ストリーミング出力と Markdown、QR コードによるワンクリック接続）、DeepSeek V4 と百度モデルの追加、スケジュールタスクツールの強化 |
+| [2.0.7](/ja/releases/v2.0.7) | 2026.04.22 | 画像生成スキル（6 プロバイダー自動ルーティング）、新モデル対応（Kimi K2.6、Claude Opus 4.7、GLM 5.1）、ナレッジベース強化、Web コンソール最適化 |
+| [2.0.6](/ja/releases/v2.0.6) | 2026.04.14 | プロジェクト名変更、ナレッジベースシステム、Deep Dream 記憶蒸留、コンテキストの賢い圧縮、Web コンソールのマルチセッションおよび複数の最適化 |
+| [2.0.5](/ja/releases/v2.0.5) | 2026.04.01 | Cow CLI、Skill Hub オープンソース化、ブラウザツール、WeCom QR コード作成、複数の最適化と修正 |
+| [2.0.4](/ja/releases/v2.0.4) | 2026.03.22 | 個人 WeChat チャネル追加、新モデル対応、日本語ドキュメント、スクリプトリファクタリングおよび複数の修正 |
+| [2.0.3](/ja/releases/v2.0.3) | 2026.03.18 | WeCom スマートボットおよび QQ チャネル追加、Coding Plan 対応、複数モデル追加、Web 側のファイル処理、メモリシステムアップグレード |
+| [2.0.2](/ja/releases/v2.0.2) | 2026.02.27 | Web コンソールアップグレード、マルチチャネル同時実行、セッション永続化 |
+| [2.0.1](/ja/releases/v2.0.1) | 2026.02.13 | Web Search ツール組み込み、スマートコンテキスト管理、複数の修正 |
+| [2.0.0](/ja/releases/v2.0.0) | 2026.02.03 | スーパー Agent アシスタントへの全面アップグレード |
+| 1.7.6 | 2025.05.23 | Web Channel 最適化、AgentMesh マルチエージェントプラグイン |
 | 1.7.5 | 2025.04.11 | DeepSeek モデル |
 | 1.7.4 | 2024.12.13 | Gemini 2.0 モデル、Web Channel |
-| 1.7.3 | 2024.10.31 | 安定性の改善、データベース機能 |
+| 1.7.3 | 2024.10.31 | 安定性向上、データベース機能 |
 | 1.7.2 | 2024.09.26 | ワンクリックインストールスクリプト、o1 モデル |
 | 1.7.0 | 2024.08.02 | 讯飞 4.0 モデル、ナレッジベース参照 |
-| 1.6.9 | 2024.07.19 | gpt-4o-mini、阿里音声認識 |
+| 1.6.9 | 2024.07.19 | gpt-4o-mini、アリババ音声認識 |
 | 1.6.8 | 2024.07.05 | Claude 3.5、Gemini 1.5 Pro |
-| 1.6.0 | 2024.04.26 | Kimi 統合、gpt-4-turbo アップグレード |
+| 1.6.0 | 2024.04.26 | Kimi 接続、gpt-4-turbo アップグレード |
+| 1.5.8 | 2024.03.26 | GLM-4、Claude-3、edge-tts |
+| 1.5.2 | 2023.11.10 | Feishu チャネル、画像認識対話 |
 | 1.5.0 | 2023.11.10 | gpt-4-turbo、dall-e-3、tts マルチモーダル |
-| 1.0.0 | 2022.12.12 | プロジェクト作成、初の ChatGPT 統合 |
+| 1.0.0 | 2022.12.12 | プロジェクト作成、初の ChatGPT モデル接続 |
 
-完全な履歴は [GitHub Releases](https://github.com/zhayujie/CowAgent/releases) をご覧ください。
+その他の過去バージョンは [GitHub Releases](https://github.com/zhayujie/CowAgent/releases) をご覧ください。
diff --git a/docs/ja/releases/v2.0.7.mdx b/docs/ja/releases/v2.0.7.mdx
index 81390dd0..bcf46778 100644
--- a/docs/ja/releases/v2.0.7.mdx
+++ b/docs/ja/releases/v2.0.7.mdx
@@ -11,7 +11,7 @@ description: CowAgent 2.0.7 - 画像生成スキル（6プロバイダー自動
 - **モデル選択不要**：API Key を設定するだけで使用可能、モデルを手動で指定する必要なし。会話で特定モデルを指名することも可能（例：「seedream で猫を描いて」）
 - **柔軟な制御**：`quality`（画質）、`size`（解像度、512/1K〜4K）、`aspect_ratio`（アスペクト比）パラメータ対応、各プロバイダーが自動的に有効な値にマッピング
 - **画像編集**：既存の画像を渡して編集・スタイル変換・複数画像融合が可能（Seedream は最大 14 枚の参照画像をサポート）
-- **スキルレベル設定**：`config.json` の `skill.image-generation.model` でデフォルトモデルを固定可能
+- **スキルレベル設定**：`config.json` の `skills.image-generation.model` でデフォルトモデルを固定可能
 - **画像ライトボックス**：Web コンソールのすべての画像がクリックで拡大プレビュー対応
 
 ドキュメント：[画像生成スキル](https://docs.cowagent.ai/ja/skills/image-generation)
diff --git a/docs/ja/releases/v2.0.8.mdx b/docs/ja/releases/v2.0.8.mdx
index 4456fb70..310d98b2 100644
--- a/docs/ja/releases/v2.0.8.mdx
+++ b/docs/ja/releases/v2.0.8.mdx
@@ -51,7 +51,7 @@ description: CowAgent 2.0.8 - 飛書チャネル全面アップグレード（
 
 ## 🔧 ツールと安全性
 
-- **Vision モデル選択**：`tool.vision.model` 設定が実際に反映されるようになり、未設定時は自動フォールバック #2792
+- **Vision モデル選択**：`tools.vision.model` 設定が実際に反映されるようになり、未設定時は自動フォールバック #2792
 - **Bash セーフティ確認**：破壊的削除の確認プロンプトをワークスペース外のパスに限定。ワークスペース内の通常操作は中断されません
 
 ## 🐛 その他の修正
diff --git a/docs/ja/releases/v2.0.9.mdx b/docs/ja/releases/v2.0.9.mdx
new file mode 100644
index 00000000..003ae6e3
--- /dev/null
+++ b/docs/ja/releases/v2.0.9.mdx
@@ -0,0 +1,65 @@
+---
+title: v2.0.9
+description: CowAgent 2.0.9 - モデル管理機能、MCP プロトコル対応、ブラウザログイン状態の永続化、新モデル追加とデプロイ・セキュリティ強化
+---
+
+## 🖥️ モデル管理機能の追加
+
+Web コンソールに「モデル」ページを新設。**モデルプロバイダー × モデル機能** の軸で管理し、対話、画像、音声、ベクトル、検索の各能力を一元的に設定可能になりました：
+
+- **プロバイダー単位の設定**：各プロバイダーの API Key / API Base はページ上部で一度だけ設定すれば、下部の各機能が自動で参照。再入力は不要
+- **画像モデル**：画像理解・画像生成それぞれで独立にプロバイダーとモデルを選択可能。未指定時はメインモデルに自動で追従
+- **音声モデル**：音声認識（ASR）と音声合成（TTS）を独立に設定可能。Qwen・Zhipu の ASR/TTS モデルを新たに追加
+- **ベクトルモデル**：埋め込み（Embedding）モデルを設定可能（記憶およびナレッジベース検索で利用）。OpenAI、Tongyi、Doubao、Zhipu などに対応。モデル切り替え後は `/memory rebuild-index` でインデックスをオンライン再構築してください
+- **検索機能**：ウェブ検索機能を強化、Bocha・Baidu・Zhipu など複数プロバイダーに対応。自動モードでは Agent が複数ソースの結果を統合してより深いリサーチを実行可能
+
+ドキュメント：[モデル概要](https://docs.cowagent.ai/ja/models)
+
+<img width="720" alt="20260522113305" src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/screenshots/en/web-console-models-config.png" />
+
+
+## 🧩 MCP プロトコル対応
+
+**MCP（Model Context Protocol）** プロトコルに対応。固定のツールセットから、開放的でプラグイン可能なツールエコシステムへと拡張され、MCP 互換のあらゆるサービスを Agent のツールとして直接接続できます。
+
+- ネイティブ JSON-RPC 実装、追加依存ゼロ。`stdio` と `sse` の両伝送方式に対応
+- Claude Desktop / Cursor などの主流形式の `mcpServers` 設定に互換、`~/cow/mcp.json` を優先的に読み込み
+
+ドキュメント：[MCP ツール](https://docs.cowagent.ai/ja/tools/mcp)。Thanks [@yangluxin613](https://github.com/yangluxin613) (#2801)
+
+## 🌐 ブラウザログイン状態の永続化
+
+ログインが必要なサイトや反クロウル機構のあるサイトに対して、ブラウザツールが一度のログイン状態を長期的に再利用できるようになりました。さらに自前の本物の Chrome に接続することで、フィンガープリント検出も回避可能です：
+
+- **永続化ユーザープロファイル（デフォルト）**：`~/.cow/browser_profile` をブラウザのユーザーディレクトリとしてデフォルト使用、一度ログインすれば次回以降は自動で復元
+- **CDP モード**：`tools.browser.cdp_endpoint` を設定することで実際の Chrome ブラウザを乗っ取り、完全なブラウザ権限を享受可能
+
+ドキュメント：[ブラウザツール](https://docs.cowagent.ai/ja/tools/browser)。Thanks [@leafmove](https://github.com/leafmove) (#2809)
+
+## 🤖 モデル追加と最適化
+
+- **モデル新規追加**：`gpt-5.5`、`gemini-3.5-flash`、`qwen3.7-max`、`ernie-5.1`
+- **モデル最適化**：DeepSeek V4 が `reasoning_effort` 思考深度パラメータをサポート。MiMo などの思考モデルが OpenAI 互換プロトコル経由で接続できない問題を修正
+
+## 🔒 デプロイとセキュリティ
+
+- **デフォルトでローカルアクセスのみ**：Web コンソールの `web_host` をデフォルトで `127.0.0.1` にバインド。サーバーデプロイ時は手動で `0.0.0.0` に変更しパスワードを設定してください。Thanks @August829、@yidaozhongqing、@YLChen-007、@icysun
+- **フロントエンド資源の完全ローカル化**：サードパーティ CSS / JS をすべてローカル配信化、オフライン / イントラネット環境でもコンソールが正常に動作。Thanks [@gitlayzer](https://github.com/gitlayzer) (#2816)
+
+## 🛠 体験改善と修正
+
+- **TTS のチャネル拡充**：Web 対話、個人 WeChat、飛書、DingTalk、WeCom スマートボットすべてが音声返信に対応。詳細は [チャネル概要](https://docs.cowagent.ai/ja/channels) を参照
+- **ログパネル強化**：ログレベルに応じたハイライト表示と、レベル別フィルタリングをサポート。Thanks [@yangluxin613](https://github.com/yangluxin613) (#2807)
+- **Web コンソールの自動起動**：プログラム起動後に Web コンソールが自動で開きます。Thanks [@yangluxin613](https://github.com/yangluxin613) (#2804)
+- **Ctrl+C のクリーン終了**：長い `KeyboardInterrupt` スタックトレースが表示されなくなりました。Thanks [@yangluxin613](https://github.com/yangluxin613) (#2806)
+- **フォルダアップロード**：Web 端でディレクトリアップロードに対応、Windows 向けのパス検証に適合。Thanks [@TryToMakeUsBetter](https://github.com/TryToMakeUsBetter) (#2814)
+- 特定条件下でスケジュールタスクが重複実行される問題を修正。Thanks [@CNXudiandian](https://github.com/CNXudiandian) (#2820)
+- タイムゾーン付きの単発スケジュールタスクが発火しない問題を修正。Thanks @AethericSpace
+- 実行失敗したツール呼び出しがページ更新後に表示されない問題を修正。Thanks [@a1094174619](https://github.com/a1094174619) (#2822)
+- WeCom ボットメッセージに不正な制御文字が含まれる場合に配信が失敗する問題を修正。Thanks [@Jacques-Zhao](https://github.com/Jacques-Zhao) (#2810)
+
+## 📦 アップグレード方法
+
+ソースコードデプロイは `cow update` でワンクリックアップグレード、または最新コードを手動で pull して再起動してください。詳細は [アップグレードガイド](https://docs.cowagent.ai/ja/guide/upgrade) を参照。
+
+**リリース日**：2026.05.22 | [Full Changelog](https://github.com/zhayujie/CowAgent/compare/2.0.8...2.0.9)
diff --git a/docs/ja/skills/hub.mdx b/docs/ja/skills/hub.mdx
new file mode 100644
index 00000000..c8116e3f
--- /dev/null
+++ b/docs/ja/skills/hub.mdx
@@ -0,0 +1,65 @@
+---
+title: スキルハブ
+description: AI Agent スキルの閲覧、検索、インストール
+---
+
+[Cow Skill Hub](https://skills.cowagent.ai/) は、公式推奨・コミュニティ貢献・サードパーティ（GitHub、ClawHub など）のスキルを集約した、オープンソースの AI Agent スキルマーケットプレイスです。
+
+ソースコード: [github.com/zhayujie/cow-skill-hub](https://github.com/zhayujie/cow-skill-hub)
+
+<img src="https://cdn.link-ai.tech/doc/20260401110103.png" width="800" />
+
+## 機能
+
+- **スキル閲覧** — カテゴリ（公式推奨 / コミュニティ / サードパーティ）とタグでフィルタ
+- **スキル検索** — 名前または説明で検索
+- **詳細表示** — スキルマニフェスト、ファイル内容、インストールコマンド、必要な環境変数を確認
+- **ワンクリックインストール** — インストールコマンドをコピーして CowAgent で実行
+
+## スキルのインストール
+
+チャット内またはターミナルでインストールコマンドを実行:
+
+<CodeGroup>
+```text チャット
+/skill install <name>
+```
+
+```bash ターミナル
+cow skill install <name>
+```
+</CodeGroup>
+
+チャットからスキルハブを直接閲覧することもできます:
+
+```text
+/skill list --remote
+/skill search <キーワード>
+```
+
+リスト表示されている厳選スキル以外にも、**GitHub、ClawHub、LinkAI、任意の URL** からサードパーティスキルを CLI 経由でインストールできます。詳しくは [スキルのインストール](/ja/skills/install) を参照してください。
+
+## スキルの貢献
+
+ご自身のスキルを投稿するには:
+
+1. [skills.cowagent.ai/submit](https://skills.cowagent.ai/submit) にアクセス
+2. GitHub または Google でログイン
+3. `SKILL.md` を含むフォルダまたは zip ファイルをアップロード
+4. スキル名・表示名・説明は自動検出されます。必要に応じて編集してください
+5. 提出後、セキュリティ・品質チェックを経て公開されます
+
+<img src="https://cdn.link-ai.tech/doc/20260401111904.png" width="800" />
+
+スキルのファイル構成:
+
+```
+your-skill/
+├── SKILL.md        # 必須、ルートに配置
+├── scripts/        # 任意、実行スクリプト
+└── resources/      # 任意、その他リソース
+```
+
+<Tip>
+  スキルは `SKILL.md` マニフェストを中心に構築されます。スキル詳細ページから `SKILL.md` をダウンロードし、カスタム指示に対応した任意の Agent（OpenClaw、Cursor、Claude Code など）でも利用できます。
+</Tip>
diff --git a/docs/ja/skills/image-generation.mdx b/docs/ja/skills/image-generation.mdx
index cafc9eb3..6267d088 100644
--- a/docs/ja/skills/image-generation.mdx
+++ b/docs/ja/skills/image-generation.mdx
@@ -1,158 +1,98 @@
 ---
 title: image-generation - 画像生成
-description: テキストから画像生成 / 画像編集 / 複数画像の融合、複数プロバイダーの自動ルーティングとフォールバック対応
+description: テキストから画像生成 / 画像編集 / 複数画像融合に対応。複数プロバイダーの自動ルーティングとフォールバックをサポート
 ---
 
-汎用の画像生成・編集スキルです。OpenAI、Gemini、Seedream（Volcengine Ark）、Qwen（DashScope）、MiniMax、LinkAI の 6 社に対応。モデルを手動で選ぶ必要はなく、固定の優先順位に従って、設定済みのプロバイダーを自動的に選択します。
+汎用の画像生成・編集スキルです。OpenAI、Gemini、Seedream（Volcengine Ark）、Qwen（DashScope）、MiniMax、LinkAI の 6 つのプロバイダーに対応しています。いずれか 1 社の Key を設定すれば利用でき、複数社を設定すると自動フォールバックが有効になります。
 
-## モデル選択
-
-`image-generation` は「固定優先度 + 自動フォールバック」のストラテジーを採用しています。API Key を設定するだけで使えます：
-
-1. **優先順位**: `OpenAI → Gemini → Seedream → Qwen → MiniMax → LinkAI`
-2. **未設定のプロバイダーはスキップ**: API Key が設定されているプロバイダーのみが参加
-3. **失敗時は自動で次へ**: 401、モデル未開通、ネットワークエラーなどの場合、次のプロバイダーを試行
-4. **モデル指定時は前置**: 特定のモデル名を渡すと、そのプロバイダーが最前列に昇格
-
-### 対応モデル
+## 対応モデル
 
 | プロバイダー | モデル / エイリアス | 特徴 |
 | --- | --- | --- |
-| OpenAI | `gpt-image-2`、`gpt-image-1` | 汎用テキスト→画像、高品質、`quality` パラメータ対応 |
+| OpenAI | `gpt-image-2`、`gpt-image-1` | 汎用テキスト→画像、高品質、`quality` で画質制御に対応 |
 | Gemini Nano Banana | `nano-banana-2`、`nano-banana-pro`、`nano-banana` | `gemini-3.1-flash`、`gemini-3-pro`、`gemini-2.5-flash` の画像バージョン |
-| Seedream（Volcengine Ark） | `seedream-5.0-lite`、`seedream-4.5` | ネイティブ 2K–4K、最大 14 枚の参照画像を融合 |
-| Qwen（DashScope） | `qwen-image-2.0`、`qwen-image-2.0-pro` | 中国語テキスト描画やテキスト・画像レイアウトに強い |
-| MiniMax | `image-01` | シンプルで高速な画像生成 |
-| LinkAI | 任意のモデル | 汎用プロキシ、フォールバック用 |
+| Seedream（Volcengine Ark） | `seedream-5.0-lite`、`seedream-4.5` | ネイティブ 2K–4K、最大 14 枚の画像融合 |
+| Qwen（DashScope） | `qwen-image-2.0`、`qwen-image-2.0-pro` | 中国語のレイアウトや画像とテキストの融合に強い |
+| MiniMax | `image-01` | シンプルで高速 |
+| LinkAI | 任意のモデル | 統一ゲートウェイ、フォールバック用途 |
 
-<Note>
-デフォルトでは Agent はモデルを選ばず、自動ルーティングを使用します。特定のモデルを使いたい場合は、会話で直接指定してください（例：「seedream で猫を描いて」「gpt-image-2 でポスターを作って」）。下記の「カスタム設定」でデフォルトモデルを固定することもできます。
-</Note>
+## モデル選択
 
-## カスタム設定
+デフォルトでは「自動ルーティング + 失敗時フォールバック」で動作します：
 
-### API Key の設定
+1. `OpenAI → Gemini → Seedream → Qwen → MiniMax → LinkAI` の順に、設定済みのプロバイダーを最初に選択
+2. 401、モデル未開通、ネットワークエラーなどに遭遇した場合、自動的に次のプロバイダーへ切り替え
+3. ユーザーが対話内でモデルを指定した場合（例：「seedream で猫を描いて」）、該当プロバイダーが優先候補に繰り上がります
 
-**少なくとも 1 つ**のプロバイダーの Key が必要です。複数設定すると自動フォールバックが有効になります。設定方法は 3 通り：
-
-#### 方法 1：既存のモデル Key を自動再利用
-
-Web コンソールや `config.json` で対話モデルの Key（`openai_api_key`、`gemini_api_key` など）を設定済みの場合、起動時にこれらの Key は対応する環境変数に**自動同期**されます。つまり、対話モデルが使えていれば、画像生成も同じ Key で追加設定なしに利用できます。
-
-#### 方法 2：config.json で設定
-
-`config.json` に Key フィールドを直接記述：
+特定のモデルに固定したい場合：
 
 ```json
 {
-  "openai_api_key": "sk-xxx",
-  "openai_api_base": "https://api.openai.com/v1",
-  "gemini_api_key": "AIza-xxx",
-  "ark_api_key": "xxx",
-  "dashscope_api_key": "sk-xxx",
-  "minimax_api_key": "xxx",
-  "linkai_api_key": "xxx"
-}
-```
-
-変更後は再起動が必要です。各 Key には対応する `*_api_base` フィールドがあり、カスタムエンドポイントを指定できます。
-
-#### 方法 3：会話で直接設定
-
-チャットで API Key を送信すると、Agent が `env_config` ツールで `~/cow/.env` に保存します。**再起動不要**でただちに反映されます。例：
-
-```
-OPENAI_API_KEY を sk-xxx に設定して
-```
-
-または：
-
-```
-ARK_API_KEY を xxx に設定して
-```
-
-### API Key 一覧
-
-| 環境変数 | config.json フィールド | プロバイダー | デフォルト Base URL |
-| --- | --- | --- | --- |
-| `OPENAI_API_KEY` | `openai_api_key` | OpenAI | `https://api.openai.com/v1` |
-| `GEMINI_API_KEY` | `gemini_api_key` | Gemini | `https://generativelanguage.googleapis.com` |
-| `ARK_API_KEY` | `ark_api_key` | Volcengine Ark（Seedream） | `https://ark.cn-beijing.volces.com/api/v3` |
-| `DASHSCOPE_API_KEY` | `dashscope_api_key` | Alibaba DashScope（Qwen） | `https://dashscope.aliyuncs.com` |
-| `MINIMAX_API_KEY` | `minimax_api_key` | MiniMax | `https://api.minimaxi.com` |
-| `LINKAI_API_KEY` | `linkai_api_key` | LinkAI | `https://api.link-ai.tech` |
-
-### デフォルトモデルの固定
-
-すべての画像生成を特定のプロバイダーのモデルで固定したい場合、`config.json` に以下を追加：
-
-```json
-"skill": {
-  "image-generation": {
-    "model": "seedream-5.0-lite"
+  "skills": {
+    "image-generation": {
+      "model": "seedream-5.0-lite"
+    }
   }
 }
 ```
 
-起動時にこの設定は環境変数 `SKILL_IMAGE_GENERATION_MODEL` に自動変換され、スクリプトはこのモデルのプロバイダーを常に使用します。
+## API Key の設定
+
+<Tip>
+  [Web コンソール](/ja/channels/web) の「モデル管理」ページから設定するのが推奨です。設定済みの対話モデル Key は画像生成スキルでも自動的に再利用されるため、重複した設定は不要です。設定ファイルを手動編集するか、対話中に `env_config` ツールで一時的に設定することもできます。
+</Tip>
+
+認証情報はメインモデルプロバイダーの Key を統一的に再利用します：
+
+| フィールド | 対応プロバイダー |
+| --- | --- |
+| `openai_api_key` | OpenAI |
+| `gemini_api_key` | Gemini |
+| `ark_api_key` | Volcengine Ark（Seedream） |
+| `dashscope_api_key` | Alibaba DashScope（Qwen） |
+| `minimax_api_key` | MiniMax |
+| `linkai_api_key` | LinkAI |
+
 
 ## 有効化と無効化
 
-`image-generation` は内蔵スキルで、**API Key に基づいてステータスが自動調整**されます：
+スキルは API Key に応じて自動的にステータスが調整されます：
 
-- **Key 設定済み**：スキルはアクティブ — Agent は画像生成リクエストを受けると呼び出す
-- **Key 未設定**：スキルはコンテキストに表示される（「設定が必要」とマーク）— Agent は呼び出し失敗の代わりに Key の設定を案内する
+- **Key 設定済み**：Agent は画像生成リクエストを受けると直接呼び出します
+- **Key 未設定**：スキルはコンテキストに表示されますが（「設定が必要」とマーク）、Agent はユーザーに Key の設定を案内します
 
 手動で制御する場合：
 
 ```text
-/skill disable image-generation    # 無効化（Key があっても呼び出されない）
+/skill disable image-generation    # 無効化
 /skill enable image-generation     # 再有効化
 ```
 
-ターミナルでは `cow skill disable image-generation` / `cow skill enable image-generation`。
+ターミナルでの等価コマンド：`cow skill disable image-generation` / `cow skill enable image-generation`。
 
 ## パラメータ
 
 | パラメータ | 型 | 必須 | デフォルト | 説明 |
 | --- | --- | --- | --- | --- |
 | `prompt` | string | はい | — | 画像の説明 |
-| `image_url` | string / list | いいえ | null | 編集用の入力画像。ローカルパスまたは URL。複数指定で複数画像融合 |
-| `quality` | string | いいえ | auto | `low` / `medium` / `high` — 一部のプロバイダーのみ対応 |
-| `size` | string | いいえ | auto | `512` / `1K` / `2K` / `3K` / `4K`、またはピクセル値（例: `1024x1024`） |
-| `aspect_ratio` | string | いいえ | null | `1:1` / `3:2` / `2:3` / `16:9` / `9:16` / `21:9`；Gemini は `1:4` / `4:1` / `1:8` / `8:1` にも対応 |
+| `image_url` | string / list | いいえ | null | 編集用の入力画像。ローカルパスまたは URL。リスト指定で複数画像融合 |
+| `quality` | string | いいえ | auto | `low` / `medium` / `high`、一部のプロバイダーのみ対応 |
+| `size` | string | いいえ | auto | `512` / `1K` / `2K` / `3K` / `4K`、またはピクセル値（例：`1024x1024`） |
+| `aspect_ratio` | string | いいえ | null | `1:1` / `3:2` / `2:3` / `16:9` / `9:16` / `21:9`。Gemini は `1:4` / `4:1` / `1:8` / `8:1` にも対応 |
 
 <Warning>
-**品質が高いほど・解像度が大きいほど、コストが高く、時間がかかります。**
-
-- 日常の会話やプレビューにはデフォルト（`auto`）、または `quality=low` + `size=1K` を使用 — 約 20 秒で生成
-- ポスターやユーザーが高解像度を明示的に要求した場合は `quality=high` + `size=2K/4K` — モデルによって 1〜5 分かかる場合があります
+  **品質が高いほど、解像度が大きいほど、時間とコストが高くなります。** 日常の対話ではデフォルト（`auto`）または `quality=low` + `size=1K` で十分で、約 20 秒で生成されます。ポスター制作や明示的に高解像度が必要な場合のみ `high` + `2K/4K` を使用してください。1〜5 分かかる場合があります。
 </Warning>
 
-## 出力
-
-成功時：
-
-```json
-{
-  "model": "doubao-seedream-5-0-260128",
-  "images": [
-    {"url": "/path/to/output.png"}
-  ]
-}
-```
-
-失敗時：`{ "error": "..." }`。エラー後は**直接リトライしないでください** — ほぼ確実に設定の問題です（Key の誤り、API ベース URL の不一致、モデル未開通など）。まず設定を修正してから再試行してください。
-
 ## よくある使い方
 
-- **テキスト→画像**：説明からイラスト、ポスター、アイコン、アバター、絵コンテなどを生成
-- **画像→画像**：既存の画像のスタイル変更、要素の入れ替え、装飾やテキストの追加
-- **複数画像の融合**：複数の参照画像を 1 枚に合成（着せ替え、キャラクター集合写真など）
+- **テキスト→画像**：説明文からイラスト、ポスター、アイコン、アバター、絵コンテなどを生成
+- **画像→画像**：既存の画像のスタイル変更、要素差し替え、装飾や文字の追加など
+- **複数画像融合**：複数の参考画像を 1 枚に合成（着せ替え、キャラクター集合写真など）
 
 <Note>
-- bash タイムアウトは 600 秒に設定してください。各プロバイダーの HTTP タイムアウトは 300 秒ですが、スクリプトが複数のプロバイダーを順番に試行する場合があります
-- 入力画像は自動的に 4 MB 以下・最長辺 4096 px 以下に圧縮されます
-- Gemini / Seedream / Qwen / MiniMax は `quality` パラメータに対応していません（渡しても無視されます）
+- bash タイムアウトは 600 秒に設定することを推奨：単一プロバイダーの HTTP タイムアウトは 300 秒、スクリプトは複数社を順に試行する場合があります
+- 入力画像は自動的に 4 MB 以内・最長辺 4096 px 以内に圧縮されます
+- Gemini / Seedream / Qwen / MiniMax は `quality` パラメータに対応していません
 - Seedream のデフォルトは 2K。`seedream-5.0-lite` は 3K まで、`seedream-4.5` は 4K まで対応
 </Note>
diff --git a/docs/ja/tools/mcp.mdx b/docs/ja/tools/mcp.mdx
index e450a099..efd4a514 100644
--- a/docs/ja/tools/mcp.mdx
+++ b/docs/ja/tools/mcp.mdx
@@ -34,7 +34,9 @@ MCP コミュニティ標準に完全準拠しており、Claude Desktop / Curso
 | `command` | stdio | サーバーを起動する実行コマンド（`npx`、`python`、`uvx` など） |
 | `args` | 任意 | `command` に渡す引数 |
 | `env` | 任意 | サブプロセスの環境変数。API Key などに利用 |
-| `url` | SSE | SSE エンドポイントの URL（`command` と二者択一） |
+| `url` | SSE / Streamable HTTP | リモートエンドポイントの URL（`command` と二者択一） |
+| `type` | リモート | リモートトランスポート種別：`sse` または `streamable-http`（既定は `sse`） |
+| `headers` | 任意 | リモートリクエストの追加 HTTP ヘッダ（`Authorization` など）。Streamable HTTP のみ |
 | `disabled` | 任意 | `true` のとき該当サーバーをスキップ。一時的に無効化したいときに便利 |
 
 ### 完全な例
@@ -88,7 +90,8 @@ Agent は次のように動作します：
 | トランスポート | 説明 | 設定フィールド |
 | --- | --- | --- |
 | **stdio** | サブプロセス通信。最も一般的で、コミュニティのエコシステムが最も豊富 | `command` + `args` |
-| **SSE** | HTTP Server-Sent Events。リモートホスト型の MCP サービス向け | `url` |
+| **SSE** | HTTP Server-Sent Events。従来のリモート用トランスポート | `url`（既定） |
+| **Streamable HTTP** | 新しい単一エンドポイント方式。SSE を段階的に置き換え | `type: "streamable-http"` + `url` |
 
 ## トラブルシューティング
 
@@ -106,4 +109,4 @@ Agent は次のように動作します：
 - [mcp.so](https://mcp.so) — グローバル MCP サービスインデックス
 - [ModelScope MCP 広場](https://modelscope.cn/mcp) — 魔搭コミュニティの MCP 広場、中国本土からのアクセスが安定
 
-MCP 標準プロトコル（stdio / SSE）に準拠していれば、コードを一切変更せずに CowAgent に統合できます。
+MCP 標準プロトコル（stdio / SSE / Streamable HTTP）に準拠していれば、コードを一切変更せずに CowAgent に統合できます。
diff --git a/docs/ja/tools/vision.mdx b/docs/ja/tools/vision.mdx
index 0c3c9d9a..06eb287d 100644
--- a/docs/ja/tools/vision.mdx
+++ b/docs/ja/tools/vision.mdx
@@ -1,41 +1,57 @@
 ---
-title: vision - 画像分析
-description: 画像コンテンツの分析（認識、説明、OCR など）
+title: vision - 画像理解
+description: 画像コンテンツを分析（認識、説明、OCR など）
 ---
 
 Vision API を使用してローカル画像や画像 URL を分析します。コンテンツの説明、テキスト抽出（OCR）、オブジェクト認識などに対応しています。
 
 ## モデル選択
 
-Vision ツールは多段階の自動選択＋自動フォールバック戦略を採用しており、手動設定なしで利用可能です：
+Vision ツールは多段階の自動選択 + 自動フォールバック戦略を採用しており、手動設定なしで利用できます：
 
-1. **メインモデル** — 現在設定されているメインモデルで画像認識を実行（追加コストなし）
-2. **その他の設定済みモデル** — API キーが設定されている他のマルチモーダルモデルを自動検出
-3. **OpenAI** — `open_ai_api_key` を使用して gpt-4.1-mini を呼び出し
-4. **LinkAI** — `linkai_api_key` を使用して LinkAI ビジョンサービスを呼び出し
+1. **メインモデル** — 現在設定されているメインモデルを優先的に使用して画像認識を行います（マルチモーダルモデルである必要があります）
+2. **その他の設定済みモデル** — API Key が設定済みのその他のマルチモーダルモデルを自動的に検出して候補とします
 
-`use_linkai=true` の場合、LinkAI が最優先になります。
-
-現在のプロバイダーが失敗した場合、成功するかすべて失敗するまで自動的に次のプロバイダーを試行します。
+現在のプロバイダーで呼び出しに失敗した場合、成功するかすべて失敗するまで自動的に次のプロバイダーを試行します。
 
 ### 対応モデル
 
-| ベンダー | ビジョンモデル | 説明 |
+| プロバイダー | ビジョンモデル | 説明 |
 | --- | --- | --- |
-| OpenAI / 互換プロトコル | メインモデル | すべての OpenAI 互換マルチモーダルモデルに対応 |
-| Baidu Qianfan | メインモデル | 多モーダルの主モデル（`ernie-5.1` など）は直接画像を処理。テキスト専用主モデルの場合は `ernie-4.5-turbo-vl` に自動フォールバック |
-| 通義千問 (DashScope) | メインモデル | MultiModalConversation API 経由 |
-| Claude | メインモデル | Anthropic ネイティブ画像形式 |
-| Gemini | メインモデル | inlineData 形式 |
-| 豆包 (Doubao) | メインモデル | doubao-seed-2-0 シリーズがネイティブ対応 |
-| Kimi (Moonshot) | メインモデル | kimi-k2.6、kimi-k2.5 がネイティブ対応 |
+| OpenAI / 互換プロトコル | メインモデルを使用 | すべての OpenAI 互換マルチモーダルモデルに対応 |
+| 通義千問 (DashScope) | メインモデルを使用 | 例：qwen3.6-plus など |
+| Claude | メインモデルを使用 | Anthropic ネイティブ画像形式 |
+| Gemini | メインモデルを使用 | inlineData 形式 |
+| 豆包 (Doubao) | メインモデルを使用 | doubao-seed-2-0 シリーズがネイティブ対応 |
+| Kimi (Moonshot) | メインモデルを使用 | kimi-k2.6、kimi-k2.5 がネイティブ対応 |
+| 百度 Qianfan | メインモデルを使用 | デフォルトでマルチモーダルメインモデル（`ernie-5.1` など）を使用。メインモデルが非対応の場合は `ernie-4.5-turbo-vl` にフォールバック |
 | 智谱 AI | glm-5v-turbo | 常にビジョン専用モデルを使用 |
 | MiniMax | MiniMax-Text-01 | 常にビジョン専用モデルを使用 |
 
 <Note>
-  智谱 AI と MiniMax のテキストモデルは画像理解に対応していないため、対応するビジョン専用モデルが自動的に使用されます。
+  智谱と MiniMax のテキストモデルは画像理解に対応していないため、常に対応するビジョン専用モデルが使用されます。手動で指定する必要はありません。
 </Note>
 
+> `use_linkai=true` の場合、デフォルトで LinkAI のマルチモーダルモデルが使用されます。
+
+## カスタム設定
+
+Vision で使用するモデルを指定したい場合は、`config.json` に以下のように設定できます：
+
+```json
+{
+    "tools": {
+        "vision": {
+            "model": "gpt-4.1"
+        }
+    }
+}
+```
+
+指定したモデルが**優先的に使用**され、ツールはモデル名に応じて対応するプロバイダーへ自動ルーティングします。呼び出しに失敗した場合は、他の設定済みプロバイダーへ自動的にフォールバックします。
+
+ほとんどの場合、設定は不要です。メインモデルがマルチモーダルに対応しているか、ビジョン対応の API Key が 1 つでも設定されていれば自動的に動作します。
+
 ## パラメータ
 
 | パラメータ | 型 | 必須 | 説明 |
@@ -45,29 +61,15 @@ Vision ツールは多段階の自動選択＋自動フォールバック戦略
 
 対応画像形式：jpg、jpeg、png、gif、webp
 
-## カスタム設定
 
-Vision ツールで使用するモデルを指定するには、`config.json` に以下を追加します：
-
-```json
-{
-    "tool": {
-        "vision": {
-            "model": "ernie-4.5-turbo-vl"
-        }
-    }
-}
-```
-
-ほとんどの場合、設定は不要です。メインモデルがマルチモーダルに対応しているか、ビジョン対応の API キーが設定されていれば自動的に動作します。
 
 ## ユースケース
 
 - 画像コンテンツの説明
 - 画像からのテキスト抽出（OCR）
-- オブジェクト、色、シーンの識別
-- スクリーンショットやスキャン文書の分析
+- オブジェクト、色、シーンの認識
+- スクリーンショットやスキャン文書などの分析
 
 <Note>
-  1MB を超える画像は自動的に圧縮されます（最大辺 1536px）。すべての画像（リモート URL を含む）は base64 に変換して送信され、すべてのモデルバックエンドとの互換性を確保します。
+  1MB を超える画像は自動的に圧縮してアップロードされます。すべての画像（リモート URL を含む）は base64 に統一変換して送信され、すべてのモデルバックエンドとの互換性を確保します。
 </Note>
diff --git a/docs/ja/tools/web-fetch.mdx b/docs/ja/tools/web-fetch.mdx
new file mode 100644
index 00000000..f509a181
--- /dev/null
+++ b/docs/ja/tools/web-fetch.mdx
@@ -0,0 +1,32 @@
+---
+title: web_fetch - Web 取得
+description: Web ページやドキュメントのコンテンツを取得
+---
+
+HTTP/HTTPS URL の内容を取得します。Web ページからは可読テキストを抽出し、ドキュメントファイル（PDF、Word、Excel など）は自動でダウンロードして解析します。
+
+## パラメータ
+
+| パラメータ | 型 | 必須 | 説明 |
+| --- | --- | --- | --- |
+| `url` | string | はい | HTTP/HTTPS URL（Web ページまたはドキュメント） |
+
+## 対応ファイル形式
+
+| 種別 | 形式 |
+| --- | --- |
+| PDF | `.pdf` |
+| Word | `.docx` |
+| テキスト | `.txt`、`.md`、`.csv`、`.log` |
+| 表計算 | `.xls`、`.xlsx` |
+| プレゼン | `.ppt`、`.pptx` |
+
+## ユースケース
+
+- Web ページの可読テキストを抽出する
+- リモートドキュメントのダウンロードと解析
+- API レスポンスの確認
+
+<Note>
+  `web_fetch` は静的 HTML のみ取得できます。JavaScript レンダリングが必要なページ（SPA など）は `browser` ツールを使用してください。
+</Note>
diff --git a/docs/ja/tools/web-search.mdx b/docs/ja/tools/web-search.mdx
index e5e04d93..dc49a927 100644
--- a/docs/ja/tools/web-search.mdx
+++ b/docs/ja/tools/web-search.mdx
@@ -1,32 +1,51 @@
 ---
-title: web_search - Web検索
-description: インターネットからリアルタイム情報を検索
+title: web_search - Web 検索
+description: インターネットからリアルタイム情報を検索。複数の検索プロバイダーに対応
 ---
 
-インターネットからリアルタイムの情報、ニュース、リサーチなどを検索します。2つの検索バックエンドに対応し、自動フォールバック機能を備えています。
+インターネットからリアルタイム情報、ニュース、リサーチなどを検索します。Bocha、百度 Qianfan、智谱（Zhipu）、LinkAI の 4 つのバックエンドに対応しており、いずれか 1 社を設定すれば利用可能です。
 
-## 依存関係
+<Tip>
+  [Web コンソール](/ja/channels/web) の「モデル管理 → 検索」パネルから、プロバイダーと戦略を可視化して設定するのが推奨です。設定ファイルを手動で編集する必要はありません。
+</Tip>
 
-少なくとも1つの検索APIキーが必要です（`env_config` Toolまたはワークスペースの `.env` ファイルで設定）：
+## プロバイダー
 
-| バックエンド | 環境変数 | 優先度 | 取得方法 |
-| --- | --- | --- | --- |
-| Bocha Search | `BOCHA_API_KEY` | プライマリ | [Bocha Open Platform](https://open.bochaai.com/) |
-| LinkAI Search | `LINKAI_API_KEY` | フォールバック | [LinkAI Console](https://link-ai.tech/console/interface) |
+| プロバイダー | 認証情報 | 申請窓口 |
+| --- | --- | --- |
+| Bocha | `tools.web_search.bocha_api_key` | [Bocha Open Platform](https://open.bochaai.com/) |
+| 百度 Qianfan | `qianfan_api_key` を再利用 | [Qianfan コンソール](https://cloud.baidu.com/doc/qianfan/s/2mh4su4uy) |
+| 智谱 Zhipu | `zhipu_ai_api_key` を再利用 | [Zhipu Open Platform](https://docs.bigmodel.cn/cn/guide/tools/web-search) |
+| LinkAI | `linkai_api_key` を再利用 | [LinkAI コンソール](https://link-ai.tech/console/interface) |
 
-## パラメータ
+Bocha のみ独立した `bocha_api_key` が必要ですが、他の 3 社は対応するモデルの API Key をそのまま再利用するため、モデルを設定すれば検索機能も同時に利用可能になります。
+
+## ルーティング戦略
+
+```json
+{
+  "tools": {
+    "web_search": {
+      "strategy": "auto",
+      "provider": ""
+    }
+  }
+}
+```
+
+- `auto`（デフォルト）：Agent が設定済みのプロバイダーから自動的に選択し、1 回のタスク内で複数回呼び出し、異なるプロバイダーを切り替えてより包括的な結果を取得できます。未指定の場合は `bocha → qianfan → zhipu → linkai` の順でフォールバックします。
+- `fixed`：`provider` で指定したプロバイダーに固定。該当プロバイダーの認証情報が欠けている場合は自動的に auto の順序にフォールバックします。
+
+## ツールパラメータ
 
 | パラメータ | 型 | 必須 | 説明 |
 | --- | --- | --- | --- |
 | `query` | string | はい | 検索キーワード |
-| `count` | integer | いいえ | 結果件数（1-50、デフォルト10） |
-| `freshness` | string | いいえ | 期間指定：`noLimit`、`oneDay`、`oneWeek`、`oneMonth`、`oneYear`、または `2025-01-01..2025-02-01` のような日付範囲 |
-| `summary` | boolean | いいえ | ページ要約を返す（デフォルトfalse） |
-
-## ユースケース
-
-ユーザーが最新情報について質問したり、事実確認やリアルタイムデータが必要な場合、AgentはこのToolを自動的に呼び出します。
+| `count` | integer | いいえ | 返却する結果数（1–50、デフォルト 10） |
+| `freshness` | string | いいえ | 期間指定：`noLimit`（デフォルト）、`oneDay`、`oneWeek`、`oneMonth`、`oneYear`、または `2025-01-01..2025-02-01` のような日付範囲 |
+| `summary` | boolean | いいえ | ページ要約を返すか（デフォルト false） |
+| `provider` | string | いいえ | `auto` 戦略で複数プロバイダーを設定している場合に表示。単回のプロバイダー切り替えに使用 |
 
 <Note>
-  検索APIキーが設定されていない場合、このToolは読み込まれません。
+  4 社の認証情報がいずれも未設定の場合、このツールは Agent に登録されません。
 </Note>
diff --git a/docs/memory/index.mdx b/docs/memory/index.mdx
index cfcea5a7..c6dc0e65 100644
--- a/docs/memory/index.mdx
+++ b/docs/memory/index.mdx
@@ -27,7 +27,7 @@ Agent 通过以下机制自动将对话内容持久化为长期记忆：
 
 - **上下文裁剪时** — 当对话轮次或 token 超出配置上限时，裁剪最早一半的上下文，使用 LLM 将被裁剪的内容总结为关键信息写入当天记忆文件，并将摘要异步注入到保留的上下文中，帮助模型保持对话连贯性
 - **每日定时总结** — 每天 23:55 自动触发一次全量总结，防止低活跃日无记忆留存（内容无变化时自动跳过）
-- **[梦境蒸馏（Deep Dream）](/memory/deep-dream)** — 每日总结完成后自动执行，将天级记忆蒸馏合并到 MEMORY.md，并生成梦境日记
+- [梦境蒸馏（Deep Dream）](/memory/deep-dream) — 每日总结完成后自动执行，将天级记忆蒸馏合并到 MEMORY.md，并生成梦境日记
 - **API 上下文溢出时** — 当模型 API 返回上下文溢出错误时，紧急保存当前对话摘要
 
 所有记忆写入均在后台异步执行（LLM 总结 + 文件写入），不阻塞正常对话回复。
diff --git a/docs/models/claude.mdx b/docs/models/claude.mdx
index 920f54cd..ee1809d6 100644
--- a/docs/models/claude.mdx
+++ b/docs/models/claude.mdx
@@ -1,17 +1,50 @@
 ---
 title: Claude
-description: Claude 模型配置
+description: Anthropic Claude 模型配置（文本对话 + 图像理解）
 ---
 
+Claude 由 Anthropic 提供，支持文本对话与图像理解，主流 Sonnet / Opus 模型均原生支持视觉，无需额外指定 Vision 模型。
+
+<Tip>
+  通过 Web 控制台的「模型管理」页面可一站式配置以下全部能力，无需手动改配置文件。
+</Tip>
+
+## 文本对话
+
 ```json
 {
-  "model": "claude-sonnet-4-6",
+  "model": "claude-opus-4-8",
   "claude_api_key": "YOUR_API_KEY"
 }
 ```
 
 | 参数 | 说明 |
 | --- | --- |
-| `model` | 支持 `claude-sonnet-4-6`、`claude-opus-4-7`、`claude-opus-4-6`、`claude-sonnet-4-5`、`claude-sonnet-4-0`、`claude-3-5-sonnet-latest` 等，参考 [官方模型](https://docs.anthropic.com/en/docs/about-claude/models/overview) |
+| `model` | 支持 `claude-opus-4-8`、`claude-opus-4-7`、`claude-sonnet-4-6`、`claude-opus-4-6`、`claude-sonnet-4-5`、`claude-sonnet-4-0`、`claude-3-5-sonnet-latest` 等，参考 [官方模型](https://docs.anthropic.com/en/docs/about-claude/models/overview) |
 | `claude_api_key` | 在 [Claude 控制台](https://console.anthropic.com/settings/keys) 创建 |
-| `claude_api_base` | 可选，默认为 `https://api.anthropic.com/v1`，修改可接入第三方代理 |
+| `claude_api_base` | 可选，默认为 `https://api.anthropic.com/v1`，可改为第三方代理 |
+
+### 模型选择
+
+| 模型 | 适用场景 |
+| --- | --- |
+| `claude-opus-4-8` | 默认推荐，最新旗舰，复杂推理与长链路任务效果最佳 |
+| `claude-opus-4-7` | 上一代 Opus 旗舰 |
+| `claude-sonnet-4-6` | 性价比与速度平衡，成本更低 |
+| `claude-opus-4-6` / `claude-sonnet-4-5` / `claude-sonnet-4-0` | 更早的旗舰，价格更低 |
+
+## 图像理解
+
+配置 `claude_api_key` 后 Agent 的 Vision 工具会自动使用 Claude 主模型识别图像，无需额外配置。
+
+如需手动指定 Vision 模型，可在配置文件中显式配置：
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "claude-sonnet-4-6"
+    }
+  }
+}
+```
diff --git a/docs/models/custom.mdx b/docs/models/custom.mdx
index 907dbac3..2673a8de 100644
--- a/docs/models/custom.mdx
+++ b/docs/models/custom.mdx
@@ -13,7 +13,7 @@ description: 自定义厂商配置，适用于第三方 API 代理和本地模
   与 `openai` 厂商的区别：选择自定义厂商后，通过 `/config model` 切换模型时，不会自动切换厂商类型，始终使用自定义的 API 地址。
 </Note>
 
-## 配置方式
+## 文本对话
 
 ### 第三方 API 代理
 
@@ -35,7 +35,7 @@ description: 自定义厂商配置，适用于第三方 API 代理和本地模
 
 ### 本地模型
 
-本地模型通常不需要 API Key，只需填写 API Base 即可：
+本地模型通常不需要 API Key，只需填写 API Base：
 
 ```json
 {
@@ -53,7 +53,7 @@ description: 自定义厂商配置，适用于第三方 API 代理和本地模
 | [vLLM](https://docs.vllm.ai) | `http://localhost:8000/v1` |
 | [LocalAI](https://localai.io) | `http://localhost:8080/v1` |
 
-## 切换模型
+### 切换模型
 
 自定义厂商下切换模型时，只会修改 `model`，不会改变 `bot_type` 和 API 地址：
 
diff --git a/docs/models/deepseek.mdx b/docs/models/deepseek.mdx
index a522ce98..57b96d55 100644
--- a/docs/models/deepseek.mdx
+++ b/docs/models/deepseek.mdx
@@ -1,9 +1,11 @@
 ---
 title: DeepSeek
-description: DeepSeek 模型配置
+description: DeepSeek 模型配置（文本对话 + 思考模式）
 ---
 
-方式一：官方接入（推荐）：
+DeepSeek 是当前 Agent 模式默认推荐的厂商之一，主打高性价比的文本对话和任务规划能力。
+
+## 文本对话
 
 ```json
 {
@@ -18,20 +20,20 @@ description: DeepSeek 模型配置
 | `deepseek_api_key` | 在 [DeepSeek 平台](https://platform.deepseek.com/api_keys) 创建 |
 | `deepseek_api_base` | 可选，默认为 `https://api.deepseek.com/v1`，可修改为第三方代理地址 |
 
-## 模型选择
+### 模型选择
 
 | 模型 | 适用场景 |
 | --- | --- |
 | `deepseek-v4-flash` | 默认推荐，速度快、成本低 |
-| `deepseek-v4-pro` | 更智能、复杂任务效果更强 |
+| `deepseek-v4-pro` | 更智能，复杂任务效果更强 |
 
 ## 思考模式
 
-V4 系列（`deepseek-v4-flash` / `deepseek-v4-pro`）支持显式的"思考模式"：模型在输出最终回答前，先输出一段思维链（`reasoning_content`），从而提升答案质量。
+V4 系列（`deepseek-v4-flash` / `deepseek-v4-pro`）支持显式的「思考模式」：模型在输出最终回答前，先输出一段思维链（`reasoning_content`），从而提升答案质量。
 
 ### 开关
 
-通过全局配置 `enable_thinking` 控制：
+通过全局配置 `enable_thinking` 控制，也可在 web控制台 - 配置页面中进行切换：
 
 ```json
 {
@@ -66,16 +68,5 @@ V4 系列（`deepseek-v4-flash` / `deepseek-v4-pro`）支持显式的"思考模
 - **多轮工具调用**：当历史中包含工具调用时，DeepSeek 要求所有 assistant 消息必须回传 `reasoning_content`。CowAgent 会自动处理回传逻辑，跨轮次切换思考开关也不会出错。
 
 <Tip>
-  默认使用 `deepseek-v4-flash`；复杂任务可使用 `deepseek-v4-pro`；需要深度思考可开启 `enable_thinking`。
+  默认使用 `deepseek-v4-flash`；复杂任务可使用 `deepseek-v4-pro`；需要深度推理可开启 `enable_thinking`。
 </Tip>
-
-方式二：OpenAI 兼容方式接入：
-
-```json
-{
-  "model": "deepseek-v4-flash",
-  "bot_type": "openai",
-  "open_ai_api_key": "YOUR_API_KEY",
-  "open_ai_api_base": "https://api.deepseek.com/v1"
-}
-```
diff --git a/docs/models/doubao.mdx b/docs/models/doubao.mdx
index e7440434..cfdc5670 100644
--- a/docs/models/doubao.mdx
+++ b/docs/models/doubao.mdx
@@ -1,17 +1,66 @@
 ---
 title: 豆包 Doubao
-description: 豆包 (火山方舟) 模型配置
+description: 豆包（火山方舟）模型配置（文本 / 图像理解 / 图像生成 / 向量）
 ---
 
+豆包（火山方舟）支持文本对话、图像理解、图像生成（Seedream）和向量能力，一份 `ark_api_key` 即可启用全部能力。
+
+<Tip>
+  通过 Web 控制台的「模型管理」页面可一站式配置以下全部能力，无需手动改配置文件。
+</Tip>
+
+## 文本对话
+
 ```json
 {
-  "model": "doubao-seed-2-0-code-preview-260215",
+  "model": "doubao-seed-2-0-pro-260215",
   "ark_api_key": "YOUR_API_KEY"
 }
 ```
 
 | 参数 | 说明 |
 | --- | --- |
-| `model` | 可填 `doubao-seed-2-0-code-preview-260215`、`doubao-seed-2-0-pro-260215`、`doubao-seed-2-0-lite-260215` 等 |
+| `model` | 可填 `doubao-seed-2-0-pro-260215`、`doubao-seed-2-0-code-preview-260215`、`doubao-seed-2-0-lite-260215` 等 |
 | `ark_api_key` | 在 [火山方舟控制台](https://console.volcengine.com/ark/region:ark+cn-beijing/apikey) 创建 |
 | `ark_base_url` | 可选，默认为 `https://ark.cn-beijing.volces.com/api/v3` |
+
+## 图像理解
+
+配置 `ark_api_key` 后 Agent 的 Vision 工具会自动使用 `doubao-seed-2-0-pro-260215` 识别图像，无需额外配置。
+
+如需手动指定 Vision 模型：
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "doubao-seed-2-0-pro-260215"
+    }
+  }
+}
+```
+
+## 图像生成
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "seedream-5.0-lite"
+    }
+  }
+}
+```
+
+可选模型：`seedream-5.0-lite`、`seedream-4.5`。
+
+## 向量
+
+```json
+{
+  "embedding_provider": "doubao",
+  "embedding_model": "doubao-embedding-vision-251215"
+}
+```
+
+默认模型 `doubao-embedding-vision-251215`（多模态 embedding），可在配置文件中通过 `embedding_dimensions` 指定 1024 或 2048 维。修改 embedding 后需执行 `/memory rebuild-index` 命令重建索引。
diff --git a/docs/models/gemini.mdx b/docs/models/gemini.mdx
index 220e53a2..f1c8991a 100644
--- a/docs/models/gemini.mdx
+++ b/docs/models/gemini.mdx
@@ -1,16 +1,59 @@
 ---
 title: Gemini
-description: Google Gemini 模型配置
+description: Google Gemini 模型配置（文本对话 + 图像理解 + 图像生成）
 ---
 
+Google Gemini 支持文本对话、图像理解和图像生成（Nano Banana 系列），一个 `gemini_api_key` 即可启用全部能力。
+
+<Tip>
+  通过 Web 控制台的「模型管理」页面可一站式配置以下全部能力，无需手动改配置文件。
+</Tip>
+
+## 文本对话
+
 ```json
 {
-  "model": "gemini-3.1-pro-preview",
+  "model": "gemini-3.5-flash",
   "gemini_api_key": "YOUR_API_KEY"
 }
 ```
 
 | 参数 | 说明 |
 | --- | --- |
-| `model` | 支持 `gemini-3.1-flash-lite-preview`、`gemini-3.1-pro-preview`、`gemini-3-flash-preview`、`gemini-3-pro-preview` 等，参考 [官方文档](https://ai.google.dev/gemini-api/docs/models) |
+| `model` | 推荐 `gemini-3.5-flash`，亦支持 `gemini-3.1-pro-preview`、`gemini-3.1-flash-lite-preview`、`gemini-3-flash-preview`、`gemini-3-pro-preview` 等，参考 [官方文档](https://ai.google.dev/gemini-api/docs/models) |
 | `gemini_api_key` | 在 [Google AI Studio](https://aistudio.google.com/app/apikey) 创建 |
+| `gemini_api_base` | 可选，默认为 `https://generativelanguage.googleapis.com`，可改为第三方代理 |
+
+## 图像理解
+
+Gemini 全系列模型均原生支持视觉，配置 `gemini_api_key` 后 Agent 的 Vision 工具会自动使用主模型识别图像，无需额外配置。
+
+如需手动指定 Vision 模型：
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "gemini-3.1-flash-lite-preview"
+    }
+  }
+}
+```
+
+## 图像生成
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "gemini-3.1-flash-image-preview"
+    }
+  }
+}
+```
+
+| 模型 ID | 别名 |
+| --- | --- |
+| `gemini-3.1-flash-image-preview` | Nano Banana 2 |
+| `gemini-3-pro-image-preview` | Nano Banana Pro |
+| `gemini-2.5-flash-image` | Nano Banana |
diff --git a/docs/models/glm.mdx b/docs/models/glm.mdx
index f667efdf..ad5f8fd3 100644
--- a/docs/models/glm.mdx
+++ b/docs/models/glm.mdx
@@ -1,8 +1,16 @@
 ---
 title: 智谱 GLM
-description: 智谱AI GLM 模型配置
+description: 智谱 AI GLM 模型配置（文本 / 图像理解 / 语音识别 / 向量）
 ---
 
+智谱 AI 支持文本对话、图像理解、语音识别（ASR）和向量（Embedding），一份 `zhipu_ai_api_key` 即可启用全部能力。
+
+<Tip>
+  通过 Web 控制台的「模型管理」页面可一站式配置以下全部能力，无需手动改配置文件。
+</Tip>
+
+## 文本对话
+
 ```json
 {
   "model": "glm-5.1",
@@ -13,15 +21,36 @@ description: 智谱AI GLM 模型配置
 | 参数 | 说明 |
 | --- | --- |
 | `model` | 可填 `glm-5.1`、`glm-5-turbo`、`glm-5`、`glm-4.7`、`glm-4-plus`、`glm-4-flash`、`glm-4-air` 等，参考 [模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4) |
-| `zhipu_ai_api_key` | 在 [智谱AI 控制台](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) 创建 |
+| `zhipu_ai_api_key` | 在 [智谱 AI 控制台](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) 创建 |
+| `zhipu_ai_api_base` | 可选，默认为 `https://open.bigmodel.cn/api/paas/v4` |
 
-也支持 OpenAI 兼容方式接入：
+## 图像理解
+
+智谱 chat 系列模型（`glm-5.1`、`glm-5-turbo` 等）不支持视觉，视觉调用统一路由到 `glm-5v-turbo`。配置 `zhipu_ai_api_key` 后 Agent 的 Vision 工具会自动使用该模型，无需在配置文件中显式指定。
+
+## 语音识别
 
 ```json
 {
-  "bot_type": "openai",
-  "model": "glm-5.1",
-  "open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "voice_to_text": "zhipu",
+  "voice_to_text_model": "glm-asr-2512"
 }
 ```
+
+| 参数 | 说明 |
+| --- | --- |
+| `voice_to_text` | 设为 `zhipu` 启用智谱 ASR |
+| `voice_to_text_model` | 可选，默认 `glm-asr-2512` |
+
+凭证自动复用 `zhipu_ai_api_key`。语音文件建议小于 25MB，超大文件可能被服务端拒绝。
+
+## 向量
+
+```json
+{
+  "embedding_provider": "zhipu",
+  "embedding_model": "embedding-3"
+}
+```
+
+可选模型：`embedding-3`、`embedding-2`。修改 embedding 后需执行 `/memory rebuild-index` 命令重建索引。
diff --git a/docs/models/index.mdx b/docs/models/index.mdx
index afe9798e..02402b6a 100644
--- a/docs/models/index.mdx
+++ b/docs/models/index.mdx
@@ -1,67 +1,40 @@
 ---
 title: 模型概览
-description: CowAgent 支持的模型及推荐选择
+description: CowAgent 支持的模型厂商及能力矩阵
 ---
 
-CowAgent 支持国内外主流厂商的大语言模型，模型接口实现在项目的 `models/` 目录下。
+CowAgent 支持国内外主流厂商的大语言模型，模型接口实现在项目的 `models/` 目录下。除文本对话外，部分厂商还提供视觉理解、图像生成、语音识别、语音合成、向量等能力，可在 Agent 流程中按需调用。
 
-<Note>
-  Agent 模式下推荐使用以下模型，可根据效果及成本综合选择：deepseek-v4-flash、MiniMax-M2.7、claude-sonnet-4-6、gemini-3.1-pro-preview、glm-5.1、qwen3.6-plus、kimi-k2.6、ernie-5.1
 
-  同时支持使用 [LinkAI](https://link-ai.tech) 平台接口，可灵活切换多种模型，并支持知识库、工作流、插件等 Agent 能力。
-</Note>
+## 模型能力总览
+
+各厂商提供的能力一览。「文本」指主对话模型，其余列表示该厂商可承担对应 Agent 能力。
+
+| 厂商 | 代表模型 | 文本 | 图像理解 | 图像生成 | 语音识别 | 语音合成 | 向量 |
+| --- | --- | :-: | :-: | :-: | :-: | :-: | :-: |
+| [DeepSeek](/models/deepseek) | deepseek-v4-flash / pro | ✅ | | | | | |
+| [MiniMax](/models/minimax) | MiniMax-M2.7 | ✅ | ✅ | ✅ | | ✅ | |
+| [Claude](/models/claude) | claude-opus-4-8 | ✅ | ✅ | | | | |
+| [Gemini](/models/gemini) | gemini-3.5-flash | ✅ | ✅ | ✅ | | | |
+| [OpenAI](/models/openai) | gpt-5.5、o 系列 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [智谱 GLM](/models/glm) | glm-5.1、glm-5v-turbo | ✅ | ✅ | | ✅ | | ✅ |
+| [通义千问](/models/qwen) | qwen3.7-max | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [豆包 Doubao](/models/doubao) | doubao-seed-2.0 系列 | ✅ | ✅ | ✅ | | | ✅ |
+| [Kimi](/models/kimi) | kimi-k2.6 | ✅ | ✅ | | | | |
+| [百度千帆](/models/qianfan) | ernie-5.1 | ✅ | ✅ | | | | |
+| [小米 MiMo](/models/mimo) | mimo-v2.5-pro / v2.5 | ✅ | ✅ | | | ✅ | |
+| [LinkAI](/models/linkai) | 多厂商 100+ 模型统一接入 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [自定义](/models/custom) |本地模型 / 三方代理 | ✅ | | | | | |
+
+<Tip>
+  Web 控制台中各项能力（视觉 / 图像 / 语音识别 / 语音合成 / 向量 / 网络搜索）均可独立配置厂商与模型，互相之间不强制绑定。
+</Tip>
+
 
 ## 配置方式
 
-**方式一（推荐）：** 通过 [Web 控制台](/channels/web) 在线管理模型配置，无需手动编辑配置文件：
+**方式一（推荐）：** 通过 [Web 控制台](/channels/web) 在线管理模型与各项能力，无需手动编辑配置文件：
 
-<img width="850" src="https://cdn.link-ai.tech/doc/20260227173811.png" />
+<img width="900" src="https://cdn.link-ai.tech/doc/20260521212527.png" />
 
 **方式二：** 手动编辑 `config.json`，根据所选模型填写对应的模型名称和 API Key。每个模型也支持 OpenAI 兼容方式接入，将 `bot_type` 设为 `openai`，配置 `open_ai_api_base` 和 `open_ai_api_key` 即可。
-
-
-## 支持的模型
-
-<CardGroup cols={2}>
-  <Card title="DeepSeek" href="/models/deepseek">
-    deepseek-v4-flash、deepseek-v4-pro 等
-  </Card>
-  <Card title="百度千帆 / ERNIE" href="/models/qianfan">
-    ernie-5.1、ernie-5.0、ernie-4.5-turbo-128k 等
-  </Card>
-  <Card title="MiniMax" href="/models/minimax">
-    MiniMax-M2.7 等系列模型
-  </Card>
-  <Card title="Claude" href="/models/claude">
-    claude-sonnet-4-6 等
-  </Card>
-  <Card title="Gemini" href="/models/gemini">
-    gemini-3.1-pro-preview 等
-  </Card>
-  <Card title="OpenAI" href="/models/openai">
-    gpt-5.4、gpt-4.1、o 系列等
-  </Card>
-  <Card title="智谱 GLM" href="/models/glm">
-    glm-5.1、glm-5-turbo、glm-5 等系列模型
-  </Card>
-  <Card title="通义千问 Qwen" href="/models/qwen">
-    qwen3.6-plus、qwen3-max 等
-  </Card>
-  <Card title="豆包 Doubao" href="/models/doubao">
-    doubao-seed 系列模型
-  </Card>
-  <Card title="Kimi" href="/models/kimi">
-    kimi-k2.6、kimi-k2.5、kimi-k2 等
-  </Card>
-  <Card title="LinkAI" href="/models/linkai">
-    多模型统一接口 + 知识库
-  </Card>
-  <Card title="自定义" href="/models/custom">
-    第三方代理、本地模型等
-  </Card>
-</CardGroup>
-
-
-<Tip>
-  全部模型名称可参考项目 [`common/const.py`](https://github.com/zhayujie/CowAgent/blob/master/common/const.py) 文件。
-</Tip>
diff --git a/docs/models/kimi.mdx b/docs/models/kimi.mdx
index a75cadea..beb5beaf 100644
--- a/docs/models/kimi.mdx
+++ b/docs/models/kimi.mdx
@@ -1,8 +1,16 @@
 ---
 title: Kimi
-description: Kimi (Moonshot) 模型配置
+description: Kimi（Moonshot）模型配置（文本对话 + 图像理解）
 ---
 
+Kimi 由 Moonshot 提供，支持文本对话与图像理解，`kimi-k2.x` 系列原生支持视觉。
+
+<Tip>
+  通过 Web 控制台的「模型管理」页面可一站式配置以下全部能力，无需手动改配置文件。
+</Tip>
+
+## 文本对话
+
 ```json
 {
   "model": "kimi-k2.6",
@@ -14,14 +22,20 @@ description: Kimi (Moonshot) 模型配置
 | --- | --- |
 | `model` | 可填 `kimi-k2.6`、`kimi-k2.5`、`kimi-k2`、`moonshot-v1-8k`、`moonshot-v1-32k`、`moonshot-v1-128k` |
 | `moonshot_api_key` | 在 [Moonshot 控制台](https://platform.moonshot.cn/console/api-keys) 创建 |
+| `moonshot_base_url` | 可选，默认为 `https://api.moonshot.cn/v1` |
 
-也支持 OpenAI 兼容方式接入：
+## 图像理解
+
+配置 `moonshot_api_key` 后 Agent 的 Vision 工具会自动使用 `kimi-k2.6` 识别图像，无需额外配置。
+
+如需手动指定 Vision 模型：
 
 ```json
 {
-  "bot_type": "openai",
-  "model": "kimi-k2.6",
-  "open_ai_api_base": "https://api.moonshot.cn/v1",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "tools": {
+    "vision": {
+      "model": "kimi-k2.6"
+    }
+  }
 }
 ```
diff --git a/docs/models/linkai.mdx b/docs/models/linkai.mdx
index 776bc7c9..68647ebc 100644
--- a/docs/models/linkai.mdx
+++ b/docs/models/linkai.mdx
@@ -1,9 +1,15 @@
 ---
 title: LinkAI
-description: 通过 LinkAI 平台统一接入多种模型
+description: 通过 LinkAI 平台统一接入文本、视觉、图像、语音与向量能力
 ---
 
-通过 [LinkAI](https://link-ai.tech) 平台可灵活切换 OpenAI、Claude、Gemini、DeepSeek、MiniMax、Qwen、Kimi 等多种模型，并支持知识库、工作流、插件等 Agent 能力。
+通过一份 `linkai_api_key` 即可访问 OpenAI、Claude、Gemini、DeepSeek、MiniMax、Qwen、Kimi、豆包 等主流厂商的全部能力。
+
+<Tip>
+  通过 Web 控制台的「模型管理」页面可一站式配置以下全部能力，无需手动改配置文件。
+</Tip>
+
+## 文本对话
 
 ```json
 {
@@ -14,8 +20,84 @@ description: 通过 LinkAI 平台统一接入多种模型
 
 | 参数 | 说明 |
 | --- | --- |
-| `use_linkai` | 设为 `true` 启用 LinkAI 接口 |
+| `use_linkai` | 设为 `true` 启用 |
 | `linkai_api_key` | 在 [控制台](https://link-ai.tech/console/interface) 创建 |
-| `model` | 留空则使用智能体默认模型，可在平台中灵活切换，[模型列表](https://link-ai.tech/console/models) 中的全部模型均可使用 |
+| `model` | 可填写 [模型列表](https://link-ai.tech/console/models) 中任意编码 |
 
-参考 [接口文档](https://docs.link-ai.tech/platform/api) 了解更多。
+前往 [模型服务](https://link-ai.tech/console/models) 了解更多。
+
+## 图像理解
+
+配置完成后 Agent 的 Vision 工具会自动调用网关上的多模态模型，无需额外配置。如需手动指定 Vision 模型：
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "gpt-5.4-mini"
+    }
+  }
+}
+```
+
+可选模型：`gpt-4.1-mini`、`gpt-5.4-mini`、`qwen3.6-plus`、`doubao-seed-2-0-pro-260215`、`kimi-k2.6`、`claude-sonnet-4-6`、`gemini-3.1-flash-lite-preview` 等。
+
+## 图像生成
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "gpt-image-2"
+    }
+  }
+}
+```
+
+| 模型 ID | 别名 |
+| --- | --- |
+| `gpt-image-2` | OpenAI |
+| `gemini-3.1-flash-image-preview` | Nano Banana 2 |
+| `gemini-3-pro-image-preview` | Nano Banana Pro |
+| `seedream-5.0-lite` | 字节豆包 Seedream |
+
+## 语音识别
+
+```json
+{
+  "voice_to_text": "linkai"
+}
+```
+
+ASR 固定使用 Whisper，凭证自动复用 `linkai_api_key`。
+
+## 语音合成
+
+语音合成网关下支持多个底层 TTS 引擎，按 `text_to_voice_model` 选择引擎，音色随引擎切换。
+
+```json
+{
+  "text_to_voice": "linkai",
+  "text_to_voice_model": "doubao",
+  "tts_voice_id": "BV001_streaming"
+}
+```
+
+| `text_to_voice_model` | 引擎说明 |
+| --- | --- |
+| `tts-1` | OpenAI · 多语种通用（音色 `alloy` / `nova` / `echo` 等） |
+| `doubao` | 字节豆包 · 中文音色丰富 |
+| `baidu` | 百度 · 中文主播音色 |
+
+不同引擎对应的音色不同，建议在 Web 控制台「模型管理 → 语音合成」中可视化选择。
+
+## 向量
+
+```json
+{
+  "embedding_provider": "linkai",
+  "embedding_model": "text-embedding-3-small"
+}
+```
+
+默认模型 `text-embedding-3-small`（OpenAI 兼容）。修改 embedding 后需执行 `/memory rebuild-index` 命令重建索引。
diff --git a/docs/models/mimo.mdx b/docs/models/mimo.mdx
new file mode 100644
index 00000000..ea445df9
--- /dev/null
+++ b/docs/models/mimo.mdx
@@ -0,0 +1,135 @@
+---
+title: 小米 MiMo
+description: 小米 MiMo 模型配置（文本对话 + 图像理解 + 语音合成）
+---
+
+小米 MiMo 是原生全模态大模型，单 `mimo_api_key` 即可同时启用文本对话、图像理解与语音合成。
+
+<Tip>
+  通过 Web 控制台的「模型管理」页面可一站式配置以下全部能力，无需手动改配置文件。
+</Tip>
+
+## 文本对话
+
+```json
+{
+  "model": "mimo-v2.5-pro",
+  "mimo_api_key": "YOUR_API_KEY",
+  "mimo_api_base": "https://api.xiaomimimo.com/v1"
+}
+```
+
+| 参数 | 说明 |
+| --- | --- |
+| `model` | 默认推荐 `mimo-v2.5-pro`，也可使用 `mimo-v2.5` |
+| `mimo_api_key` | 在 [MiMo 开放平台](https://platform.xiaomimimo.com/console/api-keys) 创建 |
+| `mimo_api_base` | 可选，默认为 `https://api.xiaomimimo.com/v1` |
+
+### 模型选择
+
+| 模型 | 适用场景 |
+| --- | --- |
+| `mimo-v2.5-pro` | 旗舰，原生全模态 + Agent 能力，最高 100 万 tokens 上下文 |
+| `mimo-v2.5` | 综合版，原生全模态（文本 / 图像 / 视频 / 音频） |
+
+## 思考模式
+
+MiMo V2.5 系列默认开启「思考模式」：模型在输出最终回答前会先输出 `reasoning_content`（思维链），提升复杂任务表现。
+
+通过全局配置 `enable_thinking` 控制是否展示（也可在 Web 控制台 - 配置页面切换）：
+
+```json
+{
+  "enable_thinking": true
+}
+```
+
+## 图像理解
+
+配置 `mimo_api_key` 后，Agent 的 Vision 工具可以自动使用 MiMo 视觉模型：
+
+- 当主模型本身是多模态时（`mimo-v2.5-pro` / `mimo-v2.5`），直接由主模型识别图像，无需额外配置
+- 当主模型是其他厂商时，Vision 工具会根据顺序自动 fallback 到 `mimo-v2.5-pro`
+
+如需手动指定 Vision 模型，可在配置文件中显式配置：
+
+```json
+{
+  "tools": {
+    "vision": {
+      "provider": "mimo",
+      "model": "mimo-v2.5-pro"
+    }
+  }
+}
+```
+
+## 语音合成
+
+```json
+{
+  "text_to_voice": "mimo",
+  "text_to_voice_model": "mimo-v2.5-tts",
+  "tts_voice_id": "冰糖"
+}
+```
+
+| 参数 | 说明 |
+| --- | --- |
+| `text_to_voice_model` | 当前仅支持 `mimo-v2.5-tts`（预置音色 + 唱歌模式） |
+| `tts_voice_id` | 预置音色名（中文音色直接使用中文名作为 ID） |
+
+### 预置音色
+
+| 音色 ID | 说明 |
+| --- | --- |
+| `冰糖` | 中文 · 女声（默认） |
+| `茉莉` | 中文 · 女声 |
+| `苏打` | 中文 · 男声 |
+| `白桦` | 中文 · 男声 |
+| `Mia` | 英文 · 女声 |
+| `Chloe` | 英文 · 女声 |
+| `Milo` | 英文 · 男声 |
+| `Dean` | 英文 · 男声 |
+
+也可在 Web 控制台的「模型管理 → 语音合成」下拉框中可视化选择。
+
+### 风格控制
+
+MiMo TTS 支持在合成文本中嵌入 **音频标签** 来控制情绪、语调、方言、角色甚至唱歌。标签需出现在 **最终被合成为语音的文本（即 Agent 回复内容）** 中，整体风格标签写在开头：
+
+```
+(风格)待合成内容
+```
+
+支持半角 `()`、全角 `（）` 或 `[]` 三种括号。常见风格示例：
+
+| 类型 | 示例标签 |
+| --- | --- |
+| 基础情绪 | `开心` `悲伤` `愤怒` `恐惧` `惊讶` `兴奋` `委屈` `平静` `冷漠` |
+| 复合情绪 | `怅然` `欣慰` `无奈` `愧疚` `释然` `忐忑` `动情` |
+| 整体语调 | `温柔` `高冷` `活泼` `严肃` `慵懒` `俏皮` `深沉` `干练` `凌厉` |
+| 音色定位 | `磁性` `醇厚` `清亮` `空灵` `稚嫩` `苍老` `甜美` `沙哑` |
+| 人设腔调 | `夹子音` `御姐音` `正太音` `大叔音` `台湾腔` |
+| 方言 | `东北话` `四川话` `河南话` `粤语` |
+| 角色扮演 | `孙悟空` `林黛玉` |
+| 唱歌 | `唱歌`（等价于 `sing` / `singing`） |
+
+示例：
+
+- (磁性)夜已经深了，城市还在呼吸。
+- (东北话)哎呀妈呀，这天儿也忒冷了吧！
+- (粤语)呢个真係好正啊！
+- (唱歌)原谅我这一生不羁放纵爱自由…
+
+也可以在文本任意位置插入细粒度音频标签来控制呼吸、笑声、停顿等，例如：
+
+```
+（紧张，深呼吸）呼……冷静，冷静。（语速加快）自我介绍我背了五十遍了，应该没问题。
+```
+
+完整标签列表参见 [MiMo 语音合成文档](https://platform.xiaomimimo.com/docs/zh-CN/usage-guide/speech-synthesis-v2.5)。
+
+<Tip>
+  CowAgent 在调用 TTS 时会将 Agent 的回复原文（含 `(...)` 标签）直接送入 MiMo 合成。你可以在人设 / 系统提示词里要求模型「在回复开头用 `(风格)` 标签控制语气」，即可让 IM 渠道（微信 / 飞书 / 钉钉 / 企微）的语音回复带上情绪、方言、唱歌等效果。
+</Tip>
diff --git a/docs/models/minimax.mdx b/docs/models/minimax.mdx
index 299a7064..8282f88b 100644
--- a/docs/models/minimax.mdx
+++ b/docs/models/minimax.mdx
@@ -1,8 +1,16 @@
 ---
 title: MiniMax
-description: MiniMax 模型配置
+description: MiniMax 模型配置（文本 / 图像理解 / 图像生成 / 语音合成）
 ---
 
+MiniMax 支持文本对话、图像理解、图像生成与语音合成，一份 `minimax_api_key` 即可启用全部能力。
+
+<Tip>
+  通过 Web 控制台的「模型管理」页面可一站式配置以下全部能力，无需手动改配置文件。
+</Tip>
+
+## 文本对话
+
 ```json
 {
   "model": "MiniMax-M2.7",
@@ -12,16 +20,52 @@ description: MiniMax 模型配置
 
 | 参数 | 说明 |
 | --- | --- |
-| `model` | 可填 `MiniMax-M2.7`、`MiniMax-M2.5`、`MiniMax-M2.1`、`MiniMax-M2.1-lightning`、`MiniMax-M2` 等 |
+| `model` | 可填 `MiniMax-M2.7`、`MiniMax-M2.7-highspeed`、`MiniMax-M2.5`、`MiniMax-M2.1`、`MiniMax-M2.1-lightning`、`MiniMax-M2` 等 |
 | `minimax_api_key` | 在 [MiniMax 控制台](https://platform.minimaxi.com/user-center/basic-information/interface-key) 创建 |
 
-也支持 OpenAI 兼容方式接入：
+## 图像理解
+
+MiniMax 的 M2.x 系列 chat 模型本身不支持视觉，视觉调用统一路由到 `MiniMax-Text-01`。配置 `minimax_api_key` 后 Agent 的 Vision 工具会自动使用该模型，无需在配置文件中显式指定。
+
+## 图像生成
 
 ```json
 {
-  "bot_type": "openai",
-  "model": "MiniMax-M2.7",
-  "open_ai_api_base": "https://api.minimaxi.com/v1",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "skills": {
+    "image-generation": {
+      "model": "image-01"
+    }
+  }
 }
 ```
+
+可选模型：`image-01`。
+
+## 语音合成
+
+```json
+{
+  "text_to_voice": "minimax",
+  "text_to_voice_model": "speech-2.8-hd",
+  "tts_voice_id": "female-shaonv"
+}
+```
+
+| 参数 | 说明 |
+| --- | --- |
+| `text_to_voice_model` | `speech-2.8-hd`（情绪渲染、自然听感）、`speech-2.8-turbo`（极速）、`speech-2.6-hd`、`speech-2.6-turbo` |
+| `tts_voice_id` | 音色 ID，支持中文 / 粤语 / 英 / 日 / 韩，共 70+ 种 |
+
+常用音色示例：
+
+| 音色 ID | 说明 |
+| --- | --- |
+| `female-shaonv` | 中文 · 少女（女） |
+| `female-yujie` | 中文 · 御姐（女） |
+| `female-tianmei` | 中文 · 甜美女性（女） |
+| `male-qn-jingying` | 中文 · 精英青年（男） |
+| `male-qn-badao` | 中文 · 霸道青年（男） |
+| `Cantonese_GentleLady` | 粤语 · 温柔女声 |
+| `English_Graceful_Lady` | 英文 · Graceful Lady |
+
+完整音色（中文 / 粤语 / 英 / 日 / 韩共 70+ 种）可参考 [系统音色列表](https://platform.minimaxi.com/docs/faq/system-voice-id)，也可在 Web 控制台的「模型管理 → 语音合成」下拉框中可视化选择。
diff --git a/docs/models/openai.mdx b/docs/models/openai.mdx
index c3406aca..aad83c8f 100644
--- a/docs/models/openai.mdx
+++ b/docs/models/openai.mdx
@@ -1,11 +1,20 @@
 ---
 title: OpenAI
-description: OpenAI 模型配置
+description: OpenAI 模型配置（文本 / 视觉 / 图像 / 语音 / 向量）
 ---
 
+OpenAI 是覆盖最完整的厂商，可同时承担文本对话、视觉理解、图像生成、语音识别（ASR）、语音合成（TTS）和向量（Embedding）能力。一份 `open_ai_api_key` 即可让 Agent 用到全部能力。
+
+<Tip>
+  通过 Web 控制台的「模型管理」页面可一站式配置以下全部能力，无需手动改配置文件。
+</Tip>
+
+
+## 文本对话
+
 ```json
 {
-  "model": "gpt-5.4",
+  "model": "gpt-5.5",
   "open_ai_api_key": "YOUR_API_KEY",
   "open_ai_api_base": "https://api.openai.com/v1"
 }
@@ -13,7 +22,82 @@ description: OpenAI 模型配置
 
 | 参数 | 说明 |
 | --- | --- |
-| `model` | 与 OpenAI 接口的 [model 参数](https://platform.openai.com/docs/models) 一致，支持 o 系列、gpt-5.4、gpt-5.4-mini、gpt-5.4-nano、gpt-5 系列、gpt-4.1 等，Agent 模式推荐使用 `gpt-5.4` |
+| `model` | 与 OpenAI 接口的 [model 参数](https://platform.openai.com/docs/models) 一致，支持 `gpt-5.5`、`gpt-5.4`、`gpt-5.4-mini`、`gpt-5.4-nano`、`gpt-5` 系列、`gpt-4.1`、o 系列等；Agent 模式默认 `gpt-5.5`，追求性价比可改为 `gpt-5.4` |
 | `open_ai_api_key` | 在 [OpenAI 平台](https://platform.openai.com/api-keys) 创建 |
-| `open_ai_api_base` | 可选，修改可接入第三方代理接口 |
-| `bot_type` | 使用 OpenAI 官方模型时无需填写。当通过代理接口使用 Claude 等非 OpenAI 模型时，设为 `openai` |
+| `open_ai_api_base` | 可选，修改可接入第三方代理 |
+| `bot_type` | 使用 OpenAI 官方模型时无需填写；通过兼容协议接入厂商模型时需设为 `openai` |
+
+## 图像理解
+
+`gpt-5.5`、`gpt-5.4`、`gpt-4o`、`gpt-4.1` 等 OpenAI 模型均原生支持视觉，配置 `open_ai_api_key` 后 Agent 的 Vision 工具会自动使用主模型识别图像。若主模型不支持视觉或希望显式指定，可在配置文件中配置：
+
+```json
+{
+  "tools": {
+    "vision": {
+      "model": "gpt-5.4-mini"
+    }
+  }
+}
+```
+
+支持的 Vision 模型：`gpt-5.5`、`gpt-5.4`、`gpt-5.4-mini`、`gpt-5.4-nano`、`gpt-5`、`gpt-4.1`、`gpt-4.1-mini`、`gpt-4o`。
+
+## 图像生成
+
+在配置文件中指定图像生成模型，Agent 调用图像生成技能时会自动路由到 OpenAI：
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "gpt-image-2"
+    }
+  }
+}
+```
+
+支持的图像生成模型：`gpt-image-2`、`gpt-image-1`。
+
+## 语音识别
+
+```json
+{
+  "voice_to_text": "openai",
+  "voice_to_text_model": "gpt-4o-mini-transcribe"
+}
+```
+
+| 参数 | 说明 |
+| --- | --- |
+| `voice_to_text` | 设为 `openai` 启用 OpenAI 语音识别 |
+| `voice_to_text_model` | 可选，默认 `gpt-4o-mini-transcribe`；也可填 `gpt-4o-transcribe`、`whisper-1` |
+
+凭证自动复用 `open_ai_api_key`。
+
+## 语音合成
+
+```json
+{
+  "text_to_voice": "openai",
+  "text_to_voice_model": "tts-1",
+  "tts_voice_id": "alloy"
+}
+```
+
+| 参数 | 说明 |
+| --- | --- |
+| `text_to_voice_model` | `tts-1`、`tts-1-hd`、`gpt-4o-mini-tts` |
+| `tts_voice_id` | 音色：`alloy`、`echo`、`fable`、`onyx`、`nova`、`shimmer`、`ash`、`ballad`、`coral`、`sage`、`verse` |
+
+## 向量
+
+```json
+{
+  "embedding_provider": "openai",
+  "embedding_model": "text-embedding-3-small"
+}
+```
+
+可选模型：`text-embedding-3-small`、`text-embedding-3-large`、`text-embedding-ada-002`。修改 embedding 后需执行 `/memory rebuild-index` 命令重建索引。
+
diff --git a/docs/models/qianfan.mdx b/docs/models/qianfan.mdx
index 819713e0..bdd87214 100644
--- a/docs/models/qianfan.mdx
+++ b/docs/models/qianfan.mdx
@@ -1,14 +1,20 @@
 ---
 title: 百度千帆
-description: 百度千帆 ERNIE 模型配置
+description: 百度千帆 ERNIE 模型配置（文本对话 + 图像理解）
 ---
 
-方式一：官方接入（推荐）：
+百度千帆提供 ERNIE 系列模型，支持文本对话与图像理解。
+
+<Tip>
+  通过 Web 控制台的「模型管理」页面可一站式配置以下全部能力，无需手动改配置文件。
+</Tip>
+
+## 文本对话
 
 ```json
 {
   "model": "ernie-5.1",
-  "qianfan_api_key": "",
+  "qianfan_api_key": "YOUR_API_KEY",
   "qianfan_api_base": "https://qianfan.baidubce.com/v2"
 }
 ```
@@ -19,7 +25,7 @@ description: 百度千帆 ERNIE 模型配置
 | `qianfan_api_key` | 千帆 API Key，格式通常以 `bce-v3/` 开头 |
 | `qianfan_api_base` | 可选，默认为 `https://qianfan.baidubce.com/v2` |
 
-## 模型选择
+### 模型选择
 
 | 模型 | 适用场景 |
 | --- | --- |
@@ -29,18 +35,18 @@ description: 百度千帆 ERNIE 模型配置
 | `ernie-4.5-turbo-128k` | 长上下文和通用对话 |
 | `ernie-4.5-turbo-32k` | 通用对话，成本和上下文更均衡 |
 
-## Vision 工具
+## 图像理解
 
 配置 `qianfan_api_key` 后，Agent 的 Vision 工具可以自动使用千帆视觉模型：
 
 - 当主模型本身是多模态时（如 `ernie-5.1`、`ernie-5.0`、`ernie-x1.1`、`ernie-4.5-turbo-vl`），直接由主模型识别图像，无需额外配置
 - 当主模型是纯文本时（如 `ernie-4.5-turbo-128k`），Vision 工具会自动 fallback 到 `ernie-4.5-turbo-vl`
 
-如需手动指定 Vision 模型，可在 `config.json` 中显式配置：
+如需手动指定 Vision 模型，可在配置文件中显式配置：
 
 ```json
 {
-  "tool": {
+  "tools": {
     "vision": {
       "model": "ernie-4.5-turbo-vl"
     }
@@ -48,17 +54,6 @@ description: 百度千帆 ERNIE 模型配置
 }
 ```
 
-方式二：OpenAI 兼容方式接入：
-
-```json
-{
-  "model": "ernie-5.1",
-  "bot_type": "openai",
-  "open_ai_api_key": "",
-  "open_ai_api_base": "https://qianfan.baidubce.com/v2"
-}
-```
-
 <Tip>
   新配置推荐使用 `qianfan_api_key`。旧的 `wenxin`、`wenxin-4`、`baidu_wenxin_api_key`、`baidu_wenxin_secret_key` 配置仍保持兼容。
 </Tip>
diff --git a/docs/models/qwen.mdx b/docs/models/qwen.mdx
index 2bc6517d..765bae64 100644
--- a/docs/models/qwen.mdx
+++ b/docs/models/qwen.mdx
@@ -1,8 +1,16 @@
 ---
 title: 通义千问 Qwen
-description: 通义千问模型配置
+description: 通义千问模型配置（文本 / 图像理解 / 图像生成 / 语音识别 / 语音合成 / 向量）
 ---
 
+通义千问（DashScope / 百炼）是国内覆盖最完整的厂商之一，文本、图像理解、图像生成、语音识别、语音合成与向量能力均可用一份 `dashscope_api_key` 启用。
+
+<Tip>
+  通过 Web 控制台的「模型管理」页面可一站式配置以下全部能力，无需手动改配置文件。
+</Tip>
+
+## 文本对话
+
 ```json
 {
   "model": "qwen3.6-plus",
@@ -12,16 +20,93 @@ description: 通义千问模型配置
 
 | 参数 | 说明 |
 | --- | --- |
-| `model` | 可填 `qwen3.6-plus`、`qwen3.5-plus`、`qwen3-max`、`qwen-max`、`qwen-plus`、`qwen-turbo`、`qwq-plus` 等 |
+| `model` | 可填 `qwen3.6-plus`、`qwen3.7-max`、`qwen3.5-plus`、`qwen3-max`、`qwen-max`、`qwen-plus`、`qwen-turbo`、`qwq-plus` 等 |
 | `dashscope_api_key` | 在 [百炼控制台](https://bailian.console.aliyun.com/?tab=model#/api-key) 创建，参考 [官方文档](https://bailian.console.aliyun.com/?tab=api#/api) |
 
-也支持 OpenAI 兼容方式接入：
+## 图像理解
+
+配置 `dashscope_api_key` 后 Agent 的 Vision 工具会自动调用千问的视觉模型识别图像。`qwen3-max` / `qwen3.5-plus` / `qwen3.6-plus` 等模型本身就是多模态；若主模型是纯文本（如 `qwen-turbo`），会自动回落到 `qwen-vl-max`。
+
+如需手动指定 Vision 模型：
 
 ```json
 {
-  "bot_type": "openai",
-  "model": "qwen3.6-plus",
-  "open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "tools": {
+    "vision": {
+      "model": "qwen3.6-plus"
+    }
+  }
 }
 ```
+
+支持模型：`qwen3.6-plus`、`qwen3.5-plus`、`qwen3-max`。
+
+## 图像生成
+
+```json
+{
+  "skills": {
+    "image-generation": {
+      "model": "qwen-image-2.0"
+    }
+  }
+}
+```
+
+可选模型：`qwen-image-2.0`、`qwen-image-2.0-pro`。
+
+## 语音识别
+
+```json
+{
+  "voice_to_text": "dashscope",
+  "voice_to_text_model": "qwen3-asr-flash"
+}
+```
+
+| 参数 | 说明 |
+| --- | --- |
+| `voice_to_text` | 设为 `dashscope` 启用通义千问 ASR |
+| `voice_to_text_model` | 可选，默认 `qwen3-asr-flash` |
+
+凭证自动复用 `dashscope_api_key`。单段音频建议小于 10MB、时长不超过 300 秒。
+
+## 语音合成
+
+```json
+{
+  "text_to_voice": "dashscope",
+  "text_to_voice_model": "qwen3-tts-flash",
+  "tts_voice_id": "Cherry"
+}
+```
+
+| 参数 | 说明 |
+| --- | --- |
+| `text_to_voice_model` | 可选，默认 `qwen3-tts-flash`，覆盖普通话、方言与主流外语 |
+| `tts_voice_id` | 音色 ID，详见下方常用列表 |
+
+常用音色示例：
+
+| 音色 ID | 说明 |
+| --- | --- |
+| `Cherry` | 芊悦 · 阳光女声 |
+| `Serena` | 苏瑶 · 温柔女声 |
+| `Ethan` | 晨煦 · 阳光男声 |
+| `Chelsie` | 千雪 · 二次元少女 |
+| `Dylan` | 北京话 · 晓东 |
+| `Rocky` | 粤语 · 阿强 |
+| `Sunny` | 四川话 · 晴儿 |
+
+完整音色（普通话 / 各地方言 / 双语等）可在 Web 控制台的「模型管理 → 语音合成」下拉框中可视化选择。
+
+## 向量
+
+```json
+{
+  "embedding_provider": "dashscope",
+  "embedding_model": "text-embedding-v4"
+}
+```
+
+默认模型 `text-embedding-v4`。修改 embedding 后需执行 `/memory rebuild-index` 命令重建索引。
diff --git a/docs/releases/overview.mdx b/docs/releases/overview.mdx
index 6f685799..020265e6 100644
--- a/docs/releases/overview.mdx
+++ b/docs/releases/overview.mdx
@@ -5,6 +5,7 @@ description: CowAgent 版本更新历史
 
 | 版本 | 日期 | 说明 |
 | --- | --- | --- |
+| [2.0.9](/releases/v2.0.9) | 2026.05.22 | 新增模型管理、MCP 协议支持、浏览器登录态持久化、新模型接入（gpt-5.5、gemini-3.5-flash、qwen3.7-max 等）、部署安全加固 |
 | [2.0.8](/releases/v2.0.8) | 2026.05.06 | 飞书渠道全面升级（语音、流式输出和Markdown、扫码一键接入）、DeepSeek V4和百度模型新增、定时任务工具增强 |
 | [2.0.7](/releases/v2.0.7) | 2026.04.22 | 图像生成技能（六厂商自动路由）、新模型支持（Kimi K2.6、Claude Opus 4.7、GLM 5.1）、知识库增强、Web 控制台优化 |
 | [2.0.6](/releases/v2.0.6) | 2026.04.14 | 项目更名、知识库系统、梦境记忆蒸馏、上下文智能压缩、Web 控制台多会话及多项优化 |
diff --git a/docs/releases/v2.0.7.mdx b/docs/releases/v2.0.7.mdx
index d9e2275d..b4b6e27b 100644
--- a/docs/releases/v2.0.7.mdx
+++ b/docs/releases/v2.0.7.mdx
@@ -11,7 +11,7 @@ description: CowAgent 2.0.7 - 图像生成技能（六厂商自动路由）、
 - **开箱即用**：配置 API Key 即可使用，无需手动指定模型。也支持在对话中指定特定模型
 - **灵活控制**：支持 `quality`（画质）、`size`（分辨率，512/1K~4K）、`aspect_ratio`（宽高比）等参数，各厂商自动适配有效值
 - **图片编辑**：传入已有图片即可进行编辑、风格迁移、多图融合
-- **Skill 级配置**：支持通过 `config.json` 中的 `skill.image-generation.model` 固定默认模型
+- **Skill 级配置**：支持通过 `config.json` 中的 `skills.image-generation.model` 固定默认模型
 
 相关文档：[图像生成技能](https://docs.cowagent.ai/skills/image-generation)
 
diff --git a/docs/releases/v2.0.8.mdx b/docs/releases/v2.0.8.mdx
index ccb72827..ced1b967 100644
--- a/docs/releases/v2.0.8.mdx
+++ b/docs/releases/v2.0.8.mdx
@@ -46,7 +46,7 @@ description: CowAgent 2.0.8 - 飞书渠道全面升级（语音、流式打字
 
 ## 🔧 工具与安全
 
-- **图像识别模型**：让 `tool.vision.model` 配置真正生效，未配置时自动 fallback #2792 Thanks CNXudiandian
+- **图像识别模型**：让 `tools.vision.model` 配置真正生效，未配置时自动 fallback #2792 Thanks CNXudiandian
 - **Bash 安全确认**：仅对工作区外的破坏性删除做二次确认，工作区内常规操作不再打扰
 
 ## 🐛 其他修复
diff --git a/docs/releases/v2.0.9.mdx b/docs/releases/v2.0.9.mdx
new file mode 100644
index 00000000..957e0ced
--- /dev/null
+++ b/docs/releases/v2.0.9.mdx
@@ -0,0 +1,65 @@
+---
+title: v2.0.9
+description: CowAgent 2.0.9 - 新增模型管理、MCP 协议支持、浏览器登录态持久化、新模型接入
+---
+
+## 🖥️ 新增模型管理
+
+Web 控制台新增「模型」页面，按 **模型厂商 + 模型能力** 进行管理，支持对话、图像、语音、向量模型和搜索能力的配置：
+
+- **多厂商配置**：所有厂商的 API Key / API Base 在顶部统一维护，下方所有能力立即生效，无需重复填写
+- **图像模型**：图像理解与图像生成均可独立选择厂商和模型，未指定时跟随主模型自动选择
+- **语音模型**：语音识别和合成可独立配置，新增千问、智谱 ASR/TTS 模型
+- **向量模型**：支持配置 Embedding 模型（用于记忆及知识库检索），新增支持 OpenAI、通义、豆包、智谱等；切换模型后需执行 `/memory rebuild-index` 在线重建索引
+- **搜索能力**：联网搜索能力升级，支持博查、百度、智谱等多个厂商，自动模式下 Agent 可综合多来源搜索结果进行深度研究
+
+相关文档：[模型概览](https://docs.cowagent.ai/models)
+
+<img width="720" alt="20260522113305" src="https://cdn.link-ai.tech/doc/20260522113305.png" />
+
+
+## 🧩 MCP 协议支持
+
+支持 **MCP（Model Context Protocol）** 协议，从固定工具集扩展为开放可插拔的工具生态，任何兼容 MCP 协议的服务均可作为工具直接接入 Agent。
+
+- 原生 JSON-RPC 实现，零额外依赖，同时支持 `stdio` 和 `sse` 两种传输
+- 兼容 Claude Desktop / Cursor 等主流风格的 `mcpServers` 配置，优先读取 `~/cow/mcp.json`
+
+相关文档：[MCP 工具](https://docs.cowagent.ai/tools/mcp)。Thanks @yangluxin613 (#2801)
+
+## 🌐 浏览器登录态持久化
+
+针对需要登录、有反爬机制的网站，浏览器工具支持登录一次后长期复用登录态，并允许接入用户自己的真实 Chrome 以通过指纹检测：
+
+- **持久化用户配置（默认）**：默认使用 `~/.cow/browser_profile` 作为浏览器用户目录，登录一次后下次自动复用登录态
+- **CDP 模式**：通过 `tools.browser.cdp_endpoint` 接管真实 Chrome 浏览器，享有完整浏览器权限
+
+相关文档：[浏览器工具](https://docs.cowagent.ai/tools/browser)。Thanks @leafmove (#2809)
+
+## 🤖 模型新增与优化
+
+- **模型新增**：`gpt-5.5`、`gemini-3.5-flash`、`qwen3.7-max`、`ernie-5.1`
+- **模型优化**：DeepSeek V4 支持 `reasoning_effort` 思考深度参数；修复 MiMo 等思考模型通过 OpenAI 兼容协议接入的问题
+
+## 🔒 部署与安全
+
+- **默认本机访问**：Web 控制台 `web_host` 配置默认绑定 `127.0.0.1`，服务器部署时可手动设置为 `0.0.0.0` 并设置密码。Thanks @August829、@yidaozhongqing、@YLChen-007、@icysun
+- **前端资源完全本地化**：第三方 CSS / JS 全部本地分发，离线 / 内网环境也能正常加载控制台。Thanks @gitlayzer (#2816)
+
+## 🛠 体验优化与修复
+
+- **TTS 适配更多通道**：Web对话、个人微信、飞书、钉钉、企微智能机器人均已支持回复语音，详情查看 [通道概览](https://docs.cowagent.ai/channels)
+- **日志面板增强**：根据日志等级差异化高亮展示、支持根据等级筛选。Thanks @yangluxin613 (#2807)
+- **Web 控制台自动启动**：程序启动后自动打开 Web 控制台。Thanks @yangluxin613 (#2804)
+- **Ctrl+C 干净退出**：不再打印一长串 `KeyboardInterrupt` 堆栈。Thanks @yangluxin613 (#2806)
+- **文件夹上传**：Web 端支持目录上传，路径校验适配 Windows。Thanks @TryToMakeUsBetter (#2814)
+- 修复定时任务在某些情况下重复执行的问题。Thanks @CNXudiandian (#2820)
+- 修复定时任务带时区时单次任务不触发的问题。Thanks @AethericSpace
+- 修复执行失败的工具调用在页面刷新后不显示的问题。Thanks @a1094174619 (#2822)
+- 修复企微机器人消息中包含非法控制字符导致投递失败的问题。Thanks @Jacques-Zhao (#2810)
+
+## 📦 升级方式
+
+源码部署可执行 `cow update` 一键升级，或手动拉取代码后重启。详见 [更新升级文档](https://docs.cowagent.ai/guide/upgrade)。
+
+**发布日期**：2026.05.22 | [Full Changelog](https://github.com/zhayujie/CowAgent/compare/2.0.8...2.0.9)
diff --git a/docs/skills/image-generation.mdx b/docs/skills/image-generation.mdx
index e64cc846..288fd656 100644
--- a/docs/skills/image-generation.mdx
+++ b/docs/skills/image-generation.mdx
@@ -3,149 +3,87 @@ title: image-generation - 图像生成
 description: 文生图 / 图生图 / 多图融合，支持多家厂商自动路由与回退
 ---
 
-通用的图像生成与编辑技能，支持 OpenAI、Gemini、Seedream（火山方舟）、Qwen（百炼）、MiniMax、LinkAI 共六家厂商。不需要手动选模型，脚本会按固定优先级自动挑选已配置的厂商来出图。
+通用的图像生成与编辑技能，支持 OpenAI、Gemini、Seedream（火山方舟）、Qwen（百炼）、MiniMax、LinkAI 共六家厂商。配好任意一家的 Key 即可使用，配多家可享受自动回退。
 
-## 模型选择
-
-`image-generation` 采用「固定优先级 + 自动回退」的策略，配好 Key 就能用：
-
-1. **优先级顺序**：`OpenAI → Gemini → Seedream → Qwen → MiniMax → LinkAI`
-2. **没配 Key 的跳过**：只有设了 API Key 的厂商才会参与
-3. **失败自动切下一家**：遇到 401、模型未开通、网络异常等错误时，会自动试下一个
-4. **指定模型时前置**：如果明确传了某个模型名，对应厂商会被提到最前面先试
-
-### 支持的模型
+## 支持的模型
 
 | 厂商 | 模型 / 别名 | 特点 |
 | --- | --- | --- |
-| OpenAI | `gpt-image-2`、`gpt-image-1` | 通用文生图，高质量、高智能，支持 `quality` 参数控制画质 |
+| OpenAI | `gpt-image-2`、`gpt-image-1` | 通用文生图，高质量，支持 `quality` 控制画质 |
 | Gemini Nano Banana | `nano-banana-2`、`nano-banana-pro`、`nano-banana` | 对应 `gemini-3.1-flash`、`gemini-3-pro`、`gemini-2.5-flash` 的图像版本 |
 | Seedream（火山方舟） | `seedream-5.0-lite`、`seedream-4.5` | 原生 2K–4K，最多 14 张图融合 |
 | Qwen（百炼） | `qwen-image-2.0`、`qwen-image-2.0-pro` | 擅长中文排版和图文融合 |
-| MiniMax | `image-01` | 简单快速的图片生成 |
-| LinkAI | 任意模型 | 通用代理，兜底用 |
+| MiniMax | `image-01` | 简单快速 |
+| LinkAI | 任意模型 | 统一网关，作为兜底 |
 
-<Note>
-默认情况下 Agent 不会主动选模型，而是走自动路由。如果你想用某个特定模型，直接在对话里说就行，比如「用 seedream 画一只猫」或「用 gpt-image-2 生成海报」。也可以通过下面的「自定义配置」固定默认模型。
-</Note>
+## 模型选择
 
-## 自定义配置
+默认走「自动路由 + 失败回退」：
 
-### API Key 配置
+1. 按 `OpenAI → Gemini → Seedream → Qwen → MiniMax → LinkAI` 顺序选第一个已配置的厂商
+2. 遇到 401、模型未开通、网络异常等错误时，自动切到下一家
+3. 用户在对话里指定模型时（如「用 seedream 画一只猫」），对应厂商会被提到最前优先尝试
 
-至少需要配**一个**厂商的 Key，配多个就能享受自动回退能力。有三种配置方式：
-
-#### 方式一：已有模型 Key 自动复用
-
-如果你在 web控制台 或 `config.json` 中配置了对话模型的 Key（比如 `openai_api_key`、`gemini_api_key` 等），启动时这些 Key 会被**自动同步**到对应的环境变量。也就是说，只要你的对话模型能用，图像生成就能直接用同一个 Key，不需要额外配置。
-
-#### 方式二：在 config.json 中配置
-
-在 `config.json` 中直接写对应的 Key 字段即可，支持的字段如下：
+如需固定使用某个模型：
 
 ```json
 {
-  "openai_api_key": "sk-xxx",
-  "openai_api_base": "https://api.openai.com/v1",
-  "gemini_api_key": "AIza-xxx",
-  "ark_api_key": "xxx",
-  "dashscope_api_key": "sk-xxx",
-  "minimax_api_key": "xxx",
-  "linkai_api_key": "xxx"
-}
-```
-
-修改后需要重启生效。每个 Key 还有对应的 `*_api_base` 字段可以自定义接口地址。
-
-#### 方式三：对话中直接配置
-
-在对话里发送 API Key，Agent 会通过 `env_config` 工具自动保存到 `~/cow/.env`，**不需要重启**就能生效。例如：
-
-```
-帮我配置 OPENAI_API_KEY 为 sk-xxx
-```
-
-或者：
-
-```
-设置 ARK_API_KEY 为 xxx
-```
-
-### API Key 一览
-
-| 环境变量 | config.json 字段 | 对应厂商 | 默认 Base URL |
-| --- | --- | --- | --- |
-| `OPENAI_API_KEY` | `openai_api_key` | OpenAI | `https://api.openai.com/v1` |
-| `GEMINI_API_KEY` | `gemini_api_key` | Gemini | `https://generativelanguage.googleapis.com` |
-| `ARK_API_KEY` | `ark_api_key` | 火山方舟（Seedream） | `https://ark.cn-beijing.volces.com/api/v3` |
-| `DASHSCOPE_API_KEY` | `dashscope_api_key` | 阿里百炼（Qwen） | `https://dashscope.aliyuncs.com` |
-| `MINIMAX_API_KEY` | `minimax_api_key` | MiniMax | `https://api.minimaxi.com` |
-| `LINKAI_API_KEY` | `linkai_api_key` | LinkAI | `https://api.link-ai.tech` |
-
-
-### 指定默认模型
-
-如果想让所有图像生成固定走某个厂商的模型，可以在 `config.json` 里加：
-
-```json
-"skill": {
-  "image-generation": {
-    "model": "seedream-5.0-lite"
+  "skills": {
+    "image-generation": {
+      "model": "seedream-5.0-lite"
+    }
   }
 }
 ```
 
-启动时这段配置会被自动转成环境变量 `SKILL_IMAGE_GENERATION_MODEL`，脚本读到后会固定使用这个模型所在的厂商进行生成。
+## 配置 API Key
+
+<Tip>
+  推荐通过 [Web 控制台](/channels/web) 的「模型管理」页面配置，配好的对话模型 Key 会被图像生成技能自动复用，无需重复配置。也可手动编辑配置文件或在对话中通过 `env_config` 工具临时设置。
+</Tip>
+
+凭证统一复用主模型厂商的 Key：
+
+| 字段 | 对应厂商 |
+| --- | --- |
+| `openai_api_key` | OpenAI |
+| `gemini_api_key` | Gemini |
+| `ark_api_key` | 火山方舟（Seedream） |
+| `dashscope_api_key` | 阿里百炼（Qwen） |
+| `minimax_api_key` | MiniMax |
+| `linkai_api_key` | LinkAI |
 
 
 ## 开启和关闭
 
-`image-generation` 是内置技能，**会根据 API Key 自动调整状态**：
+技能会根据 API Key 自动调整状态：
 
-- **Key 已配置**：技能正常可用，Agent 收到画图请求时会直接调用
-- **Key 未配置**：技能仍然会出现在上下文中（标记为「需要配置」），Agent 会引导用户去配 Key，而不是直接调用失败
+- **已配置 Key**：Agent 收到画图请求时直接调用
+- **未配置 Key**：技能仍会出现在上下文中（标记为「需要配置」），Agent 会引导用户去配 Key
 
-如果想手动控制，也可以用命令：
+如需手动控制：
 
 ```text
-/skill disable image-generation    # 手动关闭（即使有 Key 也不会被调用）
+/skill disable image-generation    # 关闭
 /skill enable image-generation     # 重新开启
 ```
 
-终端里对应的命令是 `cow skill disable image-generation` / `cow skill enable image-generation`。
+终端等价命令：`cow skill disable image-generation` / `cow skill enable image-generation`。
 
 ## 参数
 
 | 参数 | 类型 | 必填 | 默认 | 说明 |
 | --- | --- | --- | --- | --- |
 | `prompt` | string | 是 | — | 图像描述 |
-| `image_url` | string / list | 否 | null | 编辑用的输入图，支持本地路径或 URL。传多个就是多图融合 |
-| `quality` | string | 否 | auto | `low` / `medium` / `high`，只有部分厂商支持 |
-| `size` | string | 否 | auto | `512` / `1K` / `2K` / `3K` / `4K`，也可以写像素值如 `1024x1024` |
+| `image_url` | string / list | 否 | null | 编辑用的输入图，本地路径或 URL；传列表为多图融合 |
+| `quality` | string | 否 | auto | `low` / `medium` / `high`，仅部分厂商支持 |
+| `size` | string | 否 | auto | `512` / `1K` / `2K` / `3K` / `4K`，或像素值如 `1024x1024` |
 | `aspect_ratio` | string | 否 | null | `1:1` / `3:2` / `2:3` / `16:9` / `9:16` / `21:9`；Gemini 还支持 `1:4` / `4:1` / `1:8` / `8:1` |
 
 <Warning>
-**质量越高、分辨率越大，花的钱越多、等的时间越长。**
-
-- 日常对话和快速预览直接用默认（`auto`），或者 `quality=low` + `size=1K`，大概 20 秒出图
-- 做海报、用户明确要高清的时候再上 `quality=high` + `size=2K/4K`，可能要等 1～5 分钟，取决于不同模型的速度
+  **质量越高、分辨率越大，耗时和成本越高。** 日常对话用默认（`auto`）或 `quality=low` + `size=1K` 即可，约 20 秒出图；做海报或明确要高清时再上 `high` + `2K/4K`，可能需要 1–5 分钟。
 </Warning>
 
-## 输出
-
-成功时返回：
-
-```json
-{
-  "model": "doubao-seedream-5-0-260128",
-  "images": [
-    {"url": "/path/to/output.png"}
-  ]
-}
-```
-
-失败时返回 `{ "error": "..." }`。出错后**不要直接重试**——大概率是配置问题（Key 填错、API 地址不对、模型没开通），让用户修好配置再试。
-
 ## 常见用法
 
 - **文生图**：根据描述生成插画、海报、图标、头像、分镜图等
@@ -153,8 +91,8 @@ description: 文生图 / 图生图 / 多图融合，支持多家厂商自动路
 - **多图融合**：把多张参考图合成一张（换装、角色合影等）
 
 <Note>
-- bash 超时建议设 600 秒。单个厂商的 HTTP 超时是 300 秒，但脚本可能依次尝试多个厂商
-- 输入的图片会自动压缩到 4MB 以内、最长边不超过 4096px
-- Gemini / Seedream / Qwen / MiniMax 不支持 `quality` 参数，传了也没用
-- Seedream 默认出 2K 图，`seedream-5.0-lite` 支持到 3K，`seedream-4.5` 支持到 4K
+- bash 超时建议设 600 秒：单厂商 HTTP 超时 300 秒，脚本可能依次尝试多家
+- 输入图片自动压缩到 4MB 以内、最长边不超过 4096px
+- Gemini / Seedream / Qwen / MiniMax 不支持 `quality` 参数
+- Seedream 默认出 2K 图；`seedream-5.0-lite` 支持到 3K，`seedream-4.5` 支持到 4K
 </Note>
diff --git a/docs/skills/index.mdx b/docs/skills/index.mdx
index 6e90aba8..795cebc0 100644
--- a/docs/skills/index.mdx
+++ b/docs/skills/index.mdx
@@ -11,7 +11,7 @@ Skill 与 Tool 的区别：Tool 是由代码实现的原子操作（如读写文
 
 CowAgent 提供多种方式获取技能：
 
-- **[Cow 技能广场](https://skills.cowagent.ai/)** — 在线浏览所有可用技能，或通过 `/skill list --remote` 在对话中浏览和安装
+- [Cow 技能广场](https://skills.cowagent.ai/) — 在线浏览所有可用技能，或通过 `/skill list --remote` 在对话中浏览和安装
 - **GitHub** — 直接从 GitHub 仓库安装，支持批量安装
 - **ClawHub** — 通过 `/skill install clawhub:名称` 安装 ClawHub 上的技能 (4w+个)
 - **LinkA** — 通过 `/skill install linkai:编码` 安装 LinkAI 上的公开资源和创建的知识库/数据库/工作流/插件等资源
diff --git a/docs/skills/install.mdx b/docs/skills/install.mdx
index dd876f15..84395d95 100644
--- a/docs/skills/install.mdx
+++ b/docs/skills/install.mdx
@@ -3,7 +3,7 @@ title: 安装技能
 description: 通过命令一键安装来自多种来源的技能
 ---
 
-CowAgent 支持通过统一的 `install` 命令安装来自 **[Cow 技能广场](https://skills.cowagent.ai/)、GitHub、ClawHub、LinkAI** 以及任意 URL 上的技能。在对话中使用 `/skill install`，在终端中使用 `cow skill install`。
+CowAgent 支持通过统一的 `install` 命令安装来自 [Cow 技能广场](https://skills.cowagent.ai/)、GitHub、ClawHub、LinkAI 以及任意 URL 上的技能。在对话中使用 `/skill install`，在终端中使用 `cow skill install`。
 
 ## 从Cow技能广场安装
 
diff --git a/docs/tools/mcp.mdx b/docs/tools/mcp.mdx
index 0973f25a..8b7670c1 100644
--- a/docs/tools/mcp.mdx
+++ b/docs/tools/mcp.mdx
@@ -34,7 +34,9 @@ Docker 部署时，官方 `docker-compose.yml` 已经把宿主机 `./cow` 挂载
 | `command` | stdio | 启动 server 的可执行命令（如 `npx`、`python`、`uvx`） |
 | `args` | 否 | 传给 command 的参数列表 |
 | `env` | 否 | 子进程的环境变量，常用于 API Key |
-| `url` | SSE | SSE 端点 URL（与 `command` 二选一） |
+| `url` | SSE / Streamable HTTP | 远程端点 URL（与 `command` 二选一） |
+| `type` | 远程 | 远程传输类型，可选 `sse` 或 `streamable-http`，默认 `sse` |
+| `headers` | 否 | 远程请求附加 HTTP 头（如 `Authorization`），仅 Streamable HTTP 使用 |
 | `disabled` | 否 | `true` 时跳过该 server，便于临时关闭 |
 
 ### 完整示例
@@ -88,7 +90,8 @@ Agent 会：
 | 协议 | 说明 | 配置字段 |
 | --- | --- | --- |
 | **stdio** | 子进程通信，最常见，社区生态最丰富 | `command` + `args` |
-| **SSE** | HTTP Server-Sent Events，适合远程托管的 MCP 服务 | `url` |
+| **SSE** | HTTP Server-Sent Events，旧版远程协议 | `url`（默认） |
+| **Streamable HTTP** | 新版远程协议，单端点收发，逐步取代 SSE | `type: "streamable-http"` + `url` |
 
 ## 排错
 
@@ -106,4 +109,4 @@ Agent 会：
 - [mcp.so](https://mcp.so) — 全球 MCP 服务索引
 - [ModelScope MCP 广场](https://modelscope.cn/mcp) — 魔搭社区 MCP 广场，国内访问更稳定
 
-只要遵循 MCP 标准协议（stdio / SSE），都可以直接接入 CowAgent。
+只要遵循 MCP 标准协议（stdio / SSE / Streamable HTTP），都可以直接接入 CowAgent。
diff --git a/docs/tools/vision.mdx b/docs/tools/vision.mdx
index 66cfdebf..675afe41 100644
--- a/docs/tools/vision.mdx
+++ b/docs/tools/vision.mdx
@@ -40,7 +40,7 @@ Vision 工具采用多级自动选择 + 自动兜底策略，无需手动配置
 
 ```json
 {
-    "tool": {
+    "tools": {
         "vision": {
             "model": "gpt-4.1"
         }
diff --git a/docs/tools/web-search.mdx b/docs/tools/web-search.mdx
index 2622d0c5..928eb633 100644
--- a/docs/tools/web-search.mdx
+++ b/docs/tools/web-search.mdx
@@ -1,32 +1,51 @@
 ---
 title: web_search - 联网搜索
-description: 搜索互联网获取实时信息
+description: 搜索互联网获取实时信息，支持多个搜索厂商
 ---
 
-搜索互联网获取实时信息、新闻、研究等内容。支持两个搜索后端，自动选择可用的后端。
+搜索互联网获取实时信息、新闻、研究等内容。支持博查、百度千帆、智谱、LinkAI 四个后端，配置任意一家即可使用。
 
-## 依赖
+<Tip>
+  推荐通过 [Web 控制台](/channels/web) 的「模型管理 → 搜索」面板可视化配置厂商与策略，无需手动编辑配置文件。
+</Tip>
 
-需要配置至少一个搜索 API Key（通过 `env_config` 工具或工作空间 `.env` 文件配置）：
+## 厂商
 
-| 后端 | 环境变量 | 优先级 | 获取方式 |
-| --- | --- | --- | --- |
-| 博查搜索 | `BOCHA_API_KEY` | 优先使用 | [博查开放平台](https://open.bochaai.com/) |
-| LinkAI 搜索 | `LINKAI_API_KEY` | 可选 | [LinkAI 控制台](https://link-ai.tech/console/interface) |
+| 厂商 | 凭证 | 申请入口 |
+| --- | --- | --- |
+| 博查 Bocha | `tools.web_search.bocha_api_key` | [博查开放平台](https://open.bochaai.com/) |
+| 百度千帆 | 复用 `qianfan_api_key` | [千帆控制台](https://cloud.baidu.com/doc/qianfan/s/2mh4su4uy) |
+| 智谱 Zhipu | 复用 `zhipu_ai_api_key` | [智谱开放平台](https://docs.bigmodel.cn/cn/guide/tools/web-search) |
+| LinkAI | 复用 `linkai_api_key` | [LinkAI 控制台](https://link-ai.tech/console/interface) |
 
-## 参数
+除博查需要单独的 `bocha_api_key` 外，其他三家直接复用对应模型的 API Key，配好模型即同时获得搜索能力。
+
+## 路由策略
+
+```json
+{
+  "tools": {
+    "web_search": {
+      "strategy": "auto",
+      "provider": ""
+    }
+  }
+}
+```
+
+- `auto`（默认）：由 Agent 在已配置的厂商中智能选择，并可在一次任务中多次调用、切换不同厂商以获取更全面的结果；未指定时按 `bocha → qianfan → zhipu → linkai` 顺序兜底。
+- `fixed`：固定使用 `provider` 指定的厂商；该厂商凭证缺失时自动回落到 auto 顺序。
+
+## 工具参数
 
 | 参数 | 类型 | 必填 | 说明 |
 | --- | --- | --- | --- |
 | `query` | string | 是 | 搜索关键词 |
-| `count` | integer | 否 | 返回结果数量（1-50，默认 10） |
-| `freshness` | string | 否 | 时间范围：`noLimit`、`oneDay`、`oneWeek`、`oneMonth`、`oneYear`，或日期范围如 `2025-01-01..2025-02-01` |
+| `count` | integer | 否 | 返回结果数量（1–50，默认 10） |
+| `freshness` | string | 否 | 时间范围：`noLimit`（默认）、`oneDay`、`oneWeek`、`oneMonth`、`oneYear`，或日期范围如 `2025-01-01..2025-02-01` |
 | `summary` | boolean | 否 | 是否返回页面摘要（默认 false） |
-
-## 使用场景
-
-当用户询问最新信息、需要事实核查或获取实时数据时，Agent 会自动调用此工具。
+| `provider` | string | 否 | `auto` 策略下配置了多个厂商时可见，用于单次切换厂商 |
 
 <Note>
-  如果未配置任何搜索 API Key，该工具不会被加载。
+  四家凭证均未配置时，该工具不会注册到 Agent。
 </Note>
diff --git a/docs/zh/README.md b/docs/zh/README.md
new file mode 100644
index 00000000..bc8000b3
--- /dev/null
+++ b/docs/zh/README.md
@@ -0,0 +1,269 @@
+<p align="center"><img src= "https://github.com/user-attachments/assets/eca9a9ec-8534-4615-9e0f-96c5ac1d10a3" alt="CowAgent" width="420" /></p>
+
+<p align="center">
+  <a href="https://github.com/zhayujie/CowAgent/releases/latest"><img src="https://img.shields.io/github/v/release/zhayujie/CowAgent" alt="Latest release"></a>
+  <a href="https://github.com/zhayujie/CowAgent/blob/master/LICENSE"><img src="https://img.shields.io/github/license/zhayujie/CowAgent" alt="License: MIT"></a>
+  <a href="https://github.com/zhayujie/CowAgent"><img src="https://img.shields.io/github/stars/zhayujie/CowAgent?style=flat-square" alt="Stars"></a> <br/>
+  [<a href="../../README.md">English</a>] | [中文] | [<a href="../ja/README.md">日本語</a>]
+</p>
+
+**CowAgent** 是一个开源的超级 AI 助理，能够主动思考和规划任务、操作计算机和外部资源、创造和执行 Skills、构建知识库与长期记忆，与你一同成长，是 Agent Harness 工程的最佳实践之一。
+
+CowAgent 轻量、易部署、可扩展，自由接入主流大模型，覆盖微信、飞书、钉钉、企微、QQ、Telegram、Slack、网页等多渠道，7×24 运行于个人电脑或服务器中。
+
+<p align="center">
+  <a href="https://cowagent.ai/?lang=zh">🌐 官网</a> &nbsp;·&nbsp;
+  <a href="https://docs.cowagent.ai/">📖 文档中心</a> &nbsp;·&nbsp;
+  <a href="https://docs.cowagent.ai/guide/quick-start">🚀 快速开始</a> &nbsp;·&nbsp;
+  <a href="https://skills.cowagent.ai/">🧩 技能广场</a> &nbsp;·&nbsp;
+  <a href="https://link-ai.tech/cowagent/create">☁️ 在线体验</a>
+</p>
+
+<br/>
+
+## 🌟 核心能力
+
+| 能力 | 说明 |
+| :--- | :--- |
+| [任务规划](https://docs.cowagent.ai/intro/architecture) | 理解复杂任务并自主分解执行，循环调用工具直到完成目标 |
+| [长期记忆](https://docs.cowagent.ai/memory) | 三层记忆架构（上下文 → 天级 → 核心），梦境蒸馏自动整理，支持关键词与向量混合检索 |
+| [知识库](https://docs.cowagent.ai/knowledge) | 自动整理结构化知识为 Markdown Wiki，构建持续增长的知识图谱，可视化浏览 |
+| [技能](https://docs.cowagent.ai/skills) | 从 [Skill Hub](https://skills.cowagent.ai/)、GitHub、ClawHub 等一键安装；也可通过对话创造自定义技能 |
+| [工具](https://docs.cowagent.ai/tools) | 内置文件读写、终端、浏览器、定时任务、记忆检索、联网搜索等 10+ 工具，支持 MCP 协议 |
+| [通道](https://docs.cowagent.ai/channels) | 一个 Agent 同时接入 Web、微信、飞书、钉钉、企微、QQ、公众号、Telegram、Slack 等多个渠道 |
+| 多模态 | 文本、图片、语音、文件全消息类型支持，覆盖识别、生成、收发 |
+| [模型](https://docs.cowagent.ai/models) | DeepSeek、Claude、Gemini、GPT、GLM、Qwen、Kimi、MiniMax、Doubao 等主流厂商，配置一行切换 |
+| [部署](https://docs.cowagent.ai/guide/quick-start) | 一键脚本安装，Web 控制台统一管理；本地、Docker、服务器多种部署方式 |
+
+<br/>
+
+## 🏗️ 架构总览
+
+<img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/architecture/zh/architecture.jpg" alt="CowAgent Architecture" width="750"/>
+
+CowAgent 是一个完整的 **Agent Harness**：消息从各类**通道**进入，**Agent Core** 结合记忆、知识库与可用工具/技能进行任务规划与决策，调用**模型**生成结果，再回传至原通道。各模块解耦清晰，按需扩展。
+
+详见 [项目架构](https://docs.cowagent.ai/intro/architecture)。
+
+<br/>
+
+## 🚀 快速开始
+
+项目提供一键安装脚本，自动完成依赖安装、配置和启动：
+
+**Linux / macOS：**
+
+```bash
+bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
+```
+
+**Windows（PowerShell）：**
+
+```powershell
+irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
+```
+
+**Docker：**
+
+```bash
+curl -O https://cdn.link-ai.tech/code/cow/docker-compose.yml
+docker compose up -d
+```
+
+启动成功后访问 `http://localhost:9899` 进入 **Web 控制台**，在控制台内即可完成模型配置、渠道接入、技能安装等全部操作。
+
+> 服务器部署且需要公网访问控制台时，请在 `config.json` 中将 `web_host` 设为 `0.0.0.0`（同时强烈建议设置 `web_password` 启用鉴权），然后访问 `http://<server-ip>:9899`，并确保防火墙/安全组放行 `9899` 端口。
+
+> 📖 详细安装指南：[快速开始](https://docs.cowagent.ai/guide/quick-start) · [源码安装](https://docs.cowagent.ai/guide/manual-install) · [升级](https://docs.cowagent.ai/guide/upgrade)
+
+安装后可使用 `cow` [CLI 命令](https://docs.cowagent.ai/cli) 管理服务：
+
+```bash
+cow start | stop | restart        # 服务管理
+cow status | logs                  # 状态和日志
+cow update                         # 拉取最新代码并重启
+cow skill install <名称>           # 安装技能
+cow install-browser                # 安装浏览器工具
+```
+
+<br/>
+
+## 🤖 模型支持
+
+CowAgent 支持国内外主流厂商的大语言模型。**文本对话、图像理解、图像生成、语音识别/合成、向量** 等能力均可独立配置厂商。
+
+| 厂商 | 代表模型 | 文本 | 图像理解 | 图像生成 | 语音识别 | 语音合成 | 向量 |
+| --- | --- | :-: | :-: | :-: | :-: | :-: | :-: |
+| [DeepSeek](https://docs.cowagent.ai/models/deepseek) | deepseek-v4-flash / pro | ✅ | | | | | |
+| [MiniMax](https://docs.cowagent.ai/models/minimax) | MiniMax-M2.7 | ✅ | ✅ | ✅ | | ✅ | |
+| [Claude](https://docs.cowagent.ai/models/claude) | claude-opus-4-8 | ✅ | ✅ | | | | |
+| [Gemini](https://docs.cowagent.ai/models/gemini) | gemini-3.5-flash | ✅ | ✅ | ✅ | | | |
+| [OpenAI](https://docs.cowagent.ai/models/openai) | gpt-5.5、o 系列 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [智谱 GLM](https://docs.cowagent.ai/models/glm) | glm-5.1、glm-5v-turbo | ✅ | ✅ | | ✅ | | ✅ |
+| [通义千问](https://docs.cowagent.ai/models/qwen) | qwen3.7-max | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [豆包 Doubao](https://docs.cowagent.ai/models/doubao) | doubao-seed-2.0 系列 | ✅ | ✅ | ✅ | | | ✅ |
+| [Kimi](https://docs.cowagent.ai/models/kimi) | kimi-k2.6 | ✅ | ✅ | | | | |
+| [百度ERNIE](https://docs.cowagent.ai/models/qianfan) | ernie-5.1 | ✅ | ✅ | | | | |
+| [小米 MiMo](https://docs.cowagent.ai/models/mimo) | mimo-v2.5-pro / v2.5 | ✅ | ✅ | | | ✅ | |
+| [LinkAI](https://docs.cowagent.ai/models/linkai) | 一个 Key 接入 100+ 模型 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [自定义](https://docs.cowagent.ai/models/custom) | 本地模型 / 三方代理 | ✅ | | | | | |
+
+> 推荐通过 Web 控制台在线配置，无需手动编辑文件。手动配置请参考各厂商文档，详见 [模型概览](https://docs.cowagent.ai/models)。
+
+<br/>
+
+## 💬 通道接入
+
+一个 Agent 实例可同时接入多个渠道，启动时通过 `channel_type` 切换或并行运行。
+
+| 通道 | 文本 | 图片 | 文件 | 语音 | 群聊 |
+| --- | :-: | :-: | :-: | :-: | :-: |
+| [Web 控制台](https://docs.cowagent.ai/channels/web)（默认） | ✅ | ✅ | ✅ | ✅ | |
+| [微信](https://docs.cowagent.ai/channels/weixin) | ✅ | ✅ | ✅ | ✅ | |
+| [飞书](https://docs.cowagent.ai/channels/feishu) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [钉钉](https://docs.cowagent.ai/channels/dingtalk) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [企微智能机器人](https://docs.cowagent.ai/channels/wecom-bot) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [QQ](https://docs.cowagent.ai/channels/qq) | ✅ | ✅ | ✅ | | ✅ |
+| [企业微信应用](https://docs.cowagent.ai/channels/wecom) | ✅ | ✅ | ✅ | ✅ | |
+| [微信公众号](https://docs.cowagent.ai/channels/wechatmp) | ✅ | ✅ | | ✅ | |
+| [Telegram](https://docs.cowagent.ai/channels/telegram) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [Slack](https://docs.cowagent.ai/channels/slack) | ✅ | ✅ | ✅ | | ✅ |
+
+> 飞书、企微智能机器人支持在 Web 控制台内**扫码一键接入**，无需公网 IP。详见 [通道概览](https://docs.cowagent.ai/channels)。
+
+<img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/screenshots/zh/web-console-chat.png" alt="CowAgent Web 控制台" width="800"/>
+
+*Web 控制台是默认通道，也是统一的 Agent 配置和管理入口*
+
+<br/>
+
+## 🧠 记忆与知识库
+
+**长期记忆**采用三层架构：对话上下文（短期）→ 天级记忆（中期）→ MEMORY.md（长期）。每日自动执行**梦境蒸馏（Deep Dream）**，将分散记忆整合为精炼的长期记忆并生成叙事日记。详见 [长期记忆](https://docs.cowagent.ai/memory) · [梦境蒸馏](https://docs.cowagent.ai/memory/deep-dream)。
+
+**个人知识库** 与按时间记录的记忆不同，以**主题为维度**组织结构化知识。Agent 在对话中自动整理有价值信息，维护交叉引用与索引，Web 控制台可可视化浏览知识图谱。详见 [个人知识库](https://docs.cowagent.ai/knowledge)。
+
+<table>
+  <tr>
+    <td width="50%">
+      <img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/screenshots/zh/web-console-memory.png" alt="长期记忆" />
+      <p align="center"><em>长期记忆 · 三层记忆 + 梦境蒸馏</em></p>
+    </td>
+    <td width="50%">
+      <img src="https://cdn.jsdelivr.net/gh/zhayujie/cowagent-assets@main/screenshots/zh/web-console-knowledge.png" alt="个人知识库" />
+      <p align="center"><em>个人知识库 · 自动整理的 Markdown Wiki</em></p>
+    </td>
+  </tr>
+</table>
+
+<br/>
+
+
+## 🔧 工具与技能
+
+**工具（Tools）** 是 Agent 操作系统资源的原子能力，**技能（Skills）** 是基于说明文件的高级工作流，可组合多个工具完成复杂任务。
+
+### 工具系统
+
+**内置工具** 涵盖文件读写（`read` / `write` / `edit` / `ls`）、终端（`bash`）、文件发送（`send`）、记忆检索（`memory`）、环境变量（`env_config`）、网页获取（`web_fetch`）、定时任务（`scheduler`）、联网搜索（`web_search`）、图像识别（`vision`）、浏览器自动化（`browser`）等常用能力。
+
+**MCP 协议** 通过 [Model Context Protocol](https://modelcontextprotocol.io) 接入开放生态中的各种 MCP 服务，配置一次 `mcp.json` 即用即得，支持 stdio / SSE 协议、热更新、零代码接入。
+
+详见 [工具概览](https://docs.cowagent.ai/tools) · [MCP 集成](https://docs.cowagent.ai/tools/mcp)。
+
+### 技能系统
+
+- **[Skill Hub](https://skills.cowagent.ai/)** — 开源的技能广场，浏览、搜索、一键安装
+- **GitHub / ClawHub / URL 等** — 任意来源一键安装
+- **对话创造** — 通过 `skill-creator` 用对话快速生成自定义技能，可将工作流程或第三方接口直接固化为技能
+
+```bash
+/skill list                   # 查看当前技能
+/skill search <关键词>         # 在技能广场搜索
+/skill install <名称>          # 一键安装
+```
+
+详见 [技能概览](https://docs.cowagent.ai/skills) · [创建技能](https://docs.cowagent.ai/skills/create)。
+
+<br/>
+
+## 🏷 更新日志
+
+> **2026.05.22：** [v2.0.9](https://github.com/zhayujie/CowAgent/releases/tag/2.0.9) — 模型管理、MCP 协议支持、浏览器登录态持久化、新模型接入（gpt-5.5、gemini-3.5-flash、qwen3.7-max）、部署安全加固
+
+> **2026.05.06：** [v2.0.8](https://github.com/zhayujie/CowAgent/releases/tag/2.0.8) — 飞书渠道全面升级（语音、流式输出、扫码接入）、新模型支持（DeepSeek V4、百度千帆）、定时任务工具增强
+
+> **2026.04.22：** [v2.0.7](https://github.com/zhayujie/CowAgent/releases/tag/2.0.7) — 图像生成内置技能（GPT Image 2、Nano Banana）、新模型支持（Kimi K2.6、Claude Opus 4.7、GLM 5.1）、知识库和记忆增强
+
+> **2026.04.14：** [v2.0.6](https://github.com/zhayujie/CowAgent/releases/tag/2.0.6) — 知识库系统、梦境记忆模块、上下文智能压缩、Web 控制台多会话
+
+> **2026.04.01：** [v2.0.5](https://github.com/zhayujie/CowAgent/releases/tag/2.0.5) — Cow CLI 命令系统、Skill Hub 开源、浏览器工具、企微扫码创建
+
+> **2026.03.22：** [v2.0.4](https://github.com/zhayujie/CowAgent/releases/tag/2.0.4) — 新增个人微信通道，支持文本/图片/文件/语音消息
+
+> **2026.02.03：** [v2.0.0](https://github.com/zhayujie/CowAgent/releases/tag/2.0.0) — 正式升级为超级 Agent 助理，支持多轮任务决策、长期记忆、Skills 框架
+
+完整更新历史：[Release Notes](https://docs.cowagent.ai/releases)
+
+<br/>
+
+## 🤝 社区与支持
+
+扫码加入微信开源交流群：
+
+<img width="130" src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/open-community.png">
+
+也可通过以下方式获取支持：
+
+- 🐛 [提交 Issue](https://github.com/zhayujie/CowAgent/issues)
+- 🤖 在线 AI 助手：[项目小助手](https://link-ai.tech/app/Kv2fXJcH)（基于项目知识库）
+
+<br/>
+
+## 🔗 相关项目
+
+- **[Cow Skill Hub](https://github.com/zhayujie/cow-skill-hub)** — 开源的 AI Agent 技能广场，支持 CowAgent、OpenClaw、Claude Code 等多种 Agent
+- **[bot-on-anything](https://github.com/zhayujie/bot-on-anything)** — 轻量大模型应用框架，支持 Slack、Telegram、Discord、Gmail 等海外平台
+- **[AgentMesh](https://github.com/MinimalFuture/AgentMesh)** — 开源多智能体（Multi-Agent）框架，通过团队协同解决复杂问题
+
+<br/>
+
+## 🏢 企业服务
+
+<a href="https://link-ai.tech" target="_blank"><img width="650" src="https://cdn.link-ai.tech/image/link-ai-intro.jpg"></a>
+
+> [LinkAI](https://link-ai.tech/) 是面向企业和个人的一站式 AI 智能体平台，为 CowAgent 提供云端托管和企业级支持：
+>
+> - **🚀 免部署在线运行**：无需服务器即可创建 [CowAgent 在线助理](https://link-ai.tech/cowagent/create)，1 分钟拥有专属 Agent
+> - **🧠 Agent 基础设施**：聚合主流大模型、知识库、数据库、技能、工作流，提供开箱即用的 Agent 能力扩展
+> - **🏢 企业级协作**：提供团队协作、权限分级、审计日志、私有化部署等能力，让 Agent 安全落地企业场景
+
+**产品咨询和企业服务** 可联系产品客服：
+
+<img width="130" src="https://cdn.link-ai.tech/portal/linkai-customer-service.png">
+
+<br/>
+
+## 🛠️ 开发与贡献
+
+欢迎接入更多应用通道，参考 [飞书通道实现](https://github.com/zhayujie/CowAgent/blob/master/channel/feishu/feishu_channel.py) 新增自定义通道；同时欢迎贡献新技能，向 [Skill Hub](https://skills.cowagent.ai/submit) 提交。
+
+通过 ⭐ Star 关注项目更新，欢迎提交 PR、Issue 进行反馈。
+
+## 🌟 贡献者
+
+![cow contributors](https://contrib.rocks/image?repo=zhayujie/CowAgent&max=1000)
+
+<br/>
+
+## ⚠️ 声明
+
+1. 本项目遵循 [MIT 开源协议](/LICENSE)，主要用于技术研究和学习。使用时请遵守所在地法律法规及相关政策，因使用本项目所产生的一切后果由使用者自行承担。
+2. **成本与安全：** Agent 模式 Token 消耗显著高于普通对话，请根据效果与成本权衡选择模型；Agent 具备访问本地操作系统的能力，请谨慎选择部署环境。
+3. CowAgent 项目专注于开源技术开发，不会参与、授权或发行任何加密货币。
+
+<br/>
+
+## 📌 项目更名说明
+
+本项目原名 `chatgpt-on-wechat`，于 2026.04.13 正式更名为 **CowAgent**。原 GitHub 地址已自动重定向，老用户可选择执行 `git remote set-url origin https://github.com/zhayujie/CowAgent.git` 更新本地远程地址。
diff --git a/models/bot_factory.py b/models/bot_factory.py
index 824aed04..5d07a236 100644
--- a/models/bot_factory.py
+++ b/models/bot_factory.py
@@ -25,6 +25,10 @@ def create_bot(bot_type):
         from models.qianfan.qianfan_bot import QianfanBot
         return QianfanBot()
 
+    elif bot_type == const.MIMO:
+        from models.mimo.mimo_bot import MimoBot
+        return MimoBot()
+
     elif bot_type in (const.OPENAI, const.CHATGPT, const.CUSTOM):  # OpenAI-compatible API
         from models.chatgpt.chat_gpt_bot import ChatGPTBot
         return ChatGPTBot()
diff --git a/models/chatgpt/chat_gpt_bot.py b/models/chatgpt/chat_gpt_bot.py
index 0ec95a25..d5b7703d 100644
--- a/models/chatgpt/chat_gpt_bot.py
+++ b/models/chatgpt/chat_gpt_bot.py
@@ -60,7 +60,7 @@ class ChatGPTBot(Bot, OpenAIImage, OpenAICompatibleBot):
             "timeout": conf().get("request_timeout", None),  # 重试超时时间，在这个时间内，将会自动重试
         }
         # 部分模型暂不支持一些参数，特殊处理
-        if conf_model in [const.O1, const.O1_MINI, const.GPT_5, const.GPT_5_MINI, const.GPT_5_NANO]:
+        if conf_model in [const.O1, const.O1_MINI, const.GPT_5, const.GPT_5_MINI, const.GPT_5_NANO, const.GPT_55]:
             remove_keys = ["temperature", "top_p", "frequency_penalty", "presence_penalty"]
             for key in remove_keys:
                 self.args.pop(key, None)  # 如果键不存在，使用 None 来避免抛出错、
diff --git a/models/gemini/google_gemini_bot.py b/models/gemini/google_gemini_bot.py
index 6716e971..3c9ac9ae 100644
--- a/models/gemini/google_gemini_bot.py
+++ b/models/gemini/google_gemini_bot.py
@@ -38,9 +38,9 @@ class GoogleGeminiBot(Bot):
 
     @property
     def model(self):
-        model_name = conf().get("model") or "gemini-3.1-pro-preview"
+        model_name = conf().get("model") or "gemini-3.5-flash"
         if model_name == "gemini":
-            model_name = "gemini-3.1-pro-preview"
+            model_name = "gemini-3.5-flash"
         return model_name
 
     @property
diff --git a/models/mimo/__init__.py b/models/mimo/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/models/mimo/mimo_bot.py b/models/mimo/mimo_bot.py
new file mode 100644
index 00000000..a815e9f0
--- /dev/null
+++ b/models/mimo/mimo_bot.py
@@ -0,0 +1,668 @@
+# encoding:utf-8
+
+"""
+小米 MiMo Bot —— OpenAI 兼容协议，使用独立 API key / base 配置。
+
+支持模型：
+- mimo-v2.5-pro     (旗舰，长上下文，默认开启思考)
+- mimo-v2.5         (多模态：文/图/音/视频，默认开启思考)
+- mimo-v2-pro       (V2 Pro，默认开启思考)
+- mimo-v2-omni      (V2 多模态，默认开启思考)
+- mimo-v2-flash     (V2 极速版，默认关闭思考)
+
+思考模式说明：
+- 开关参数：``{"thinking": {"type": "enabled" | "disabled"}}``
+- mimo-v2.5-pro / mimo-v2.5 在思考模式下 ``temperature`` 会被强制为 1.0，
+  本地直接剥离 ``temperature`` / ``top_p`` 等参数避免歧义。
+- 多轮工具调用过程中，若历史包含 tool_calls，所有后续 assistant 消息必须回传
+  ``reasoning_content``，否则 API 返回 400 错误。
+- 文档：https://platform.xiaomimimo.com/docs/zh-CN/usage-guide/passing-back-reasoning_content
+"""
+
+import json
+import time
+from typing import Optional
+
+import requests
+
+from bridge.context import ContextType
+from bridge.reply import Reply, ReplyType
+from common import const
+from common.log import logger
+from config import conf, load_config
+from models.bot import Bot
+from models.openai_compatible_bot import OpenAICompatibleBot
+from models.session_manager import SessionManager
+from .mimo_session import MimoSession
+
+DEFAULT_API_BASE = "https://api.xiaomimimo.com/v1"
+DEFAULT_MODEL = const.MIMO_V2_5_PRO
+
+# 支持多模态输入（图/音/视频）的模型
+MULTIMODAL_MODELS = {const.MIMO_V2_5_PRO, const.MIMO_V2_5, const.MIMO_V2_OMNI}
+
+
+class MimoBot(Bot, OpenAICompatibleBot):
+    def __init__(self):
+        super().__init__()
+        self.sessions = SessionManager(
+            MimoSession,
+            model=conf().get("model") or DEFAULT_MODEL,
+        )
+        conf_model = conf().get("model") or DEFAULT_MODEL
+        self.args = {
+            "model": conf_model,
+            "temperature": conf().get("temperature", 1.0),
+            "top_p": conf().get("top_p", 0.95),
+        }
+
+    # ---------- config helpers ----------
+
+    @property
+    def api_key(self):
+        return conf().get("mimo_api_key")
+
+    @property
+    def api_base(self):
+        url = conf().get("mimo_api_base") or DEFAULT_API_BASE
+        return url.rstrip("/")
+
+    def get_api_config(self):
+        """OpenAICompatibleBot 接口 —— 供 call_with_tools() 使用。"""
+        return {
+            "api_key": self.api_key,
+            "api_base": self.api_base,
+            "model": conf().get("model", DEFAULT_MODEL),
+            "default_temperature": conf().get("temperature", 1.0),
+            "default_top_p": conf().get("top_p", 0.95),
+        }
+
+    @property
+    def supports_vision(self) -> bool:
+        """主模型为多模态模型时，允许 vision tool 走主 bot 通道。"""
+        model_name = (conf().get("model") or "").lower()
+        return model_name in MULTIMODAL_MODELS
+
+    @staticmethod
+    def _model_supports_thinking(model_name: str) -> bool:
+        """全部 mimo 系列模型都支持 thinking 开关。"""
+        if not model_name:
+            return False
+        return model_name.lower().startswith("mimo-")
+
+    @staticmethod
+    def _thinking_default_enabled(model_name: str) -> bool:
+        """各模型的思考模式默认值。mimo-v2-flash 默认关闭，其他默认开启。"""
+        if not model_name:
+            return False
+        return model_name.lower() != const.MIMO_V2_FLASH
+
+    def _build_headers(self) -> dict:
+        return {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {self.api_key}",
+        }
+
+    # ---------- simple chat (non-agent mode) ----------
+
+    def reply(self, query, context=None):
+        if context.type == ContextType.TEXT:
+            logger.info("[MIMO] query={}".format(query))
+
+            session_id = context["session_id"]
+            reply = None
+            clear_memory_commands = conf().get("clear_memory_commands", ["#清除记忆"])
+            if query in clear_memory_commands:
+                self.sessions.clear_session(session_id)
+                reply = Reply(ReplyType.INFO, "记忆已清除")
+            elif query == "#清除所有":
+                self.sessions.clear_all_session()
+                reply = Reply(ReplyType.INFO, "所有人记忆已清除")
+            elif query == "#更新配置":
+                load_config()
+                reply = Reply(ReplyType.INFO, "配置已更新")
+            if reply:
+                return reply
+
+            session = self.sessions.session_query(query, session_id)
+            logger.debug("[MIMO] session query={}".format(session.messages))
+
+            new_args = self.args.copy()
+            reply_content = self.reply_text(session, args=new_args)
+            logger.debug(
+                "[MIMO] new_query={}, session_id={}, reply_cont={}, completion_tokens={}".format(
+                    session.messages, session_id,
+                    reply_content["content"], reply_content["completion_tokens"],
+                )
+            )
+            if reply_content["completion_tokens"] == 0 and len(reply_content["content"]) > 0:
+                reply = Reply(ReplyType.ERROR, reply_content["content"])
+            elif reply_content["completion_tokens"] > 0:
+                self.sessions.session_reply(
+                    reply_content["content"], session_id, reply_content["total_tokens"],
+                )
+                reply = Reply(ReplyType.TEXT, reply_content["content"])
+            else:
+                reply = Reply(ReplyType.ERROR, reply_content["content"])
+                logger.debug("[MIMO] reply {} used 0 tokens.".format(reply_content))
+            return reply
+        else:
+            reply = Reply(ReplyType.ERROR, "Bot不支持处理{}类型的消息".format(context.type))
+            return reply
+
+    def reply_text(self, session, args=None, retry_count: int = 0) -> dict:
+        try:
+            headers = self._build_headers()
+            body = dict(args) if args else dict(self.args)
+            body["messages"] = session.messages
+
+            model_name = str(body.get("model", ""))
+            # 思考模式下 mimo-v2.5-pro / mimo-v2.5 不支持自定义 temperature/top_p,
+            # 简单起见，所有支持思考的模型按默认配置走，剥离这些参数。
+            if self._model_supports_thinking(model_name) and self._thinking_default_enabled(model_name):
+                for k in ("temperature", "top_p", "presence_penalty", "frequency_penalty"):
+                    body.pop(k, None)
+
+            res = requests.post(
+                f"{self.api_base}/chat/completions",
+                headers=headers,
+                json=body,
+                timeout=180,
+            )
+            if res.status_code == 200:
+                response = res.json()
+                return {
+                    "total_tokens": response["usage"]["total_tokens"],
+                    "completion_tokens": response["usage"]["completion_tokens"],
+                    "content": response["choices"][0]["message"]["content"],
+                }
+            else:
+                try:
+                    response = res.json()
+                    error = response.get("error", {})
+                except Exception:
+                    error = {"message": res.text[:300]}
+                logger.error(
+                    f"[MIMO] chat failed, status_code={res.status_code}, "
+                    f"msg={error.get('message')}, type={error.get('type')}"
+                )
+                result = {"completion_tokens": 0, "content": "提问太快啦，请休息一下再问我吧"}
+                need_retry = False
+                if res.status_code >= 500:
+                    need_retry = retry_count < 2
+                elif res.status_code == 401:
+                    result["content"] = "授权失败，请检查API Key是否正确"
+                elif res.status_code == 429:
+                    result["content"] = "请求过于频繁，请稍后再试"
+                    need_retry = retry_count < 2
+
+                if need_retry:
+                    time.sleep(3)
+                    return self.reply_text(session, args, retry_count + 1)
+                return result
+        except Exception as e:
+            logger.exception(e)
+            if retry_count < 2:
+                return self.reply_text(session, args, retry_count + 1)
+            return {"completion_tokens": 0, "content": "我现在有点累了，等会再来吧"}
+
+    # ==================== Agent mode support ====================
+
+    def call_with_tools(self, messages, tools=None, stream: bool = False, **kwargs):
+        """
+        带工具调用支持的 MiMo API 调用 (供 agent 集成使用)。
+
+        处理逻辑：
+        - Claude 格式 → OpenAI 格式 转换（含 reasoning_content 全量回传）
+        - System prompt 注入
+        - SSE 流式响应（包含 tool_calls 与 reasoning_content 增量）
+        - 思考模式开关传递
+        """
+        try:
+            converted_messages = self._convert_messages_to_openai_format(messages)
+
+            system_prompt = kwargs.pop("system", None)
+            if system_prompt:
+                if not converted_messages or converted_messages[0].get("role") != "system":
+                    converted_messages.insert(0, {"role": "system", "content": system_prompt})
+                else:
+                    converted_messages[0] = {"role": "system", "content": system_prompt}
+
+            converted_tools = None
+            if tools:
+                converted_tools = self._convert_tools_to_openai_format(tools)
+
+            model = kwargs.pop("model", None) or self.args["model"]
+            max_tokens = kwargs.pop("max_tokens", None)
+
+            request_body = {
+                "model": model,
+                "messages": converted_messages,
+                "stream": stream,
+            }
+            if max_tokens is not None:
+                # MiMo 使用 max_completion_tokens 命名（含可见输出 + 推理 token）
+                request_body["max_completion_tokens"] = max_tokens
+
+            if converted_tools:
+                request_body["tools"] = converted_tools
+                request_body["tool_choice"] = kwargs.pop("tool_choice", "auto")
+
+            # 思考模式：默认遵循各模型的官方默认值；caller 可显式覆盖
+            thinking_param = kwargs.pop("thinking", None)
+            thinking_active = False
+
+            if self._model_supports_thinking(model):
+                if thinking_param is None:
+                    default_on = self._thinking_default_enabled(model)
+                    thinking_param = {"type": "enabled" if default_on else "disabled"}
+                request_body["thinking"] = thinking_param
+                thinking_active = thinking_param.get("type") == "enabled"
+
+            # 思考模式下 v2.5-pro / v2.5 不支持自定义 temperature；干脆全部剥离避免被静默忽略
+            if thinking_active:
+                for k in ("temperature", "top_p", "presence_penalty", "frequency_penalty"):
+                    request_body.pop(k, None)
+                    kwargs.pop(k, None)
+            else:
+                temperature = kwargs.pop("temperature", None)
+                if temperature is not None:
+                    request_body["temperature"] = temperature
+                top_p = kwargs.pop("top_p", None)
+                if top_p is not None:
+                    request_body["top_p"] = top_p
+
+            logger.debug(
+                f"[MIMO] API call: model={model}, "
+                f"tools={len(converted_tools) if converted_tools else 0}, "
+                f"stream={stream}, thinking={thinking_active}"
+            )
+
+            if stream:
+                return self._handle_stream_response(request_body)
+            else:
+                return self._handle_sync_response(request_body)
+
+        except Exception as e:
+            logger.error(f"[MIMO] call_with_tools error: {e}")
+            import traceback
+            logger.error(traceback.format_exc())
+
+            def error_generator():
+                yield {"error": True, "message": str(e), "status_code": 500}
+            return error_generator()
+
+    # -------------------- streaming --------------------
+
+    def _handle_stream_response(self, request_body: dict):
+        """SSE 流式 chunk 转为 OpenAI 标准 delta 输出（含 reasoning_content）。"""
+        try:
+            headers = self._build_headers()
+            url = f"{self.api_base}/chat/completions"
+            response = requests.post(url, headers=headers, json=request_body, stream=True, timeout=180)
+
+            if response.status_code != 200:
+                error_msg = response.text
+                logger.error(f"[MIMO] API error: status={response.status_code}, msg={error_msg}")
+                yield {"error": True, "message": error_msg, "status_code": response.status_code}
+                return
+
+            current_tool_calls = {}
+            finish_reason = None
+
+            for line in response.iter_lines():
+                if not line:
+                    continue
+
+                line = line.decode("utf-8")
+                if line.startswith("data: "):
+                    data_str = line[6:]
+                elif line.startswith("data:"):
+                    data_str = line[5:]
+                else:
+                    continue
+                if data_str.strip() == "[DONE]":
+                    break
+
+                try:
+                    chunk = json.loads(data_str)
+                except json.JSONDecodeError as e:
+                    logger.warning(f"[MIMO] JSON decode error: {e}, data: {data_str[:200]}")
+                    continue
+
+                if chunk.get("error"):
+                    error_data = chunk["error"]
+                    error_msg = error_data.get("message", "Unknown error") if isinstance(error_data, dict) else str(error_data)
+                    logger.error(f"[MIMO] stream error: {error_msg}")
+                    yield {"error": True, "message": error_msg, "status_code": 500}
+                    return
+
+                if not chunk.get("choices"):
+                    continue
+                choice = chunk["choices"][0]
+                delta = choice.get("delta", {})
+
+                if choice.get("finish_reason"):
+                    finish_reason = choice["finish_reason"]
+
+                # 推理内容（思考模式）：单独 delta 透传给 agent_stream
+                if delta.get("reasoning_content"):
+                    yield {
+                        "choices": [{
+                            "index": 0,
+                            "delta": {
+                                "role": "assistant",
+                                "reasoning_content": delta["reasoning_content"],
+                            },
+                            "finish_reason": None,
+                        }]
+                    }
+
+                if delta.get("content"):
+                    yield {
+                        "choices": [{
+                            "index": 0,
+                            "delta": {
+                                "role": "assistant",
+                                "content": delta["content"],
+                            },
+                        }]
+                    }
+
+                if "tool_calls" in delta and delta["tool_calls"]:
+                    for tool_call_chunk in delta["tool_calls"]:
+                        index = tool_call_chunk.get("index", 0)
+                        if index not in current_tool_calls:
+                            current_tool_calls[index] = {
+                                "id": tool_call_chunk.get("id", ""),
+                                "name": tool_call_chunk.get("function", {}).get("name", ""),
+                                "arguments": "",
+                            }
+                        if "function" in tool_call_chunk and "arguments" in tool_call_chunk["function"]:
+                            current_tool_calls[index]["arguments"] += tool_call_chunk["function"]["arguments"]
+
+                        yield {
+                            "choices": [{
+                                "index": 0,
+                                "delta": {"tool_calls": [tool_call_chunk]},
+                            }]
+                        }
+
+            yield {
+                "choices": [{
+                    "index": 0,
+                    "delta": {},
+                    "finish_reason": finish_reason,
+                }]
+            }
+
+        except requests.exceptions.Timeout:
+            logger.error("[MIMO] Request timeout")
+            yield {"error": True, "message": "Request timeout", "status_code": 500}
+        except Exception as e:
+            logger.error(f"[MIMO] stream response error: {e}")
+            import traceback
+            logger.error(traceback.format_exc())
+            yield {"error": True, "message": str(e), "status_code": 500}
+
+    # -------------------- sync --------------------
+
+    def _handle_sync_response(self, request_body: dict):
+        """非流式响应；统一 yield 一份 Claude 格式 dict 与流式路径对齐。"""
+        try:
+            headers = self._build_headers()
+            request_body.pop("stream", None)
+            url = f"{self.api_base}/chat/completions"
+            response = requests.post(url, headers=headers, json=request_body, timeout=180)
+
+            if response.status_code != 200:
+                error_msg = response.text
+                logger.error(f"[MIMO] API error: status={response.status_code}, msg={error_msg}")
+                yield {"error": True, "message": error_msg, "status_code": response.status_code}
+                return
+
+            result = response.json()
+            message = result["choices"][0]["message"]
+            finish_reason = result["choices"][0]["finish_reason"]
+
+            response_data = {"role": "assistant", "content": []}
+
+            # 推理内容包装成 thinking block，便于 agent 层持久化并在工具调用时回传
+            if message.get("reasoning_content"):
+                response_data["content"].append({
+                    "type": "thinking",
+                    "thinking": message["reasoning_content"],
+                })
+
+            if message.get("content"):
+                response_data["content"].append({
+                    "type": "text",
+                    "text": message["content"],
+                })
+
+            if message.get("tool_calls"):
+                for tool_call in message["tool_calls"]:
+                    try:
+                        tool_input = json.loads(tool_call["function"]["arguments"])
+                    except (json.JSONDecodeError, TypeError):
+                        tool_input = {}
+                    response_data["content"].append({
+                        "type": "tool_use",
+                        "id": tool_call["id"],
+                        "name": tool_call["function"]["name"],
+                        "input": tool_input,
+                    })
+
+            if finish_reason == "tool_calls":
+                response_data["stop_reason"] = "tool_use"
+            elif finish_reason == "stop":
+                response_data["stop_reason"] = "end_turn"
+            else:
+                response_data["stop_reason"] = finish_reason
+
+            yield response_data
+
+        except requests.exceptions.Timeout:
+            logger.error("[MIMO] Request timeout")
+            yield {"error": True, "message": "Request timeout", "status_code": 500}
+        except Exception as e:
+            logger.error(f"[MIMO] sync response error: {e}")
+            import traceback
+            logger.error(traceback.format_exc())
+            yield {"error": True, "message": str(e), "status_code": 500}
+
+    # -------------------- format conversion --------------------
+
+    def _convert_messages_to_openai_format(self, messages):
+        """
+        将 Claude 格式（content blocks）转为 OpenAI 格式。
+
+        关键约束：MiMo 思考模式下，一旦历史包含 tool_calls 的 assistant 轮次，
+        所有后续 assistant 消息（含工具调用轮）必须回传 reasoning_content，
+        否则 API 返回 400。本地无 trace 时用空字符串回填，MiMo 接受字段存在
+        即可。
+        """
+        if not messages:
+            return []
+
+        has_tool_call_history = False
+        for msg in messages:
+            if msg.get("role") != "assistant":
+                continue
+            if msg.get("tool_calls"):
+                has_tool_call_history = True
+                break
+            content = msg.get("content")
+            if isinstance(content, list) and any(
+                isinstance(b, dict) and b.get("type") == "tool_use" for b in content
+            ):
+                has_tool_call_history = True
+                break
+
+        converted = []
+
+        for msg in messages:
+            role = msg.get("role")
+            content = msg.get("content")
+
+            if not isinstance(content, list):
+                if (
+                    role == "assistant"
+                    and isinstance(msg, dict)
+                    and has_tool_call_history
+                    and "reasoning_content" not in msg
+                ):
+                    patched = dict(msg)
+                    patched["reasoning_content"] = ""
+                    converted.append(patched)
+                else:
+                    converted.append(msg)
+                continue
+
+            if role == "user":
+                has_tool_result = any(
+                    isinstance(b, dict) and b.get("type") == "tool_result" for b in content
+                )
+                if has_tool_result:
+                    text_parts = []
+                    tool_results = []
+
+                    for block in content:
+                        if not isinstance(block, dict):
+                            continue
+                        if block.get("type") == "text":
+                            text_parts.append(block.get("text", ""))
+                        elif block.get("type") == "tool_result":
+                            tool_call_id = block.get("tool_use_id") or ""
+                            result_content = block.get("content", "")
+                            if not isinstance(result_content, str):
+                                result_content = json.dumps(result_content, ensure_ascii=False)
+                            tool_results.append({
+                                "role": "tool",
+                                "tool_call_id": tool_call_id,
+                                "content": result_content,
+                            })
+
+                    converted.extend(tool_results)
+
+                    if text_parts:
+                        converted.append({"role": "user", "content": "\n".join(text_parts)})
+                else:
+                    # 多模态原样保留（image_url / input_audio / video_url 等 block）
+                    converted.append(msg)
+
+            elif role == "assistant":
+                openai_msg = {"role": "assistant"}
+                text_parts = []
+                tool_calls = []
+                reasoning_parts = []
+
+                for block in content:
+                    if not isinstance(block, dict):
+                        continue
+                    btype = block.get("type")
+                    if btype == "text":
+                        text_parts.append(block.get("text", ""))
+                    elif btype == "tool_use":
+                        tool_calls.append({
+                            "id": block.get("id"),
+                            "type": "function",
+                            "function": {
+                                "name": block.get("name"),
+                                "arguments": json.dumps(block.get("input", {})),
+                            },
+                        })
+                    elif btype == "thinking":
+                        reasoning_parts.append(block.get("thinking", ""))
+
+                if text_parts:
+                    openai_msg["content"] = "\n".join(text_parts)
+                elif not tool_calls:
+                    openai_msg["content"] = ""
+
+                if tool_calls:
+                    openai_msg["tool_calls"] = tool_calls
+                    if not text_parts:
+                        openai_msg["content"] = None
+
+                if reasoning_parts:
+                    openai_msg["reasoning_content"] = "\n".join(reasoning_parts)
+                elif has_tool_call_history:
+                    openai_msg["reasoning_content"] = ""
+
+                converted.append(openai_msg)
+            else:
+                converted.append(msg)
+
+        return converted
+
+    def _convert_tools_to_openai_format(self, tools):
+        """工具定义 Claude 格式 → OpenAI 格式。"""
+        if not tools:
+            return None
+
+        converted = []
+        for tool in tools:
+            if "type" in tool and tool["type"] == "function":
+                converted.append(tool)
+            else:
+                converted.append({
+                    "type": "function",
+                    "function": {
+                        "name": tool.get("name"),
+                        "description": tool.get("description"),
+                        "parameters": tool.get("input_schema", {}),
+                    },
+                })
+        return converted
+
+    # -------------------- vision --------------------
+
+    def call_vision(self, image_url: str, question: str,
+                    model: Optional[str] = None,
+                    max_tokens: int = 1000) -> dict:
+        """通过 MiMo OpenAI 兼容的 /chat/completions 端点进行图像理解。"""
+        try:
+            # 主模型若不支持视觉（如 mimo-v2-flash），自动切到 mimo-v2.5-pro
+            vision_model = model
+            if not vision_model:
+                cur = self.args.get("model") or DEFAULT_MODEL
+                vision_model = cur if cur in MULTIMODAL_MODELS else const.MIMO_V2_5_PRO
+
+            payload = {
+                "model": vision_model,
+                "max_completion_tokens": max_tokens,
+                "messages": [{
+                    "role": "user",
+                    "content": [
+                        {"type": "text", "text": question},
+                        {"type": "image_url", "image_url": {"url": image_url}},
+                    ],
+                }],
+            }
+            headers = self._build_headers()
+            resp = requests.post(
+                f"{self.api_base}/chat/completions",
+                headers=headers, json=payload, timeout=60,
+            )
+            if resp.status_code != 200:
+                return {"error": True, "message": f"HTTP {resp.status_code}: {resp.text[:300]}"}
+            data = resp.json()
+            if "error" in data:
+                return {"error": True, "message": data["error"].get("message", str(data["error"]))}
+            choice = data.get("choices", [{}])[0].get("message", {})
+            # 部分模型在多模态下会把答案塞在 reasoning_content 而非 content
+            content = choice.get("content") or choice.get("reasoning_content") or ""
+            usage = data.get("usage", {})
+            return {
+                "model": vision_model,
+                "content": content,
+                "usage": {
+                    "prompt_tokens": usage.get("prompt_tokens", 0),
+                    "completion_tokens": usage.get("completion_tokens", 0),
+                    "total_tokens": usage.get("total_tokens", 0),
+                },
+            }
+        except Exception as e:
+            logger.error(f"[MIMO] call_vision error: {e}")
+            return {"error": True, "message": str(e)}
diff --git a/models/mimo/mimo_session.py b/models/mimo/mimo_session.py
new file mode 100644
index 00000000..76483f11
--- /dev/null
+++ b/models/mimo/mimo_session.py
@@ -0,0 +1,57 @@
+from common.log import logger
+from models.session_manager import Session
+
+
+class MimoSession(Session):
+    def __init__(self, session_id, system_prompt=None, model="mimo-v2.5-pro"):
+        super().__init__(session_id, system_prompt)
+        self.model = model
+        self.reset()
+
+    def discard_exceeding(self, max_tokens, cur_tokens=None):
+        precise = True
+        try:
+            cur_tokens = self.calc_tokens()
+        except Exception as e:
+            precise = False
+            if cur_tokens is None:
+                raise e
+            logger.debug("Exception when counting tokens precisely for query: {}".format(e))
+        while cur_tokens > max_tokens:
+            if len(self.messages) > 2:
+                self.messages.pop(1)
+            elif len(self.messages) == 2 and self.messages[1]["role"] == "assistant":
+                self.messages.pop(1)
+                if precise:
+                    cur_tokens = self.calc_tokens()
+                else:
+                    cur_tokens = cur_tokens - max_tokens
+                break
+            elif len(self.messages) == 2 and self.messages[1]["role"] == "user":
+                logger.warn("user message exceed max_tokens. total_tokens={}".format(cur_tokens))
+                break
+            else:
+                logger.debug("max_tokens={}, total_tokens={}, len(messages)={}".format(
+                    max_tokens, cur_tokens, len(self.messages)))
+                break
+            if precise:
+                cur_tokens = self.calc_tokens()
+            else:
+                cur_tokens = cur_tokens - max_tokens
+        return cur_tokens
+
+    def calc_tokens(self):
+        return num_tokens_from_messages(self.messages, self.model)
+
+
+def num_tokens_from_messages(messages, model):
+    tokens = 0
+    for msg in messages:
+        content = msg.get("content", "")
+        if isinstance(content, str):
+            tokens += len(content)
+        elif isinstance(content, list):
+            for block in content:
+                if isinstance(block, dict):
+                    tokens += len(block.get("text", ""))
+    return tokens
diff --git a/models/openai_compatible_bot.py b/models/openai_compatible_bot.py
index aba5b327..e669fed2 100644
--- a/models/openai_compatible_bot.py
+++ b/models/openai_compatible_bot.py
@@ -89,8 +89,9 @@ class OpenAICompatibleBot:
                     messages[0] = {"role": "system", "content": system_prompt}
             
             # Build request parameters
+            model_name = kwargs.get("model", api_config.get('model', 'gpt-5.4'))
             request_params = {
-                "model": kwargs.get("model", api_config.get('model', 'gpt-3.5-turbo')),
+                "model": model_name,
                 "messages": messages,
                 "temperature": kwargs.get("temperature", api_config.get('default_temperature', 0.9)),
                 "top_p": kwargs.get("top_p", api_config.get('default_top_p', 1.0)),
@@ -98,6 +99,10 @@ class OpenAICompatibleBot:
                 "presence_penalty": kwargs.get("presence_penalty", api_config.get('default_presence_penalty', 0.0)),
                 "stream": stream
             }
+            # GPT-5 / GPT-5.5 / o1 series only accept default temperature/top_p and reject penalty params
+            if model_name in ("gpt-5", "gpt-5-mini", "gpt-5-nano", "gpt-5.5", "o1", "o1-mini"):
+                for key in ("temperature", "top_p", "frequency_penalty", "presence_penalty"):
+                    request_params.pop(key, None)
             
             # Add max_tokens if specified
             if kwargs.get("max_tokens"):
diff --git a/plugins/agent/README.md b/plugins/agent/README.md
deleted file mode 100644
index 744a3d61..00000000
--- a/plugins/agent/README.md
+++ /dev/null
@@ -1,66 +0,0 @@
-# Agent插件
-
-## 插件说明
-
-基于 [AgentMesh](https://github.com/MinimalFuture/AgentMesh) 多智能体框架实现的Agent插件，可以让机器人快速获得Agent能力，通过自然语言对话来访问 **终端、浏览器、文件系统、搜索引擎** 等各类工具。
-同时还支持通过 **多智能体协作** 来完成复杂任务，例如多智能体任务分发、多智能体问题讨论、协同处理等。
-
-AgentMesh项目地址：https://github.com/MinimalFuture/AgentMesh
-
-## 安装
-
-1. 确保已安装依赖：
-
-```bash
-pip install agentmesh-sdk>=0.1.3
-```
-
-2. 如需使用浏览器工具，还需安装：
-
-```bash
-pip install browser-use>=0.1.40
-playwright install
-```
-
-## 配置
-
-插件配置文件是 `plugins/agent`目录下的 `config.yaml`，包含智能体团队的配置以及工具的配置，可以从模板文件 `config-template.yaml`中复制：
-
-```bash
-cp config-template.yaml config.yaml
-```
-
-说明：
-
- - `team`配置是默认选中的 agent team
- - `teams` 下是Agent团队配置，团队的model默认为`gpt-4.1-mini`，可根据需要进行修改，模型对应的 `api_key` 需要在项目根目录的 `config.json` 全局配置中进行配置。例如openai模型需要配置 `open_ai_api_key`
- - 支持为 `agents` 下面的每个agent添加model字段来设置不同的模型
-
-
-## 使用方法
-
-在对机器人发送的消息中使用 `$agent` 前缀来触发插件，支持以下命令：
-
-- `$agent [task]`: 使用默认团队执行任务 (默认团队可通 config.yaml 中的team配置修改)
-- `$agent teams`: 列出可用的团队
-- `$agent use [team_name] [task]`: 使用指定的团队执行任务
-
-
-### 示例
-
-```bash
-$agent 帮我查看当前目录下有哪些文件夹
-$agent teams
-$agent use software_team 帮我写一个产品预约体验的表单页面
-```
-
-## 工具支持
-
-目前支持多种内置工具，包括但不限于：
-
-- `calculator`: 数学计算工具
-- `current_time`: 获取当前时间
-- `browser`: 浏览器操作工具，注意需安装`browser-use`依赖
-- `google_search`: 搜索引擎，注意需在`config.yaml`中配置 `api_key`
-- `file_save`: 文件保存工具，开启后智能体输出的内容将保存在 `workspace` 目录下
-- `terminal`: 终端命令执行工具
diff --git a/plugins/agent/__init__.py b/plugins/agent/__init__.py
deleted file mode 100644
index 75642e00..00000000
--- a/plugins/agent/__init__.py
+++ /dev/null
@@ -1,3 +0,0 @@
-from .agent import AgentPlugin
-
-__all__ = ["AgentPlugin"]
\ No newline at end of file
diff --git a/plugins/agent/agent.py b/plugins/agent/agent.py
deleted file mode 100644
index 700f134e..00000000
--- a/plugins/agent/agent.py
+++ /dev/null
@@ -1,282 +0,0 @@
-import os
-import yaml
-from typing import Dict, List, Optional
-
-from agentmesh import AgentTeam, Agent, LLMModel
-from agentmesh.models import ClaudeModel
-from agentmesh.tools import ToolManager
-from config import conf
-
-import plugins
-from plugins import Plugin, Event, EventContext, EventAction
-from bridge.context import ContextType
-from bridge.reply import Reply, ReplyType
-from common.log import logger
-
-
-@plugins.register(
-    name="agent",
-    desc="Use AgentMesh framework to process tasks with multi-agent teams",
-    version="0.1.0",
-    author="Saboteur7",
-    desire_priority=1,
-)
-class AgentPlugin(Plugin):
-    """Plugin for integrating AgentMesh framework."""
-    
-    def __init__(self):
-        super().__init__()
-        self.handlers[Event.ON_HANDLE_CONTEXT] = self.on_handle_context
-        self.name = "agent"
-        self.description = "Use AgentMesh framework to process tasks with multi-agent teams"
-        self.config = self._load_config()
-        self.tool_manager = ToolManager()
-        self.tool_manager.load_tools(config_dict=self.config.get("tools"))
-        logger.debug("[agent] inited")
-    
-    def _load_config(self) -> Dict:
-        """Load configuration from config.yaml file."""
-        config_path = os.path.join(self.path, "config.yaml")
-        if not os.path.exists(config_path):
-            logger.debug(f"Config file not found at {config_path}")
-            return {}
-            
-        with open(config_path, 'r', encoding='utf-8') as f:
-            return yaml.safe_load(f)
-    
-    def get_help_text(self, verbose=False, **kwargs):
-        """Return help message for the agent plugin."""
-        help_text = "通过AgentMesh实现对终端、浏览器、文件系统、搜索引擎等工具的执行，并支持多智能体协作。"
-        trigger_prefix = conf().get("plugin_trigger_prefix", "$")
-        
-        if not verbose:
-            return help_text
-            
-        teams = self.get_available_teams()
-        teams_str = ", ".join(teams) if teams else "未配置任何团队"
-        
-        help_text += "\n\n使用说明：\n"
-        help_text += f"{trigger_prefix}agent [task] - 使用默认团队执行任务\n"
-        help_text += f"{trigger_prefix}agent teams - 列出可用的团队\n"
-        help_text += f"{trigger_prefix}agent use [team_name] [task] - 使用特定团队执行任务\n\n"
-        help_text += f"可用团队: \n{teams_str}\n\n"
-        help_text += f"示例:\n"
-        help_text += f"{trigger_prefix}agent 帮我查看当前文件夹路径\n"
-        help_text += f"{trigger_prefix}agent use software_team 帮我写一个产品预约体验的表单页面"
-        return help_text
-    
-    def get_available_teams(self) -> List[str]:
-        """Get list of available teams from configuration."""
-        teams_config = self.config.get("teams", {})
-        return list(teams_config.keys())
-
-
-    def create_team_from_config(self, team_name: str) -> Optional[AgentTeam]:
-        """Create a team from configuration."""
-        # Get teams configuration
-        teams_config = self.config.get("teams", {})
-
-        # Check if the specified team exists
-        if team_name not in teams_config:
-            logger.error(f"Team '{team_name}' not found in configuration.")
-            available_teams = list(teams_config.keys())
-            logger.info(f"Available teams: {', '.join(available_teams)}")
-            return None
-
-        # Get team configuration
-        team_config = teams_config[team_name]
-
-        # Get team's model
-        team_model_name = team_config.get("model", "gpt-4.1-mini")
-        team_model = self.create_llm_model(team_model_name)
-
-        # Get team's max_steps (default to 20 if not specified)
-        team_max_steps = team_config.get("max_steps", 20)
-
-        # Create team with the model
-        team = AgentTeam(
-            name=team_name,
-            description=team_config.get("description", ""),
-            rule=team_config.get("rule", ""),
-            model=team_model,
-            max_steps=team_max_steps
-        )
-
-        # Create and add agents to the team
-        agents_config = team_config.get("agents", [])
-        for agent_config in agents_config:
-            # Check if agent has a specific model
-            if agent_config.get("model"):
-                agent_model = self.create_llm_model(agent_config.get("model"))
-            else:
-                agent_model = team_model
-
-            # Get agent's max_steps
-            agent_max_steps = agent_config.get("max_steps")
-
-            agent = Agent(
-                name=agent_config.get("name", ""),
-                system_prompt=agent_config.get("system_prompt", ""),
-                model=agent_model,  # Use agent's model if specified, otherwise will use team's model
-                description=agent_config.get("description", ""),
-                max_steps=agent_max_steps
-            )
-
-            # Add tools to the agent if specified
-            tool_names = agent_config.get("tools", [])
-            for tool_name in tool_names:
-                tool = self.tool_manager.create_tool(tool_name)
-                if tool:
-                    agent.add_tool(tool)
-                else:
-                    if tool_name == "browser":
-                        logger.warning(
-                            "Tool 'Browser' loaded failed, "
-                            "please install the required dependency with: \n"
-                            "'pip install browser-use>=0.1.40' or 'pip install agentmesh-sdk[full]'\n"
-                        )
-                    else:
-                        logger.warning(f"Tool '{tool_name}' not found for agent '{agent.name}'\n")
-
-            # Add agent to team
-            team.add(agent)
-
-        return team
-    
-    def on_handle_context(self, e_context: EventContext):
-        """Handle the message context."""
-        if e_context['context'].type != ContextType.TEXT:
-            return
-        content = e_context['context'].content
-        trigger_prefix = conf().get("plugin_trigger_prefix", "$")
-        
-        if not content.startswith(f"{trigger_prefix}agent "):
-            e_context.action = EventAction.CONTINUE
-            return
-
-        if not self.config:
-            reply = Reply()
-            reply.type = ReplyType.ERROR
-            reply.content = "未找到插件配置，请在 plugins/agent 目录下创建 config.yaml 配置文件，可根据 config-template.yml 模板文件复制"
-            e_context['reply'] = reply
-            e_context.action = EventAction.BREAK_PASS
-            return
-
-        # Extract the actual task
-        task = content[len(f"{trigger_prefix}agent "):].strip()
-        
-        # If task is empty, return help message
-        if not task:
-            reply = Reply()
-            reply.type = ReplyType.TEXT
-            reply.content = self.get_help_text(verbose=True)
-            e_context['reply'] = reply
-            e_context.action = EventAction.BREAK_PASS
-            return
-            
-        # Check if task is asking for available teams
-        if task.lower() in ["teams", "list teams", "show teams"]:
-            teams = self.get_available_teams()
-            reply = Reply()
-            reply.type = ReplyType.TEXT
-            
-            if not teams:
-                reply.content = "未配置任何团队。请检查 config.yaml 文件。"
-            else:
-                reply.content = f"可用团队: {', '.join(teams)}"
-                
-            e_context['reply'] = reply
-            e_context.action = EventAction.BREAK_PASS
-            return
-        
-        # Check if task specifies a team
-        team_name = None
-        if task.startswith("use "):
-            parts = task[4:].split(" ", 1)
-            if len(parts) > 0:
-                team_name = parts[0]
-                if len(parts) > 1:
-                    task = parts[1].strip()
-                else:
-                    reply = Reply()
-                    reply.type = ReplyType.TEXT
-                    reply.content = f"已选择团队 '{team_name}'。请输入您想执行的任务。"
-                    e_context['reply'] = reply
-                    e_context.action = EventAction.BREAK_PASS
-                    return
-        if not team_name:
-            team_name = self.config.get("team")
-
-        # If no team specified, use default or first available
-        if not team_name:
-            teams = self.configself.get_available_teams()
-            if not teams:
-                reply = Reply()
-                reply.type = ReplyType.TEXT
-                reply.content = "未配置任何团队。请检查 config.yaml 文件。"
-                e_context['reply'] = reply
-                e_context.action = EventAction.BREAK_PASS
-                return
-            team_name = teams[0]
-            
-        # Create team
-        team = self.create_team_from_config(team_name)
-        if not team:
-            reply = Reply()
-            reply.type = ReplyType.TEXT
-            reply.content = f"创建团队 '{team_name}' 失败。请检查配置。"
-            e_context['reply'] = reply
-            e_context.action = EventAction.BREAK_PASS
-            return
-        
-        # Run the task
-        try:
-            logger.info(f"[agent] Running task '{task}' with team '{team_name}', team_model={team.model.model}")
-            result = team.run_async(task=task)
-            for agent_result in result:
-                res_text = f"🤖 {agent_result.get('agent_name')}\n\n{agent_result.get('final_answer')}"
-                _send_text(e_context, content=res_text)
-            
-            reply = Reply()
-            reply.type = ReplyType.TEXT
-            reply.content = ""
-            e_context['reply'] = reply
-            e_context.action = EventAction.BREAK_PASS
-            
-        except Exception as e:
-            logger.exception(f"Error running task with team '{team_name}'")
-            
-            reply = Reply()
-            reply.type = ReplyType.ERROR
-            reply.content = f"执行任务时出错: {str(e)}"
-            e_context['reply'] = reply
-            e_context.action = EventAction.BREAK_PASS
-        return
-
-    def create_llm_model(self, model_name) -> LLMModel:
-        if conf().get("use_linkai"):
-            api_base = "https://api.link-ai.tech/v1"
-            api_key = conf().get("linkai_api_key")
-        elif model_name.startswith(("gpt", "text-davinci", "o1", "o3")):
-            api_base = conf().get("open_ai_api_base") or "https://api.openai.com/v1"
-            api_key = conf().get("open_ai_api_key")
-        elif model_name.startswith("claude"):
-            return ClaudeModel(model=model_name, api_key=conf().get("claude_api_key"))
-        elif model_name.startswith("moonshot"):
-            api_base = "https://api.moonshot.cn/v1"
-            api_key = conf().get("moonshot_api_key")
-        elif model_name.startswith("qwen"):
-            api_base = "https://dashscope.aliyuncs.com/compatible-mode/v1"
-            api_key = conf().get("dashscope_api_key")
-        else:
-            api_base = conf().get("open_ai_api_base") or "https://api.openai.com/v1"
-            api_key = conf().get("open_ai_api_key")
-
-        llm_model = LLMModel(model=model_name, api_key=api_key, api_base=api_base)
-        return llm_model
-
-
-def _send_text(e_context: EventContext, content: str):
-    reply = Reply(ReplyType.TEXT, content)
-    channel = e_context["channel"]
-    channel.send(reply, e_context["context"])
diff --git a/plugins/agent/config-template.yaml b/plugins/agent/config-template.yaml
deleted file mode 100644
index db0e69b2..00000000
--- a/plugins/agent/config-template.yaml
+++ /dev/null
@@ -1,52 +0,0 @@
-# 默认选中的Agent Team名称
-team: general_team
-
-tools:
-  google_search:
-    # get your apikey from https://serper.dev/
-    api_key: "YOUR API KEY"
-
-# Agent Team 配置
-teams:
-  # 通用智能体团队
-  general_team:
-    model: "gpt-4.1-mini"        # 团队使用的模型
-    description: "A versatile research and information agent team"
-    max_steps: 5
-    agents:
-      - name: "通用智能助手"
-        description: "Universal assistant specializing in research, information synthesis, and task execution"
-        system_prompt: "You are a versatile assistant who answers questions and completes tasks using available tools. Reply in a clearly structured, attractive and easy to read format."
-        # Agent 支持使用的工具
-        tools:
-          - time
-          - calculator
-          - google_search
-          - browser
-          - terminal
-
-  # 软件开发智能体团队
-  software_team:
-    model: "gpt-4.1-mini"
-    description: "A software development team with product manager, developer and tester."
-    rule: "A normal R&D process should be that Product Manager writes PRD, Developer writes code based on PRD, and Finally, Tester performs testing."
-    max_steps: 10
-    agents:
-      - name: "Product-Manager"
-        description: "Responsible for product requirements and documentation"
-        system_prompt: "You are an experienced product manager who creates concise PRDs, focusing on user needs and feature specifications. You always format your responses in Markdown."
-        tools:
-          - time
-          - file_save
-      - name: "Developer"
-        description: "Implements code based on PRD"
-        system_prompt: "You are a skilled developer. When developing web application, you creates single-page website based on user needs, you deliver HTML files with embedded JavaScript and CSS that are visually appealing, responsive, and user-friendly, featuring a grand layout and beautiful background. The HTML, CSS, and JavaScript code should be well-structured and effectively organized."
-        tools:
-          - file_save
-      - name: "Tester"
-        description: "Tests code and verifies functionality"
-        system_prompt: "You are a tester who validates code against requirements. For HTML applications, use browser tools to test functionality. For Python or other client-side applications, use the terminal tool to run and test. You only need to test a few core cases."
-        tools:
-          - file_save
-          - browser
-          - terminal
diff --git a/plugins/cow_cli/cow_cli.py b/plugins/cow_cli/cow_cli.py
index aafa1813..b7dc6371 100644
--- a/plugins/cow_cli/cow_cli.py
+++ b/plugins/cow_cli/cow_cli.py
@@ -26,16 +26,19 @@ from common.log import logger
 from cli import __version__
 
 
-# Known top-level subcommands that cow supports
+# Known top-level subcommands that cow supports.
+# "start" / "stop" / "restart" refer to daemon lifecycle on the host shell;
+# in chat, "/cancel" aborts the in-flight agent run instead.
 KNOWN_COMMANDS = {
     "help", "version", "status", "logs",
     "start", "stop", "restart",
+    "cancel",
     "skill", "context", "config",
     "knowledge", "memory",
     "install-browser",
 }
 
-# Commands that can only run from the CLI (terminal), not in chat
+# Commands that can only run from the CLI (terminal), not in chat.
 CLI_ONLY_COMMANDS = {"start", "stop", "restart"}
 
 # Commands that can only run from chat (need access to in-process memory)
@@ -225,6 +228,7 @@ class CowCliPlugin(Plugin):
             "  /help          显示此帮助",
             "  /version       查看版本",
             "  /status        查看运行状态",
+            "  /cancel        中止当前正在运行的 Agent 任务",
             "  /logs [N]      查看最近N条日志 (默认20)",
             "  /context       查看当前对话上下文信息",
             "  /context clear 清除当前对话上下文",
@@ -250,6 +254,41 @@ class CowCliPlugin(Plugin):
     def _cmd_version(self, args: str, e_context, **_) -> str:
         return f"CowAgent v{__version__}"
 
+    # ------------------------------------------------------------------
+    # cancel — abort the in-flight agent run for the current session.
+    # Fallback handler; in practice chat_channel/web_channel intercept
+    # /cancel earlier so it bypasses the per-session serial queue.
+    # ------------------------------------------------------------------
+
+    def _cmd_cancel(self, args: str, e_context: EventContext, session_id: str = "", **_) -> str:
+        """Signal the running agent to halt at its next checkpoint."""
+        from agent.protocol import get_cancel_registry
+
+        target_session = self._get_session_id(e_context, fallback=session_id)
+        registry = get_cancel_registry()
+
+        # Prefer per-turn request_id (matches the key agent_bridge registered)
+        cancelled = 0
+        request_id = ""
+        if e_context is not None:
+            try:
+                ctx = e_context["context"]
+                request_id = ctx.kwargs.get("request_id") or ctx.get("request_id", "")
+            except Exception:
+                request_id = ""
+
+        if request_id and registry.cancel_request(request_id):
+            cancelled = 1
+
+        # Fall back to session-wide cancel
+        if cancelled == 0 and target_session:
+            cancelled = registry.cancel_session(target_session)
+
+        if cancelled <= 0:
+            return "当前没有可中止的任务。"
+
+        return "🛑 已中止"
+
     # ------------------------------------------------------------------
     # status
     # ------------------------------------------------------------------
@@ -1056,6 +1095,38 @@ class CowCliPlugin(Plugin):
             logger.warning(f"[CowCli] /memory dream sync failed: {e}")
             return f"❌ 记忆蒸馏失败: {e}"
 
+    @staticmethod
+    def _resolve_active_embedding():
+        """
+        Resolve (provider_label, model, dim) from the LATEST config, not the
+        possibly-stale provider instance cached on a running agent. Used by
+        /memory status and rebuild-index hints so they reflect what a rebuild
+        will actually run as after the user changes embedding_provider.
+        Returns (label, model, dim) where any field may be None when unknown.
+        """
+        from agent.memory.embedding import EMBEDDING_VENDORS
+        from config import conf
+
+        provider_key = (conf().get("embedding_provider") or "").strip().lower()
+        cfg_model = (conf().get("embedding_model") or "").strip()
+        try:
+            cfg_dim = int(conf().get("embedding_dimensions") or 0)
+        except (TypeError, ValueError):
+            cfg_dim = 0
+
+        if not provider_key:
+            # Legacy auto path: openai -> linkai, both default to text-embedding-3-small (1536).
+            if (conf().get("open_ai_api_key") or "").strip():
+                return "openai (legacy)", "text-embedding-3-small", 1536
+            if (conf().get("linkai_api_key") or "").strip():
+                return "linkai (legacy)", "text-embedding-3-small", 1536
+            return "(legacy)", None, None
+
+        meta = EMBEDDING_VENDORS.get(provider_key) or {}
+        model = cfg_model or meta.get("default_model")
+        dim = cfg_dim if cfg_dim > 0 else meta.get("default_dimensions")
+        return provider_key, model, dim
+
     def _memory_status(self) -> str:
         """Show current memory index status."""
         from agent.memory.embedding import detect_index_dim
@@ -1078,15 +1149,14 @@ class CowCliPlugin(Plugin):
         lines.append(f"  Chunks  : {chunks} (embedded: {embedded})")
         lines.append("")
 
-        # Active provider (from running config + provider instance).
+        # Resolve from the latest config so users see what /memory rebuild-index
+        # will actually run as — not what the cached agent was initialized with.
+        cfg_provider, cfg_model, cfg_dim = self._resolve_active_embedding()
         provider_obj = memory_manager.embedding_provider
-        cfg_provider = (conf().get("embedding_provider") or "").strip().lower() or "(legacy)"
-        if provider_obj is not None:
-            cfg_model = getattr(provider_obj, "model", "?")
-            cfg_dim = getattr(provider_obj, "_dimensions", None) or "?"
+        if cfg_model:
             lines.append(f"  Provider : {cfg_provider}")
             lines.append(f"  Model    : {cfg_model}")
-            lines.append(f"  Dim      : {cfg_dim}")
+            lines.append(f"  Dim      : {cfg_dim if cfg_dim else '?'}")
         else:
             lines.append("  Provider : (未初始化, keyword-only)")
 
@@ -1105,7 +1175,6 @@ class CowCliPlugin(Plugin):
                 )
 
             index_dim = detect_index_dim(memory_manager.storage)
-            cfg_dim = getattr(provider_obj, "_dimensions", None)
             if index_dim is not None and cfg_dim and index_dim != cfg_dim:
                 warnings.append(
                     f"  ⚠️ 索引中存量向量为 {index_dim} 维，与当前配置 {cfg_dim} 维不一致；"
@@ -1129,15 +1198,27 @@ class CowCliPlugin(Plugin):
             )
 
         memory_manager = agent.memory_manager
-        if memory_manager.embedding_provider is None:
+
+        # Rebuild against the LATEST config: build a fresh provider from
+        # config.json and swap it onto memory_manager. The agent's
+        # conversation_history and other state are untouched.
+        try:
+            from bridge.agent_initializer import AgentInitializer
+            fresh_provider = AgentInitializer(bridge=None, agent_bridge=None) \
+                ._init_embedding_provider(memory_manager.config, session_id=session_id)
+        except Exception as e:
+            logger.exception("[CowCli] /memory rebuild-index: build provider failed")
+            return f"⚠️ 无法根据当前配置构造 embedding provider: {e}"
+
+        if fresh_provider is None:
             return (
                 "⚠️ 当前没有可用的 embedding provider。\n"
                 "请检查 config.json 中的 embedding 相关配置 (provider / api key)。"
             )
+        memory_manager.embedding_provider = fresh_provider
 
-        provider_obj = memory_manager.embedding_provider
-        model_label = getattr(provider_obj, "model", "?")
-        dim_label = getattr(provider_obj, "dimensions", "?")
+        model_label = getattr(fresh_provider, "model", "?")
+        dim_label = getattr(fresh_provider, "dimensions", "?")
 
         # SaaS (e_context is None): run synchronously, return final result
         if e_context is None:
@@ -1168,7 +1249,7 @@ class CowCliPlugin(Plugin):
         threading.Thread(target=_run, daemon=True).start()
         return (
             f"🔧 索引重建已启动 (model={model_label}, dim={dim_label})\n\n"
-            f"将清空现有 chunks 并重新 embed 所有记忆文件，完成后会通知你。"
+            f"将重新向量化所有记忆和知识文件，完成后会通知你。"
         )
 
     @staticmethod
diff --git a/requirements-optional.txt b/requirements-optional.txt
index c8cd9a63..7abdc8e5 100644
--- a/requirements-optional.txt
+++ b/requirements-optional.txt
@@ -3,8 +3,8 @@ tiktoken>=0.3.2 # openai calculate token
 #voice
 pydub>=0.25.1 # need ffmpeg
 gTTS>=2.3.1 # google text to speech
-edge-tts # edge-tts
-elevenlabs==1.0.3 # elevenlabs TTS
+# edge-tts: install on demand, see voice/edge/edge_voice.py
+# elevenlabs: install on demand, see voice/elevent/elevent_voice.py
 
 #install plugin
 dulwich
diff --git a/requirements.txt b/requirements.txt
index 88b832b7..7f0ccc71 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -1,3 +1,4 @@
+numpy>=1.21
 aiohttp>=3.8.6,<3.10
 requests>=2.28.2
 chardet>=5.1.0
@@ -9,6 +10,7 @@ PyYAML>=6.0
 croniter>=2.0.0
 click>=8.0
 qrcode
+json-repair
 
 # wechatcom & wechatmp
 wechatpy
@@ -25,3 +27,7 @@ dingtalk_stream
 # wecom bot websocket mode
 websocket-client>=1.4.0
 pycryptodome
+# telegram bot
+python-telegram-bot
+# slack bot
+slack_bolt
diff --git a/run.sh b/run.sh
index c7ebd0a0..5dff9785 100755
--- a/run.sh
+++ b/run.sh
@@ -311,7 +311,7 @@ select_model() {
     echo -e "${CYAN}${BOLD}=========================================${NC}"
     echo -e "${YELLOW}1) DeepSeek (deepseek-v4-flash, deepseek-v4-pro, etc.)${NC}"
     echo -e "${YELLOW}2) MiniMax (MiniMax-M2.7, MiniMax-M2.5, etc.)${NC}"
-    echo -e "${YELLOW}3) Claude (claude-sonnet-4-6, claude-opus-4-7, claude-opus-4-6, etc.)${NC}"
+    echo -e "${YELLOW}3) Claude (claude-opus-4-8, claude-opus-4-7, claude-sonnet-4-6, etc.)${NC}"
     echo -e "${YELLOW}4) Gemini (gemini-3.1-flash-lite-preview, gemini-3.1-pro-preview, etc.)${NC}"
     echo -e "${YELLOW}5) OpenAI GPT (gpt-5.4, gpt-5.2, gpt-4.1, etc.)${NC}"
     echo -e "${YELLOW}6) Zhipu AI (glm-5.1, glm-5-turbo, glm-5, etc.)${NC}"
@@ -360,7 +360,7 @@ configure_model() {
         1) read_model_config "DeepSeek" "deepseek-v4-flash" "DEEPSEEK_KEY" ;;
         2) read_model_config "MiniMax" "MiniMax-M2.7" "MINIMAX_KEY" ;;
         3)
-            read_model_config "Claude" "claude-sonnet-4-6" "CLAUDE_KEY"
+            read_model_config "Claude" "claude-opus-4-8" "CLAUDE_KEY"
             read_api_base "CLAUDE_BASE" "https://api.anthropic.com/v1"
             ;;
         4)
diff --git a/scripts/run.ps1 b/scripts/run.ps1
index 18300d11..7c5f0b06 100644
--- a/scripts/run.ps1
+++ b/scripts/run.ps1
@@ -175,7 +175,7 @@ $ModelChoices = @{
     "4" = @{ Provider = "Kimi (Moonshot)";          Default = "kimi-k2.6";                              Key = "MOONSHOT_KEY" }
     "5" = @{ Provider = "Doubao (Volcengine Ark)";  Default = "doubao-seed-2-0-code-preview-260215";    Key = "ARK_KEY" }
     "6" = @{ Provider = "Qwen (DashScope)";         Default = "qwen3.6-plus";                           Key = "DASHSCOPE_KEY" }
-    "7" = @{ Provider = "Claude";                   Default = "claude-sonnet-4-6";                      Key = "CLAUDE_KEY";  Base = "https://api.anthropic.com/v1" }
+    "7" = @{ Provider = "Claude";                   Default = "claude-opus-4-8";                        Key = "CLAUDE_KEY";  Base = "https://api.anthropic.com/v1" }
     "8" = @{ Provider = "Gemini";                   Default = "gemini-3.1-pro-preview";                 Key = "GEMINI_KEY";  Base = "https://generativelanguage.googleapis.com" }
     "9" = @{ Provider = "OpenAI GPT";               Default = "gpt-5.4";                                Key = "OPENAI_KEY";  Base = "https://api.openai.com/v1" }
     "10" = @{ Provider = "LinkAI";                  Default = "deepseek-v4-flash";                      Key = "LINKAI_KEY" }
@@ -191,7 +191,7 @@ function Select-Model {
     Write-Host "4) Kimi (kimi-k2.6, kimi-k2.5, kimi-k2, etc.)"
     Write-Host "5) Doubao (doubao-seed-2-0-code-preview-260215, etc.)"
     Write-Host "6) Qwen (qwen3.6-plus, qwen3.5-plus, qwen3-max, qwq-plus, etc.)"
-    Write-Host "7) Claude (claude-sonnet-4-6, claude-opus-4-6, etc.)"
+    Write-Host "7) Claude (claude-opus-4-8, claude-opus-4-7, claude-sonnet-4-6, etc.)"
     Write-Host "8) Gemini (gemini-3.1-flash-lite-preview, gemini-3.1-pro-preview, etc.)"
     Write-Host "9) OpenAI GPT (gpt-5.4, gpt-5.2, gpt-4.1, etc.)"
     Write-Host "10) LinkAI (access multiple models via one API)"
diff --git a/skills/image-generation/scripts/generate.py b/skills/image-generation/scripts/generate.py
index 905390b5..23d05fd4 100644
--- a/skills/image-generation/scripts/generate.py
+++ b/skills/image-generation/scripts/generate.py
@@ -1011,6 +1011,18 @@ _MODEL_PREFERRED_PROVIDER: list[tuple[tuple[str, ...], str]] = [
 # Default global priority when the model has no preferred provider.
 _DEFAULT_PROVIDER_ORDER = ["OpenAI", "Gemini", "Seedream", "Qwen", "MiniMax", "LinkAI"]
 
+# UI provider id (persisted via the Models page) → internal label used by
+# the factory dict in `_build_providers`. Allows pinning a vendor for
+# custom model names that prefix-inference can't recognize.
+_PROVIDER_ID_TO_LABEL = {
+    "openai": "OpenAI",
+    "gemini": "Gemini",
+    "doubao": "Seedream",
+    "dashscope": "Qwen",
+    "minimax": "MiniMax",
+    "linkai": "LinkAI",
+}
+
 
 def _preferred_provider(model: str) -> str | None:
     m = (model or "").lower()
@@ -1020,7 +1032,7 @@ def _preferred_provider(model: str) -> str | None:
     return None
 
 
-def _build_providers(model: str) -> list[tuple[str, ImageProvider]]:
+def _build_providers(model: str, provider_id: str = "") -> list[tuple[str, ImageProvider]]:
     """Build an ordered list of (label, provider) to try.
 
     Behaviour:
@@ -1051,7 +1063,12 @@ def _build_providers(model: str) -> list[tuple[str, ImageProvider]]:
         "LinkAI": os.environ.get("LINKAI_API_BASE", "https://api.link-ai.tech"),
     }
 
-    pref = _preferred_provider(model)
+    # Provider preference resolution priority:
+    #   1. Explicit `provider_id` (UI-persisted, supports custom model names).
+    #   2. Model-name prefix inference.
+    pref = _PROVIDER_ID_TO_LABEL.get(provider_id) if provider_id else None
+    if not pref:
+        pref = _preferred_provider(model)
 
     # If a specific model is requested and its native provider has no key,
     # other backends won't recognise the id → reset to auto routing.
@@ -1110,10 +1127,13 @@ def main():
     # Model resolution priority:
     #   1. Explicit `model` in the call args (agent / user override)
     #   2. SKILL_IMAGE_GENERATION_MODEL env var (synced from
-    #      config["skill"]["image-generation"]["model"] at startup)
+    #      config["skills"]["image-generation"]["model"] at startup)
     #   3. None → fall back to automatic provider routing (try every
     #      provider with a configured API key in global priority order)
     model = args.get("model") or os.environ.get("SKILL_IMAGE_GENERATION_MODEL") or ""
+    # Provider hint persisted by the Models UI; lets users pin a vendor for
+    # custom model names that prefix-inference can't recognize.
+    provider_id = args.get("provider") or os.environ.get("SKILL_IMAGE_GENERATION_PROVIDER") or ""
     quality = args.get("quality")
     size = args.get("size")
     aspect_ratio = args.get("aspect_ratio")
@@ -1121,7 +1141,7 @@ def main():
 
     output_dir = os.environ.get("IMAGE_OUTPUT_DIR", os.path.join(os.getcwd(), "images"))
 
-    providers = _build_providers(model)
+    providers = _build_providers(model, provider_id=provider_id)
     if not providers:
         target = f"model '{model}'" if model else "image generation"
         print(json.dumps({
diff --git a/tests/test_qianfan_provider.py b/tests/test_qianfan_provider.py
index 4c7900e5..99eb4130 100644
--- a/tests/test_qianfan_provider.py
+++ b/tests/test_qianfan_provider.py
@@ -394,7 +394,7 @@ class TestQianfanVisionTool(unittest.TestCase):
             "open_ai_api_key": "",
             "linkai_api_key": "",
             "use_linkai": False,
-            "tool": {},
+            "tools": {},
         }
         if values:
             data.update(values)
@@ -424,7 +424,7 @@ class TestQianfanVisionTool(unittest.TestCase):
     def test_vision_routes_ernie_model_override_to_qianfan(self):
         fake_conf = self._fake_conf({
             "qianfan_api_key": "test-qianfan-key",
-            "tool": {"vision": {"model": "ernie-4.5-turbo-vl-32k"}},
+            "tools": {"vision": {"model": "ernie-4.5-turbo-vl-32k"}},
         })
         fake_bot = MagicMock()
         fake_bot.call_vision = MagicMock()
diff --git a/translate/__init__.py b/translate/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/translate/baidu/__init__.py b/translate/baidu/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/translate/youdao/__init__.py b/translate/youdao/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/voice/dashscope/__init__.py b/voice/dashscope/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/voice/dashscope/dashscope_voice.py b/voice/dashscope/dashscope_voice.py
new file mode 100644
index 00000000..746bb59a
--- /dev/null
+++ b/voice/dashscope/dashscope_voice.py
@@ -0,0 +1,175 @@
+# encoding:utf-8
+"""DashScope voice: qwen3-asr-flash (ASR) + qwen3-tts-flash (TTS)
+via dashscope.MultiModalConversation."""
+import datetime
+import os
+import random
+from typing import Optional
+
+import dashscope
+import requests
+from dashscope import MultiModalConversation
+
+from bridge.reply import Reply, ReplyType
+from common.log import logger
+from config import conf
+from voice import audio_convert
+from voice.voice import Voice
+
+
+DEFAULT_ASR_MODEL = "qwen3-asr-flash"
+DEFAULT_TTS_MODEL = "qwen3-tts-flash"
+DEFAULT_TTS_VOICE = "Cherry"
+MAX_DURATION_SECONDS = 300
+MAX_FILE_BYTES = 10 * 1024 * 1024
+
+
+class DashScopeVoice(Voice):
+    def __init__(self):
+        pass
+
+    def voiceToText(self, voice_file: str):
+        try:
+            voice_file = self._ensure_compatible_format(voice_file)
+
+            try:
+                size = os.path.getsize(voice_file)
+                if size > MAX_FILE_BYTES:
+                    logger.warning(
+                        f"[DashScopeVoice] audio file {size}B exceeds {MAX_FILE_BYTES}B; "
+                        f"qwen3-asr-flash may reject it"
+                    )
+            except OSError:
+                pass
+
+            api_key = conf().get("dashscope_api_key", "")
+            if not api_key:
+                logger.error("[DashScopeVoice] dashscope_api_key is not configured")
+                return Reply(ReplyType.ERROR, "未配置 DashScope API key")
+            dashscope.api_key = api_key
+
+            model = conf().get("voice_to_text_model") or DEFAULT_ASR_MODEL
+            abs_path = os.path.abspath(voice_file)
+            file_uri = f"file://{abs_path}"
+
+            messages = [
+                {"role": "user", "content": [{"audio": file_uri}]},
+            ]
+            response = MultiModalConversation.call(
+                model=model,
+                messages=messages,
+                result_format="message",
+                asr_options={"enable_itn": False, "enable_lid": True},
+            )
+
+            text = self._extract_text(response)
+            if text is None:
+                logger.error(f"[DashScopeVoice] voiceToText failed: {response}")
+                return Reply(ReplyType.ERROR, "我暂时还无法听清您的语音，请稍后再试吧~")
+
+            logger.info(f"[DashScopeVoice] voiceToText model={model} text={text}")
+            return Reply(ReplyType.TEXT, text)
+        except Exception as e:
+            logger.exception(f"[DashScopeVoice] voiceToText exception: {e}")
+            return Reply(ReplyType.ERROR, "我暂时还无法听清您的语音，请稍后再试吧~")
+
+    def textToVoice(self, text: str):
+        try:
+            api_key = conf().get("dashscope_api_key", "")
+            if not api_key:
+                logger.error("[DashScopeVoice] dashscope_api_key is not configured")
+                return Reply(ReplyType.ERROR, "未配置 DashScope API key")
+            dashscope.api_key = api_key
+
+            model = conf().get("text_to_voice_model") or DEFAULT_TTS_MODEL
+            voice = conf().get("tts_voice_id") or DEFAULT_TTS_VOICE
+            response = MultiModalConversation.call(
+                model=model,
+                api_key=api_key,
+                text=text,
+                voice=voice,
+                stream=False,
+            )
+
+            url = self._extract_audio_url(response)
+            if not url:
+                logger.error(f"[DashScopeVoice] textToVoice failed: {response}")
+                return Reply(ReplyType.ERROR, "语音合成失败")
+
+            local_path = self._download_audio(url)
+            if not local_path:
+                return Reply(ReplyType.ERROR, "语音合成失败")
+
+            logger.info(f"[DashScopeVoice] textToVoice model={model} voice={voice} file={local_path}")
+            return Reply(ReplyType.VOICE, local_path)
+        except Exception as e:
+            logger.exception(f"[DashScopeVoice] textToVoice exception: {e}")
+            return Reply(ReplyType.ERROR, "语音合成失败")
+
+    @staticmethod
+    def _extract_audio_url(response) -> Optional[str]:
+        try:
+            if getattr(response, "status_code", 200) != 200:
+                return None
+            audio = response.output.get("audio") if response.output else None
+            if isinstance(audio, dict):
+                return audio.get("url") or None
+            return getattr(audio, "url", None)
+        except Exception:
+            return None
+
+    @staticmethod
+    def _download_audio(url: str) -> Optional[str]:
+        try:
+            tmp_dir = os.path.join(os.getcwd(), "tmp")
+            os.makedirs(tmp_dir, exist_ok=True)
+            ts = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
+            ext = os.path.splitext(url.split("?", 1)[0])[1].lower() or ".wav"
+            if ext not in (".mp3", ".wav", ".m4a", ".aac", ".opus"):
+                ext = ".wav"
+            dst = os.path.join(tmp_dir, f"dashscope_tts_{ts}_{random.randint(0, 9999)}{ext}")
+            resp = requests.get(url, timeout=60)
+            resp.raise_for_status()
+            with open(dst, "wb") as f:
+                f.write(resp.content)
+            return dst
+        except Exception as e:
+            logger.error(f"[DashScopeVoice] download audio failed: {e}")
+            return None
+
+    @staticmethod
+    def _ensure_compatible_format(voice_file: str) -> str:
+        # qwen3-asr-flash doesn't accept AMR/SILK; mp3/wav/m4a/aac/opus pass through.
+        lower = voice_file.lower()
+        if lower.endswith(".amr") or lower.endswith(".silk") or lower.endswith(".slk"):
+            try:
+                mp3_file = os.path.splitext(voice_file)[0] + ".mp3"
+                audio_convert.any_to_mp3(voice_file, mp3_file)
+                return mp3_file
+            except Exception as e:
+                logger.warning(f"[DashScopeVoice] mp3 convert failed: {e}")
+        return voice_file
+
+    @staticmethod
+    def _extract_text(response) -> Optional[str]:
+        try:
+            if getattr(response, "status_code", 200) != 200:
+                return None
+            choices = response.output.get("choices") or []
+            if not choices:
+                return None
+            content = choices[0].get("message", {}).get("content")
+            if isinstance(content, str):
+                return content.strip() or None
+            if isinstance(content, list):
+                parts = []
+                for item in content:
+                    if isinstance(item, dict) and "text" in item:
+                        parts.append(item["text"])
+                    elif isinstance(item, str):
+                        parts.append(item)
+                text = "".join(parts).strip()
+                return text or None
+            return None
+        except Exception:
+            return None
diff --git a/voice/edge/edge_voice.py b/voice/edge/edge_voice.py
index 7bb8b2e6..1a25a2b4 100644
--- a/voice/edge/edge_voice.py
+++ b/voice/edge/edge_voice.py
@@ -1,3 +1,4 @@
+# Requires: edge-tts  (pip install edge-tts)
 import time
 
 import edge_tts
diff --git a/voice/elevent/elevent_voice.py b/voice/elevent/elevent_voice.py
index 2cfa5a3f..5e274638 100644
--- a/voice/elevent/elevent_voice.py
+++ b/voice/elevent/elevent_voice.py
@@ -1,3 +1,4 @@
+# Requires: elevenlabs==1.0.3  (pip install elevenlabs==1.0.3)
 import time
 
 from elevenlabs.client import ElevenLabs
diff --git a/voice/factory.py b/voice/factory.py
index abe7ba57..2bc356f4 100644
--- a/voice/factory.py
+++ b/voice/factory.py
@@ -58,4 +58,16 @@ def create_voice(voice_type):
         from voice.minimax.minimax_voice import MinimaxVoice
 
         return MinimaxVoice()
+    elif voice_type == "dashscope":
+        from voice.dashscope.dashscope_voice import DashScopeVoice
+
+        return DashScopeVoice()
+    elif voice_type == "zhipu" or voice_type == "zhipuai":
+        from voice.zhipuai.zhipuai_voice import ZhipuAIVoice
+
+        return ZhipuAIVoice()
+    elif voice_type == "mimo":
+        from voice.mimo.mimo_voice import MimoVoice
+
+        return MimoVoice()
     raise RuntimeError
diff --git a/voice/linkai/linkai_voice.py b/voice/linkai/linkai_voice.py
index 739b5f60..ec59812e 100644
--- a/voice/linkai/linkai_voice.py
+++ b/voice/linkai/linkai_voice.py
@@ -1,16 +1,18 @@
-"""
-google voice service
-"""
+"""LinkAI voice: Whisper ASR + multi-vendor TTS (OpenAI / Doubao / Baidu)
+proxied via https://docs.link-ai.tech/platform/api/voice-speech."""
+import datetime
+import os
 import random
+
 import requests
-from voice import audio_convert
+
 from bridge.reply import Reply, ReplyType
+from common import const
 from common.log import logger
 from config import conf
+from voice import audio_convert
 from voice.voice import Voice
-from common import const
-import os
-import datetime
+
 
 class LinkAIVoice(Voice):
     def __init__(self):
@@ -21,63 +23,67 @@ class LinkAIVoice(Voice):
         try:
             url = conf().get("linkai_api_base", "https://api.link-ai.tech") + "/v1/audio/transcriptions"
             headers = {"Authorization": "Bearer " + conf().get("linkai_api_key")}
-            model = None
-            if not conf().get("text_to_voice") or conf().get("voice_to_text") == "openai":
-                model = const.WHISPER_1
+            # Pin whisper-1: gateway ignores any other ASR model id.
+            model = const.WHISPER_1
             if voice_file.endswith(".amr"):
                 try:
                     mp3_file = os.path.splitext(voice_file)[0] + ".mp3"
                     audio_convert.any_to_mp3(voice_file, mp3_file)
                     voice_file = mp3_file
                 except Exception as e:
-                    logger.warn(f"[LinkVoice] amr file transfer failed, directly send amr voice file: {format(e)}")
-            file = open(voice_file, "rb")
-            file_body = {
-                "file": file
-            }
-            data = {
-                "model": model
-            }
-            res = requests.post(url, files=file_body, headers=headers, data=data, timeout=(5, 60))
-            if res.status_code == 200:
-                text = res.json().get("text")
-            else:
-                res_json = res.json()
-                logger.error(f"[LinkVoice] voiceToText error, status_code={res.status_code}, msg={res_json.get('message')}")
+                    logger.warning(f"[LinkVoice] amr file transfer failed, directly send amr voice file: {e}")
+            with open(voice_file, "rb") as file:
+                res = requests.post(
+                    url,
+                    files={"file": file},
+                    headers=headers,
+                    data={"model": model},
+                    timeout=(5, 60),
+                )
+            if res.status_code != 200:
+                msg = ""
+                try:
+                    msg = res.json().get("message", "")
+                except Exception:
+                    pass
+                logger.error(f"[LinkVoice] voiceToText error, status_code={res.status_code}, msg={msg}")
                 return None
-            reply = Reply(ReplyType.TEXT, text)
+            text = res.json().get("text")
             logger.info(f"[LinkVoice] voiceToText success, text={text}, file name={voice_file}")
+            return Reply(ReplyType.TEXT, text)
         except Exception as e:
             logger.error(e)
             return None
-        return reply
 
     def textToVoice(self, text):
         try:
             url = conf().get("linkai_api_base", "https://api.link-ai.tech") + "/v1/audio/speech"
             headers = {"Authorization": "Bearer " + conf().get("linkai_api_key")}
-            model = const.TTS_1
-            if not conf().get("text_to_voice") or conf().get("text_to_voice") in ["openai", const.TTS_1, const.TTS_1_HD]:
-                model = conf().get("text_to_voice_model") or const.TTS_1
+            # Gateway routes by `model` (tts-1 / doubao / baidu) + `voice` from
+            # that engine's catalog. `app_code` is optional workspace override.
             data = {
-                "model": model,
                 "input": text,
                 "voice": conf().get("tts_voice_id"),
-                "app_code": conf().get("linkai_app_code")
+                "app_code": conf().get("linkai_app_code"),
             }
+            model = conf().get("text_to_voice_model")
+            if model:
+                data["model"] = model
             res = requests.post(url, headers=headers, json=data, timeout=(5, 120))
-            if res.status_code == 200:
-                tmp_file_name = "tmp/" + datetime.datetime.now().strftime('%Y%m%d%H%M%S') + str(random.randint(0, 1000)) + ".mp3"
-                with open(tmp_file_name, 'wb') as f:
-                    f.write(res.content)
-                reply = Reply(ReplyType.VOICE, tmp_file_name)
-                logger.info(f"[LinkVoice] textToVoice success, input={text}, model={model}, voice_id={data.get('voice')}")
-                return reply
-            else:
-                res_json = res.json()
-                logger.error(f"[LinkVoice] textToVoice error, status_code={res.status_code}, msg={res_json.get('message')}")
+            if res.status_code != 200:
+                msg = ""
+                try:
+                    msg = res.json().get("message", "")
+                except Exception:
+                    pass
+                logger.error(f"[LinkVoice] textToVoice error, status_code={res.status_code}, msg={msg}")
                 return None
+            tmp_file_name = "tmp/" + datetime.datetime.now().strftime('%Y%m%d%H%M%S') + str(random.randint(0, 1000)) + ".mp3"
+            os.makedirs(os.path.dirname(tmp_file_name), exist_ok=True)
+            with open(tmp_file_name, 'wb') as f:
+                f.write(res.content)
+            logger.info(f"[LinkVoice] textToVoice success, input={text}, voice_id={data.get('voice')}")
+            return Reply(ReplyType.VOICE, tmp_file_name)
         except Exception as e:
             logger.error(e)
-            # reply = Reply(ReplyType.ERROR, "遇到了一点小问题，请稍后再问我吧")
             return None
diff --git a/voice/mimo/__init__.py b/voice/mimo/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/voice/mimo/mimo_voice.py b/voice/mimo/mimo_voice.py
new file mode 100644
index 00000000..2ae885f8
--- /dev/null
+++ b/voice/mimo/mimo_voice.py
@@ -0,0 +1,109 @@
+# encoding:utf-8
+"""
+小米 MiMo TTS - 基于 mimo-v2.5-tts 模型的语音合成。
+
+通过 /chat/completions 接口实现：assistant 消息内容为待合成文本，
+audio 字段指定预置音色（如 冰糖/茉莉/苏打/Mia/Chloe 等），返回 base64
+编码的音频字节。
+
+文档：https://platform.xiaomimimo.com/docs/zh-CN/usage-guide/speech-synthesis-v2.5
+注意：MiMo 不提供 ASR 端点，因此 voiceToText 不实现。
+"""
+import base64
+import datetime
+import os
+import random
+
+import requests
+
+from bridge.reply import Reply, ReplyType
+from common.log import logger
+from config import conf
+from voice.voice import Voice
+
+DEFAULT_API_BASE = "https://api.xiaomimimo.com/v1"
+DEFAULT_TTS_MODEL = "mimo-v2.5-tts"
+DEFAULT_TTS_VOICE = "冰糖"  # 默认音色：中国集群事实默认值
+REQUEST_TIMEOUT = (5, 120)
+
+
+class MimoVoice(Voice):
+    def __init__(self):
+        pass
+
+    def voiceToText(self, voice_file: str):
+        # MiMo 没有独立 ASR 端点；建议使用其他 provider（如 openai/zhipu/dashscope）
+        logger.warning("[MimoVoice] voiceToText is not supported by MiMo API")
+        return Reply(ReplyType.ERROR, "MiMo 暂不支持语音识别，请配置其他 voice_to_text provider")
+
+    def textToVoice(self, text: str):
+        try:
+            api_key = conf().get("mimo_api_key", "")
+            if not api_key:
+                logger.error("[MimoVoice] mimo_api_key is not configured")
+                return Reply(ReplyType.ERROR, "未配置 MiMo API key")
+
+            api_base = (conf().get("mimo_api_base") or DEFAULT_API_BASE).rstrip("/")
+            model = conf().get("text_to_voice_model") or DEFAULT_TTS_MODEL
+            voice_id = conf().get("tts_voice_id") or DEFAULT_TTS_VOICE
+
+            # 目标合成文本必须放在 assistant 消息；user 消息可选用作风格指令
+            payload = {
+                "model": model,
+                "messages": [
+                    {"role": "assistant", "content": text},
+                ],
+                "audio": {
+                    "format": "wav",
+                    "voice": voice_id,
+                },
+            }
+            headers = {
+                "Authorization": f"Bearer {api_key}",
+                "Content-Type": "application/json",
+            }
+            url = f"{api_base}/chat/completions"
+            response = requests.post(url, headers=headers, json=payload, timeout=REQUEST_TIMEOUT)
+
+            if response.status_code != 200:
+                logger.error(
+                    f"[MimoVoice] textToVoice failed: status={response.status_code} "
+                    f"body={response.text[:500]} model={model} voice={voice_id}"
+                )
+                return Reply(ReplyType.ERROR, "语音合成失败，请稍后再试")
+
+            data = response.json()
+            if "error" in data:
+                err = data["error"]
+                msg = err.get("message", str(err)) if isinstance(err, dict) else str(err)
+                logger.error(f"[MimoVoice] textToVoice api error: {msg}")
+                return Reply(ReplyType.ERROR, "语音合成失败，请稍后再试")
+
+            message = (data.get("choices") or [{}])[0].get("message", {}) or {}
+            audio_obj = message.get("audio") or {}
+            audio_b64 = audio_obj.get("data")
+            if not audio_b64:
+                logger.error(f"[MimoVoice] textToVoice empty audio in response: {data}")
+                return Reply(ReplyType.ERROR, "语音合成失败，请稍后再试")
+
+            try:
+                audio_bytes = base64.b64decode(audio_b64)
+            except Exception as e:
+                logger.error(f"[MimoVoice] base64 decode failed: {e}")
+                return Reply(ReplyType.ERROR, "语音合成失败，请稍后再试")
+
+            file_name = (
+                "tmp/" + datetime.datetime.now().strftime("%Y%m%d%H%M%S")
+                + str(random.randint(0, 1000)) + ".wav"
+            )
+            os.makedirs(os.path.dirname(file_name), exist_ok=True)
+            with open(file_name, "wb") as f:
+                f.write(audio_bytes)
+            logger.info(
+                f"[MimoVoice] textToVoice model={model} voice={voice_id} "
+                f"file={file_name} bytes={len(audio_bytes)}"
+            )
+            return Reply(ReplyType.VOICE, file_name)
+        except Exception as e:
+            logger.exception(f"[MimoVoice] textToVoice exception: {e}")
+            return Reply(ReplyType.ERROR, "语音合成失败，请稍后再试")
diff --git a/voice/minimax/minimax_voice.py b/voice/minimax/minimax_voice.py
index 1446a3f1..8456c479 100644
--- a/voice/minimax/minimax_voice.py
+++ b/voice/minimax/minimax_voice.py
@@ -1,8 +1,7 @@
 # encoding:utf-8
-"""
-MiniMax TTS voice service
-"""
+"""MiniMax TTS via /v1/t2a_v2 (SSE stream, hex-encoded mp3 chunks)."""
 import datetime
+import json
 import random
 import requests
 
@@ -12,24 +11,12 @@ from config import conf
 from voice.voice import Voice
 
 
-MINIMAX_TTS_VOICES = [
-    "English_Graceful_Lady",
-    "English_Insightful_Speaker",
-    "English_radiant_girl",
-    "English_Persuasive_Man",
-    "English_Lucky_Robot",
-    "English_expressive_narrator",
-    "Chinese_Warm_Woman",
-    "Chinese_Gentle_Man",
-]
-
-
 class MinimaxVoice(Voice):
     def __init__(self):
         self.api_key = conf().get("minimax_api_key")
-        self.api_base = conf().get("minimax_api_base") or "https://api.minimax.io"
-        # Strip trailing /v1 if present so we can always append /v1/t2a_v2
-        self.api_base = self.api_base.rstrip("/")
+        # Mainland endpoint matches `sk-api-0-...` keys; override via
+        # `minimax_api_base` for international (api.minimax.io) workspaces.
+        self.api_base = (conf().get("minimax_api_base") or "https://api.minimaxi.com").rstrip("/")
         if self.api_base.endswith("/v1"):
             self.api_base = self.api_base[:-3]
 
@@ -68,12 +55,14 @@ class MinimaxVoice(Voice):
             response = requests.post(url, headers=headers, json=payload, stream=True, timeout=60)
             response.raise_for_status()
 
-            # Parse SSE stream and collect hex-encoded audio chunks
+            # MiniMax returns HTTP 200 even on errors; capture base_resp for diagnostics.
             audio_chunks = []
-            buffer = ""
+            last_base_resp = None
+            event_count = 0
             for raw in response.iter_lines():
                 if not raw:
                     continue
+                event_count += 1
                 line = raw.decode("utf-8") if isinstance(raw, bytes) else raw
                 if not line.startswith("data:"):
                     continue
@@ -81,16 +70,31 @@ class MinimaxVoice(Voice):
                 if not json_str or json_str == "[DONE]":
                     continue
                 try:
-                    import json
                     event_data = json.loads(json_str)
-                    audio_hex = event_data.get("data", {}).get("audio")
-                    if audio_hex:
-                        audio_chunks.append(bytes.fromhex(audio_hex))
                 except Exception:
                     continue
+                base_resp = event_data.get("base_resp") or {}
+                if base_resp:
+                    last_base_resp = base_resp
+                audio_hex = (event_data.get("data") or {}).get("audio")
+                if audio_hex:
+                    try:
+                        audio_chunks.append(bytes.fromhex(audio_hex))
+                    except Exception as e:
+                        logger.warning(f"[MINIMAX] skip bad audio hex chunk: {e}")
 
             if not audio_chunks:
-                logger.error("[MINIMAX] TTS returned no audio data")
+                ct = response.headers.get("Content-Type", "")
+                if last_base_resp and last_base_resp.get("status_code") not in (None, 0):
+                    logger.error(
+                        f"[MINIMAX] TTS failed: status_code={last_base_resp.get('status_code')}, "
+                        f"status_msg={last_base_resp.get('status_msg')}, model={model}, voice_id={voice_id}"
+                    )
+                else:
+                    logger.error(
+                        f"[MINIMAX] TTS returned no audio data, model={model}, voice_id={voice_id}, "
+                        f"url={url}, http={response.status_code}, content_type={ct!r}, events={event_count}"
+                    )
                 return Reply(ReplyType.ERROR, "语音合成失败，未获取到音频数据")
 
             audio_data = b"".join(audio_chunks)
diff --git a/voice/openai/openai_voice.py b/voice/openai/openai_voice.py
index 3ffa00aa..f0db53b4 100644
--- a/voice/openai/openai_voice.py
+++ b/voice/openai/openai_voice.py
@@ -31,7 +31,8 @@ class OpenaiVoice(Voice):
                 "file": file,
             }
             data = {
-                "model": "whisper-1",
+                # Override via `voice_to_text_model` (e.g. fall back to whisper-1).
+                "model": conf().get("voice_to_text_model") or "gpt-4o-mini-transcribe",
             }
             response = requests.post(url, headers=headers, files=files, data=data)
             response_data = response.json()
diff --git a/voice/zhipuai/__init__.py b/voice/zhipuai/__init__.py
new file mode 100644
index 00000000..e69de29b
diff --git a/voice/zhipuai/zhipuai_voice.py b/voice/zhipuai/zhipuai_voice.py
new file mode 100644
index 00000000..1fdcdc7c
--- /dev/null
+++ b/voice/zhipuai/zhipuai_voice.py
@@ -0,0 +1,173 @@
+# encoding:utf-8
+"""ZhipuAI voice: glm-asr-2512 (ASR) + glm-tts (TTS) via BigModel REST API."""
+import datetime
+import os
+import random
+
+import requests
+
+from bridge.reply import Reply, ReplyType
+from common.log import logger
+from config import conf
+from voice import audio_convert
+from voice.voice import Voice
+
+
+DEFAULT_ASR_MODEL = "glm-asr-2512"
+DEFAULT_TTS_MODEL = "glm-tts"
+DEFAULT_TTS_VOICE = "tongtong"
+DEFAULT_API_BASE = "https://open.bigmodel.cn/api/paas/v4"
+MAX_FILE_BYTES = 25 * 1024 * 1024
+REQUEST_TIMEOUT = (5, 60)
+
+
+class ZhipuAIVoice(Voice):
+    def __init__(self):
+        pass
+
+    def voiceToText(self, voice_file: str):
+        try:
+            voice_file = self._ensure_compatible_format(voice_file)
+
+            try:
+                size = os.path.getsize(voice_file)
+                if size > MAX_FILE_BYTES:
+                    logger.warning(
+                        f"[ZhipuAIVoice] audio file {size}B exceeds {MAX_FILE_BYTES}B; "
+                        f"glm-asr-2512 may reject it"
+                    )
+            except OSError:
+                pass
+
+            api_key = conf().get("zhipu_ai_api_key", "")
+            if not api_key:
+                logger.error("[ZhipuAIVoice] zhipu_ai_api_key is not configured")
+                return Reply(ReplyType.ERROR, "未配置 ZhipuAI API key")
+
+            api_base = (conf().get("zhipu_ai_api_base") or DEFAULT_API_BASE).rstrip("/")
+            url = f"{api_base}/audio/transcriptions"
+            model = conf().get("voice_to_text_model") or DEFAULT_ASR_MODEL
+
+            with open(voice_file, "rb") as f:
+                files = {"file": (os.path.basename(voice_file), f)}
+                data = {"model": model, "stream": "false"}
+                headers = {"Authorization": f"Bearer {api_key}"}
+                response = requests.post(
+                    url, headers=headers, files=files, data=data, timeout=REQUEST_TIMEOUT
+                )
+
+            if response.status_code != 200:
+                logger.error(
+                    f"[ZhipuAIVoice] voiceToText failed: status={response.status_code} "
+                    f"body={response.text[:500]}"
+                )
+                return Reply(ReplyType.ERROR, "我暂时还无法听清您的语音，请稍后再试吧~")
+
+            payload = response.json()
+            text = (payload.get("text") or "").strip()
+            if not text:
+                logger.error(f"[ZhipuAIVoice] voiceToText empty text: {payload}")
+                return Reply(ReplyType.ERROR, "我暂时还无法听清您的语音，请稍后再试吧~")
+
+            logger.info(f"[ZhipuAIVoice] voiceToText model={model} text={text}")
+            return Reply(ReplyType.TEXT, text)
+        except Exception as e:
+            logger.exception(f"[ZhipuAIVoice] voiceToText exception: {e}")
+            return Reply(ReplyType.ERROR, "我暂时还无法听清您的语音，请稍后再试吧~")
+
+    def textToVoice(self, text: str):
+        try:
+            api_key = conf().get("zhipu_ai_api_key", "")
+            if not api_key:
+                logger.error("[ZhipuAIVoice] zhipu_ai_api_key is not configured")
+                return Reply(ReplyType.ERROR, "未配置 ZhipuAI API key")
+
+            api_base = (conf().get("zhipu_ai_api_base") or DEFAULT_API_BASE).rstrip("/")
+            url = f"{api_base}/audio/speech"
+            model = conf().get("text_to_voice_model") or DEFAULT_TTS_MODEL
+            voice_id = conf().get("tts_voice_id") or DEFAULT_TTS_VOICE
+
+            payload = {
+                "model": model,
+                "input": text,
+                "voice": voice_id,
+                "response_format": "wav",
+                "speed": 1.0,
+                "volume": 1.0,
+            }
+            headers = {
+                "Authorization": f"Bearer {api_key}",
+                "Content-Type": "application/json",
+            }
+            response = requests.post(
+                url, headers=headers, json=payload, timeout=REQUEST_TIMEOUT
+            )
+
+            if response.status_code != 200:
+                logger.error(
+                    f"[ZhipuAIVoice] textToVoice failed: status={response.status_code} "
+                    f"body={response.text[:500]} model={model} voice={voice_id}"
+                )
+                return Reply(ReplyType.ERROR, "语音合成失败，请稍后再试")
+
+            # Some errors come back as JSON / SSE with HTTP 200.
+            ct = response.headers.get("Content-Type", "")
+            if "application/json" in ct or "text/event-stream" in ct:
+                try:
+                    err = response.json()
+                except Exception:
+                    err = {"raw": response.text[:500]}
+                logger.error(
+                    f"[ZhipuAIVoice] textToVoice unexpected text response "
+                    f"(content_type={ct}): {err}"
+                )
+                return Reply(ReplyType.ERROR, "语音合成失败，请稍后再试")
+
+            audio_bytes = response.content
+            ext = self._sniff_audio_ext(audio_bytes) or "wav"
+
+            file_name = (
+                "tmp/" + datetime.datetime.now().strftime("%Y%m%d%H%M%S")
+                + str(random.randint(0, 1000)) + "." + ext
+            )
+            os.makedirs(os.path.dirname(file_name), exist_ok=True)
+            with open(file_name, "wb") as f:
+                f.write(audio_bytes)
+            logger.info(
+                f"[ZhipuAIVoice] textToVoice model={model} voice={voice_id} "
+                f"file={file_name} bytes={len(audio_bytes)} ext={ext}"
+            )
+            return Reply(ReplyType.VOICE, file_name)
+        except Exception as e:
+            logger.exception(f"[ZhipuAIVoice] textToVoice exception: {e}")
+            return Reply(ReplyType.ERROR, "语音合成失败，请稍后再试")
+
+    @staticmethod
+    def _sniff_audio_ext(data: bytes) -> str:
+        """Detect audio container by magic bytes; returns '' on unknown."""
+        if len(data) < 12:
+            return ""
+        head = data[:12]
+        if head[:4] == b"RIFF" and head[8:12] == b"WAVE":
+            return "wav"
+        if head[:3] == b"ID3" or head[:2] == b"\xff\xfb" or head[:2] == b"\xff\xf3" or head[:2] == b"\xff\xf2":
+            return "mp3"
+        if head[:4] == b"OggS":
+            return "ogg"
+        if head[:4] == b"fLaC":
+            return "flac"
+        return ""
+
+    @staticmethod
+    def _ensure_compatible_format(voice_file: str) -> str:
+        # glm-asr-2512 only accepts .wav / .mp3
+        lower = voice_file.lower()
+        if lower.endswith(".mp3") or lower.endswith(".wav"):
+            return voice_file
+        try:
+            mp3_file = os.path.splitext(voice_file)[0] + ".mp3"
+            audio_convert.any_to_mp3(voice_file, mp3_file)
+            return mp3_file
+        except Exception as e:
+            logger.warning(f"[ZhipuAIVoice] mp3 convert failed: {e}")
+            return voice_file