feat: release 2.0.9

Merge pull request #2826 from zhayujie/feat-multi-model
feat: multi-provider model console
2026-06-03 02:27:09 +08:00 · 2026-05-22 12:25:22 +08:00 · 2026-05-22 11:08:13 +08:00 · 2026-05-22 11:04:55 +08:00 · 2026-05-22 10:54:56 +08:00 · 2026-05-22 10:39:04 +08:00
299 changed files with 30266 additions and 3509 deletions
--- a/.github/workflows/deploy-image-arm.yml
+++ b/.github/workflows/deploy-image-arm.yml
@@ -19,7 +19,7 @@ env:

 jobs:
  build-and-push-image:
-    if: github.repository == 'zhayujie/chatgpt-on-wechat'
+    if: github.repository == 'zhayujie/CowAgent'
    runs-on: ubuntu-latest
    permissions:
      contents: read
@@ -51,7 +51,12 @@ jobs:
        uses: docker/metadata-action@v4
        with:
          images: |
-            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
+            ${{ env.REGISTRY }}/zhayujie/chatgpt-on-wechat
+            ${{ env.REGISTRY }}/zhayujie/cowagent
+          tags: |
+            type=raw,value=latest-arm64,enable={{is_default_branch}}
+            type=ref,event=branch,suffix=-arm64
+            type=ref,event=tag,suffix=-arm64

      - name: Build and push Docker image
        uses: docker/build-push-action@v3
@@ -60,7 +65,7 @@ jobs:
          push: true
          file: ./docker/Dockerfile.latest
          platforms: linux/arm64
-          tags: ${{ steps.meta.outputs.tags }}-arm64
+          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}

      - uses: actions/delete-package-versions@v4
--- a/.github/workflows/deploy-image.yml
+++ b/.github/workflows/deploy-image.yml
@@ -16,10 +16,11 @@ on:
 env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}
+  DOCKERHUB_IMAGE: zhayujie/chatgpt-on-wechat

 jobs:
  build-and-push-image:
-    if: github.repository == 'zhayujie/chatgpt-on-wechat'
+    if: github.repository == 'zhayujie/CowAgent'
    runs-on: ubuntu-latest
    permissions:
      contents: read
@@ -47,8 +48,14 @@ jobs:
        uses: docker/metadata-action@v4
        with:
          images: |
-            ${{ env.IMAGE_NAME }}
-            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
+            zhayujie/chatgpt-on-wechat
+            zhayujie/cowagent
+            ${{ env.REGISTRY }}/zhayujie/chatgpt-on-wechat
+            ${{ env.REGISTRY }}/zhayujie/cowagent
+          tags: |
+            type=raw,value=latest,enable={{is_default_branch}}
+            type=ref,event=branch
+            type=ref,event=tag

      - name: Build and push Docker image
        uses: docker/build-push-action@v3
--- a/README.md
+++ b/README.md
@@ -1,13 +1,13 @@
-<p align="center"><img src= "https://github.com/user-attachments/assets/eca9a9ec-8534-4615-9e0f-96c5ac1d10a3" alt="Chatgpt-on-Wechat" width="550" /></p>
+<p align="center"><img src= "https://github.com/user-attachments/assets/eca9a9ec-8534-4615-9e0f-96c5ac1d10a3" alt="CowAgent" width="550" /></p>

 <p align="center">
-  <a href="https://github.com/zhayujie/chatgpt-on-wechat/releases/latest"><img src="https://img.shields.io/github/v/release/zhayujie/chatgpt-on-wechat" alt="Latest release"></a>
-  <a href="https://github.com/zhayujie/chatgpt-on-wechat/blob/master/LICENSE"><img src="https://img.shields.io/github/license/zhayujie/chatgpt-on-wechat" alt="License: MIT"></a>
-  <a href="https://github.com/zhayujie/chatgpt-on-wechat"><img src="https://img.shields.io/github/stars/zhayujie/chatgpt-on-wechat?style=flat-square" alt="Stars"></a> <br/>
+  <a href="https://github.com/zhayujie/CowAgent/releases/latest"><img src="https://img.shields.io/github/v/release/zhayujie/CowAgent" alt="Latest release"></a>
+  <a href="https://github.com/zhayujie/CowAgent/blob/master/LICENSE"><img src="https://img.shields.io/github/license/zhayujie/CowAgent" alt="License: MIT"></a>
+  <a href="https://github.com/zhayujie/CowAgent"><img src="https://img.shields.io/github/stars/zhayujie/CowAgent?style=flat-square" alt="Stars"></a> <br/>
  [中文] | [<a href="docs/en/README.md">English</a>] | [<a href="docs/ja/README.md">日本語</a>]
 </p>

-**CowAgent** 是基于大模型的超级 AI 助理，能够主动思考和任务规划、操作计算机和外部资源、创造和执行 Skills、拥有长期记忆并不断成长，比 OpenClaw 更轻量和便捷。CowAgent 支持灵活切换多种模型，能处理文本、语音、图片、文件等多模态消息，可接入微信、飞书、钉钉、企微智能机器人、QQ、企微自建应用、微信公众号、网页中使用，7*24小时运行于你的个人电脑或服务器中。
+**CowAgent** 是基于大模型的超级 AI 助理，能够主动思考和任务规划、操作计算机和外部资源、创造和执行 Skills、拥有长期记忆和知识库并不断成长，比 OpenClaw 更轻量和便捷。CowAgent 支持灵活切换多种模型，能处理文本、语音、图片、文件等多模态消息，可接入微信、飞书、钉钉、企微智能机器人、QQ、企微自建应用、微信公众号、网页中使用，7*24小时运行于你的个人电脑或服务器中。

 <p align="center">
  <a href="https://cowagent.ai/">🌐 官网</a> &nbsp;·&nbsp;
@@ -23,12 +23,13 @@
 > 该项目既是一个可以开箱即用的超级 AI 助理，也是一个支持高扩展的 Agent 框架，可以通过为项目扩展大模型接口、接入渠道、内置工具、Skills 系统来灵活实现各种定制需求。核心能力如下：

 -  ✅  **自主任务规划**：能够理解复杂任务并自主规划执行，持续思考和调用工具直到完成目标
-  ✅  **长期记忆：** 自动将对话记忆持久化至本地文件和数据库中，包括核心记忆和日级记忆，支持关键词及向量检索
+-  ✅  **长期记忆：** 自动将对话记忆持久化至本地文件和数据库中，包括核心记忆、日级记忆和梦境蒸馏，支持关键词及向量检索
+-  ✅  **个人知识库：** 自动整理结构化知识，通过交叉引用构建知识图谱，支持通过对话管理和可视化浏览知识库
 -  ✅  **技能系统：** Skills 安装和运行的引擎，支持从 [Skill Hub](https://skills.cowagent.ai/)、GitHub 等一键安装技能，或通过对话创造 Skills
-  ✅  **工具系统：** 内置文件读写、终端执行、浏览器操作、定时任务等工具，Agent 自主调用以完成复杂任务
+-  ✅  **工具系统：** 内置文件读写、终端执行、浏览器操作、定时任务等工具，支持 MCP 协议，通过 Agent 自主调用完成复杂任务
 -  ✅  **CLI系统：** 提供终端命令和对话命令，支持进程管理、技能安装、配置修改等操作
 -  ✅  **多模态消息：** 支持对文本、图片、语音、文件等多类型消息进行解析、处理、生成、发送等操作
-  ✅  **多模型支持：** 支持 OpenAI, Claude, Gemini, DeepSeek, MiniMax、GLM、Qwen、Kimi、Doubao 等国内外主流模型厂商
+-  ✅  **多模型支持：** 支持 DeepSeek、MiniMax、Claude、Gemini、OpenAI、GLM、Qwen、Doubao、Kimi 等国内外主流模型厂商
 -  ✅  **多通道接入：** 支持运行在本地计算机或服务器，可集成到微信、飞书、钉钉、企业微信、QQ、微信公众号、网页中使用

 ## 声明
@@ -69,17 +70,25 @@

 # 🏷 更新日志

->**2026.04.01：** [2.0.5版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.5)，Cow CLI 命令系统、Skill Hub 开源、浏览器工具、企微扫码创建、多项优化和修复。
+>**2026.05.22：** [2.0.9版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.9)，新增模型管理、MCP 协议支持、浏览器登录态持久化、新模型接入（gpt-5.5、gemini-3.5-flash、qwen3.7-max 等）、部署安全加固

->**2026.03.22：** [2.0.4版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.4)，新增个人微信通道（微信扫码即用）、新增 MiniMax-M2.7 和 GLM-5-Turbo 模型、run.sh 脚本重构、日文文档及多项修复。
+>**2026.05.06：** [2.0.8版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.8)，飞书渠道全面升级（语音、流式输出和Markdown、一键扫码接入）、新模型支持（DeepSeek V4、百度千帆）、定时任务工具增强等

->**2026.03.18：** [2.0.3版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.3)，新增企微智能机器人和 QQ 通道、支持 Coding Plan、新增多个模型、Web 端文件处理、记忆系统升级。
+>**2026.04.22：** [2.0.7版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.7)，图像生成内置技能（GPT Image 2、Nano Banana 等）、新模型支持（Kimi K2.6、Claude Opus 4.7、GLM 5.1）、知识库和记忆增强、Web 控制台优化

->**2026.02.27：** [2.0.2版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.2)，Web 控制台全面升级（流式对话、模型/技能/记忆/通道/定时任务/日志管理）、支持多通道同时运行、会话持久化存储、新增多个模型。
+>**2026.04.14：** [2.0.6版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.6)，知识库系统、梦境记忆模块、上下文智能压缩、Web 控制台多会话及多项优化。

->**2026.02.13：** [2.0.1版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.1)，内置 Web Search 工具、智能上下文裁剪策略、运行时信息动态更新、Windows 兼容性适配，修复定时任务记忆丢失、飞书连接等多项问题。
+>**2026.04.01：** [2.0.5版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.5)，Cow CLI 命令系统、Skill Hub 开源、浏览器工具、企微扫码创建、多项优化和修复。

->**2026.02.03：** [2.0.0版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.0)，正式升级为超级 Agent 助理，支持多轮任务决策、具备长期记忆、实现多种系统工具、支持 Skills 框架，新增多种模型并优化了接入渠道。
+>**2026.03.22：** [2.0.4版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.4)，新增个人微信通道（微信扫码即用）、新增 MiniMax-M2.7 和 GLM-5-Turbo 模型、run.sh 脚本重构、日文文档及多项修复。
+
+>**2026.03.18：** [2.0.3版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.3)，新增企微智能机器人和 QQ 通道、支持 Coding Plan、新增多个模型、Web 端文件处理、记忆系统升级。
+
+>**2026.02.27：** [2.0.2版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.2)，Web 控制台全面升级（流式对话、模型/技能/记忆/通道/定时任务/日志管理）、支持多通道同时运行、会话持久化存储、新增多个模型。
+
+>**2026.02.13：** [2.0.1版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.1)，内置 Web Search 工具、智能上下文裁剪策略、运行时信息动态更新、Windows 兼容性适配，修复定时任务记忆丢失、飞书连接等多项问题。
+
+>**2026.02.03：** [2.0.0版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.0)，正式升级为超级 Agent 助理，支持多轮任务决策、具备长期记忆、实现多种系统工具、支持 Skills 框架，新增多种模型并优化了接入渠道。

 更多更新历史请查看: [更新日志](https://docs.cowagent.ai/releases)

@@ -110,24 +119,24 @@ irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex

 项目支持国内外主流厂商的模型接口，可选模型及配置说明参考：[模型说明](#模型说明)。

-> 注：Agent 模式下推荐使用以下模型，可根据效果及成本综合选择：MiniMax-M2.7、glm-5-turbo、kimi-k2.5、qwen3.5-plus、claude-sonnet-4-6、gemini-3.1-pro-preview、gpt-5.4、gpt-5.4-mini
+> 注：Agent 模式下推荐使用以下模型，可根据效果及成本综合选择：deepseek-v4-flash / pro、MiniMax-M2.7、glm-5.1、kimi-k2.6、qwen3.5-plus、claude-sonnet-4-6、gemini-3.5-flash、gpt-5.5、ernie-5.1 等

 同时支持使用 **LinkAI 平台** 接口，支持上述全部模型，并支持知识库、工作流、插件等 Agent 技能，参考 [接口文档](https://docs.link-ai.tech/platform/api)。

 ### 2.环境安装

-支持 Linux、MacOS、Windows 操作系统，可在个人计算机及服务器上运行，需安装 `Python`，Python 版本需在3.7 ~ 3.12 之间。
+支持 Linux、MacOS、Windows 操作系统，可在个人计算机及服务器上运行，需安装 `Python`，Python 版本需在 3.7 ~ 3.13 之间。

 > 注意：Agent 模式推荐使用源码运行，若选择 Docker 部署则无需安装 python 环境和下载源码，可直接快进到下一节。

 **(1) 克隆项目代码：**

 ```bash
-git clone https://github.com/zhayujie/chatgpt-on-wechat
-cd chatgpt-on-wechat/
+git clone https://github.com/zhayujie/CowAgent
+cd CowAgent/
 ```

-若遇到网络问题可使用国内仓库地址：https://gitee.com/zhayujie/chatgpt-on-wechat
+若遇到网络问题可使用国内仓库地址：https://gitee.com/zhayujie/CowAgent

 **(2) 安装核心依赖 (必选)：**

@@ -177,7 +186,9 @@ cow install-browser
 # config.json 文件内容示例
 {
  "channel_type": "weixin",                                   # 接入渠道类型，默认为 weixin, 支持修改为 feishu,dingtalk,wecom_bot,qq,wechatcom_app,wechatmp_service,wechatmp,terminal
-  "model": "MiniMax-M2.7",                                    # 模型名称
+  "model": "deepseek-v4-flash",                                # 模型名称
+  "deepseek_api_key": "",                                      # DeepSeek API Key
+  "deepseek_api_base": "https://api.deepseek.com/v1",         # DeepSeek API 地址
  "minimax_api_key": "",                                      # MiniMax API Key
  "zhipu_ai_api_key": "",                                     # 智谱 GLM API Key
  "moonshot_api_key": "",                                     # Kimi/Moonshot API Key
@@ -187,8 +198,6 @@ cow install-browser
  "claude_api_base": "https://api.anthropic.com/v1",          # Claude API 地址，修改可接入三方代理平台
  "gemini_api_key": "",                                       # Gemini API Key
  "gemini_api_base": "https://generativelanguage.googleapis.com", # Gemini API 地址
-  "deepseek_api_key": "",                                      # DeepSeek API Key
-  "deepseek_api_base": "https://api.deepseek.com/v1",         # DeepSeek API 地址，可修改为第三方代理
  "open_ai_api_key": "",                                      # OpenAI API Key
  "open_ai_api_base": "https://api.openai.com/v1",            # OpenAI API 地址
  "linkai_api_key": "",                                       # LinkAI API Key
@@ -197,11 +206,13 @@ cow install-browser
  "group_speech_recognition": false,                          # 是否开启群组语音识别
  "voice_reply_voice": false,                                 # 是否使用语音回复语音
  "use_linkai": false,                                        # 是否使用 LinkAI 接口，默认关闭，设置为 true 后可对接 LinkAI 平台模型
+  "web_password": "",                                         # Web 控制台访问密码，留空则不启用密码保护（监听 0.0.0.0 时务必设置）
  "agent": true,                                              # 是否启用 Agent 模式，启用后拥有多轮工具决策、长期记忆、Skills 能力等
  "agent_workspace": "~/cow",                                 # Agent 的工作空间路径，用于存储 memory、skills、系统设定等
-  "agent_max_context_tokens": 40000,                          # Agent 模式下最大上下文 tokens，超出将自动丢弃最早的上下文
-  "agent_max_context_turns": 30,                              # Agent 模式下最大上下文记忆轮次，每轮包括一次用户提问和 AI 回复
-  "agent_max_steps": 15                                       # Agent 模式下单次任务的最大决策步数，超出后将停止继续调用工具
+  "agent_max_context_tokens": 50000,                          # Agent 模式下最大上下文 tokens，超出将自动智能压缩处理
+  "agent_max_context_turns": 20,                              # Agent 模式下最大上下文记忆轮次，一问一答为一轮，超出后智能压缩处理
+  "agent_max_steps": 20,                                      # Agent 模式下单次任务的最大决策步数，超出后将停止继续调用工具
+  "enable_thinking": false                                    # 是否启用深度思考模式
 }
 ```

@@ -213,12 +224,13 @@ cow install-browser
 + 添加 `"speech_recognition": true` 将开启语音识别，默认使用 openai 的 whisper 模型识别为文字，同时以文字回复，该参数仅支持私聊 (注意由于语音消息无法匹配前缀，一旦开启将对所有语音自动回复，支持语音触发画图)；
 + 添加 `"group_speech_recognition": true` 将开启群组语音识别，默认使用 openai 的 whisper 模型识别为文字，同时以文字回复，参数仅支持群聊 (会匹配 group_chat_prefix 和 group_chat_keyword, 支持语音触发画图)；
 + 添加 `"voice_reply_voice": true` 将开启语音回复语音（同时作用于私聊和群聊）
+ 使用 MiniMax TTS：设置 `"text_to_voice": "minimax"`，并配置 `minimax_api_key`；可通过 `"tts_voice_id"` 指定发音人（如 `English_Graceful_Lady`），`"text_to_voice_model"` 指定模型（如 `speech-2.8-hd`、`speech-2.8-turbo`）
 </details>

 <details>
 <summary>2. 其他配置</summary>

-+ `model`: 模型名称，Agent 模式下推荐使用 `MiniMax-M2.7`、`glm-5-turbo`、`kimi-k2.5`、`qwen3.6-plus`、`claude-sonnet-4-6`、`gemini-3.1-pro-preview`，全部模型名称参考[common/const.py](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/common/const.py)文件
+ `model`: 模型名称，Agent 模式下推荐使用 `deepseek-v4-flash`、`MiniMax-M2.7`、`glm-5.1`、`kimi-k2.6`、`qwen3.6-plus`、`claude-sonnet-4-6`、`gemini-3.1-pro-preview`，全部模型名称参考[common/const.py](https://github.com/zhayujie/CowAgent/blob/master/common/const.py)文件
 + `character_desc`：普通对话模式下的机器人系统提示词。在 Agent 模式下该配置不生效，由工作空间中的文件内容构成。
 + `subscribe_msg`：订阅消息，公众号和企业微信 channel 中请填写，当被订阅时会自动回复， 可使用特殊占位符。目前支持的占位符有{trigger_prefix}，在程序中它会自动替换成 bot 的触发词。
 </details>
@@ -230,7 +242,7 @@ cow install-browser
 + `linkai_api_key`: LinkAI Api Key，可在 [控制台](https://link-ai.tech/console/interface) 创建
 </details>

-注：全部配置项说明可在 [`config.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/config.py) 文件中查看。
+注：全部配置项说明可在 [`config.py`](https://github.com/zhayujie/CowAgent/blob/master/config.py) 文件中查看。

 ## 三、运行

@@ -305,6 +317,97 @@ sudo docker logs -f chatgpt-on-wechat

 推荐通过 Web 控制台在线管理模型配置，无需手动编辑文件，详见 [模型文档](https://docs.cowagent.ai/models)。以下是手动修改 `config.json` 配置模型的说明：

+<details>
+<summary>DeepSeek</summary>
+
+1. API Key 创建：在 [DeepSeek 平台](https://platform.deepseek.com/api_keys) 创建 API Key
+
+2. 填写配置
+
+方式一：官方接入（推荐）：
+
+```json
+{
+    "model": "deepseek-v4-flash",
+    "deepseek_api_key": "sk-xxxxxxxxxxx"
+}
+```
+
+ - `model`: 推荐填写 `deepseek-v4-flash`、`deepseek-v4-pro`
+ - `deepseek_api_key`: DeepSeek 平台的 API Key
+ - `deepseek_api_base`: 可选，默认为 `https://api.deepseek.com/v1`，可修改为第三方代理地址
+
+方式二：OpenAI 兼容方式接入：
+
+```json
+{
+    "model": "deepseek-v4-flash",
+    "bot_type": "openai",
+    "open_ai_api_key": "sk-xxxxxxxxxxx",
+    "open_ai_api_base": "https://api.deepseek.com/v1"
+}
+```
+
+</details>
+
+<details>
+<summary>MiniMax</summary>
+
+方式一：官方接入，配置如下(推荐)：
+
+```json
+{
+    "model": "MiniMax-M2.7",
+    "minimax_api_key": ""
+}
+```
+ - `model`: 可填写 `MiniMax-M2.7、MiniMax-M2.7-highspeed、MiniMax-M2.5、MiniMax-M2.1、MiniMax-M2.1-lightning、MiniMax-M2、abab6.5-chat` 等
+ - `minimax_api_key`：MiniMax 平台的 API-KEY，在 [控制台](https://platform.minimaxi.com/user-center/basic-information/interface-key) 创建
+
+方式二：OpenAI 兼容方式接入，配置如下：
+```json
+{
+  "bot_type": "openai",
+  "model": "MiniMax-M2.7",
+  "open_ai_api_base": "https://api.minimaxi.com/v1",
+  "open_ai_api_key": ""
+}
+```
+- `bot_type`: OpenAI 兼容方式
+- `model`: 可填 `MiniMax-M2.7、MiniMax-M2.7-highspeed、MiniMax-M2.5、MiniMax-M2.1、MiniMax-M2.1-lightning、MiniMax-M2`，参考[API文档](https://platform.minimaxi.com/document/%E5%AF%B9%E8%AF%9D?key=66701d281d57f38758d581d0#QklxsNSbaf6kM4j6wjO5eEek)
+- `open_ai_api_base`: MiniMax 平台 API 的 BASE URL
+- `open_ai_api_key`: MiniMax 平台的 API-KEY
+</details>
+
+<details>
+<summary>Claude</summary>
+
+1. API Key 创建：在 [Claude控制台](https://console.anthropic.com/settings/keys) 创建 API Key
+
+2. 填写配置
+
+```json
+{
+    "model": "claude-sonnet-4-6",
+    "claude_api_key": "YOUR_API_KEY"
+}
+```
+ - `model`: 参考 [官方模型ID](https://docs.anthropic.com/en/docs/about-claude/models/overview#model-aliases) ，支持 `claude-sonnet-4-6、claude-opus-4-7、claude-opus-4-6、claude-sonnet-4-5、claude-sonnet-4-0、claude-opus-4-0、claude-3-5-sonnet-latest` 等
+</details>
+
+<details>
+<summary>Gemini</summary>
+
+API Key 创建：在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn) 创建 API Key ，配置如下
+```json
+{
+    "model": "gemini-3.1-flash-lite-preview",
+    "gemini_api_key": ""
+}
+```
+ - `model`: 参考[官方文档-模型列表](https://ai.google.dev/gemini-api/docs/models?hl=zh-cn)，支持 `gemini-3.1-flash-lite-preview、gemini-3.1-pro-preview、gemini-3-flash-preview、gemini-3-pro-preview` 等
+</details>
+
 <details>
 <summary>OpenAI</summary>

@@ -326,55 +429,6 @@ sudo docker logs -f chatgpt-on-wechat
 - `bot_type`: 使用 OpenAI 相关模型时无需填写。当使用第三方代理接口接入 Claude 等非 OpenAI 官方模型时，该参数设为 `openai`
 </details>

-<details>
-<summary>LinkAI</summary>
-
-1. API Key 创建：在 [LinkAI平台](https://link-ai.tech/console/interface) 创建 API Key 
-
-2. 填写配置
-
-```json
-{
-    "model": "gpt-5.4-mini",
-    "use_linkai": true,
-    "linkai_api_key": "YOUR API KEY"
-}
-```
-
-+ `use_linkai`: 是否使用 LinkAI 接口，默认关闭，设置为 true 后可对接 LinkAI 平台的模型，并使用知识库、工作流、数据库、插件等丰富的 Agent 技能
-+ `linkai_api_key`: LinkAI 平台的 API Key，可在 [控制台](https://link-ai.tech/console/interface) 中创建
-+ `model`: [模型列表](https://link-ai.tech/console/models)中的全部模型均可使用
-</details>
-
-<details>
-<summary>MiniMax</summary>
-
-方式一：官方接入，配置如下(推荐)：
-
-```json
-{
-    "model": "MiniMax-M2.7",
-    "minimax_api_key": ""
-}
-```
- - `model`: 可填写 `MiniMax-M2.7、MiniMax-M2.5、MiniMax-M2.1、MiniMax-M2.1-lightning、MiniMax-M2、abab6.5-chat` 等
- - `minimax_api_key`：MiniMax 平台的 API-KEY，在 [控制台](https://platform.minimaxi.com/user-center/basic-information/interface-key) 创建
-
-方式二：OpenAI 兼容方式接入，配置如下：
-```json
-{
-  "bot_type": "openai",
-  "model": "MiniMax-M2.7",
-  "open_ai_api_base": "https://api.minimaxi.com/v1",
-  "open_ai_api_key": ""
-}
-```
- `bot_type`: OpenAI 兼容方式
- `model`: 可填 `MiniMax-M2.7、MiniMax-M2.5、MiniMax-M2.1、MiniMax-M2.1-lightning、MiniMax-M2`，参考[API文档](https://platform.minimaxi.com/document/%E5%AF%B9%E8%AF%9D?key=66701d281d57f38758d581d0#QklxsNSbaf6kM4j6wjO5eEek)
- `open_ai_api_base`: MiniMax 平台 API 的 BASE URL
- `open_ai_api_key`: MiniMax 平台的 API-KEY
-</details>
-
 <details>
 <summary>智谱AI (GLM)</summary>

@@ -382,24 +436,24 @@ sudo docker logs -f chatgpt-on-wechat

 ```json
 {
-  "model": "glm-5-turbo",
+  "model": "glm-5.1",
  "zhipu_ai_api_key": ""
 }
 ```
- - `model`: 可填 `glm-5-turbo、glm-5、glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long` 等, 参考 [glm 系列模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4)
+ - `model`: 可填 `glm-5.1、glm-5-turbo、glm-5、glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long` 等, 参考 [glm 系列模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4)
 - `zhipu_ai_api_key`: 智谱AI 平台的 API KEY，在 [控制台](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) 创建

 方式二：OpenAI 兼容方式接入，配置如下：
 ```json
 {
  "bot_type": "openai",
-  "model": "glm-5-turbo",
+  "model": "glm-5.1",
  "open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
  "open_ai_api_key": ""
 }
 ```
 - `bot_type`: OpenAI 兼容方式
- `model`: 可填 `glm-5-turbo、glm-5、glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long` 等
+- `model`: 可填 `glm-5.1、glm-5-turbo、glm-5、glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long` 等
 - `open_ai_api_base`: 智谱AI 平台的 BASE URL
 - `open_ai_api_key`: 智谱AI 平台的 API KEY
 </details>
@@ -433,35 +487,6 @@ sudo docker logs -f chatgpt-on-wechat
 - `open_ai_api_key`: 通义千问的 API-KEY
 </details>

-<details>
-<summary>Kimi (Moonshot)</summary>
-
-方式一：官方接入，配置如下：
-
-```json
-{
-    "model": "kimi-k2.5",
-    "moonshot_api_key": ""
-}
-```
- - `model`: 可填写 `kimi-k2.5、kimi-k2、moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
- - `moonshot_api_key`: Moonshot 的 API-KEY，在 [控制台](https://platform.moonshot.cn/console/api-keys) 创建
- 
-方式二：OpenAI 兼容方式接入，配置如下：
-```json
-{
-  "bot_type": "openai",
-  "model": "kimi-k2.5",
-  "open_ai_api_base": "https://api.moonshot.cn/v1",
-  "open_ai_api_key": ""
-}
-```
- `bot_type`: OpenAI 兼容方式
- `model`: 可填写 `kimi-k2.5、kimi-k2、moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
- `open_ai_api_base`: Moonshot 的 BASE URL
- `open_ai_api_key`: Moonshot 的 API-KEY
-</details>
-
 <details>
 <summary>豆包 (Doubao)</summary>

@@ -481,67 +506,74 @@ sudo docker logs -f chatgpt-on-wechat
 </details>

 <details>
-<summary>Claude</summary>
+<summary>Kimi (Moonshot)</summary>

-1. API Key 创建：在 [Claude控制台](https://console.anthropic.com/settings/keys) 创建 API Key
+方式一：官方接入，配置如下：
+
+```json
+{
+    "model": "kimi-k2.6",
+    "moonshot_api_key": ""
+}
+```
+ - `model`: 可填写 `kimi-k2.6、kimi-k2.5、kimi-k2、moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
+ - `moonshot_api_key`: Moonshot 的 API-KEY，在 [控制台](https://platform.moonshot.cn/console/api-keys) 创建
+
+方式二：OpenAI 兼容方式接入，配置如下：
+```json
+{
+  "bot_type": "openai",
+  "model": "kimi-k2.6",
+  "open_ai_api_base": "https://api.moonshot.cn/v1",
+  "open_ai_api_key": ""
+}
+```
+- `bot_type`: OpenAI 兼容方式
+- `model`: 可填写 `kimi-k2.6、kimi-k2.5、kimi-k2、moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
+- `open_ai_api_base`: Moonshot 的 BASE URL
+- `open_ai_api_key`: Moonshot 的 API-KEY
+</details>
+
+<details>
+<summary>ModelScope</summary>
+
+```json
+{
+  "bot_type": "modelscope",
+  "model": "Qwen/QwQ-32B",
+  "modelscope_api_key": "your_api_key",
+  "modelscope_base_url": "https://api-inference.modelscope.cn/v1/chat/completions",
+  "text_to_image": "MusePublic/489_ckpt_FLUX_1"
+}
+```
+
+- `bot_type`: modelscope 接口格式
+- `model`: 参考[模型列表](https://www.modelscope.cn/models?filter=inference_type&page=1)
+- `modelscope_api_key`: 参考 [官方文档-访问令牌](https://modelscope.cn/docs/accounts/token) ，在 [控制台](https://modelscope.cn/my/myaccesstoken)
+- `modelscope_base_url`: modelscope 平台的 BASE URL
+- `text_to_image`: 图像生成模型，参考[模型列表](https://www.modelscope.cn/models?filter=inference_type&page=1)
+</details>
+
+<details>
+<summary>LinkAI</summary>
+
+1. API Key 创建：在 [LinkAI平台](https://link-ai.tech/console/interface) 创建 API Key

 2. 填写配置

 ```json
 {
-    "model": "claude-sonnet-4-6",
-    "claude_api_key": "YOUR_API_KEY"
+    "model": "gpt-5.4-mini",
+    "use_linkai": true,
+    "linkai_api_key": "YOUR API KEY"
 }
 ```
- - `model`: 参考 [官方模型ID](https://docs.anthropic.com/en/docs/about-claude/models/overview#model-aliases) ，支持 `claude-sonnet-4-6、claude-opus-4-6、claude-sonnet-4-5、claude-sonnet-4-0、claude-opus-4-0、claude-3-5-sonnet-latest` 等
+
+ `use_linkai`: 是否使用 LinkAI 接口，默认关闭，设置为 true 后可对接 LinkAI 平台的模型，并使用知识库、工作流、数据库、插件等丰富的 Agent 技能
+ `linkai_api_key`: LinkAI 平台的 API Key，可在 [控制台](https://link-ai.tech/console/interface) 中创建
+ `model`: [模型列表](https://link-ai.tech/console/models)中的全部模型均可使用
 </details>

-<details>
-<summary>Gemini</summary>
-
-API Key 创建：在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn) 创建 API Key ，配置如下
-```json
-{
-    "model": "gemini-3.1-flash-lite-preview",
-    "gemini_api_key": ""
-}
-```
- - `model`: 参考[官方文档-模型列表](https://ai.google.dev/gemini-api/docs/models?hl=zh-cn)，支持 `gemini-3.1-flash-lite-preview、gemini-3.1-pro-preview、gemini-3-flash-preview、gemini-3-pro-preview` 等
-</details>
-
-<details>
-<summary>DeepSeek</summary>
-
-1. API Key 创建：在 [DeepSeek 平台](https://platform.deepseek.com/api_keys) 创建 API Key 
-
-2. 填写配置
-
-方式一：官方接入（推荐）：
-
-```json
-{
-    "model": "deepseek-chat",
-    "deepseek_api_key": "sk-xxxxxxxxxxx"
-}
-```
-
- - `model`: 可填 `deepseek-chat、deepseek-reasoner`，分别对应的是 DeepSeek-V3.2（非思考模式）和 DeepSeek-R1（思考模式）
- - `deepseek_api_key`: DeepSeek 平台的 API Key
- - `deepseek_api_base`: 可选，默认为 `https://api.deepseek.com/v1`，可修改为第三方代理地址
-
-方式二：OpenAI 兼容方式接入：
-
-```json
-{
-    "model": "deepseek-chat",
-    "bot_type": "openai",
-    "open_ai_api_key": "sk-xxxxxxxxxxx",
-    "open_ai_api_base": "https://api.deepseek.com/v1"
-}
-```
-
- </details>
-
 <details>
 <summary>Azure</summary>

@@ -569,33 +601,35 @@ API Key 创建：在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn
 </details>

 <details>
-<summary>百度文心</summary>
-方式一：官方 SDK 接入，配置如下：
+<summary>百度千帆 / ERNIE</summary>
+
+方式一：官方接入（推荐），配置如下：

 ```json
 {
-    "model": "wenxin-4", 
-    "baidu_wenxin_api_key": "IajztZ0bDxgnP9bEykU7lBer",
-    "baidu_wenxin_secret_key": "EDPZn6L24uAS9d8RWFfotK47dPvkjD6G"
+  "model": "ernie-5.1",
+  "qianfan_api_key": "",
+  "qianfan_api_base": "https://qianfan.baidubce.com/v2"
 }
 ```
- - `model`: 可填 `wenxin`和`wenxin-4`，对应模型为 文心-3.5 和 文心-4.0
- - `baidu_wenxin_api_key`：参考 [千帆平台-access_token鉴权](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/dlv4pct3s) 文档获取 API Key
- - `baidu_wenxin_secret_key`：参考 [千帆平台-access_token鉴权](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/dlv4pct3s) 文档获取 Secret Key
+
+ - `model`: 默认推荐填写 `ernie-5.1`（多模态，可直接识图），也可填写 `ernie-5.0`、`ernie-x1.1`、`ernie-4.5-turbo-128k`、`ernie-4.5-turbo-32k`；当主模型为纯文本 ERNIE 时，Vision 工具会自动 fallback 到 `ernie-4.5-turbo-vl`
+ - `qianfan_api_key`: 百度千帆 API Key，通常以 `bce-v3/` 开头，可在百度智能云控制台创建
+ - `qianfan_api_base`: 可选，默认为 `https://qianfan.baidubce.com/v2`

 方式二：OpenAI 兼容方式接入，配置如下：
 ```json
 {
  "bot_type": "openai",
-  "model": "ERNIE-4.0-Turbo-8K",
+  "model": "ernie-5.1",
  "open_ai_api_base": "https://qianfan.baidubce.com/v2",
-  "open_ai_api_key": "bce-v3/ALTxxxxxxd2b"
+  "open_ai_api_key": ""
 }
 ```
 - `bot_type`: OpenAI 兼容方式
- `model`: 支持官方所有模型，参考[模型列表](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/Wm9cvy6rl)
- `open_ai_api_base`: 百度文心 API 的 BASE URL
- `open_ai_api_key`: 百度文心的 API-KEY，参考 [官方文档](https://cloud.baidu.com/doc/qianfan-api/s/ym9chdsy5) ，在 [控制台](https://console.bce.baidu.com/iam/#/iam/apikey/list) 创建 API Key
+- `model`: 支持千帆平台上的 ERNIE 模型
+- `open_ai_api_base`: 百度千帆 OpenAI 兼容 API 的 BASE URL
+- `open_ai_api_key`: 百度千帆 API Key

 </details>

@@ -634,26 +668,6 @@ API Key 创建：在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn
 - `open_ai_api_key`: 讯飞星火平台的[APIPassword](https://console.xfyun.cn/services/bm3) ，因模型而已
 </details>

-<details>
-<summary>ModelScope</summary>
-
-```json
-{
-  "bot_type": "modelscope",
-  "model": "Qwen/QwQ-32B",
-  "modelscope_api_key": "your_api_key",
-  "modelscope_base_url": "https://api-inference.modelscope.cn/v1/chat/completions",
-  "text_to_image": "MusePublic/489_ckpt_FLUX_1"
-}
-```
-
- `bot_type`: modelscope 接口格式
- `model`: 参考[模型列表](https://www.modelscope.cn/models?filter=inference_type&page=1)
- `modelscope_api_key`: 参考 [官方文档-访问令牌](https://modelscope.cn/docs/accounts/token) ，在 [控制台](https://modelscope.cn/my/myaccesstoken) 
- `modelscope_base_url`: modelscope 平台的 BASE URL
- `text_to_image`: 图像生成模型，参考[模型列表](https://www.modelscope.cn/models?filter=inference_type&page=1)
-</details>
-
 <details>
 <summary>Coding Plan</summary>

@@ -703,48 +717,42 @@ Coding Plan 是各厂商推出的编程包月套餐，所有厂商均可通过 O
 ```json
 {
    "channel_type": "web",
+    "web_host": "0.0.0.0",
+    "web_password": "YOUR PASSWORD",
    "web_port": 9899
 }
 ```

+- `web_host`: 监听地址，默认 `127.0.0.1`（仅本机），如需公网访问请改为 `0.0.0.0` 并设置密码
 - `web_port`: 默认为 9899，可按需更改，需要服务器防火墙和安全组放行该端口
- 如本地运行，启动后请访问 `http://localhost:9899/chat` ；如服务器运行，请访问 `http://ip:9899/chat` 
+- `web_password`: 访问密码，留空则不启用密码保护。部署在公网环境时请务必设置
+- 如本地运行，启动后请访问 `http://localhost:9899` ；如服务器运行，请访问 `http://YOUR_IP:9899`
 > 注：请将上述 url 中的 ip 或者 port 替换为实际的值
 </details>

 <details>
 <summary>3. Feishu - 飞书</summary>

-飞书支持两种事件接收模式：WebSocket 长连接（推荐）和 Webhook。
+飞书使用 WebSocket 长连接模式，无需公网 IP。详细步骤参考 [飞书接入](https://docs.cowagent.ai/channels/feishu)。

-**方式一：WebSocket 模式（推荐，无需公网 IP）**
+**方式一：扫码一键创建（推荐）**
+
+启动 Cow 后打开 Web 控制台，**通道** → **接入通道** → 选择 **飞书** → 扫码创建。也支持 CLI 启动时在终端打印二维码。
+
+**方式二：手动配置**
+
+在飞书开放平台创建自建应用并配置权限后，将凭据填入 `config.json`：

 ```json
 {
    "channel_type": "feishu",
    "feishu_app_id": "APP_ID",
    "feishu_app_secret": "APP_SECRET",
-    "feishu_event_mode": "websocket"
+    "feishu_stream_reply": true
 }
 ```

-**方式二：Webhook 模式（需要公网 IP）**
-
-```json
-{
-    "channel_type": "feishu",
-    "feishu_app_id": "APP_ID",
-    "feishu_app_secret": "APP_SECRET",
-    "feishu_token": "VERIFICATION_TOKEN",
-    "feishu_event_mode": "webhook",
-    "feishu_port": 9891
-}
-```
-
- `feishu_event_mode`: 事件接收模式，`websocket`（推荐）或 `webhook`
- WebSocket 模式需安装依赖：`pip3 install lark-oapi`
-
-详细步骤和参数说明参考 [飞书接入](https://docs.cowagent.ai/channels/feishu)
+- `feishu_stream_reply`：是否开启流式打字机回复，默认开启（需 `cardkit:card:write` 权限 + 飞书客户端 ≥ 7.20）

 </details>

@@ -766,7 +774,15 @@ Coding Plan 是各厂商推出的编程包月套餐，所有厂商均可通过 O
 <details>
 <summary>5. WeCom Bot - 企微智能机器人</summary>

-企微智能机器人使用 WebSocket 长连接模式，无需公网 IP 和域名，配置简单：
+企微智能机器人使用 WebSocket 长连接模式，无需公网 IP 和域名。详细步骤参考 [企微智能机器人接入](https://docs.cowagent.ai/channels/wecom-bot)。
+
+**方式一：扫码一键创建（推荐）**
+
+启动 Cow 后打开 Web 控制台，**通道** → **接入通道** → 选择 **企微智能机器人** → 使用企业微信扫码创建。
+
+**方式二：手动配置**
+
+在企业微信中创建智能机器人并选择**长连接模式**，记录 Bot ID 和 Secret 后填入 `config.json`：

 ```json
 {
@@ -775,7 +791,6 @@ Coding Plan 是各厂商推出的编程包月套餐，所有厂商均可通过 O
    "wecom_bot_secret": "YOUR_SECRET"
 }
 ```
-详细步骤和参数说明参考 [企微智能机器人接入](https://docs.cowagent.ai/channels/wecom-bot)

 </details>

@@ -878,18 +893,28 @@ QQ 机器人使用 WebSocket 长连接模式，无需公网 IP 和域名，支

 # 🔎 常见问题

-FAQs： <https://github.com/zhayujie/chatgpt-on-wechat/wiki/FAQs>
+FAQs： <https://github.com/zhayujie/CowAgent/wiki/FAQs>

 或直接在线咨询 [项目小助手](https://link-ai.tech/app/Kv2fXJcH)  (知识库持续完善中，回复供参考)

 # 🛠️ 开发

-欢迎接入更多应用通道，参考 [飞书通道](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/channel/feishu/feishu_channel.py) 新增自定义通道，实现接收和发送消息逻辑即可完成接入。同时欢迎贡献新的 Skills，向 [Skill Hub](https://skills.cowagent.ai/submit) 提交技能。
+欢迎接入更多应用通道，参考 [飞书通道](https://github.com/zhayujie/CowAgent/blob/master/channel/feishu/feishu_channel.py) 新增自定义通道，实现接收和发送消息逻辑即可完成接入。同时欢迎贡献新的 Skills，向 [Skill Hub](https://skills.cowagent.ai/submit) 提交技能。

 # ✉ 联系

-欢迎提交PR、Issues进行反馈，以及通过 🌟Star 支持并关注项目更新。项目运行遇到问题可以查看 [常见问题列表](https://github.com/zhayujie/chatgpt-on-wechat/wiki/FAQs) ，以及前往 [Issues](https://github.com/zhayujie/chatgpt-on-wechat/issues) 中搜索。个人开发者可加入开源交流群参与更多讨论，企业用户可联系[产品客服](https://cdn.link-ai.tech/portal/linkai-customer-service.png)咨询。
+欢迎提交PR、Issues进行反馈，以及通过 🌟Star 支持并关注项目更新。项目运行遇到问题可以查看 [常见问题列表](https://github.com/zhayujie/CowAgent/wiki/FAQs) ，以及前往 [Issues](https://github.com/zhayujie/CowAgent/issues) 中搜索。个人开发者可加入开源交流群参与更多讨论，企业用户可联系[产品客服](https://cdn.link-ai.tech/portal/linkai-customer-service.png)咨询。

 # 🌟 贡献者

-![cow contributors](https://contrib.rocks/image?repo=zhayujie/chatgpt-on-wechat&max=1000)
+![cow contributors](https://contrib.rocks/image?repo=zhayujie/CowAgent&max=1000)
+
+# 📌 项目更名说明
+
+本项目原名 `chatgpt-on-wechat`（GitHub 原地址：https://github.com/zhayujie/chatgpt-on-wechat ），
+于 2026.04.13 正式更名为 **CowAgent**。GitHub 已自动设置重定向，原有链接仍可正常访问。
+
+如需更新本地仓库的远程地址（可选）：
+```bash
+git remote set-url origin https://github.com/zhayujie/CowAgent.git
+```
--- a/agent/chat/service.py
+++ b/agent/chat/service.py
@@ -57,7 +57,16 @@ class ChatService:
            event_type = event.get("type")
            data = event.get("data", {})

-            if event_type == "message_update":
+            if event_type == "reasoning_update":
+                delta = data.get("delta", "")
+                if delta:
+                    send_chunk_fn({
+                        "chunk_type": "reasoning",
+                        "delta": delta,
+                        "segment_id": state.segment_id,
+                    })
+
+            elif event_type == "message_update":
                # Incremental text delta
                delta = data.get("delta", "")
                if delta:
--- a/agent/chat/session_service.py
+++ b/agent/chat/session_service.py
@@ -0,0 +1,241 @@
+"""
+SessionService - Manages multi-session lifecycle for both web channel and cloud client.
+
+Provides a unified interface for listing, deleting, renaming, clearing context,
+and generating AI titles for conversation sessions. Backed by ConversationStore
+(SQLite) and AgentBridge (in-memory agent instances).
+"""
+
+import re
+from typing import Optional
+
+from common.log import logger
+
+
+def _truncate_fallback_title(user_message: str, max_len: int = 30) -> str:
+    """Pick the first non-empty line of the user message and truncate it."""
+    if not user_message:
+        return "New Chat"
+    first_line = ""
+    for line in user_message.splitlines():
+        line = line.strip()
+        if line:
+            first_line = line
+            break
+    if not first_line:
+        return "New Chat"
+    if len(first_line) > max_len:
+        first_line = first_line[:max_len].rstrip() + "..."
+    return first_line
+
+
+def generate_session_title(user_message: str, assistant_reply: str = "") -> str:
+    """
+    Generate a short session title by calling the current bot's reply_text.
+    Falls back to the first line of the user message if the LLM call fails
+    or returns an obvious error sentinel.
+    """
+    fallback = _truncate_fallback_title(user_message)
+    try:
+        from bridge.bridge import Bridge
+        from models.session_manager import Session
+        bot = Bridge().get_bot("chat")
+
+        prompt_parts = [f"User: {user_message[:300]}"]
+        if assistant_reply:
+            prompt_parts.append(f"Assistant: {assistant_reply[:300]}")
+
+        session = Session("__title_gen__", system_prompt="")
+        session.messages = [
+            {"role": "user", "content": (
+                "Generate a very short title (max 15 characters for Chinese, max 6 words for English) "
+                "summarizing this conversation. Return ONLY the title text, nothing else.\n\n"
+                + "\n".join(prompt_parts)
+            )}
+        ]
+
+        result = bot.reply_text(session) or {}
+        # When bots fail (network error, auth error, rate limit, etc.) they
+        # typically return completion_tokens=0 with a sentinel content like
+        # "请再问我一次吧" / "我现在有点累了". Treat that as failure.
+        completion_tokens = result.get("completion_tokens", 0) or 0
+        raw = (result.get("content") or "").strip()
+        if completion_tokens <= 0:
+            logger.warning(
+                f"[SessionService] Title generation got empty completion "
+                f"(completion_tokens={completion_tokens}, content='{raw[:50]}'), "
+                f"using fallback")
+            return fallback
+
+        title = re.sub(r'<think>.*?</think>', '', raw, flags=re.DOTALL).strip().strip('"\'')
+        logger.info(f"[SessionService] Title generation result: '{title}' (len={len(title)})")
+        if title and len(title) <= 50:
+            return title
+    except Exception as e:
+        logger.warning(f"[SessionService] Title generation failed: {e}")
+    return fallback
+
+
+class SessionService:
+    """
+    High-level service for session lifecycle management.
+
+    Usage:
+        svc = SessionService()
+        result = svc.dispatch("list", {"channel_type": "web", "page": 1})
+    """
+
+    def _get_store(self):
+        from agent.memory import get_conversation_store
+        return get_conversation_store()
+
+    def _remove_agent(self, session_id: str):
+        """Remove the in-memory Agent instance for a session if it exists."""
+        try:
+            from bridge.bridge import Bridge
+            ab = Bridge().get_agent_bridge()
+            if session_id in ab.agents:
+                del ab.agents[session_id]
+                logger.info(f"[SessionService] Removed agent instance: {session_id}")
+        except Exception:
+            pass
+
+    @staticmethod
+    def _normalize_sid(session_id: str) -> str:
+        if session_id and not session_id.startswith("session_"):
+            return f"session_{session_id}"
+        return session_id
+
+    # ------------------------------------------------------------------
+    # actions
+    # ------------------------------------------------------------------
+    def list_sessions(self, channel_type: Optional[str] = None,
+                      page: int = 1, page_size: int = 50) -> dict:
+        store = self._get_store()
+        return store.list_sessions(
+            channel_type=channel_type,
+            page=page,
+            page_size=page_size,
+        )
+
+    def delete_session(self, session_id: str) -> None:
+        if not session_id:
+            raise ValueError("session_id required")
+        session_id = self._normalize_sid(session_id)
+
+        store = self._get_store()
+        store.clear_session(session_id)
+        self._remove_agent(session_id)
+        logger.info(f"[SessionService] Session deleted: {session_id}")
+
+    def rename_session(self, session_id: str, title: str) -> None:
+        if not session_id:
+            raise ValueError("session_id required")
+        if not title:
+            raise ValueError("title required")
+        session_id = self._normalize_sid(session_id)
+
+        store = self._get_store()
+        found = store.rename_session(session_id, title)
+        if not found:
+            raise ValueError("session not found")
+
+    def clear_context(self, session_id: str) -> int:
+        """
+        Set context boundary. Returns the new context_start_seq value.
+        """
+        if not session_id:
+            raise ValueError("session_id required")
+        session_id = self._normalize_sid(session_id)
+
+        store = self._get_store()
+        new_seq = store.clear_context(session_id)
+        self._remove_agent(session_id)
+        return new_seq
+
+    def gen_title(self, session_id: str, user_message: str,
+                  assistant_reply: str = "") -> str:
+        """
+        Generate an AI title and persist it. Returns the generated title.
+        """
+        if not session_id:
+            raise ValueError("session_id required")
+        if not user_message:
+            raise ValueError("user_message required")
+        session_id = self._normalize_sid(session_id)
+
+        title = generate_session_title(user_message, assistant_reply)
+
+        store = self._get_store()
+        updated = store.rename_session(session_id, title)
+        logger.info(f"[SessionService] Title set: sid={session_id}, "
+                     f"title='{title}', db_updated={updated}")
+        return title
+
+    # ------------------------------------------------------------------
+    # dispatch — single entry point for protocol messages
+    # ------------------------------------------------------------------
+    def dispatch(self, action: str, payload: Optional[dict] = None) -> dict:
+        """
+        Dispatch a session management action and return a protocol-compatible
+        response dict.
+
+        Action names use a ``*_session`` / session-prefixed convention so they
+        can coexist with history actions (e.g. ``query``) on the same HISTORY
+        message channel without ambiguity.
+
+        Supported actions:
+          - list_sessions: list sessions with pagination
+          - delete_session: delete a session
+          - rename_session: rename a session title
+          - clear_context: set context boundary
+          - generate_title: AI-generate a session title
+
+        :param action: one of the above action names
+        :param payload: action-specific payload
+        :return: dict with action, code, message, payload
+        """
+        payload = payload or {}
+        try:
+            if action == "list_sessions":
+                result = self.list_sessions(
+                    channel_type=payload.get("channel_type"),
+                    page=int(payload.get("page", 1)),
+                    page_size=int(payload.get("page_size", 50)),
+                )
+                return {"action": action, "code": 200, "message": "success", "payload": result}
+
+            elif action == "delete_session":
+                self.delete_session(payload.get("session_id", ""))
+                return {"action": action, "code": 200, "message": "success", "payload": None}
+
+            elif action == "rename_session":
+                self.rename_session(
+                    payload.get("session_id", ""),
+                    payload.get("title", "").strip(),
+                )
+                return {"action": action, "code": 200, "message": "success", "payload": None}
+
+            elif action == "clear_context":
+                new_seq = self.clear_context(payload.get("session_id", ""))
+                return {"action": action, "code": 200, "message": "success",
+                        "payload": {"context_start_seq": new_seq}}
+
+            elif action == "generate_title":
+                title = self.gen_title(
+                    payload.get("session_id", ""),
+                    payload.get("user_message", ""),
+                    payload.get("assistant_reply", ""),
+                )
+                return {"action": action, "code": 200, "message": "success",
+                        "payload": {"title": title}}
+
+            else:
+                return {"action": action, "code": 400,
+                        "message": f"unknown action: {action}", "payload": None}
+
+        except ValueError as e:
+            return {"action": action, "code": 400, "message": str(e), "payload": None}
+        except Exception as e:
+            logger.error(f"[SessionService] dispatch error: action={action}, error={e}")
+            return {"action": action, "code": 500, "message": str(e), "payload": None}
--- a/agent/knowledge/init.py
+++ b/agent/knowledge/init.py
--- a/agent/knowledge/service.py
+++ b/agent/knowledge/service.py
@@ -0,0 +1,240 @@
+"""
+Knowledge service for handling knowledge base operations.
+
+Provides a unified interface for listing, reading, and graphing knowledge files,
+callable from the web console, API, or CLI.
+
+Knowledge file layout (under workspace_root):
+    knowledge/index.md
+    knowledge/log.md
+    knowledge/<category>/<slug>.md
+"""
+
+import os
+import re
+from pathlib import Path
+from typing import Optional
+
+from common.log import logger
+from config import conf
+
+
+class KnowledgeService:
+    """
+    High-level service for knowledge base queries.
+    Operates directly on the filesystem.
+    """
+
+    def __init__(self, workspace_root: str):
+        self.workspace_root = workspace_root
+        self.knowledge_dir = os.path.join(workspace_root, "knowledge")
+
+    # ------------------------------------------------------------------
+    # list — directory tree with stats
+    # ------------------------------------------------------------------
+    def list_tree(self) -> dict:
+        """
+        Return the knowledge directory tree grouped by category,
+        supporting arbitrarily nested sub-directories.
+
+        Returns::
+
+            {
+                "tree": [
+                    {
+                        "dir": "concepts",
+                        "files": [
+                            {"name": "moe.md", "title": "MoE", "size": 1234},
+                        ],
+                        "children": []
+                    },
+                    {
+                        "dir": "platform",
+                        "files": [],
+                        "children": [
+                            {
+                                "dir": "analysis",
+                                "files": [{"name": "perf.md", ...}],
+                                "children": []
+                            }
+                        ]
+                    },
+                ],
+                "stats": {"pages": 15, "size": 32768},
+                "enabled": true
+            }
+        """
+        if not os.path.isdir(self.knowledge_dir):
+            return {"tree": [], "stats": {"pages": 0, "size": 0}, "enabled": conf().get("knowledge", True)}
+
+        stats = {"pages": 0, "size": 0}
+        root_files, tree = self._scan_dir(self.knowledge_dir, stats, is_root=True)
+
+        return {
+            "root_files": root_files,
+            "tree": tree,
+            "stats": stats,
+            "enabled": conf().get("knowledge", True),
+        }
+
+    def _scan_dir(self, dir_path: str, stats: dict, is_root: bool = False) -> tuple:
+        """
+        Recursively scan a directory.
+
+        :return: (files, children) where files is a list of .md file dicts
+                 in this directory and children is a list of sub-directory nodes.
+        """
+        files = []
+        children = []
+        for name in sorted(os.listdir(dir_path)):
+            if name.startswith("."):
+                continue
+            full = os.path.join(dir_path, name)
+            if os.path.isdir(full):
+                sub_files, sub_children = self._scan_dir(full, stats)
+                children.append({"dir": name, "files": sub_files, "children": sub_children})
+            elif name.endswith(".md"):
+                size = os.path.getsize(full)
+                if not is_root:
+                    stats["pages"] += 1
+                    stats["size"] += size
+                title = name.replace(".md", "")
+                try:
+                    with open(full, "r", encoding="utf-8") as f:
+                        first_line = f.readline().strip()
+                    if first_line.startswith("# "):
+                        title = first_line[2:].strip()
+                except Exception:
+                    pass
+                files.append({"name": name, "title": title, "size": size})
+        return files, children
+
+    # ------------------------------------------------------------------
+    # read — single file content
+    # ------------------------------------------------------------------
+    def read_file(self, rel_path: str) -> dict:
+        """
+        Read a single knowledge markdown file.
+
+        :param rel_path: Relative path within knowledge/, e.g. ``concepts/moe.md``
+        :return: dict with ``content`` and ``path``
+        :raises ValueError: if path is invalid or escapes knowledge dir
+        :raises FileNotFoundError: if file does not exist
+        """
+        if not rel_path or ".." in rel_path:
+            raise ValueError("invalid path")
+
+        full_path = os.path.normpath(os.path.join(self.knowledge_dir, rel_path))
+        allowed = os.path.normpath(self.knowledge_dir)
+        if not full_path.startswith(allowed + os.sep) and full_path != allowed:
+            raise ValueError("path outside knowledge dir")
+
+        if not os.path.isfile(full_path):
+            raise FileNotFoundError(f"file not found: {rel_path}")
+
+        with open(full_path, "r", encoding="utf-8") as f:
+            content = f.read()
+        return {"content": content, "path": rel_path}
+
+    # ------------------------------------------------------------------
+    # graph — nodes and links for visualization
+    # ------------------------------------------------------------------
+    def build_graph(self) -> dict:
+        """
+        Parse all knowledge pages and extract cross-reference links.
+
+        Returns::
+
+            {
+                "nodes": [
+                    {"id": "concepts/moe.md", "label": "MoE", "category": "concepts"},
+                    ...
+                ],
+                "links": [
+                    {"source": "concepts/moe.md", "target": "entities/deepseek.md"},
+                    ...
+                ]
+            }
+        """
+        knowledge_path = Path(self.knowledge_dir)
+        if not knowledge_path.is_dir():
+            return {"nodes": [], "links": []}
+
+        nodes = {}
+        links = []
+        link_re = re.compile(r'\[([^\]]*)\]\(([^)]+\.md)\)')
+
+        for md_file in knowledge_path.rglob("*.md"):
+            rel = str(md_file.relative_to(knowledge_path))
+            if rel in ("index.md", "log.md"):
+                continue
+            parts = rel.split("/")
+            category = parts[0] if len(parts) > 1 else "root"
+            title = md_file.stem.replace("-", " ").title()
+            try:
+                content = md_file.read_text(encoding="utf-8")
+                first_line = content.strip().split("\n")[0]
+                if first_line.startswith("# "):
+                    title = first_line[2:].strip()
+                for _, link_target in link_re.findall(content):
+                    resolved = (md_file.parent / link_target).resolve()
+                    try:
+                        target_rel = str(resolved.relative_to(knowledge_path))
+                    except ValueError:
+                        continue
+                    if target_rel != rel:
+                        links.append({"source": rel, "target": target_rel})
+            except Exception:
+                pass
+            nodes[rel] = {"id": rel, "label": title, "category": category}
+
+        valid_ids = set(nodes.keys())
+        links = [l for l in links if l["source"] in valid_ids and l["target"] in valid_ids]
+        seen = set()
+        deduped = []
+        for l in links:
+            key = tuple(sorted([l["source"], l["target"]]))
+            if key not in seen:
+                seen.add(key)
+                deduped.append(l)
+
+        return {"nodes": list(nodes.values()), "links": deduped}
+
+    # ------------------------------------------------------------------
+    # dispatch — single entry point for protocol messages
+    # ------------------------------------------------------------------
+    def dispatch(self, action: str, payload: Optional[dict] = None) -> dict:
+        """
+        Dispatch a knowledge management action.
+
+        :param action: ``list``, ``read``, or ``graph``
+        :param payload: action-specific payload
+        :return: protocol-compatible response dict
+        """
+        payload = payload or {}
+        try:
+            if action == "list":
+                result = self.list_tree()
+                return {"action": action, "code": 200, "message": "success", "payload": result}
+
+            elif action == "read":
+                path = payload.get("path")
+                if not path:
+                    return {"action": action, "code": 400, "message": "path is required", "payload": None}
+                result = self.read_file(path)
+                return {"action": action, "code": 200, "message": "success", "payload": result}
+
+            elif action == "graph":
+                result = self.build_graph()
+                return {"action": action, "code": 200, "message": "success", "payload": result}
+
+            else:
+                return {"action": action, "code": 400, "message": f"unknown action: {action}", "payload": None}
+
+        except ValueError as e:
+            return {"action": action, "code": 403, "message": str(e), "payload": None}
+        except FileNotFoundError as e:
+            return {"action": action, "code": 404, "message": str(e), "payload": None}
+        except Exception as e:
+            logger.error(f"[KnowledgeService] dispatch error: action={action}, error={e}")
+            return {"action": action, "code": 500, "message": str(e), "payload": None}
--- a/agent/memory/conversation_store.py
+++ b/agent/memory/conversation_store.py
@@ -28,11 +28,13 @@ from common.log import logger

 _DDL = """
 CREATE TABLE IF NOT EXISTS sessions (
-    session_id   TEXT    PRIMARY KEY,
-    channel_type TEXT    NOT NULL DEFAULT '',
-    created_at   INTEGER NOT NULL,
-    last_active  INTEGER NOT NULL,
-    msg_count    INTEGER NOT NULL DEFAULT 0
+    session_id        TEXT    PRIMARY KEY,
+    channel_type      TEXT    NOT NULL DEFAULT '',
+    title             TEXT    NOT NULL DEFAULT '',
+    context_start_seq INTEGER NOT NULL DEFAULT 0,
+    created_at        INTEGER NOT NULL,
+    last_active       INTEGER NOT NULL,
+    msg_count         INTEGER NOT NULL DEFAULT 0
 );

 CREATE TABLE IF NOT EXISTS messages (
@@ -42,6 +44,7 @@ CREATE TABLE IF NOT EXISTS messages (
    role         TEXT    NOT NULL,
    content      TEXT    NOT NULL,
    created_at   INTEGER NOT NULL,
+    extras       TEXT    NOT NULL DEFAULT '',
    UNIQUE (session_id, seq)
 );

@@ -57,6 +60,20 @@ _MIGRATION_ADD_CHANNEL_TYPE = """
 ALTER TABLE sessions ADD COLUMN channel_type TEXT NOT NULL DEFAULT '';
 """

+_MIGRATION_ADD_TITLE = """
+ALTER TABLE sessions ADD COLUMN title TEXT NOT NULL DEFAULT '';
+"""
+
+_MIGRATION_ADD_CONTEXT_START_SEQ = """
+ALTER TABLE sessions ADD COLUMN context_start_seq INTEGER NOT NULL DEFAULT 0;
+"""
+
+# Generic JSON sidecar for per-message attachments (TTS audio URL, future use).
+# Always optional — readers must tolerate missing column / empty / invalid JSON.
+_MIGRATION_ADD_MSG_EXTRAS = """
+ALTER TABLE messages ADD COLUMN extras TEXT NOT NULL DEFAULT '';
+"""
+
 DEFAULT_MAX_AGE_DAYS: int = 30


@@ -106,9 +123,10 @@ def _extract_tool_calls(content: Any) -> List[Dict[str, Any]]:
    ]


-def _extract_tool_results(content: Any) -> Dict[str, str]:
+def _extract_tool_results(content: Any) -> Dict[str, dict]:
    """
    Extract tool_result blocks from a user message, keyed by tool_use_id.
+    Values are {"result": str, "is_error": bool}.
    """
    if not isinstance(content, list):
        return {}
@@ -123,12 +141,13 @@ def _extract_tool_results(content: Any) -> Dict[str, str]:
                rb.get("text", "") for rb in result_content
                if isinstance(rb, dict) and rb.get("type") == "text"
            )
-        results[tool_id] = str(result_content)
+        results[tool_id] = {"result": str(result_content), "is_error": bool(b.get("is_error", False))}
    return results


 def _group_into_display_turns(
    rows: List[tuple],
+    include_thinking: bool = True,
 ) -> List[Dict[str, Any]]:
    """
    Convert raw (role, content_json, created_at) DB rows into display turns.
@@ -157,20 +176,26 @@ def _group_into_display_turns(
    cur_rest: List[tuple] = []
    started = False

-    for role, raw_content, created_at in rows:
+    for role, raw_content, created_at, raw_extras in rows:
        try:
            content = json.loads(raw_content)
        except Exception:
            content = raw_content
+        try:
+            extras = json.loads(raw_extras) if raw_extras else {}
+            if not isinstance(extras, dict):
+                extras = {}
+        except Exception:
+            extras = {}

        if role == "user" and _is_visible_user_message(content):
            if started:
                groups.append((cur_user, cur_rest))
-            cur_user = (content, created_at)
+            cur_user = (content, created_at, extras)
            cur_rest = []
            started = True
        else:
-            cur_rest.append((role, content, created_at))
+            cur_rest.append((role, content, created_at, extras))

    if started:
        groups.append((cur_user, cur_rest))
@@ -183,39 +208,73 @@ def _group_into_display_turns(
    for user_row, rest in groups:
        # User turn
        if user_row:
-            content, created_at = user_row
+            content, created_at, _u_extras = user_row
            text = _extract_display_text(content)
            if text:
                turns.append({"role": "user", "content": text, "created_at": created_at})

-        # Collect all tool_calls and tool_results from the rest of the group
-        all_tool_calls: List[Dict[str, Any]] = []
+        # Build an ordered list of steps preserving the original sequence:
+        #   thinking → content → tool_call → content → ...
+        steps: List[Dict[str, Any]] = []
        tool_results: Dict[str, str] = {}
        final_text = ""
        final_ts: Optional[int] = None
+        merged_extras: Dict[str, Any] = {}

-        for role, content, created_at in rest:
+        for role, content, created_at, extras in rest:
+            if role == "assistant" and isinstance(extras, dict):
+                merged_extras.update(extras)
            if role == "user":
                tool_results.update(_extract_tool_results(content))
            elif role == "assistant":
-                tcs = _extract_tool_calls(content)
-                all_tool_calls.extend(tcs)
-                t = _extract_display_text(content)
-                if t:
-                    final_text = t
+                # Walk content blocks in order to preserve interleaving
+                if isinstance(content, list):
+                    for block in content:
+                        if not isinstance(block, dict):
+                            continue
+                        btype = block.get("type")
+                        if btype == "thinking":
+                            if not include_thinking:
+                                continue
+                            txt = block.get("thinking", "").strip()
+                            if txt:
+                                steps.append({"type": "thinking", "content": txt})
+                        elif btype == "text":
+                            txt = block.get("text", "").strip()
+                            if txt:
+                                steps.append({"type": "content", "content": txt})
+                                final_text = txt
+                        elif btype == "tool_use":
+                            steps.append({
+                                "type": "tool",
+                                "id": block.get("id", ""),
+                                "name": block.get("name", ""),
+                                "arguments": block.get("input", {}),
+                            })
+                elif isinstance(content, str) and content.strip():
+                    steps.append({"type": "content", "content": content.strip()})
+                    final_text = content.strip()
                final_ts = created_at

-        # Attach tool results to their matching tool_call entries
-        for tc in all_tool_calls:
-            tc["result"] = tool_results.get(tc.get("id", ""), "")
+        # Attach tool results to tool steps
+        for step in steps:
+            if step["type"] == "tool":
+                tr = tool_results.get(step.get("id", ""), {})
+                if not isinstance(tr, dict):
+                    tr = {"result": tr}
+                step["result"] = tr.get("result", "")
+                step["is_error"] = tr.get("is_error", False)

-        if final_text or all_tool_calls:
-            turns.append({
+        if steps or final_text:
+            turn = {
                "role": "assistant",
                "content": final_text,
-                "tool_calls": all_tool_calls,
+                "steps": steps,
                "created_at": final_ts or (user_row[1] if user_row else 0),
-            })
+            }
+            if merged_extras:
+                turn["extras"] = merged_extras
+            turns.append(turn)

    return turns

@@ -264,14 +323,21 @@ class ConversationStore:
        with self._lock:
            conn = self._connect()
            try:
+                # Respect context_start_seq: only load messages at or after the boundary
+                ctx_row = conn.execute(
+                    "SELECT context_start_seq FROM sessions WHERE session_id = ?",
+                    (session_id,),
+                ).fetchone()
+                ctx_start = ctx_row[0] if ctx_row else 0
+
                rows = conn.execute(
                    """
                    SELECT seq, role, content
                    FROM messages
-                    WHERE session_id = ?
+                    WHERE session_id = ? AND seq >= ?
                    ORDER BY seq DESC
                    """,
-                    (session_id,),
+                    (session_id, ctx_start),
                ).fetchall()
            finally:
                conn.close()
@@ -279,10 +345,7 @@ class ConversationStore:
        if not rows:
            return []

-        # Walk newest-to-oldest counting *visible* user turns (actual user text,
-        # not tool_result injections).  Record the seq of every visible user
-        # message so we can find a clean cut point later.
-        visible_turn_seqs: List[int] = []  # newest first
+        visible_turn_seqs: List[int] = []
        for seq, role, raw_content in rows:
            if role != "user":
                continue
@@ -293,17 +356,11 @@ class ConversationStore:
            if _is_visible_user_message(content):
                visible_turn_seqs.append(seq)

-        # Determine the seq of the oldest visible user message we want to keep.
-        # If the total turns fit within max_turns, keep everything.
        if len(visible_turn_seqs) <= max_turns:
-            cutoff_seq = None  # keep all
+            cutoff_seq = None
        else:
-            # The Nth visible user message (0-indexed) is the oldest we keep.
            cutoff_seq = visible_turn_seqs[max_turns - 1]

-        # Build result in chronological order, starting from cutoff.
-        # IMPORTANT: we start exactly at cutoff_seq (the visible user message),
-        # never mid-group, so tool_use / tool_result pairs are always complete.
        result = []
        for seq, role, raw_content in reversed(rows):
            if cutoff_seq is not None and seq < cutoff_seq:
@@ -312,6 +369,9 @@ class ConversationStore:
                content = json.loads(raw_content)
            except Exception:
                content = raw_content
+            # Strip thinking blocks — they are stored for UI display only
+            if role == "assistant" and isinstance(content, list):
+                content = [b for b in content if b.get("type") != "thinking"]
            result.append({"role": role, "content": content})
        return result

@@ -369,13 +429,15 @@ class ConversationStore:
                        content = json.dumps(
                            msg.get("content", ""), ensure_ascii=False
                        )
+                        extras_obj = msg.get("extras") or {}
+                        extras = json.dumps(extras_obj, ensure_ascii=False) if extras_obj else ""
                        conn.execute(
                            """
                            INSERT OR IGNORE INTO messages
-                                (session_id, seq, role, content, created_at)
-                            VALUES (?, ?, ?, ?, ?)
+                                (session_id, seq, role, content, created_at, extras)
+                            VALUES (?, ?, ?, ?, ?, ?)
                            """,
-                            (session_id, next_seq, role, content, now),
+                            (session_id, next_seq, role, content, now, extras),
                        )
                        next_seq += 1

@@ -389,6 +451,61 @@ class ConversationStore:
                        """,
                        (session_id, session_id),
                    )
+
+                    # Auto-generate title from the first visible user message
+                    cur_title = conn.execute(
+                        "SELECT title FROM sessions WHERE session_id = ?",
+                        (session_id,),
+                    ).fetchone()
+                    if cur_title and not cur_title[0]:
+                        for msg in messages:
+                            if msg.get("role") == "user":
+                                content = msg.get("content", "")
+                                text = _extract_display_text(content)
+                                if text:
+                                    title = text[:50].split("\n")[0]
+                                    conn.execute(
+                                        "UPDATE sessions SET title = ? WHERE session_id = ?",
+                                        (title, session_id),
+                                    )
+                                    break
+            finally:
+                conn.close()
+
+    def clear_context(self, session_id: str) -> int:
+        """
+        Set the context boundary to after the current last message.
+        Messages before this boundary are still stored but excluded from LLM context.
+
+        Returns the new context_start_seq value.
+        """
+        with self._lock:
+            conn = self._connect()
+            try:
+                with conn:
+                    row = conn.execute(
+                        "SELECT COALESCE(MAX(seq), -1) FROM messages WHERE session_id = ?",
+                        (session_id,),
+                    ).fetchone()
+                    new_start = row[0] + 1
+                    conn.execute(
+                        "UPDATE sessions SET context_start_seq = ? WHERE session_id = ?",
+                        (new_start, session_id),
+                    )
+                    return new_start
+            finally:
+                conn.close()
+
+    def get_context_start_seq(self, session_id: str) -> int:
+        """Return the context_start_seq for a session (0 if not set)."""
+        with self._lock:
+            conn = self._connect()
+            try:
+                row = conn.execute(
+                    "SELECT context_start_seq FROM sessions WHERE session_id = ?",
+                    (session_id,),
+                ).fetchone()
+                return row[0] if row else 0
            finally:
                conn.close()

@@ -407,9 +524,111 @@ class ConversationStore:
            finally:
                conn.close()

+    def prune_scheduled_messages(
+        self,
+        session_id: str,
+        keep_last_n: int,
+        markers: Optional[List[str]] = None,
+    ) -> int:
+        """
+        Keep at most ``keep_last_n`` scheduler-injected user/assistant pairs in
+        the session, deleting the older ones.
+
+        A scheduler-injected pair is identified by a user message whose first
+        text block starts with one of ``markers``; the immediately following
+        assistant message (next seq) is treated as its paired output.
+
+        Only scheduler-tagged messages are touched; regular user turns are
+        never deleted. Safe to call repeatedly; no-op if nothing to prune.
+
+        Args:
+            session_id: Session to prune.
+            keep_last_n: Maximum scheduler pairs to retain (must be >= 0).
+            markers: Text prefixes that identify scheduler user messages.
+                Defaults to ``["[SCHEDULED]", "Scheduled task"]`` so that
+                pairs written by older versions are also recognised.
+
+        Returns:
+            Number of message rows deleted.
+        """
+        if keep_last_n < 0:
+            keep_last_n = 0
+        if markers is None:
+            markers = ["[SCHEDULED]", "Scheduled task"]
+
+        def _matches_marker(raw_content: str) -> bool:
+            try:
+                parsed = json.loads(raw_content)
+            except Exception:
+                parsed = raw_content
+            text = _extract_display_text(parsed) if not isinstance(parsed, str) else parsed
+            if not text:
+                return False
+            return any(text.startswith(m) for m in markers)
+
+        with self._lock:
+            conn = self._connect()
+            try:
+                rows = conn.execute(
+                    """
+                    SELECT seq, role, content
+                    FROM messages
+                    WHERE session_id = ?
+                    ORDER BY seq ASC
+                    """,
+                    (session_id,),
+                ).fetchall()
+
+                # Find scheduler pairs: each is (user_seq, assistant_seq?)
+                pairs: List[tuple] = []  # list of (user_seq, assistant_seq_or_None)
+                for idx, (seq, role, raw_content) in enumerate(rows):
+                    if role != "user" or not _matches_marker(raw_content):
+                        continue
+                    assistant_seq = None
+                    # Pair with the very next message if it's an assistant turn.
+                    if idx + 1 < len(rows):
+                        next_seq, next_role, _ = rows[idx + 1]
+                        if next_role == "assistant":
+                            assistant_seq = next_seq
+                    pairs.append((seq, assistant_seq))
+
+                if len(pairs) <= keep_last_n:
+                    return 0
+
+                to_delete_pairs = pairs[: len(pairs) - keep_last_n]
+                seqs_to_delete: List[int] = []
+                for user_seq, assistant_seq in to_delete_pairs:
+                    seqs_to_delete.append(user_seq)
+                    if assistant_seq is not None:
+                        seqs_to_delete.append(assistant_seq)
+
+                if not seqs_to_delete:
+                    return 0
+
+                placeholders = ",".join("?" * len(seqs_to_delete))
+                with conn:
+                    conn.execute(
+                        f"DELETE FROM messages WHERE session_id = ? AND seq IN ({placeholders})",
+                        (session_id, *seqs_to_delete),
+                    )
+                    conn.execute(
+                        """
+                        UPDATE sessions
+                        SET msg_count = (
+                            SELECT COUNT(*) FROM messages WHERE session_id = ?
+                        )
+                        WHERE session_id = ?
+                        """,
+                        (session_id, session_id),
+                    )
+                return len(seqs_to_delete)
+            finally:
+                conn.close()
+
    def cleanup_old_sessions(self, max_age_days: Optional[int] = None) -> int:
        """
        Delete sessions that have not been active within max_age_days.
+        Web channel sessions are excluded — they are meant to be permanent.

        Args:
            max_age_days: Override the default retention period.
@@ -433,7 +652,8 @@ class ConversationStore:
            try:
                with conn:
                    stale = conn.execute(
-                        "SELECT session_id FROM sessions WHERE last_active < ?",
+                        "SELECT session_id FROM sessions "
+                        "WHERE last_active < ? AND channel_type != 'web'",
                        (cutoff,),
                    ).fetchall()
                    for (sid,) in stale:
@@ -451,6 +671,55 @@ class ConversationStore:
            logger.info(f"[ConversationStore] Pruned {deleted} expired sessions")
        return deleted

+    def attach_extras_to_last_assistant(
+        self,
+        session_id: str,
+        extras: Dict[str, Any],
+    ) -> Optional[int]:
+        """
+        Merge ``extras`` into the latest assistant message of a session.
+
+        Used by post-processing (e.g. TTS) that needs to annotate an already
+        persisted bot reply with attachments such as audio URLs.
+
+        Returns the message seq that was updated, or ``None`` if no assistant
+        message exists or the update could not be applied.
+        """
+        if not extras:
+            return None
+        with self._lock:
+            conn = self._connect()
+            try:
+                row = conn.execute(
+                    """
+                    SELECT seq, extras FROM messages
+                    WHERE session_id = ? AND role = 'assistant'
+                    ORDER BY seq DESC LIMIT 1
+                    """,
+                    (session_id,),
+                ).fetchone()
+                if not row:
+                    return None
+                seq, raw = row
+                try:
+                    cur = json.loads(raw) if raw else {}
+                    if not isinstance(cur, dict):
+                        cur = {}
+                except Exception:
+                    cur = {}
+                cur.update(extras)
+                conn.execute(
+                    "UPDATE messages SET extras = ? WHERE session_id = ? AND seq = ?",
+                    (json.dumps(cur, ensure_ascii=False), session_id, seq),
+                )
+                conn.commit()
+                return seq
+            except Exception as e:
+                logger.warning(f"[ConversationStore] attach_extras failed: {e}")
+                return None
+            finally:
+                conn.close()
+
    def load_history_page(
        self,
        session_id: str,
@@ -492,19 +761,75 @@ class ConversationStore:
        with self._lock:
            conn = self._connect()
            try:
-                rows = conn.execute(
-                    """
-                    SELECT role, content, created_at
-                    FROM messages
-                    WHERE session_id = ?
-                    ORDER BY seq ASC
-                    """,
+                ctx_row = conn.execute(
+                    "SELECT context_start_seq FROM sessions WHERE session_id = ?",
                    (session_id,),
-                ).fetchall()
+                ).fetchone()
+                ctx_start = ctx_row[0] if ctx_row else 0
+
+                # extras column is added by migration; tolerate older DBs that
+                # might miss it by falling back to a NULL literal.
+                try:
+                    rows = conn.execute(
+                        """
+                        SELECT seq, role, content, created_at, extras
+                        FROM messages
+                        WHERE session_id = ?
+                        ORDER BY seq ASC
+                        """,
+                        (session_id,),
+                    ).fetchall()
+                except sqlite3.OperationalError:
+                    rows = [
+                        (seq, role, content, created_at, "")
+                        for (seq, role, content, created_at) in conn.execute(
+                            """
+                            SELECT seq, role, content, created_at
+                            FROM messages
+                            WHERE session_id = ?
+                            ORDER BY seq ASC
+                            """,
+                            (session_id,),
+                        ).fetchall()
+                    ]
            finally:
                conn.close()

-        visible = _group_into_display_turns(rows)
+        # Honour the current enable_thinking switch when building display turns
+        # so that toggling it off hides previously-saved thinking blocks too.
+        try:
+            from config import conf
+            include_thinking = bool(conf().get("enable_thinking", False))
+        except Exception:
+            include_thinking = False
+
+        # Strip seq for display grouping, but record max seq per visible user group
+        plain_rows = [
+            (role, content, created_at, extras_raw)
+            for _seq, role, content, created_at, extras_raw in rows
+        ]
+        visible = _group_into_display_turns(plain_rows, include_thinking=include_thinking)
+
+        # Build a mapping: find the seq of each visible user message to annotate context boundary.
+        # Walk through rows to find visible user message seqs in order.
+        visible_user_seqs: List[int] = []
+        for seq, role, raw_content, _ts, _extras in rows:
+            if role != "user":
+                continue
+            try:
+                content = json.loads(raw_content)
+            except Exception:
+                content = raw_content
+            if _is_visible_user_message(content):
+                visible_user_seqs.append(seq)
+
+        # Each pair of display turns (user+assistant) corresponds to a visible user seq.
+        # Mark which turns are before the context boundary.
+        user_turn_idx = 0
+        for turn in visible:
+            if turn["role"] == "user" and user_turn_idx < len(visible_user_seqs):
+                turn["_seq"] = visible_user_seqs[user_turn_idx]
+                user_turn_idx += 1

        total = len(visible)
        offset = (page - 1) * page_size
@@ -513,12 +838,98 @@ class ConversationStore:

        return {
            "messages": page_items,
+            "context_start_seq": ctx_start,
            "total": total,
            "page": page,
            "page_size": page_size,
            "has_more": offset + page_size < total,
        }

+    def list_sessions(
+        self,
+        channel_type: Optional[str] = None,
+        page: int = 1,
+        page_size: int = 50,
+    ) -> Dict[str, Any]:
+        """
+        List sessions ordered by last_active DESC, with optional channel_type filter.
+
+        Returns:
+            {
+                "sessions": [{session_id, title, created_at, last_active, msg_count}, ...],
+                "total": int,
+                "page": int,
+                "page_size": int,
+                "has_more": bool,
+            }
+        """
+        page = max(1, page)
+        with self._lock:
+            conn = self._connect()
+            try:
+                if channel_type:
+                    total = conn.execute(
+                        "SELECT COUNT(*) FROM sessions WHERE channel_type = ?",
+                        (channel_type,),
+                    ).fetchone()[0]
+                    rows = conn.execute(
+                        """
+                        SELECT session_id, title, created_at, last_active, msg_count
+                        FROM sessions
+                        WHERE channel_type = ?
+                        ORDER BY last_active DESC
+                        LIMIT ? OFFSET ?
+                        """,
+                        (channel_type, page_size, (page - 1) * page_size),
+                    ).fetchall()
+                else:
+                    total = conn.execute(
+                        "SELECT COUNT(*) FROM sessions",
+                    ).fetchone()[0]
+                    rows = conn.execute(
+                        """
+                        SELECT session_id, title, created_at, last_active, msg_count
+                        FROM sessions
+                        ORDER BY last_active DESC
+                        LIMIT ? OFFSET ?
+                        """,
+                        (page_size, (page - 1) * page_size),
+                    ).fetchall()
+            finally:
+                conn.close()
+
+        sessions = [
+            {
+                "session_id": r[0],
+                "title": r[1],
+                "created_at": r[2],
+                "last_active": r[3],
+                "msg_count": r[4],
+            }
+            for r in rows
+        ]
+        return {
+            "sessions": sessions,
+            "total": total,
+            "page": page,
+            "page_size": page_size,
+            "has_more": (page - 1) * page_size + page_size < total,
+        }
+
+    def rename_session(self, session_id: str, title: str) -> bool:
+        """Update the title of a session. Returns True if the session existed."""
+        with self._lock:
+            conn = self._connect()
+            try:
+                with conn:
+                    cur = conn.execute(
+                        "UPDATE sessions SET title = ? WHERE session_id = ?",
+                        (title, session_id),
+                    )
+                    return cur.rowcount > 0
+            finally:
+                conn.close()
+
    def get_stats(self) -> Dict[str, Any]:
        """Return basic stats keyed by channel_type, for monitoring."""
        with self._lock:
@@ -573,6 +984,32 @@ class ConversationStore:
                logger.info("[ConversationStore] Migrated: added channel_type column")
            except Exception as e:
                logger.warning(f"[ConversationStore] Migration failed: {e}")
+        if "title" not in cols:
+            try:
+                conn.execute(_MIGRATION_ADD_TITLE)
+                conn.commit()
+                logger.info("[ConversationStore] Migrated: added title column")
+            except Exception as e:
+                logger.warning(f"[ConversationStore] Migration (title) failed: {e}")
+        if "context_start_seq" not in cols:
+            try:
+                conn.execute(_MIGRATION_ADD_CONTEXT_START_SEQ)
+                conn.commit()
+                logger.info("[ConversationStore] Migrated: added context_start_seq column")
+            except Exception as e:
+                logger.warning(f"[ConversationStore] Migration (context_start_seq) failed: {e}")
+
+        msg_cols = {
+            row[1]
+            for row in conn.execute("PRAGMA table_info(messages)").fetchall()
+        }
+        if "extras" not in msg_cols:
+            try:
+                conn.execute(_MIGRATION_ADD_MSG_EXTRAS)
+                conn.commit()
+                logger.info("[ConversationStore] Migrated: added messages.extras column")
+            except Exception as e:
+                logger.warning(f"[ConversationStore] Migration (extras) failed: {e}")

    def _connect(self) -> sqlite3.Connection:
        conn = sqlite3.connect(str(self._db_path), timeout=10)
--- a/agent/memory/embedding.py
+++ b/agent/memory/embedding.py
@@ -1,167 +0,0 @@
-"""
-Embedding providers for memory
-
-Supports OpenAI and local embedding models
-"""
-
-import hashlib
-from abc import ABC, abstractmethod
-from typing import List, Optional
-
-
-class EmbeddingProvider(ABC):
-    """Base class for embedding providers"""
-
-    @abstractmethod
-    def embed(self, text: str) -> List[float]:
-        """Generate embedding for text"""
-        pass
-
-    @abstractmethod
-    def embed_batch(self, texts: List[str]) -> List[List[float]]:
-        """Generate embeddings for multiple texts"""
-        pass
-    
-    @property
-    @abstractmethod
-    def dimensions(self) -> int:
-        """Get embedding dimensions"""
-        pass
-
-
-class OpenAIEmbeddingProvider(EmbeddingProvider):
-    """OpenAI embedding provider using REST API"""
-    
-    def __init__(self, model: str = "text-embedding-3-small", api_key: Optional[str] = None,
-                 api_base: Optional[str] = None, extra_headers: Optional[dict] = None):
-        """
-        Initialize OpenAI embedding provider
-
-        Args:
-            model: Model name (text-embedding-3-small or text-embedding-3-large)
-            api_key: OpenAI API key
-            api_base: Optional API base URL
-            extra_headers: Optional extra headers to include in API requests
-        """
-        self.model = model
-        self.api_key = api_key
-        self.api_base = api_base or "https://api.openai.com/v1"
-        self.extra_headers = extra_headers or {}
-
-        # Validate API key
-        if not self.api_key or self.api_key in ["", "YOUR API KEY", "YOUR_API_KEY"]:
-            raise ValueError("OpenAI API key is not configured. Please set 'open_ai_api_key' in config.json")
-
-        # Set dimensions based on model
-        self._dimensions = 1536 if "small" in model else 3072
-
-    def _call_api(self, input_data):
-        """Call OpenAI embedding API using requests"""
-        import requests
-
-        url = f"{self.api_base}/embeddings"
-        headers = {
-            "Content-Type": "application/json",
-            "Authorization": f"Bearer {self.api_key}",
-            **self.extra_headers,
-        }
-        data = {
-            "input": input_data,
-            "model": self.model
-        }
-
-        try:
-            response = requests.post(url, headers=headers, json=data, timeout=5)
-            response.raise_for_status()
-            return response.json()
-        except requests.exceptions.ConnectionError as e:
-            raise ConnectionError(f"Failed to connect to OpenAI API at {url}. Please check your network connection and api_base configuration. Error: {str(e)}")
-        except requests.exceptions.Timeout as e:
-            raise TimeoutError(f"OpenAI API request timed out after 10s. Please check your network connection. Error: {str(e)}")
-        except requests.exceptions.HTTPError as e:
-            if e.response.status_code == 401:
-                raise ValueError(f"Invalid OpenAI API key. Please check your 'open_ai_api_key' in config.json")
-            elif e.response.status_code == 429:
-                raise ValueError(f"OpenAI API rate limit exceeded. Please try again later.")
-            else:
-                raise ValueError(f"OpenAI API request failed: {e.response.status_code} - {e.response.text}")
-
-    def embed(self, text: str) -> List[float]:
-        """Generate embedding for text"""
-        result = self._call_api(text)
-        return result["data"][0]["embedding"]
-
-    def embed_batch(self, texts: List[str]) -> List[List[float]]:
-        """Generate embeddings for multiple texts"""
-        if not texts:
-            return []
-
-        result = self._call_api(texts)
-        return [item["embedding"] for item in result["data"]]
-
-    @property
-    def dimensions(self) -> int:
-        return self._dimensions
-
-
-# LocalEmbeddingProvider removed - only use OpenAI embedding or keyword search
-
-
-class EmbeddingCache:
-    """Cache for embeddings to avoid recomputation"""
-
-    def __init__(self):
-        self.cache = {}
-
-    def get(self, text: str, provider: str, model: str) -> Optional[List[float]]:
-        """Get cached embedding"""
-        key = self._compute_key(text, provider, model)
-        return self.cache.get(key)
-    
-    def put(self, text: str, provider: str, model: str, embedding: List[float]):
-        """Cache embedding"""
-        key = self._compute_key(text, provider, model)
-        self.cache[key] = embedding
-    
-    @staticmethod
-    def _compute_key(text: str, provider: str, model: str) -> str:
-        """Compute cache key"""
-        content = f"{provider}:{model}:{text}"
-        return hashlib.md5(content.encode('utf-8')).hexdigest()
-    
-    def clear(self):
-        """Clear cache"""
-        self.cache.clear()
-
-
-def create_embedding_provider(
-    provider: str = "openai",
-    model: Optional[str] = None,
-    api_key: Optional[str] = None,
-    api_base: Optional[str] = None,
-    extra_headers: Optional[dict] = None
-) -> EmbeddingProvider:
-    """
-    Factory function to create embedding provider
-
-    Supports "openai" and "linkai" providers (both use OpenAI-compatible REST API).
-    If initialization fails, caller should fall back to keyword-only search.
-
-    Args:
-        provider: Provider name ("openai" or "linkai")
-        model: Model name (default: text-embedding-3-small)
-        api_key: API key (required)
-        api_base: API base URL
-        extra_headers: Optional extra headers to include in API requests
-
-    Returns:
-        EmbeddingProvider instance
-
-    Raises:
-        ValueError: If provider is unsupported or api_key is missing
-    """
-    if provider not in ("openai", "linkai"):
-        raise ValueError(f"Unsupported embedding provider: {provider}. Use 'openai' or 'linkai'.")
-
-    model = model or "text-embedding-3-small"
-    return OpenAIEmbeddingProvider(model=model, api_key=api_key, api_base=api_base, extra_headers=extra_headers)
--- a/agent/memory/embedding/init.py
+++ b/agent/memory/embedding/init.py
@@ -0,0 +1,41 @@
+"""
+Embedding subsystem for memory.
+
+Public API:
+  create_embedding_provider, EmbeddingProvider, OpenAIEmbeddingProvider,
+  EMBEDDING_VENDORS, EmbeddingCache
+  RebuildResult, clear_index, rebuild_in_process
+  detect_index_dim, cleanup_legacy_state_file
+"""
+
+from agent.memory.embedding.provider import (
+    EMBEDDING_VENDORS,
+    DoubaoEmbeddingProvider,
+    EmbeddingCache,
+    EmbeddingProvider,
+    OpenAIEmbeddingProvider,
+    create_embedding_provider,
+)
+from agent.memory.embedding.rebuild import (
+    RebuildResult,
+    clear_index,
+    rebuild_in_process,
+)
+from agent.memory.embedding.state import (
+    cleanup_legacy_state_file,
+    detect_index_dim,
+)
+
+__all__ = [
+    "EMBEDDING_VENDORS",
+    "DoubaoEmbeddingProvider",
+    "EmbeddingCache",
+    "EmbeddingProvider",
+    "OpenAIEmbeddingProvider",
+    "create_embedding_provider",
+    "RebuildResult",
+    "clear_index",
+    "rebuild_in_process",
+    "cleanup_legacy_state_file",
+    "detect_index_dim",
+]
--- a/agent/memory/embedding/provider.py
+++ b/agent/memory/embedding/provider.py
@@ -0,0 +1,486 @@
+"""
+Embedding providers for memory
+
+Supports multiple OpenAI-compatible embedding vendors:
+  - openai     (text-embedding-3-small / large)
+  - linkai     (OpenAI-compatible passthrough)
+  - dashscope  (Aliyun Tongyi text-embedding-v4)
+  - doubao     (ByteDance Doubao Seed1.5 / large-text on Volcengine Ark)
+  - zhipu      (ZhipuAI embedding-3)
+
+Vendor keys here intentionally match the project's bot_type constants in
+common.const (OPENAI, LINKAI, QWEN_DASHSCOPE, DOUBAO, ZHIPU_AI).
+
+All providers share a single OpenAI-compatible REST client. Vendor-specific
+behaviors (truncation, query instruction prefix) are configured via metadata.
+"""
+
+import hashlib
+import math
+from abc import ABC, abstractmethod
+from typing import List, Optional
+
+# HTTP read timeout for a single embeddings request (seconds). A batch of
+# 64+ chunks can take 30-50s end-to-end from China-side networks, so 30s is
+# routinely too tight; 90s gives meaningful headroom without letting bad
+# endpoints hang forever.
+EMBEDDING_HTTP_TIMEOUT = 90
+
+
+class EmbeddingProvider(ABC):
+    """Base class for embedding providers"""
+
+    @abstractmethod
+    def embed(self, text: str) -> List[float]:
+        """Generate embedding for a single text (treated as a query by default)"""
+        pass
+
+    @abstractmethod
+    def embed_batch(self, texts: List[str]) -> List[List[float]]:
+        """Generate embeddings for multiple texts (treated as documents)"""
+        pass
+
+    def embed_query(self, text: str) -> List[float]:
+        """Generate embedding for a query string (may apply vendor instruction prefix)"""
+        return self.embed(text)
+
+    @property
+    @abstractmethod
+    def dimensions(self) -> int:
+        """Effective embedding dimensions"""
+        pass
+
+
+# ---------------------------------------------------------------------------
+# Vendor metadata table
+# ---------------------------------------------------------------------------
+#
+# Each entry describes how to reach a vendor's embedding endpoint. Most
+# vendors expose an OpenAI-compatible /embeddings API; the few that don't
+# (currently: doubao) set `provider_class` to pick a dedicated adapter.
+# Fields:
+#   provider_class          : optional adapter key ("doubao"); defaults to OpenAI-compat
+#   default_base_url        : default API base when not overridden by user
+#   default_model           : default embedding model name
+#   default_dimensions      : recommended unified dim when explicit path is enabled
+#   supports_dim_param      : whether the API accepts a `dimensions` request param
+#   needs_client_truncate   : whether to slice + L2-normalize on the client side
+#   needs_client_normalize  : whether to L2-normalize on the client (always safe)
+#   query_instruction       : optional prefix for asymmetric retrieval (Doubao Seed)
+#   max_batch_size          : max texts per /embeddings request; embed_batch
+#                             auto-paginates above this. Conservative defaults.
+#
+EMBEDDING_VENDORS = {
+    "openai": {
+        "default_base_url": "https://api.openai.com/v1",
+        "default_model": "text-embedding-3-small",
+        # Match the legacy default so users adding `embedding_provider: openai`
+        # to an existing index don't need to rebuild. Override via
+        # embedding_dimensions if you want 1024 / 1536 / 3072.
+        "default_dimensions": 1536,
+        "supports_dim_param": True,
+        "needs_client_truncate": False,
+        "needs_client_normalize": False,
+        "query_instruction": "",
+        # OpenAI permits up to 2048 items per request, but a single call
+        # carrying hundreds of long chunks routinely exceeds the 30s read
+        # timeout from China-side networks. 64 keeps each call well under
+        # both the token-per-request budget and a reasonable wall clock.
+        "max_batch_size": 64,
+    },
+    "linkai": {
+        "default_base_url": "https://api.link-ai.tech/v1",
+        "default_model": "text-embedding-3-small",
+        "default_dimensions": 1536,
+        "supports_dim_param": True,
+        "needs_client_truncate": False,
+        "needs_client_normalize": False,
+        "query_instruction": "",
+        "max_batch_size": 64,
+    },
+    "dashscope": {
+        "default_base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
+        "default_model": "text-embedding-v4",
+        "default_dimensions": 1024,
+        "supports_dim_param": True,
+        "needs_client_truncate": False,
+        "needs_client_normalize": False,
+        "query_instruction": "",
+        "max_batch_size": 10,  # DashScope hard cap (text-embedding-v4)
+    },
+    "doubao": {
+        # Doubao no longer offers an OpenAI-compatible /v1/embeddings endpoint.
+        # Current models are unified under /api/v3/embeddings/multimodal
+        # which uses a structured `input` payload — see DoubaoEmbeddingProvider.
+        "provider_class": "doubao",
+        "default_base_url": "https://ark.cn-beijing.volces.com/api/v3",
+        "default_model": "doubao-embedding-vision-251215",
+        # Native options: 1024 or 2048. We default to 1024 to align with the
+        # other Chinese vendors (dashscope/zhipu) and keep storage footprint
+        # consistent across providers; users can still override via
+        # `embedding_dimensions: 2048` in config.
+        "default_dimensions": 1024,
+        "supports_dim_param": True,
+        "needs_client_truncate": False,
+        "needs_client_normalize": False,
+        "query_instruction": "",
+        # Multimodal endpoint produces ONE embedding per call (input list is
+        # a single document's parts, not a batch). embed_batch loops.
+        "max_batch_size": 1,
+    },
+    "zhipu": {
+        "default_base_url": "https://open.bigmodel.cn/api/paas/v4",
+        "default_model": "embedding-3",
+        "default_dimensions": 1024,
+        "supports_dim_param": True,
+        "needs_client_truncate": False,
+        "needs_client_normalize": False,
+        "query_instruction": "",
+        "max_batch_size": 64,
+    },
+}
+
+
+def _l2_normalize(vec: List[float]) -> List[float]:
+    """Normalize a vector to unit length (L2 norm). Returns input on zero vector."""
+    norm = math.sqrt(sum(v * v for v in vec))
+    if norm == 0:
+        return vec
+    return [v / norm for v in vec]
+
+
+class OpenAIEmbeddingProvider(EmbeddingProvider):
+    """
+    OpenAI-compatible embedding provider.
+
+    Used for openai/linkai/dashscope/ark/zhipu by configuring the metadata
+    fields. The legacy two-arg constructor (model, api_key, api_base) keeps
+    working, so the original OpenAI/LinkAI fallback code path is unchanged.
+    """
+
+    def __init__(
+        self,
+        model: str = "text-embedding-3-small",
+        api_key: Optional[str] = None,
+        api_base: Optional[str] = None,
+        extra_headers: Optional[dict] = None,
+        dimensions: Optional[int] = None,
+        supports_dim_param: bool = True,
+        needs_client_truncate: bool = False,
+        needs_client_normalize: bool = False,
+        query_instruction: str = "",
+        max_batch_size: int = 256,
+    ):
+        """
+        Args:
+            model: Model name (e.g. text-embedding-3-small, text-embedding-v4, embedding-3)
+            api_key: API key (required)
+            api_base: API base URL (defaults to OpenAI)
+            extra_headers: Optional extra HTTP headers
+            dimensions: Target output dimension. Required when supports_dim_param
+                is False and needs_client_truncate is True (used to slice).
+            supports_dim_param: Whether the vendor accepts a `dimensions` body param
+            needs_client_truncate: Slice the returned vector to `dimensions`
+            needs_client_normalize: L2-normalize on the client after slicing
+            query_instruction: Optional prefix prepended to query texts only
+            max_batch_size: Max items per /embeddings request; embed_batch
+                auto-paginates above this.
+        """
+        self.model = model
+        self.api_key = api_key
+        self.api_base = api_base or "https://api.openai.com/v1"
+        self.extra_headers = extra_headers or {}
+        self.supports_dim_param = supports_dim_param
+        self.needs_client_truncate = needs_client_truncate
+        self.needs_client_normalize = needs_client_normalize
+        self.query_instruction = query_instruction or ""
+        self.max_batch_size = max(1, int(max_batch_size or 1))
+
+        if not self.api_key or self.api_key in ["", "YOUR API KEY", "YOUR_API_KEY"]:
+            raise ValueError("Embedding API key is not configured")
+
+        if dimensions is not None and dimensions > 0:
+            self._dimensions = dimensions
+        else:
+            # Legacy heuristic for OpenAI text-embedding-3-* family
+            self._dimensions = 1536 if "small" in model else 3072
+
+    def _call_api(self, input_data):
+        """Call OpenAI-compatible /embeddings endpoint"""
+        import requests
+
+        url = f"{self.api_base}/embeddings"
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {self.api_key}",
+            **self.extra_headers,
+        }
+        data = {
+            "input": input_data,
+            "model": self.model,
+        }
+        if self.supports_dim_param and self._dimensions:
+            data["dimensions"] = self._dimensions
+
+        try:
+            response = requests.post(url, headers=headers, json=data, timeout=EMBEDDING_HTTP_TIMEOUT)
+            response.raise_for_status()
+            return response.json()
+        except requests.exceptions.ConnectionError as e:
+            raise ConnectionError(
+                f"Failed to connect to embedding API at {url}. "
+                f"Please check network and api_base. Error: {str(e)}"
+            )
+        except requests.exceptions.Timeout as e:
+            raise TimeoutError(f"Embedding API request timed out. Error: {str(e)}")
+        except requests.exceptions.HTTPError as e:
+            if e.response.status_code == 401:
+                raise ValueError("Invalid embedding API key")
+            elif e.response.status_code == 429:
+                raise ValueError("Embedding API rate limit exceeded")
+            else:
+                raise ValueError(
+                    f"Embedding API request failed: "
+                    f"{e.response.status_code} - {e.response.text}"
+                )
+
+    def _post_process(self, raw: List[float]) -> List[float]:
+        """Apply optional client-side truncation + normalization"""
+        vec = raw
+        if self.needs_client_truncate and self._dimensions and len(vec) > self._dimensions:
+            vec = vec[: self._dimensions]
+        if self.needs_client_normalize:
+            vec = _l2_normalize(vec)
+        return vec
+
+    def embed(self, text: str) -> List[float]:
+        """Generate embedding (treated as document by default)"""
+        result = self._call_api(text)
+        return self._post_process(result["data"][0]["embedding"])
+
+    def embed_query(self, text: str) -> List[float]:
+        """Generate embedding for a query (applies vendor instruction prefix if any)"""
+        if self.query_instruction:
+            text = f"{self.query_instruction}{text}"
+        return self.embed(text)
+
+    def embed_batch(self, texts: List[str]) -> List[List[float]]:
+        """Generate embeddings for multiple documents.
+
+        Automatically paginates by self.max_batch_size so callers can pass any
+        number of texts. Order of returned vectors matches the input order.
+        """
+        if not texts:
+            return []
+        out: List[List[float]] = []
+        step = self.max_batch_size
+        for i in range(0, len(texts), step):
+            chunk = texts[i:i + step]
+            result = self._call_api(chunk)
+            out.extend(self._post_process(item["embedding"]) for item in result["data"])
+        return out
+
+    @property
+    def dimensions(self) -> int:
+        return self._dimensions
+
+
+class DoubaoEmbeddingProvider(EmbeddingProvider):
+    """
+    Doubao (Volcengine Ark) multimodal embedding provider.
+
+    Doubao deprecated their OpenAI-compatible /v1/embeddings endpoint and
+    unified everything under /api/v3/embeddings/multimodal, which uses a
+    structured `input: [{type, text|image_url|video_url}, ...]` payload.
+
+    Notes:
+      * The endpoint produces ONE embedding per call (input list is multiple
+        modality parts of a single document, not a batch). embed_batch
+        therefore loops per-text — no native batch support.
+      * Native dimensions: 1024 or 2048 (default 1024 to align with other
+        Chinese vendors). No client-side truncation needed.
+      * Auth: Bearer ARK API key.
+    """
+
+    def __init__(
+        self,
+        model: str,
+        api_key: Optional[str] = None,
+        api_base: Optional[str] = None,
+        extra_headers: Optional[dict] = None,
+        dimensions: Optional[int] = None,
+    ):
+        self.model = model
+        self.api_key = api_key
+        self.api_base = api_base or "https://ark.cn-beijing.volces.com/api/v3"
+        self.extra_headers = extra_headers or {}
+        if not self.api_key or self.api_key in ["", "YOUR API KEY", "YOUR_API_KEY"]:
+            raise ValueError("Doubao embedding API key (ark_api_key) is not configured")
+
+        if dimensions in (1024, 2048):
+            self._dimensions = dimensions
+        elif dimensions is None:
+            self._dimensions = 1024
+        else:
+            raise ValueError(
+                f"Doubao embedding dimensions must be 1024 or 2048, got {dimensions}"
+            )
+
+    def _call_api(self, text: str) -> List[float]:
+        """One call → one embedding. multimodal endpoint takes a single
+        document represented as a list of typed parts; we send a single
+        text part."""
+        import requests
+
+        url = f"{self.api_base}/embeddings/multimodal"
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {self.api_key}",
+            **self.extra_headers,
+        }
+        payload = {
+            "model": self.model,
+            "input": [{"type": "text", "text": text}],
+            "dimensions": self._dimensions,
+            "encoding_format": "float",
+        }
+
+        try:
+            response = requests.post(url, headers=headers, json=payload, timeout=EMBEDDING_HTTP_TIMEOUT)
+            response.raise_for_status()
+            body = response.json()
+        except requests.exceptions.ConnectionError as e:
+            raise ConnectionError(
+                f"Failed to connect to Doubao embedding API at {url}. "
+                f"Please check network and api_base. Error: {str(e)}"
+            )
+        except requests.exceptions.Timeout as e:
+            raise TimeoutError(f"Doubao embedding API request timed out. Error: {str(e)}")
+        except requests.exceptions.HTTPError as e:
+            if e.response.status_code == 401:
+                raise ValueError("Invalid Doubao (ark) embedding API key")
+            elif e.response.status_code == 429:
+                raise ValueError("Doubao embedding API rate limit exceeded")
+            else:
+                raise ValueError(
+                    f"Doubao embedding API request failed: "
+                    f"{e.response.status_code} - {e.response.text}"
+                )
+
+        # Response shape per docs: {"data": {"embedding": [...]}}
+        data = body.get("data")
+        if isinstance(data, dict) and "embedding" in data:
+            return data["embedding"]
+        # Some providers wrap as a list of one — be defensive
+        if isinstance(data, list) and data and "embedding" in data[0]:
+            return data[0]["embedding"]
+        raise ValueError(f"Unexpected Doubao embedding response shape: {body}")
+
+    def embed(self, text: str) -> List[float]:
+        return self._call_api(text)
+
+    def embed_batch(self, texts: List[str]) -> List[List[float]]:
+        # Endpoint produces one embedding per call; loop. Order preserved.
+        return [self._call_api(t) for t in texts]
+
+    @property
+    def dimensions(self) -> int:
+        return self._dimensions
+
+
+class EmbeddingCache:
+    """In-memory cache for embeddings to avoid recomputation"""
+
+    def __init__(self):
+        self.cache = {}
+
+    def get(self, text: str, provider: str, model: str) -> Optional[List[float]]:
+        key = self._compute_key(text, provider, model)
+        return self.cache.get(key)
+
+    def put(self, text: str, provider: str, model: str, embedding: List[float]):
+        key = self._compute_key(text, provider, model)
+        self.cache[key] = embedding
+
+    @staticmethod
+    def _compute_key(text: str, provider: str, model: str) -> str:
+        content = f"{provider}:{model}:{text}"
+        return hashlib.md5(content.encode("utf-8")).hexdigest()
+
+    def clear(self):
+        self.cache.clear()
+
+
+def create_embedding_provider(
+    provider: str = "openai",
+    model: Optional[str] = None,
+    api_key: Optional[str] = None,
+    api_base: Optional[str] = None,
+    extra_headers: Optional[dict] = None,
+    dimensions: Optional[int] = None,
+) -> EmbeddingProvider:
+    """
+    Factory function to create an embedding provider.
+
+    Backward compatible: when called with provider in {"openai", "linkai"}
+    and no `dimensions` arg, behaves exactly as before (1536-dim OpenAI).
+
+    New providers ("dashscope", "doubao", "zhipu") require explicit configuration
+    and use the unified 1024-dim defaults from EMBEDDING_VENDORS.
+
+    Args:
+        provider: Vendor key (one of EMBEDDING_VENDORS)
+        model: Model name (uses vendor default if None)
+        api_key: API key (required)
+        api_base: API base URL (uses vendor default if None)
+        extra_headers: Optional extra HTTP headers
+        dimensions: Target output dimension (uses vendor default if None)
+
+    Returns:
+        EmbeddingProvider instance
+    """
+    meta = EMBEDDING_VENDORS.get(provider)
+    if meta is None:
+        raise ValueError(
+            f"Unsupported embedding provider: {provider}. "
+            f"Supported: {sorted(EMBEDDING_VENDORS.keys())}"
+        )
+
+    # Doubao uses a non-OpenAI-compatible multimodal endpoint.
+    if meta.get("provider_class") == "doubao":
+        final_dim = dimensions if (dimensions and dimensions > 0) else meta["default_dimensions"]
+        return DoubaoEmbeddingProvider(
+            model=model or meta["default_model"],
+            api_key=api_key,
+            api_base=api_base or meta["default_base_url"],
+            extra_headers=extra_headers,
+            dimensions=final_dim,
+        )
+
+    # Legacy two-arg call for openai/linkai keeps 1536-dim default behavior
+    # so existing data isn't invalidated.
+    is_legacy_call = (
+        provider in ("openai", "linkai")
+        and dimensions is None
+    )
+    if is_legacy_call:
+        return OpenAIEmbeddingProvider(
+            model=model or "text-embedding-3-small",
+            api_key=api_key,
+            api_base=api_base,
+            extra_headers=extra_headers,
+        )
+
+    final_dim = dimensions if (dimensions and dimensions > 0) else meta["default_dimensions"]
+    return OpenAIEmbeddingProvider(
+        model=model or meta["default_model"],
+        api_key=api_key,
+        api_base=api_base or meta["default_base_url"],
+        extra_headers=extra_headers,
+        dimensions=final_dim,
+        supports_dim_param=meta["supports_dim_param"],
+        needs_client_truncate=meta["needs_client_truncate"],
+        needs_client_normalize=meta["needs_client_normalize"],
+        query_instruction=meta["query_instruction"],
+        max_batch_size=meta.get("max_batch_size", 256),
+    )
--- a/agent/memory/embedding/rebuild.py
+++ b/agent/memory/embedding/rebuild.py
@@ -0,0 +1,191 @@
+"""
+Rebuild memory vector index.
+
+Recommended entry point (in-chat, while agent is running):
+    /memory rebuild-index
+
+Backward-compatible CLI entry (must run from project root):
+    python -m agent.memory.rebuild_index
+
+What it does:
+  1. Probes the embedding endpoint with a tiny call to fail fast on
+     bad provider/model/key — before touching the index.
+  2. Clears the SQLite chunks/files tables (workspace markdown stays intact).
+  3. Runs a fresh sync, regenerating embeddings with the currently configured
+     provider/model/dimensions.
+
+This is the only safe way to switch embedding_provider after the existing
+index has been populated by a different-dim model.
+"""
+
+from __future__ import annotations
+import asyncio
+import sys
+from dataclasses import dataclass
+from typing import Optional
+
+from common.log import logger
+from common.utils import expand_path
+
+
+@dataclass
+class RebuildResult:
+    """Outcome of a rebuild_in_process() call"""
+    ok: bool
+    removed: int = 0
+    chunks: int = 0
+    files: int = 0
+    error: Optional[str] = None
+
+
+def clear_index(db_path, storage=None) -> int:
+    """Wipe chunks/files, reset FTS5, and clean up any legacy state file.
+
+    Args:
+        db_path: Path of the index DB (also used to locate the legacy state
+            file for migration cleanup, and — when *storage* is None — to
+            open a fresh connection).
+        storage: Optional pre-opened MemoryStorage. When provided we reuse it
+            so the live connection's triggers stay in sync — opening a second
+            connection would leave the original one's triggers pointing at a
+            DROP'd chunks_fts table.
+
+    We reset (DROP+recreate) chunks_fts because its shadow tables can become
+    inconsistent across rebuild cycles, causing bm25() / ORDER BY rank to
+    raise "database disk image is malformed" even when raw MATCH still works.
+
+    Returns number of chunks removed.
+    """
+    from agent.memory.embedding.state import cleanup_legacy_state_file
+    from agent.memory.storage import MemoryStorage
+
+    owns_storage = storage is None
+    if owns_storage:
+        storage = MemoryStorage(db_path)
+    try:
+        before = storage.conn.execute("SELECT COUNT(*) FROM chunks").fetchone()[0]
+        storage.conn.execute("DELETE FROM chunks")
+        storage.conn.execute("DELETE FROM files")
+        storage.conn.commit()
+        storage.reset_fts5()
+    finally:
+        if owns_storage:
+            storage.close()
+
+    cleanup_legacy_state_file(db_path)
+    return int(before)
+
+
+def rebuild_in_process(memory_manager) -> RebuildResult:
+    """
+    Rebuild the index using an existing, fully-initialized MemoryManager.
+
+    Used by the in-chat /memory rebuild-index command. The caller already has
+    config loaded, embedding_provider built, and (optionally) the agent
+    running, so we only need to:
+      1. Clear chunks/files + state on the manager's storage.
+      2. Re-sync (force=True).
+
+    NOTE: caller must ensure memory_manager.embedding_provider is set, otherwise
+    sync() will silently skip embedding generation.
+    """
+    if memory_manager is None:
+        return RebuildResult(ok=False, error="memory_manager is None")
+    if memory_manager.embedding_provider is None:
+        return RebuildResult(ok=False, error="embedding_provider is not initialized")
+
+    # Probe the embedding endpoint BEFORE clearing the index. A bad
+    # provider/model/key would otherwise leave the user with an empty index
+    # that not even keyword search can serve.
+    try:
+        memory_manager.embedding_provider.embed_query("ping")
+    except Exception as e:
+        logger.error(f"[RebuildIndex] embedding probe failed, aborting rebuild: {e}")
+        return RebuildResult(ok=False, error=f"embedding endpoint not reachable: {e}")
+
+    db_path = memory_manager.config.get_db_path()
+    try:
+        removed = clear_index(db_path, storage=memory_manager.storage)
+    except Exception as e:
+        logger.exception("[RebuildIndex] clear_index failed")
+        return RebuildResult(ok=False, error=f"clear failed: {e}")
+
+    try:
+        asyncio.run(memory_manager.sync(force=True))
+    except RuntimeError:
+        # Already inside a running event loop (rare in chat handler thread).
+        loop = asyncio.new_event_loop()
+        try:
+            loop.run_until_complete(memory_manager.sync(force=True))
+        finally:
+            loop.close()
+    except Exception as e:
+        logger.exception("[RebuildIndex] sync failed")
+        return RebuildResult(ok=False, removed=removed, error=f"re-embed failed: {e}")
+
+    stats = memory_manager.storage.get_stats()
+    chunks = int(stats.get("chunks", 0))
+    embedded = int(stats.get("embedded", 0))
+
+    # sync() degrades to "no embeddings" on batch failure so keyword search
+    # still works at startup — but in a /rebuild-index request the user
+    # explicitly asked for vectors. Surface that as a failure.
+    if chunks > 0 and embedded == 0:
+        return RebuildResult(
+            ok=False,
+            removed=removed,
+            chunks=chunks,
+            files=int(stats.get("files", 0)),
+            error=(
+                "embedding API failed during sync; index now has chunks but no "
+                "vectors. Check embedding provider/model/key and retry."
+            ),
+        )
+
+    return RebuildResult(
+        ok=True,
+        removed=removed,
+        chunks=chunks,
+        files=int(stats.get("files", 0)),
+    )
+
+
+def main() -> int:
+    """Standalone CLI entry. Must be run from project root (relative config path)."""
+    from config import conf, load_config
+    from agent.memory import MemoryConfig, MemoryManager
+
+    load_config()
+
+    workspace_root = expand_path(conf().get("agent_workspace", "~/cow"))
+    memory_config = MemoryConfig(workspace_root=workspace_root)
+
+    logger.info(f"[RebuildIndex] Workspace: {workspace_root}")
+    logger.info(f"[RebuildIndex] Index db:  {memory_config.get_db_path()}")
+
+    from bridge.agent_initializer import AgentInitializer
+
+    initializer = AgentInitializer(bridge=None, agent_bridge=None)
+    embedding_provider = initializer._init_embedding_provider(memory_config, session_id=None)
+    if embedding_provider is None:
+        logger.error(
+            "[RebuildIndex] No embedding provider could be initialized. "
+            "Check your config.json. Aborting rebuild."
+        )
+        return 1
+
+    manager = MemoryManager(memory_config, embedding_provider=embedding_provider)
+    result = rebuild_in_process(manager)
+    if not result.ok:
+        logger.error(f"[RebuildIndex] {result.error}")
+        return 1
+
+    logger.info(
+        f"[RebuildIndex] Done. removed={result.removed}, "
+        f"chunks={result.chunks}, files={result.files}"
+    )
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/agent/memory/embedding/state.py
+++ b/agent/memory/embedding/state.py
@@ -0,0 +1,47 @@
+"""
+Embedding-related index utilities.
+
+We don't keep a sidecar state file — the SQLite index is the source of truth
+and config.json is the source of intent. The two functions below are the
+only things needing on-disk awareness:
+
+  detect_index_dim         : read the dim of stored vectors (display-only)
+  cleanup_legacy_state_file: remove old embedding_state.json from earlier
+                             versions; safe no-op when absent.
+"""
+
+from __future__ import annotations
+import json
+import os
+from pathlib import Path
+from typing import Optional, Union
+
+PathLike = Union[str, os.PathLike]
+
+
+def detect_index_dim(storage) -> Optional[int]:
+    """Return the dim of the first stored embedding, or None if the index
+    has no embeddings. Used by /memory status."""
+    try:
+        row = storage.conn.execute(
+            "SELECT embedding FROM chunks WHERE embedding IS NOT NULL LIMIT 1"
+        ).fetchone()
+    except Exception:
+        return None
+    if not row or not row["embedding"]:
+        return None
+    try:
+        emb = json.loads(row["embedding"])
+        return len(emb) if isinstance(emb, list) else None
+    except (json.JSONDecodeError, TypeError):
+        return None
+
+
+def cleanup_legacy_state_file(db_path: PathLike) -> None:
+    """Remove old embedding_state.json files from earlier versions.
+    Safe to call repeatedly; no-op if the file is absent."""
+    legacy = Path(db_path).parent / "embedding_state.json"
+    try:
+        legacy.unlink(missing_ok=True)
+    except Exception:
+        pass
--- a/agent/memory/manager.py
+++ b/agent/memory/manager.py
@@ -13,7 +13,7 @@ from datetime import datetime, timedelta
 from agent.memory.config import MemoryConfig, get_default_memory_config
 from agent.memory.storage import MemoryStorage, MemoryChunk, SearchResult
 from agent.memory.chunker import TextChunker
-from agent.memory.embedding import create_embedding_provider, EmbeddingProvider
+from agent.memory.embedding import EmbeddingProvider
 from agent.memory.summarizer import MemoryFlushManager, create_memory_files_if_needed


@@ -50,49 +50,17 @@ class MemoryManager:
            overlap_tokens=self.config.chunk_overlap_tokens
        )
        
-        # Initialize embedding provider (optional, prefer OpenAI, fallback to LinkAI)
-        self.embedding_provider = None
-        if embedding_provider:
-            self.embedding_provider = embedding_provider
-        else:
-            # Try OpenAI first
-            try:
-                api_key = os.environ.get('OPENAI_API_KEY')
-                api_base = os.environ.get('OPENAI_API_BASE')
-                if api_key:
-                    self.embedding_provider = create_embedding_provider(
-                        provider="openai",
-                        model=self.config.embedding_model,
-                        api_key=api_key,
-                        api_base=api_base
-                    )
-            except Exception as e:
-                from common.log import logger
-                logger.warning(f"[MemoryManager] OpenAI embedding failed: {e}")
-
-            # Fallback to LinkAI
-            if self.embedding_provider is None:
-                try:
-                    linkai_key = os.environ.get('LINKAI_API_KEY')
-                    linkai_base = os.environ.get('LINKAI_API_BASE', 'https://api.link-ai.tech')
-                    if linkai_key:
-                        from common.utils import get_cloud_headers
-                        cloud_headers = get_cloud_headers(linkai_key)
-                        cloud_headers.pop("Authorization", None)
-                        self.embedding_provider = create_embedding_provider(
-                            provider="linkai",
-                            model=self.config.embedding_model,
-                            api_key=linkai_key,
-                            api_base=f"{linkai_base}/v1",
-                            extra_headers=cloud_headers,
-                        )
-                except Exception as e:
-                    from common.log import logger
-                    logger.warning(f"[MemoryManager] LinkAI embedding failed: {e}")
-
-            if self.embedding_provider is None:
-                from common.log import logger
-                logger.info(f"[MemoryManager] Memory will work with keyword search only (no vector search)")
+        # Embedding provider is owned by the caller (agent_initializer is the
+        # canonical entry point and handles legacy/explicit + state validation).
+        # When None is passed, memory degrades to keyword-only search instead
+        # of silently re-initializing a vendor here, which would bypass the
+        # caller's state checks and risk corrupting the index.
+        self.embedding_provider = embedding_provider
+        if self.embedding_provider is None:
+            from common.log import logger
+            logger.info(
+                "[MemoryManager] No embedding provider; memory will use keyword search only"
+            )
        
        # Initialize memory flush manager
        workspace_dir = self.config.get_workspace()
@@ -153,12 +121,14 @@ class MemoryManager:
        if self.config.sync_on_search and self._dirty:
            await self.sync()
        
-        # Perform vector search (if embedding provider available)
+        from common.log import logger
+
+        # Perform vector search (if embedding provider available).
+        # Failures degrade silently to keyword-only — no exception is raised.
        vector_results = []
        if self.embedding_provider:
            try:
-                from common.log import logger
-                query_embedding = self.embedding_provider.embed(query)
+                query_embedding = self.embedding_provider.embed_query(query)
                vector_results = self.storage.search_vector(
                    query_embedding=query_embedding,
                    user_id=user_id,
@@ -167,19 +137,19 @@ class MemoryManager:
                )
                logger.info(f"[MemoryManager] Vector search found {len(vector_results)} results for query: {query}")
            except Exception as e:
-                from common.log import logger
-                logger.warning(f"[MemoryManager] Vector search failed: {e}")
-        
-        # Perform keyword search
+                logger.error(
+                    f"[MemoryManager] Vector search failed, falling back to keyword-only: {e}"
+                )
+
+        # Perform keyword search (also runs as fallback when vector failed)
        keyword_results = self.storage.search_keyword(
            query=query,
            user_id=user_id,
            scopes=scopes,
            limit=max_results * 2
        )
-        from common.log import logger
        logger.info(f"[MemoryManager] Keyword search found {len(keyword_results)} results for query: {query}")
-        
+
        # Merge results
        merged = self._merge_results(
            vector_results,
@@ -187,7 +157,7 @@ class MemoryManager:
            self.config.vector_weight,
            self.config.keyword_weight
        )
-        
+
        # Filter by min score and limit
        filtered = [r for r in merged if r.score >= min_score]
        return filtered[:max_results]
@@ -269,144 +239,191 @@ class MemoryManager:
    
    async def sync(self, force: bool = False):
        """
-        Synchronize memory from files
-        
+        Synchronize memory from files.
+
+        Two-pass design to amortize embedding HTTP cost:
+          1. Walk all files, chunk those whose hash changed, collect pending
+             chunks across files. No embedding calls yet.
+          2. Run a single embed_batch over the union of pending chunks (the
+             provider auto-paginates by vendor cap), then persist per-file.
+
+        For workspaces with many small files (101 files / ~1 chunk each), this
+        cuts ~100 HTTP calls down to ~ceil(total_chunks / vendor_cap).
+
        Args:
            force: Force full reindex
        """
        memory_dir = self.config.get_memory_dir()
        workspace_dir = self.config.get_workspace()
-        
-        # Scan MEMORY.md (workspace root)
+
+        files_to_scan: List[tuple] = []  # (file_path, source, scope, user_id)
+
        memory_file = Path(workspace_dir) / "MEMORY.md"
        if memory_file.exists():
-            await self._sync_file(memory_file, "memory", "shared", None)
-        
-        # Scan memory directory (including daily summaries)
+            files_to_scan.append((memory_file, "memory", "shared", None))
+
        if memory_dir.exists():
            for file_path in memory_dir.rglob("*.md"):
-                # Determine scope and user_id from path
-                rel_path = file_path.relative_to(workspace_dir)
-                parts = rel_path.parts
-                
-                # Check if it's in daily summary directory
-                if "daily" in parts:
-                    # Daily summary files
-                    if "users" in parts or len(parts) > 3:
-                        # User-scoped daily summary: memory/daily/{user_id}/2024-01-29.md
-                        user_idx = parts.index("daily") + 1
-                        user_id = parts[user_idx] if user_idx < len(parts) else None
+                rel_parts = file_path.relative_to(workspace_dir).parts
+                if any(part.startswith('.') for part in rel_parts):
+                    continue
+                # Dream diaries are narrative reflections produced by Deep
+                # Dream; their factual content has already been distilled
+                # into MEMORY.md. Indexing them adds noisy near-duplicates
+                # that crowd out the authoritative entry in retrieval.
+                if "dreams" in rel_parts:
+                    continue
+                if "daily" in rel_parts:
+                    if "users" in rel_parts or len(rel_parts) > 3:
+                        user_idx = rel_parts.index("daily") + 1
+                        user_id = rel_parts[user_idx] if user_idx < len(rel_parts) else None
                        scope = "user"
                    else:
-                        # Shared daily summary: memory/daily/2024-01-29.md
                        user_id = None
                        scope = "shared"
-                elif "users" in parts:
-                    # User-scoped memory
-                    user_idx = parts.index("users") + 1
-                    user_id = parts[user_idx] if user_idx < len(parts) else None
+                elif "users" in rel_parts:
+                    user_idx = rel_parts.index("users") + 1
+                    user_id = rel_parts[user_idx] if user_idx < len(rel_parts) else None
                    scope = "user"
                else:
-                    # Shared memory
                    user_id = None
                    scope = "shared"
-                
-                await self._sync_file(file_path, "memory", scope, user_id)
-        
-        self._dirty = False
-    
-    async def _sync_file(
-        self,
-        file_path: Path,
-        source: str,
-        scope: str,
-        user_id: Optional[str]
-    ):
-        """Sync a single file"""
-        # Compute file hash
-        content = file_path.read_text(encoding='utf-8')
-        file_hash = MemoryStorage.compute_hash(content)
-        
-        # Get relative path
-        workspace_dir = self.config.get_workspace()
-        rel_path = str(file_path.relative_to(workspace_dir))
-        
-        # Check if file changed
-        stored_hash = self.storage.get_file_hash(rel_path)
-        if stored_hash == file_hash:
-            return  # No changes
-        
-        # Delete old chunks
-        self.storage.delete_by_path(rel_path)
-        
-        # Chunk and embed
-        chunks = self.chunker.chunk_text(content)
-        if not chunks:
+                files_to_scan.append((file_path, "memory", scope, user_id))
+
+        from config import conf
+        if conf().get("knowledge", True):
+            knowledge_dir = Path(workspace_dir) / "knowledge"
+            if knowledge_dir.exists():
+                for file_path in knowledge_dir.rglob("*.md"):
+                    files_to_scan.append((file_path, "knowledge", "shared", None))
+
+        # Pass 1: inline chunking + change detection. Inlined (instead of
+        # calling self._prepare_file_for_sync) so this method does not depend
+        # on any sibling helpers — keeps it robust against partial reloads
+        # where the class object is older than the method's source.
+        pending: List[Dict[str, Any]] = []
+        workspace_dir_path = self.config.get_workspace()
+        for file_path, source, scope, user_id in files_to_scan:
+            try:
+                content = file_path.read_text(encoding='utf-8')
+            except Exception:
+                continue
+            file_hash = MemoryStorage.compute_hash(content)
+            rel_path = str(file_path.relative_to(workspace_dir_path))
+            if self.storage.get_file_hash(rel_path) == file_hash:
+                continue
+            chunks = self.chunker.chunk_text(content)
+            if not chunks:
+                continue
+            pending.append({
+                "file_path": file_path,
+                "rel_path": rel_path,
+                "source": source,
+                "scope": scope,
+                "user_id": user_id,
+                "file_hash": file_hash,
+                "chunks": chunks,
+                "texts": [c.text for c in chunks],
+            })
+
+        if not pending:
+            self._dirty = False
            return
-        
-        texts = [chunk.text for chunk in chunks]
-        if self.embedding_provider:
-            embeddings = self.embedding_provider.embed_batch(texts)
+
+        # Pass 2: single batched embed across all pending chunks.
+        # CRITICAL: never touch the index until we hold valid embeddings.
+        # If embed_batch fails, leave the existing index intact (chunks +
+        # file_hash) so the next sync will retry the same files. Writing
+        # NULL embeddings + updating file_hash here would mark the file as
+        # "successfully synced" and silently strand it without vectors.
+        all_texts: List[str] = []
+        for entry in pending:
+            all_texts.extend(entry["texts"])
+
+        if not self.embedding_provider:
+            # No provider configured at all (legacy keyword-only). Persist
+            # chunks without embeddings — this is the user's intent.
+            all_embeddings: List[Optional[List[float]]] = [None] * len(all_texts)
        else:
-            embeddings = [None] * len(texts)
-        
-        # Create memory chunks
-        memory_chunks = []
-        for chunk, embedding in zip(chunks, embeddings):
-            chunk_id = self._generate_chunk_id(rel_path, chunk.start_line, chunk.end_line)
-            chunk_hash = MemoryStorage.compute_hash(chunk.text)
-            
-            memory_chunks.append(MemoryChunk(
-                id=chunk_id,
-                user_id=user_id,
-                scope=scope,
-                source=source,
+            try:
+                all_embeddings = self.embedding_provider.embed_batch(all_texts)
+            except Exception as e:
+                from common.log import logger
+                logger.error(
+                    f"[MemoryManager] Batch embedding failed for {len(all_texts)} "
+                    f"chunks across {len(pending)} files: {e}. "
+                    f"Index left untouched; will retry on next sync."
+                )
+                # Bail before touching storage. self._dirty stays True so
+                # callers know there is pending work.
+                return
+
+        # Pass 3: inline persist — same self-contained reasoning as Pass 1.
+        cursor = 0
+        for entry in pending:
+            n = len(entry["texts"])
+            entry_embeddings = all_embeddings[cursor:cursor + n]
+            cursor += n
+
+            rel_path = entry["rel_path"]
+            self.storage.delete_by_path(rel_path)
+            memory_chunks = []
+            for chunk, embedding in zip(entry["chunks"], entry_embeddings):
+                chunk_id = self._generate_chunk_id(rel_path, chunk.start_line, chunk.end_line)
+                chunk_hash = MemoryStorage.compute_hash(chunk.text)
+                memory_chunks.append(MemoryChunk(
+                    id=chunk_id,
+                    user_id=entry["user_id"],
+                    scope=entry["scope"],
+                    source=entry["source"],
+                    path=rel_path,
+                    start_line=chunk.start_line,
+                    end_line=chunk.end_line,
+                    text=chunk.text,
+                    embedding=embedding,
+                    hash=chunk_hash,
+                    metadata=None,
+                ))
+            self.storage.save_chunks_batch(memory_chunks)
+            stat = entry["file_path"].stat()
+            self.storage.update_file_metadata(
                path=rel_path,
-                start_line=chunk.start_line,
-                end_line=chunk.end_line,
-                text=chunk.text,
-                embedding=embedding,
-                hash=chunk_hash,
-                metadata=None
-            ))
-        
-        # Save
-        self.storage.save_chunks_batch(memory_chunks)
-        
-        # Update file metadata
-        stat = file_path.stat()
-        self.storage.update_file_metadata(
-            path=rel_path,
-            source=source,
-            file_hash=file_hash,
-            mtime=int(stat.st_mtime),
-            size=stat.st_size
-        )
-    
+                source=entry["source"],
+                file_hash=entry["file_hash"],
+                mtime=int(stat.st_mtime),
+                size=stat.st_size,
+            )
+
+        self._dirty = False
+
    def flush_memory(
        self,
        messages: list,
        user_id: Optional[str] = None,
        reason: str = "threshold",
        max_messages: int = 10,
+        context_summary_callback=None,
    ) -> bool:
        """
        Flush conversation summary to daily memory file.
-        
+
        Args:
            messages: Conversation message list
            user_id: Optional user ID
            reason: "threshold" | "overflow" | "daily_summary"
            max_messages: Max recent messages to include (0 = all)
-        
+            context_summary_callback: Optional callback(str) invoked with the
+                daily summary text for in-context injection
+
        Returns:
-            True if content was written
+            True if flush was dispatched
        """
        success = self.flush_manager.flush_from_messages(
            messages=messages,
            user_id=user_id,
            reason=reason,
            max_messages=max_messages,
+            context_summary_callback=context_summary_callback,
        )
        if success:
            self._dirty = True
--- a/agent/memory/rebuild_index.py
+++ b/agent/memory/rebuild_index.py
@@ -0,0 +1,14 @@
+"""
+Backward-compatible shim for the legacy entry point:
+    python -m agent.memory.rebuild_index
+
+The implementation now lives in agent.memory.embedding.rebuild.
+Prefer using `/memory rebuild-index` in chat going forward.
+"""
+
+from agent.memory.embedding.rebuild import main
+
+if __name__ == "__main__":
+    import sys
+
+    sys.exit(main())
--- a/agent/memory/service.py
+++ b/agent/memory/service.py
@@ -32,68 +32,80 @@ class MemoryService:
    # ------------------------------------------------------------------
    # list — paginated file metadata
    # ------------------------------------------------------------------
-    def list_files(self, page: int = 1, page_size: int = 20) -> dict:
+    def list_files(self, page: int = 1, page_size: int = 20, category: str = "memory") -> dict:
        """
-        List all memory files with metadata (without content).
+        List memory or dream files with metadata (without content).

-        Returns::
-
-            {
-                "page": 1,
-                "page_size": 20,
-                "total": 15,
-                "list": [
-                    {"filename": "MEMORY.md", "type": "global", "size": 2048, "updated_at": "2026-02-20 10:00:00"},
-                    {"filename": "2026-02-20.md", "type": "daily", "size": 512, "updated_at": "2026-02-20 09:30:00"},
-                    ...
-                ]
-            }
+        Args:
+            category: ``"memory"`` (default) — MEMORY.md + daily files;
+                      ``"dream"``  — dream diary files from memory/dreams/
        """
+        if category == "dream":
+            files = self._list_dream_files()
+        else:
+            files = self._list_memory_files()
+
+        total = len(files)
+        start = (page - 1) * page_size
+        end = start + page_size
+
+        return {
+            "page": page,
+            "page_size": page_size,
+            "total": total,
+            "list": files[start:end],
+        }
+
+    def _list_memory_files(self) -> List[dict]:
+        """MEMORY.md + memory/*.md (newest first)."""
        files: List[dict] = []

-        # 1. Global memory — MEMORY.md in workspace root
        global_path = os.path.join(self.workspace_root, "MEMORY.md")
        if os.path.isfile(global_path):
            files.append(self._file_info(global_path, "MEMORY.md", "global"))

-        # 2. Daily memory files — memory/*.md (sorted newest first)
        if os.path.isdir(self.memory_dir):
            daily_files = []
            for name in os.listdir(self.memory_dir):
                full = os.path.join(self.memory_dir, name)
                if os.path.isfile(full) and name.endswith(".md"):
                    daily_files.append((name, full))
-            # Sort by filename descending (newest date first)
            daily_files.sort(key=lambda x: x[0], reverse=True)
            for name, full in daily_files:
                files.append(self._file_info(full, name, "daily"))

-        total = len(files)
+        return files

-        # Paginate
-        start = (page - 1) * page_size
-        end = start + page_size
-        page_items = files[start:end]
+    def _list_dream_files(self) -> List[dict]:
+        """memory/dreams/*.md (newest first)."""
+        files: List[dict] = []
+        dreams_dir = os.path.join(self.memory_dir, "dreams")

-        return {
-            "page": page,
-            "page_size": page_size,
-            "total": total,
-            "list": page_items,
-        }
+        if os.path.isdir(dreams_dir):
+            entries = []
+            for name in os.listdir(dreams_dir):
+                full = os.path.join(dreams_dir, name)
+                if os.path.isfile(full) and name.endswith(".md"):
+                    entries.append((name, full))
+            entries.sort(key=lambda x: x[0], reverse=True)
+            for name, full in entries:
+                files.append(self._file_info(full, name, "dream"))
+
+        return files

    # ------------------------------------------------------------------
    # content — read a single file
    # ------------------------------------------------------------------
-    def get_content(self, filename: str) -> dict:
+    def get_content(self, filename: str, category: str = "memory") -> dict:
        """
-        Read the full content of a memory file.
+        Read the full content of a memory or dream file.

-        :param filename: File name, e.g. ``MEMORY.md`` or ``2026-02-20.md``
+        :param filename: File name, e.g. ``MEMORY.md``, ``2026-02-20.md``
+        :param category: ``"memory"`` or ``"dream"``
        :return: dict with ``filename`` and ``content``
        :raises FileNotFoundError: if the file does not exist
        """
-        path = self._resolve_path(filename)
+        path = self._resolve_path(filename, category)
        if not os.path.isfile(path):
            raise FileNotFoundError(f"Memory file not found: {filename}")

@@ -113,7 +125,7 @@ class MemoryService:
        Dispatch a memory management action.

        :param action: ``list`` or ``content``
-        :param payload: action-specific payload
+        :param payload: action-specific payload (supports ``category``: ``"memory"`` | ``"dream"``)
        :return: protocol-compatible response dict
        """
        payload = payload or {}
@@ -121,14 +133,16 @@ class MemoryService:
            if action == "list":
                page = payload.get("page", 1)
                page_size = payload.get("page_size", 20)
-                result_payload = self.list_files(page=page, page_size=page_size)
+                category = payload.get("category", "memory")
+                result_payload = self.list_files(page=page, page_size=page_size, category=category)
                return {"action": action, "code": 200, "message": "success", "payload": result_payload}

            elif action == "content":
                filename = payload.get("filename")
                if not filename:
                    return {"action": action, "code": 400, "message": "filename is required", "payload": None}
-                result_payload = self.get_content(filename)
+                category = payload.get("category", "memory")
+                result_payload = self.get_content(filename, category=category)
                return {"action": action, "code": 200, "message": "success", "payload": result_payload}

            else:
@@ -145,18 +159,20 @@ class MemoryService:
    # ------------------------------------------------------------------
    # internal helpers
    # ------------------------------------------------------------------
-    def _resolve_path(self, filename: str) -> str:
+    def _resolve_path(self, filename: str, category: str = "memory") -> str:
        """
        Safely resolve a filename to its absolute path within the allowed directory.

        - ``MEMORY.md`` → ``{workspace_root}/MEMORY.md``
-        - ``2026-02-20.md`` → ``{workspace_root}/memory/2026-02-20.md``
+        - ``2026-02-20.md`` (memory) → ``{workspace_root}/memory/2026-02-20.md``
+        - ``2026-02-20.md`` (dream) → ``{workspace_root}/memory/dreams/2026-02-20.md``

-        Raises ValueError if the resolved path escapes the allowed directory
-        (path traversal protection).
+        Raises ValueError if the resolved path escapes the allowed directory.
        """
        if filename == "MEMORY.md":
            base_dir = self.workspace_root
+        elif category == "dream":
+            base_dir = os.path.join(self.memory_dir, "dreams")
        else:
            base_dir = self.memory_dir

--- a/agent/memory/storage.py
+++ b/agent/memory/storage.py
@@ -144,45 +144,37 @@ class MemoryStorage:
            ON chunks(path, hash)
        """)
        
-        # Create FTS5 virtual table for keyword search (only if supported)
+        # Create FTS5 virtual table + triggers (only if supported).
+        # Self-heal: if the previous process crashed mid-rebuild and left
+        # triggers pointing at a missing chunks_fts (or vice versa), wipe
+        # both sides and recreate cleanly. Otherwise next chunks INSERT
+        # will fail with "no such table: chunks_fts".
        if self.fts5_available:
-            # Use default unicode61 tokenizer (stable and compatible)
-            # For CJK support, we'll use LIKE queries as fallback
-            self.conn.execute("""
-                CREATE VIRTUAL TABLE IF NOT EXISTS chunks_fts USING fts5(
-                    text,
-                    id UNINDEXED,
-                    user_id UNINDEXED,
-                    path UNINDEXED,
-                    source UNINDEXED,
-                    scope UNINDEXED,
-                    content='chunks',
-                    content_rowid='rowid'
+            if self._fts5_state_inconsistent():
+                from common.log import logger
+                logger.warning(
+                    "[MemoryStorage] FTS5 state inconsistent (triggers/table mismatch). "
+                    "Resetting chunks_fts to recover."
                )
-            """)
-            
-            # Create triggers to keep FTS in sync
-            self.conn.execute("""
-                CREATE TRIGGER IF NOT EXISTS chunks_ai AFTER INSERT ON chunks BEGIN
-                    INSERT INTO chunks_fts(rowid, text, id, user_id, path, source, scope)
-                    VALUES (new.rowid, new.text, new.id, new.user_id, new.path, new.source, new.scope);
-                END
-            """)
-            
-            self.conn.execute("""
-                CREATE TRIGGER IF NOT EXISTS chunks_ad AFTER DELETE ON chunks BEGIN
-                    DELETE FROM chunks_fts WHERE rowid = old.rowid;
-                END
-            """)
-            
-            self.conn.execute("""
-                CREATE TRIGGER IF NOT EXISTS chunks_au AFTER UPDATE ON chunks BEGIN
-                    UPDATE chunks_fts SET text = new.text, id = new.id,
-                                         user_id = new.user_id, path = new.path, source = new.source, scope = new.scope
-                    WHERE rowid = new.rowid;
-                END
-            """)
-        
+                self.conn.execute("DROP TRIGGER IF EXISTS chunks_ai")
+                self.conn.execute("DROP TRIGGER IF EXISTS chunks_ad")
+                self.conn.execute("DROP TRIGGER IF EXISTS chunks_au")
+                self.conn.execute("DROP TABLE IF EXISTS chunks_fts")
+                self.conn.commit()
+            self._create_fts5_objects()
+
+            # Probe FTS5 shadow tables. The schema may be intact but the
+            # internal _data/_idx/_docsize blob can still be corrupt — that
+            # surfaces as "database disk image is malformed" on bm25 / MATCH.
+            # We rebuild from the chunks table when that happens; data isn't
+            # lost because chunks (the content table) is the source of truth.
+            if self._fts5_shadow_corrupt():
+                from common.log import logger
+                logger.warning(
+                    "[MemoryStorage] FTS5 shadow tables corrupt; rebuilding from chunks."
+                )
+                self._rebuild_fts5_from_chunks()
+
        # Create files metadata table
        self.conn.execute("""
            CREATE TABLE IF NOT EXISTS files (
@@ -196,7 +188,116 @@ class MemoryStorage:
        """)
        
        self.conn.commit()
-    
+
+    def _fts5_state_inconsistent(self) -> bool:
+        """Detect a half-broken FTS5 setup (e.g. trigger exists but table doesn't)."""
+        try:
+            row = self.conn.execute(
+                "SELECT name FROM sqlite_master WHERE type='table' AND name='chunks_fts'"
+            ).fetchone()
+            table_exists = row is not None
+            row = self.conn.execute(
+                "SELECT COUNT(*) FROM sqlite_master WHERE type='trigger' "
+                "AND name IN ('chunks_ai','chunks_ad','chunks_au')"
+            ).fetchone()
+            trigger_count = int(row[0]) if row else 0
+        except Exception:
+            return False
+        # Healthy = both present (3 triggers + table) or both absent.
+        return table_exists != (trigger_count > 0)
+
+    def _create_fts5_objects(self):
+        """Create chunks_fts virtual table and the 3 sync triggers.
+
+        Idempotent: uses IF NOT EXISTS. Caller must hold self.conn.
+        """
+        self.conn.execute("""
+            CREATE VIRTUAL TABLE IF NOT EXISTS chunks_fts USING fts5(
+                text,
+                id UNINDEXED,
+                user_id UNINDEXED,
+                path UNINDEXED,
+                source UNINDEXED,
+                scope UNINDEXED,
+                content='chunks',
+                content_rowid='rowid'
+            )
+        """)
+        self.conn.execute("""
+            CREATE TRIGGER IF NOT EXISTS chunks_ai AFTER INSERT ON chunks BEGIN
+                INSERT INTO chunks_fts(rowid, text, id, user_id, path, source, scope)
+                VALUES (new.rowid, new.text, new.id, new.user_id, new.path, new.source, new.scope);
+            END
+        """)
+        self.conn.execute("""
+            CREATE TRIGGER IF NOT EXISTS chunks_ad AFTER DELETE ON chunks BEGIN
+                DELETE FROM chunks_fts WHERE rowid = old.rowid;
+            END
+        """)
+        self.conn.execute("""
+            CREATE TRIGGER IF NOT EXISTS chunks_au AFTER UPDATE ON chunks BEGIN
+                UPDATE chunks_fts SET text = new.text, id = new.id,
+                                     user_id = new.user_id, path = new.path,
+                                     source = new.source, scope = new.scope
+                WHERE rowid = new.rowid;
+            END
+        """)
+
+    def reset_fts5(self):
+        """Drop and recreate chunks_fts + triggers in one transaction.
+
+        Used by rebuild_index to recover from FTS5 shadow-table corruption
+        (bm25/ORDER BY rank may raise "database disk image is malformed"
+        even when raw MATCH still works).
+
+        Triggers must be dropped first; otherwise the next chunks INSERT/DELETE
+        on the existing connection will hit "no such table: chunks_fts".
+        """
+        if not self.fts5_available:
+            return
+        self.conn.execute("DROP TRIGGER IF EXISTS chunks_ai")
+        self.conn.execute("DROP TRIGGER IF EXISTS chunks_ad")
+        self.conn.execute("DROP TRIGGER IF EXISTS chunks_au")
+        self.conn.execute("DROP TABLE IF EXISTS chunks_fts")
+        self._create_fts5_objects()
+        self.conn.commit()
+
+    def _fts5_shadow_corrupt(self) -> bool:
+        """Probe whether bm25 over chunks_fts errors out at startup.
+
+        Schema (table + triggers) can be intact while the underlying
+        FTS5 shadow blobs are malformed — typically because the previous
+        process crashed mid-write or wrote with a different SQLite build.
+        A cheap MATCH probe surfaces it immediately."""
+        try:
+            self.conn.execute(
+                "SELECT bm25(chunks_fts) FROM chunks_fts WHERE chunks_fts MATCH 'a' LIMIT 1"
+            ).fetchone()
+            return False
+        except sqlite3.DatabaseError as e:
+            msg = str(e).lower()
+            return "malformed" in msg or "corrupt" in msg
+        except Exception:
+            # Any other error (e.g. table missing) is handled by the
+            # state-inconsistent path; treat as healthy here.
+            return False
+
+    def _rebuild_fts5_from_chunks(self):
+        """Drop FTS5, recreate it, then INSERT every row from chunks.
+
+        Safe data-wise: chunks (the content table) is the source of truth.
+        Done in one transaction so a crash leaves either fully old or fully
+        new state, not a partial rebuild.
+        """
+        # Reset schema first; this clears any malformed shadow blobs.
+        self.reset_fts5()
+        # Re-feed content. Triggers handle future writes automatically.
+        self.conn.execute("""
+            INSERT INTO chunks_fts(rowid, text, id, user_id, path, source, scope)
+            SELECT rowid, text, id, user_id, path, source, scope FROM chunks
+        """)
+        self.conn.commit()
+
    def save_chunk(self, chunk: MemoryChunk):
        """Save a memory chunk"""
        self.conn.execute("""
@@ -283,13 +384,26 @@ class MemoryStorage:
            """
        
        rows = self.conn.execute(query, params).fetchall()
-        
-        # Calculate cosine similarity
+
+        # Calculate cosine similarity. We probe the first row's dim to fail
+        # loudly on a query/index dim mismatch — otherwise every doc would
+        # score 0 silently, leaving the user wondering why search broke.
        results = []
+        query_dim = len(query_embedding)
+        if rows:
+            first = json.loads(rows[0]['embedding'])
+            if isinstance(first, list) and len(first) != query_dim:
+                raise ValueError(
+                    f"Embedding dim mismatch: query is {query_dim}-dim but "
+                    f"index stores {len(first)}-dim vectors. The configured "
+                    f"embedding model differs from the one that built the "
+                    f"index — run /memory rebuild-index to re-embed."
+                )
+
        for row in rows:
            embedding = json.loads(row['embedding'])
            similarity = self._cosine_similarity(query_embedding, embedding)
-            
+
            if similarity > 0:
                results.append((similarity, row))
        
@@ -319,27 +433,24 @@ class MemoryStorage:
    ) -> List[SearchResult]:
        """
        Keyword search using FTS5 + LIKE fallback
-        
+
        Strategy:
-        1. If FTS5 available: Try FTS5 search first (good for English and word-based languages)
-        2. If no FTS5 or no results and query contains CJK: Use LIKE search
+        1. If FTS5 available and healthy: try FTS5 first
+        2. Always fall back to LIKE for CJK queries
+        3. If FTS5 fails OR returns empty for non-CJK, also try LIKE so a
+           broken FTS5 shadow table doesn't silently kill keyword search.
        """
        if scopes is None:
            scopes = ["shared"]
            if user_id:
                scopes.append("user")
-        
-        # Try FTS5 search first (if available)
+
        if self.fts5_available:
            fts_results = self._search_fts5(query, user_id, scopes, limit)
            if fts_results:
                return fts_results
-        
-        # Fallback to LIKE search (always for CJK, or if FTS5 not available)
-        if not self.fts5_available or MemoryStorage._contains_cjk(query):
-            return self._search_like(query, user_id, scopes, limit)
-        
-        return []
+
+        return self._search_like(query, user_id, scopes, limit)
    
    def _search_fts5(
        self,
@@ -394,7 +505,11 @@ class MemoryStorage:
                )
                for row in rows
            ]
-        except Exception:
+        except Exception as e:
+            from common.log import logger
+            logger.error(
+                f"[MemoryStorage] FTS5 search failed (caller will fall back to LIKE): {e}"
+            )
            return []
    
    def _search_like(
@@ -404,21 +519,28 @@ class MemoryStorage:
        scopes: List[str],
        limit: int
    ) -> List[SearchResult]:
-        """LIKE-based search for CJK characters"""
+        """LIKE-based search.
+
+        Used as the keyword-search fallback when FTS5 is unavailable, fails,
+        or returns empty. Supports both CJK runs and ASCII word tokens so it
+        can serve as a true safety net for any query.
+        """
        import re
-        # Extract CJK words (2+ characters)
+        # CJK runs (2+ chars) + ASCII word tokens (3+ chars to avoid noise)
        cjk_words = re.findall(r'[\u4e00-\u9fff]{2,}', query)
-        if not cjk_words:
+        ascii_words = [t for t in re.findall(r'[A-Za-z0-9_]+', query) if len(t) >= 3]
+        words = cjk_words + ascii_words
+        if not words:
            return []
-        
+
        scope_placeholders = ','.join('?' * len(scopes))
-        
-        # Build LIKE conditions for each word
+
+        # Build LIKE conditions for each word (case-insensitive for ASCII)
        like_conditions = []
        params = []
-        for word in cjk_words:
-            like_conditions.append("text LIKE ?")
-            params.append(f'%{word}%')
+        for word in words:
+            like_conditions.append("LOWER(text) LIKE ?")
+            params.append(f'%{word.lower()}%')
        
        where_clause = ' OR '.join(like_conditions)
        params.extend(scopes)
@@ -455,7 +577,9 @@ class MemoryStorage:
                )
                for row in rows
            ]
-        except Exception:
+        except Exception as e:
+            from common.log import logger
+            logger.error(f"[MemoryStorage] LIKE search failed: {e}")
            return []
    
    def delete_by_path(self, path: str):
@@ -485,14 +609,19 @@ class MemoryStorage:
        chunks_count = self.conn.execute("""
            SELECT COUNT(*) as cnt FROM chunks
        """).fetchone()['cnt']
-        
+
        files_count = self.conn.execute("""
            SELECT COUNT(*) as cnt FROM files
        """).fetchone()['cnt']
-        
+
+        embedded_count = self.conn.execute("""
+            SELECT COUNT(*) as cnt FROM chunks WHERE embedding IS NOT NULL
+        """).fetchone()['cnt']
+
        return {
            'chunks': chunks_count,
-            'files': files_count
+            'files': files_count,
+            'embedded': embedded_count,
        }
    
    def close(self):
--- a/agent/memory/summarizer.py
+++ b/agent/memory/summarizer.py
@@ -1,12 +1,12 @@
 """
-Memory flush manager
+Memory flush manager with Deep Dream distillation

 Handles memory persistence when conversation context is trimmed or overflows:
- Uses LLM to summarize discarded messages into concise key-information entries
+- Uses LLM to summarize discarded messages into concise daily records
 - Writes to daily memory files (lazy creation)
 - Deduplicates trim flushes to avoid repeated writes
 - Runs summarization asynchronously to avoid blocking normal replies
- Provides daily summary interface for scheduler
+- Deep Dream: periodically distills daily memories → refined MEMORY.md + dream diary
 """

 import threading
@@ -16,29 +16,79 @@ from datetime import datetime
 from common.log import logger


-SUMMARIZE_SYSTEM_PROMPT = """你是一个记忆提取助手。你的任务是从对话记录中提炼出值得长期记住的关键事件和核心信息。
+SUMMARIZE_SYSTEM_PROMPT = """你是一个对话记录助手。请将对话内容归纳为当天的日常记录。

-核心原则：
- 按「事件」维度归纳，而不是按对话轮次逐条记录
- 多轮对话如果围绕同一件事，合并为一条摘要
- 只记录有长期价值的信息，忽略闲聊、问候、无意义的短消息
+## 要求

-输出要求：
-1. 每条一行，用 "- " 开头，格式为：事件/主题 + 关键结论或结果
-2. 值得记录的信息类型：用户提出的需求及最终解决方案、重要的事实信息、用户的偏好或决策、关键技术方案或配置变更
-3. 不值得记录的信息：简单问候、闲聊、无实质内容的短消息、重复的中间过程
-4. 每条摘要应当简明扼要，一句话概括事件的核心内容和结果
-5. 直接输出摘要内容，不要加任何前缀说明
-6. 当对话没有任何记录价值（仅含问候或无意义内容），回复"无"
+按「事件」维度归纳发生的事，不要按对话轮次逐条记录：
+- 每条一行，用 "- " 开头
+- 合并同一件事的多轮对话
+- 只记录有意义的事件，忽略闲聊和问候
+- 保留关键的决策、结论和待办事项

-示例（仅供参考格式）：
- 用户配置了 XX 功能，设置参数为 YY，已生效
- 用户反馈了 XX 问题，原因是 YY，通过 ZZ 方式解决"""
+当对话没有任何记录价值（仅含问候或无意义内容），直接回复"无"。"""

-SUMMARIZE_USER_PROMPT = """请从以下对话记录中，按关键事件维度提炼记忆摘要（合并同一事件的多轮对话，不要逐条列出）：
+SUMMARIZE_USER_PROMPT = """请归纳以下对话的日常记录：

 {conversation}"""

+# ---------------------------------------------------------------------------
+# Deep Dream prompts — distill daily memories → MEMORY.md + dream diary
+# ---------------------------------------------------------------------------
+
+DREAM_SYSTEM_PROMPT = """你是一个记忆整理助手，负责定期整理用户的长期记忆。
+
+你将收到两份材料：
+1. **当前长期记忆** — MEMORY.md 的全部现有内容
+2. **今日日记** — 当天的日常记录
+
+MEMORY.md 会注入每次对话的系统提示词中，因此必须保持精炼，只存放有价值和值得记忆的内容。
+
+**重要：只能基于提供的材料进行整理，严禁编造、推测或添加材料中不存在的信息。**
+
+## 任务
+
+### Part 1: 更新后的长期记忆（[MEMORY]）
+
+在现有记忆基础上进行整理和提炼，输出完整的更新后内容：
+- **合并提炼**：将含义相近的多条合并为一条高密度表述，而非简单罗列
+- **新增萃取**：从今日日记中提取值得永久记住的新信息（偏好、决策、人物、规则、经验）
+- **冲突更新**：当新信息与旧条目矛盾时，以新信息为准，替换旧条目
+- **清理无效**：删除临时性记录、空白条目、格式残留、无意义、重复内容等
+- **删除冗余**：已被更精炼表述涵盖的旧条目应删除，避免信息重复
+- 每条一行，用 "- " 开头，不带日期前缀
+- 可用 "## 标题" 对相关条目分组，使结构更清晰
+- 目标：控制在 50 条以内，每条尽量一句话概括
+
+### Part 2: 梦境日记（[DREAM]）
+
+用简洁的叙事风格写一篇短日记，记录这次整理的发现，保持格式美观易读：
+- 发现了哪些重复或矛盾
+- 从日记中提取了什么新洞察
+- 做了哪些清理和优化
+- 整体感受和观察
+
+## 输出格式（严格遵守）
+
+```
+[MEMORY]
+- 记忆条目1
+- 记忆条目2
+...
+
+[DREAM]
+梦境日记内容...
+```"""
+
+DREAM_USER_PROMPT = """## 当前长期记忆（MEMORY.md）
+
+{memory_content}
+
+## 近期日记（最近 {days} 天）
+
+{daily_content}"""
+
+

 class MemoryFlushManager:
    """
@@ -65,6 +115,8 @@ class MemoryFlushManager:
        self.last_flush_timestamp: Optional[datetime] = None
        self._trim_flushed_hashes: set = set()  # Content hashes of already-flushed messages
        self._last_flushed_content_hash: str = ""  # Content hash at last flush, for daily dedup
+        self._last_dream_input_hash: str = ""  # "{date}:{daily_hash}" of last dream, for dedup
+        self._last_flush_thread: Optional[threading.Thread] = None
    
    def get_today_memory_file(self, user_id: Optional[str] = None, ensure_exists: bool = False) -> Path:
        """Get today's memory file path: memory/YYYY-MM-DD.md"""
@@ -108,23 +160,30 @@ class MemoryFlushManager:
        user_id: Optional[str] = None,
        reason: str = "trim",
        max_messages: int = 0,
+        context_summary_callback: Optional[Callable[[str], None]] = None,
    ) -> bool:
        """
        Asynchronously summarize and flush messages to daily memory.
-        
+
        Deduplication runs synchronously, then LLM summarization + file write
        run in a background thread so the main reply flow is never blocked.
-        
-        Args:
-            messages: Conversation message list (OpenAI/Claude format)
-            user_id: Optional user ID for user-scoped memory
-            reason: Why flush was triggered ("trim" | "overflow" | "daily_summary")
-            max_messages: Max recent messages to summarize (0 = all)
-        
-        Returns:
-            True if flush was dispatched
+
+        If *context_summary_callback* is provided, it is called with the
+        [DAILY] portion of the LLM summary once available. The caller can use
+        this to inject the summary into the live message list for context
+        continuity — one LLM call serves both disk persistence and in-context
+        injection.
        """
        try:
+            # Strip scheduler-injected pairs before any further processing.
+            # These messages already serve as short-term context inside the
+            # receiver session; promoting them into long-term daily memory
+            # produces low-value flat logs (e.g. "11:28 price=1013, normal /
+            # 11:58 price=1013, normal / ...") and wastes summarisation tokens.
+            messages = self._strip_scheduler_pairs(messages)
+            if not messages:
+                return False
+
            import hashlib
            deduped = []
            for m in messages:
@@ -137,18 +196,19 @@ class MemoryFlushManager:
                    deduped.append(m)
            if not deduped:
                return False
-            
+
            import copy
            snapshot = copy.deepcopy(deduped)
            thread = threading.Thread(
                target=self._flush_worker,
-                args=(snapshot, user_id, reason, max_messages),
+                args=(snapshot, user_id, reason, max_messages, context_summary_callback),
                daemon=True,
            )
            thread.start()
            logger.info(f"[MemoryFlush] Async flush dispatched (reason={reason}, msgs={len(snapshot)})")
+            self._last_flush_thread = thread
            return True
-            
+
        except Exception as e:
            logger.warning(f"[MemoryFlush] Failed to dispatch flush (reason={reason}): {e}")
            return False
@@ -159,41 +219,69 @@ class MemoryFlushManager:
        user_id: Optional[str],
        reason: str,
        max_messages: int,
+        context_summary_callback: Optional[Callable[[str], None]] = None,
    ):
-        """Background worker: summarize with LLM and write to daily file."""
+        """Background worker: summarize with LLM, write daily memory file."""
        try:
-            summary = self._summarize_messages(messages, max_messages)
-            if not summary or not summary.strip() or summary.strip() == "无":
+            raw_summary = self._summarize_messages(messages, max_messages)
+            if not raw_summary or not raw_summary.strip() or raw_summary.strip() == "无":
                logger.info(f"[MemoryFlush] No valuable content to flush (reason={reason})")
                return
-            
+
+            # Strip legacy [DAILY]/[MEMORY] markers if model still outputs them
+            daily_part = self._clean_summary_output(raw_summary)
+            if not daily_part:
+                return
+
+            # --- Write daily memory ---
            daily_file = ensure_daily_memory_file(self.workspace_dir, user_id)
-            
-            if reason == "overflow":
-                header = f"## Context Overflow Recovery ({datetime.now().strftime('%H:%M')})"
-                note = "The following conversation was trimmed due to context overflow:\n"
-            elif reason == "trim":
-                header = f"## Trimmed Context ({datetime.now().strftime('%H:%M')})"
-                note = ""
-            elif reason == "daily_summary":
-                header = f"## Daily Summary ({datetime.now().strftime('%H:%M')})"
-                note = ""
-            else:
-                header = f"## Session Notes ({datetime.now().strftime('%H:%M')})"
-                note = ""
-            
-            flush_entry = f"\n{header}\n\n{note}{summary}\n"
-            
+
+            headers = {
+                "overflow": f"## Context Overflow Recovery ({datetime.now().strftime('%H:%M')})",
+                "trim": f"## Trimmed Context ({datetime.now().strftime('%H:%M')})",
+                "daily_summary": f"## Daily Summary ({datetime.now().strftime('%H:%M')})",
+            }
+            header = headers.get(reason, f"## Session Notes ({datetime.now().strftime('%H:%M')})")
+
            with open(daily_file, "a", encoding="utf-8") as f:
-                f.write(flush_entry)
-            
+                f.write(f"\n{header}\n\n{daily_part}\n")
+
+            logger.info(f"[MemoryFlush] Wrote daily memory to {daily_file.name} (reason={reason}, chars={len(daily_part)})")
+
+            # --- Inject context summary into live messages (if callback provided) ---
+            if context_summary_callback:
+                try:
+                    context_summary_callback(daily_part)
+                except Exception as e:
+                    logger.warning(f"[MemoryFlush] Context summary callback failed: {e}")
+
            self.last_flush_timestamp = datetime.now()
-            
-            logger.info(f"[MemoryFlush] Wrote to {daily_file.name} (reason={reason}, chars={len(summary)})")
-            
+
        except Exception as e:
            logger.warning(f"[MemoryFlush] Async flush failed (reason={reason}): {e}")
-    
+
+    @staticmethod
+    def _clean_summary_output(raw: str) -> str:
+        """Strip legacy [DAILY]/[MEMORY] markers if present, return clean daily text."""
+        raw = raw.strip()
+        if not raw or raw == "无":
+            return ""
+
+        # Strip [DAILY] marker
+        if "[DAILY]" in raw:
+            start = raw.index("[DAILY]") + len("[DAILY]")
+            end = raw.index("[MEMORY]") if "[MEMORY]" in raw else len(raw)
+            raw = raw[start:end].strip()
+
+        # Remove stray [MEMORY] section entirely
+        if "[MEMORY]" in raw:
+            raw = raw[:raw.index("[MEMORY]")].strip()
+
+        # Remove markdown code fences
+        raw = raw.replace("```", "").strip()
+
+        return raw
+
    def create_daily_summary(
        self,
        messages: List[Dict],
@@ -219,12 +307,192 @@ class MemoryFlushManager:
            reason="daily_summary",
            max_messages=0,
        )
-    
+
+    # ---- Deep Dream (memory distillation) ----
+
+    def deep_dream(self, user_id: Optional[str] = None, lookback_days: int = 1, force: bool = False) -> bool:
+        """
+        Distill recent daily memories into MEMORY.md and generate a dream diary.
+
+        Args:
+            lookback_days: How many days of daily files to read (default 1 for scheduled, 3 for manual)
+            force: Skip input-hash dedup check (used by manual /memory dream trigger)
+        """
+        if not self.llm_model:
+            logger.warning("[DeepDream] No LLM model available, skipping")
+            return False
+
+        logger.info(f"[DeepDream] Starting memory distillation (lookback={lookback_days} days)")
+
+        # Collect materials
+        memory_content = self._read_main_memory(user_id)
+        daily_content, has_content = self._read_recent_dailies(user_id, lookback_days)
+
+        if not has_content:
+            logger.info("[DeepDream] No recent daily records, skipping to preserve existing MEMORY.md")
+            return False
+
+        # Dedup: skip if same daily content already dreamed today.
+        # Note: only hash daily_content (not memory_content), because deep_dream
+        # itself rewrites MEMORY.md as a side effect, which would otherwise
+        # invalidate the hash on every subsequent call within the same window.
+        import hashlib
+        daily_hash = hashlib.md5(daily_content.encode("utf-8")).hexdigest()
+        today_str = datetime.now().strftime("%Y-%m-%d")
+        dedup_key = f"{today_str}:{daily_hash}"
+        if not force and dedup_key == self._last_dream_input_hash:
+            logger.info("[DeepDream] Already dreamed today with same daily content, skipping")
+            return False
+        self._last_dream_input_hash = dedup_key
+
+        logger.info(
+            f"[DeepDream] Materials collected: "
+            f"MEMORY.md={len(memory_content)} chars, "
+            f"daily={len(daily_content)} chars"
+        )
+
+        # Call LLM for distillation
+        import time as _time
+        t0 = _time.monotonic()
+        try:
+            user_msg = DREAM_USER_PROMPT.format(
+                memory_content=memory_content or "(empty)",
+                days=lookback_days,
+                daily_content=daily_content or "(no recent daily records)",
+            )
+            from agent.protocol.models import LLMRequest
+            # Scale max_tokens based on input size to avoid truncating large MEMORY.md
+            input_chars = len(memory_content) + len(daily_content)
+            dream_max_tokens = max(2000, min(input_chars, 8000))
+            request = LLMRequest(
+                messages=[{"role": "user", "content": user_msg}],
+                temperature=0.3,
+                max_tokens=dream_max_tokens,
+                stream=False,
+                system=DREAM_SYSTEM_PROMPT,
+            )
+            response = self.llm_model.call(request)
+            raw = self._extract_response_text(response)
+            elapsed = _time.monotonic() - t0
+            if not raw or not raw.strip():
+                logger.warning(f"[DeepDream] LLM returned empty response ({elapsed:.1f}s)")
+                return False
+            logger.info(f"[DeepDream] LLM distillation completed ({elapsed:.1f}s, {len(raw)} chars)")
+        except Exception as e:
+            elapsed = _time.monotonic() - t0
+            logger.warning(f"[DeepDream] LLM call failed ({elapsed:.1f}s): {e}")
+            return False
+
+        # Parse [MEMORY] and [DREAM] sections
+        new_memory, dream_diary = self._parse_dream_output(raw)
+
+        if not new_memory:
+            logger.warning("[DeepDream] No [MEMORY] section in LLM output, skipping overwrite")
+            return False
+
+        # Overwrite MEMORY.md
+        try:
+            main_file = self.get_main_memory_file(user_id)
+            old_size = len(memory_content)
+            main_file.write_text(new_memory + "\n", encoding="utf-8")
+            logger.info(
+                f"[DeepDream] Updated MEMORY.md "
+                f"({old_size} → {len(new_memory)} chars)"
+            )
+        except Exception as e:
+            logger.warning(f"[DeepDream] Failed to write MEMORY.md: {e}")
+            return False
+
+        # Write dream diary
+        if dream_diary:
+            try:
+                self._write_dream_diary(dream_diary, user_id)
+            except Exception as e:
+                logger.warning(f"[DeepDream] Failed to write dream diary: {e}")
+
+        logger.info("[DeepDream] ✅ Deep Dream completed successfully")
+        return True
+
+    def _read_main_memory(self, user_id: Optional[str] = None) -> str:
+        """Read current MEMORY.md content."""
+        main_file = self.get_main_memory_file(user_id)
+        if main_file.exists():
+            return main_file.read_text(encoding="utf-8").strip()
+        return ""
+
+    def _read_recent_dailies(
+        self, user_id: Optional[str] = None, lookback_days: int = 1
+    ) -> tuple:
+        """
+        Read recent daily memory files.
+
+        Returns:
+            (combined_text, has_content) tuple
+        """
+        from datetime import timedelta
+
+        parts = []
+        has_content = False
+        today = datetime.now().date()
+
+        for offset in range(lookback_days):
+            day = today - timedelta(days=offset)
+            date_str = day.strftime("%Y-%m-%d")
+            if user_id:
+                daily_file = self.memory_dir / "users" / user_id / f"{date_str}.md"
+            else:
+                daily_file = self.memory_dir / f"{date_str}.md"
+
+            if daily_file.exists():
+                content = daily_file.read_text(encoding="utf-8").strip()
+                if content:
+                    parts.append(f"### {date_str}\n\n{content}")
+                    has_content = True
+            else:
+                parts.append(f"### {date_str}\n\n(no records)")
+
+        return "\n\n".join(parts), has_content
+
+    @staticmethod
+    def _parse_dream_output(raw: str) -> tuple:
+        """Parse LLM output into (new_memory, dream_diary)."""
+        raw = raw.strip().replace("```", "")
+        new_memory = ""
+        dream_diary = ""
+
+        if "[MEMORY]" in raw:
+            start = raw.index("[MEMORY]") + len("[MEMORY]")
+            end = raw.index("[DREAM]") if "[DREAM]" in raw else len(raw)
+            new_memory = raw[start:end].strip()
+
+        if "[DREAM]" in raw:
+            start = raw.index("[DREAM]") + len("[DREAM]")
+            dream_diary = raw[start:].strip()
+
+        return new_memory, dream_diary
+
+    def _write_dream_diary(self, content: str, user_id: Optional[str] = None):
+        """Write dream diary to memory/dreams/YYYY-MM-DD.md."""
+        dreams_dir = self.memory_dir / "dreams"
+        if user_id:
+            dreams_dir = self.memory_dir / "users" / user_id / "dreams"
+        dreams_dir.mkdir(parents=True, exist_ok=True)
+
+        today = datetime.now().strftime("%Y-%m-%d")
+        diary_file = dreams_dir / f"{today}.md"
+        diary_file.write_text(
+            f"# Dream Diary: {today}\n\n{content}\n",
+            encoding="utf-8",
+        )
+        logger.info(f"[DeepDream] Wrote dream diary to {diary_file}")
+
    # ---- Internal helpers ----
    
    def _summarize_messages(self, messages: List[Dict], max_messages: int = 0) -> str:
        """
-        Summarize conversation messages using LLM, with rule-based fallback.
+        Summarize conversation messages using LLM.
+        Returns empty string if LLM deems content not worth recording.
+        Rule-based fallback only used when LLM call raises an exception.
        """
        conversation_text = self._format_conversation_for_summary(messages, max_messages)
        if not conversation_text.strip():
@@ -235,13 +503,14 @@ class MemoryFlushManager:
                summary = self._call_llm_for_summary(conversation_text)
                if summary and summary.strip() and summary.strip() != "无":
                    return summary.strip()
-                logger.info(f"[MemoryFlush] LLM returned empty or '无', using fallback")
+                logger.info("[MemoryFlush] LLM returned empty or '无', skipping write")
+                return ""
            except Exception as e:
                logger.warning(f"[MemoryFlush] LLM summarization failed, using fallback: {e}")
+                return self._extract_summary_fallback(messages, max_messages)
        else:
            logger.info("[MemoryFlush] No LLM model available, using rule-based fallback")
-        
-        return self._extract_summary_fallback(messages, max_messages)
+            return self._extract_summary_fallback(messages, max_messages)

    def _format_conversation_for_summary(self, messages: List[Dict], max_messages: int = 0) -> str:
        """Format messages into readable conversation text for LLM summarization."""
@@ -259,6 +528,52 @@ class MemoryFlushManager:
                lines.append(f"助手: {text[:500]}")
        return "\n".join(lines)

+    @staticmethod
+    def _extract_response_text(response) -> str:
+        """
+        Extract text from LLM response regardless of format.
+
+        Handles:
+        - Generator (MiniMax _handle_sync_response yields Claude-format dicts)
+        - Claude format: {"role":"assistant","content":[{"type":"text","text":"..."}]}
+        - OpenAI format: {"choices":[{"message":{"content":"..."}}]}
+        - OpenAI SDK response object with .choices attribute
+        """
+        import types
+
+        # Unwrap generator — consume first yielded item
+        if isinstance(response, types.GeneratorType):
+            try:
+                response = next(response)
+            except StopIteration:
+                return ""
+
+        if not response:
+            return ""
+
+        if isinstance(response, dict):
+            # Check for error
+            if response.get("error"):
+                raise RuntimeError(response.get("message", "LLM call failed"))
+
+            # Claude format: content is a list of blocks
+            content = response.get("content")
+            if isinstance(content, list):
+                for block in content:
+                    if isinstance(block, dict) and block.get("type") == "text":
+                        return block.get("text", "")
+
+            # OpenAI format
+            choices = response.get("choices", [])
+            if choices:
+                return choices[0].get("message", {}).get("content", "")
+
+        # OpenAI SDK response object
+        if hasattr(response, "choices") and response.choices:
+            return response.choices[0].message.content or ""
+
+        return ""
+
    def _call_llm_for_summary(self, conversation_text: str) -> str:
        """Call LLM to generate a concise summary of the conversation."""
        from agent.protocol.models import LLMRequest
@@ -272,27 +587,31 @@ class MemoryFlushManager:
        )
        
        response = self.llm_model.call(request)
-        
-        if isinstance(response, dict):
-            if response.get("error"):
-                raise RuntimeError(response.get("message", "LLM call failed"))
-            # OpenAI format
-            choices = response.get("choices", [])
-            if choices:
-                return choices[0].get("message", {}).get("content", "")
-        
-        # Handle response object with attribute access (e.g. OpenAI SDK response)
-        if hasattr(response, "choices") and response.choices:
-            return response.choices[0].message.content or ""
-        
-        return ""
+        return self._extract_response_text(response)
+
+    @staticmethod
+    def _extract_first_meaningful_line(text: str, max_len: int = 120) -> str:
+        """Extract the first meaningful line from assistant reply, skipping markdown noise."""
+        import re
+        for line in text.split("\n"):
+            line = line.strip()
+            if not line:
+                continue
+            # Skip markdown headings, horizontal rules, code fences, pure emoji/symbols
+            if re.match(r'^(#{1,4}\s|```|---|\*\*\*|[-*]\s*$|[^\w\u4e00-\u9fff]{1,5}$)', line):
+                continue
+            # Strip leading markdown bold/emoji decorations
+            cleaned = re.sub(r'^[\*#>\-\s]+', '', line).strip()
+            cleaned = re.sub(r'^[\U0001f300-\U0001f9ff\u2600-\u27bf\s]+', '', cleaned).strip()
+            if len(cleaned) >= 5:
+                return cleaned[:max_len]
+        return text.split("\n")[0].strip()[:max_len]

    @staticmethod
    def _extract_summary_fallback(messages: List[Dict], max_messages: int = 0) -> str:
        """
-        Rule-based fallback when LLM is unavailable.
-        Groups consecutive user+assistant messages into events instead of
-        listing each message individually.
+        Rule-based summary of discarded messages.
+        Format: "用户问了X; 助手回答了Y" per event, compact and readable.
        """
        msgs = messages if max_messages == 0 else messages[-max_messages * 2:]

@@ -306,19 +625,19 @@ class MemoryFlushManager:
            text = text.strip()

            if role == "user":
-                if len(text) <= 5:
+                if len(text) <= 3:
                    continue
-                current_user_text = text[:150]
+                current_user_text = text[:120]
            elif role == "assistant" and current_user_text:
-                first_line = text.split("\n")[0].strip()
-                if len(first_line) > 10:
-                    events.append(f"- {current_user_text} → {first_line[:150]}")
+                reply_summary = MemoryFlushManager._extract_first_meaningful_line(text)
+                if reply_summary:
+                    events.append(f"- 用户: {current_user_text} → 回复: {reply_summary}")
                else:
-                    events.append(f"- {current_user_text}")
+                    events.append(f"- 用户: {current_user_text}")
                current_user_text = ""

        if current_user_text:
-            events.append(f"- {current_user_text}")
+            events.append(f"- 用户: {current_user_text}")

        return "\n".join(events[:10])
    
@@ -337,6 +656,40 @@ class MemoryFlushManager:
            return "\n".join(parts)
        return ""

+    @classmethod
+    def _strip_scheduler_pairs(cls, messages: List[Dict]) -> List[Dict]:
+        """Drop scheduler-injected user/assistant pairs from a flush batch.
+
+        A scheduler user message starts with the ``[SCHEDULED]`` marker
+        (written by ``AgentBridge.remember_scheduled_output``); the message
+        immediately following it (if it is an assistant turn) is its paired
+        output and is dropped together. Regular user/assistant turns and
+        any tool_use / tool_result blocks are preserved as-is.
+        """
+        if not messages:
+            return messages
+
+        SCHEDULED_PREFIX = "[SCHEDULED]"
+        result = []
+        skip_next_assistant = False
+        for msg in messages:
+            if not isinstance(msg, dict):
+                result.append(msg)
+                skip_next_assistant = False
+                continue
+            role = msg.get("role")
+            if skip_next_assistant and role == "assistant":
+                skip_next_assistant = False
+                continue
+            skip_next_assistant = False
+            if role == "user":
+                text = cls._extract_text_from_content(msg.get("content", ""))
+                if text.lstrip().startswith(SCHEDULED_PREFIX):
+                    skip_next_assistant = True
+                    continue
+            result.append(msg)
+        return result
+

 def create_memory_files_if_needed(workspace_dir: Path, user_id: Optional[str] = None):
    """
--- a/agent/prompt/builder.py
+++ b/agent/prompt/builder.py
@@ -10,6 +10,7 @@ from typing import List, Dict, Optional, Any
 from dataclasses import dataclass

 from common.log import logger
+from config import conf


@dataclass
@@ -92,10 +93,11 @@ def build_agent_system_prompt(
    顺序说明（按重要性和逻辑关系排列）:
    1. 工具系统 - 核心能力，最先介绍
    2. 技能系统 - 紧跟工具，因为技能需要用 read 工具读取
-    3. 记忆系统 - 独立的记忆能力
+    3. 记忆系统 - 记忆检索与写入引导
+    3.5 知识系统 - 结构化知识库（knowledge/index.md 注入）
    4. 工作空间 - 工作环境说明
    5. 用户身份 - 用户信息（可选）
-    6. 项目上下文 - AGENT.md, USER.md, RULE.md, BOOTSTRAP.md（定义人格、身份、规则、初始化引导）
+    6. 项目上下文 - AGENT.md, USER.md, RULE.md, MEMORY.md, BOOTSTRAP.md
    7. 运行时信息 - 元信息（时间、模型等）
    
    Args:
@@ -126,6 +128,10 @@ def build_agent_system_prompt(
    # 3. 记忆系统（独立的记忆能力）
    if memory_manager:
        sections.extend(_build_memory_section(memory_manager, tools, language))
+
+    # 3.5 知识系统（结构化知识库）
+    if conf().get("knowledge", True):
+        sections.extend(_build_knowledge_section(workspace_dir, language))
    
    # 4. 工作空间（工作环境说明）
    sections.extend(_build_workspace_section(workspace_dir, language))
@@ -268,55 +274,105 @@ def _build_memory_section(memory_manager: Any, tools: Optional[List[Any]], langu
    """构建记忆系统section"""
    if not memory_manager:
        return []
-    
-    # 检查是否有memory工具
+
    has_memory_tools = False
    if tools:
        tool_names = [tool.name if hasattr(tool, 'name') else str(tool) for tool in tools]
        has_memory_tools = any(name in ['memory_search', 'memory_get'] for name in tool_names)
-    
+
    if not has_memory_tools:
        return []
-    
+
    from datetime import datetime
    today_file = datetime.now().strftime("%Y-%m-%d") + ".md"
-    
+
    lines = [
        "## 🧠 记忆系统",
        "",
-        "### 检索记忆",
+        "### Memory Recall（mandatory）",
        "",
-        "在回答关于以前的工作、决定、日期、人物、偏好或待办事项的任何问题之前：",
+        "当用户询问过往事件、引用之前的决定、提到人物关系、偏好、待办、或你对某事不确定时，**必须先检索记忆再回答**。",
+        "如果 MEMORY.md 中已有相关信息则无需重复检索。完整内容和每日记忆需要通过工具检索。",
        "",
-        "1. 不确定记忆文件位置 → 先用 `memory_search` 通过关键词和语义检索相关内容",
-        "2. 已知文件位置 → 直接用 `memory_get` 读取相应的行 (例如：MEMORY.md, memory/YYYY-MM-DD.md)",
-        "3. search 无结果 → 尝试用 `memory_get` 读取MEMORY.md及最近两天记忆文件",
+        "1. 不确定位置 → `memory_search` 关键词/语义检索",
+        "2. 已知位置 → `memory_get` 直接读取对应行",
+        "3. search 无结果 → `memory_get` 读最近两天记忆",
        "",
        "**记忆文件结构**:",
-        f"- `MEMORY.md`: 长期记忆（核心信息、偏好、决策等）",
+        "- `MEMORY.md`: 长期记忆索引（已自动加载到上下文，核心信息、偏好、决策等）",
        f"- `memory/YYYY-MM-DD.md`: 每日记忆，今天是 `memory/{today_file}`",
+        "- `knowledge/`: 结构化知识库（见下方知识系统）",
        "",
        "### 写入记忆",
        "",
-        "**主动存储**：遇到以下情况时，应主动将信息写入记忆文件（无需告知用户）：",
+        "遇到以下情况时，**主动**将信息写入记忆文件（无需告知用户）：",
        "",
-        "- 用户明确要求你记住某些信息",
+        "- 用户要求记住某些信息，或使用了「记住」「以后」「总是」「不要」「偏好」等表达",
        "- 用户分享了重要的个人偏好、习惯、决策",
        "- 对话中产生了重要的结论、方案、约定",
        "- 完成了复杂任务，值得记录关键步骤和结果",
-        "- 发现了用户经常遇到的问题或解决方案",
        "",
        "**存储规则**:",
-        f"- 长期有效的核心信息 → `MEMORY.md`（文件保持精简，< 2000 tokens）",
-        f"- 当天的事件、进展、笔记 → `memory/{today_file}`",
-        "- 追加内容 → `edit` 工具，oldText 留空",
-        "- 修改内容 → `edit` 工具，oldText 填写要替换的文本",
-        "- **禁止写入敏感信息**：API密钥、令牌等敏感信息严禁写入记忆文件",
+        f"- 长期核心信息 → `MEMORY.md`",
+        f"- 当天事件/进展 → `memory/{today_file}`",
+        "- 结构化知识 → `knowledge/`（见知识系统）",
+        "- 追加 → `edit` 工具，oldText 留空",
+        "- 修改 → `edit` 工具，oldText 填写要替换的文本",
+        "- **禁止写入敏感信息**（API密钥、令牌等）",
        "",
        "**使用原则**: 自然使用记忆，就像你本来就知道；不用刻意提起，除非用户问起。",
        "",
    ]
-    
+
+    return lines
+
+
+def _build_knowledge_section(workspace_dir: str, language: str) -> List[str]:
+    """Build knowledge wiki section. Injects knowledge/index.md when present."""
+    index_path = os.path.join(workspace_dir, "knowledge", "index.md")
+    if not os.path.exists(index_path):
+        return []
+
+    try:
+        with open(index_path, 'r', encoding='utf-8') as f:
+            index_content = f.read().strip()
+    except Exception:
+        return []
+
+    lines = [
+        "## 📚 知识系统",
+        "",
+        "你拥有一个持续积累的个人知识库 `knowledge/`，这是你的长期结构化知识存储。",
+        "",
+        "### 自动写入规则（mandatory）",
+        "",
+        "以下场景**必须**在回复的同时写入知识库，**直接写入，不要询问用户是否需要**：",
+        "",
+        "1. **用户分享了文章/链接/文档** → 阅读理解后，在同一轮回复中将要点写入 `knowledge/sources/<slug>.md`",
+        "2. **深度讨论产生了结论/方案** → 整理为 `knowledge/analysis/<slug>.md`",
+        "3. **对话涉及重要实体**（人物/公司/项目）→ 创建或更新 `knowledge/entities/<name>.md`",
+        "4. **讨论了技术概念/方法论** → 整理为 `knowledge/concepts/<topic>.md`",
+        "",
+        "每次写入知识页面后，**必须同步更新** `knowledge/index.md` 添加一行索引。",
+        "详细的页面格式和操作规范，请读取技能 `knowledge-wiki` 的 SKILL.md。",
+        "",
+        "⚠️ 不要问「要不要存到知识库」——符合上述场景就直接写入，这是你的本能行为。",
+        "",
+    ]
+
+    if index_content:
+        lines.extend([
+            "### 当前知识索引",
+            "",
+            index_content,
+            "",
+        ])
+
+    lines.extend([
+        "**查询方式**：用 `read` 读取知识页面，或用 `memory_search` 检索（知识已纳入向量索引）。",
+        "",
+    ])
+
    return lines


@@ -375,11 +431,12 @@ def _build_workspace_section(workspace_dir: str, language: str) -> List[str]:
        "",
        "**重要说明 - 文件已自动加载**:",
        "",
-        "以下文件在会话启动时**已经自动加载**到系统提示词的「项目上下文」section 中，你**无需再用 read 工具读取它们**：",
+        "以下文件在会话启动时**已经自动加载**到系统提示词中，你**无需再用 read 工具读取**：",
        "",
        "- ✅ `AGENT.md`: 已加载 - 你的人格和灵魂设定，请严格遵循。当你的名字、性格或交流风格发生变化时，主动用 `edit` 更新此文件",
        "- ✅ `USER.md`: 已加载 - 用户的身份信息。当用户修改称呼、姓名等身份信息时，用 `edit` 更新此文件",
        "- ✅ `RULE.md`: 已加载 - 工作空间使用指南和规则，请严格遵循",
+        "- ✅ `MEMORY.md`: 已加载 - 长期记忆索引",
        "",
        "**💬 交流规范**:",
        "",
--- a/agent/prompt/workspace.py
+++ b/agent/prompt/workspace.py
@@ -67,6 +67,12 @@ def ensure_workspace(workspace_dir: str, create_templates: bool = True) -> Works
    # 创建websites子目录 (for web pages / sites generated by agent)
    websites_dir = os.path.join(workspace_dir, "websites")
    os.makedirs(websites_dir, exist_ok=True)
+
+    from config import conf
+    knowledge_enabled = conf().get("knowledge", True)
+    if knowledge_enabled:
+        knowledge_dir = os.path.join(workspace_dir, "knowledge")
+        os.makedirs(knowledge_dir, exist_ok=True)
    
    # 如果需要，创建模板文件
    if create_templates:
@@ -74,6 +80,15 @@ def ensure_workspace(workspace_dir: str, create_templates: bool = True) -> Works
        _create_template_if_missing(user_path, _get_user_template())
        _create_template_if_missing(rule_path, _get_rule_template())
        _create_template_if_missing(memory_path, _get_memory_template())
+        if knowledge_enabled:
+            _create_template_if_missing(
+                os.path.join(knowledge_dir, "index.md"),
+                _get_knowledge_index_template()
+            )
+            _create_template_if_missing(
+                os.path.join(knowledge_dir, "log.md"),
+                _get_knowledge_log_template()
+            )
        
        # Only create BOOTSTRAP.md for brand new workspaces;
        # agent deletes it after completing onboarding
@@ -109,6 +124,7 @@ def load_context_files(workspace_dir: str, files_to_load: Optional[List[str]] =
            DEFAULT_AGENT_FILENAME,
            DEFAULT_USER_FILENAME,
            DEFAULT_RULE_FILENAME,
+            DEFAULT_MEMORY_FILENAME,     # Long-term memory (frozen snapshot)
            DEFAULT_BOOTSTRAP_FILENAME,  # Only exists when onboarding is incomplete
        ]
    
@@ -138,6 +154,10 @@ def load_context_files(workspace_dir: str, files_to_load: Optional[List[str]] =
            # 跳过空文件或只包含模板占位符的文件
            if not content or _is_template_placeholder(content):
                continue
+
+            # Truncate MEMORY.md to protect context window (frozen snapshot)
+            if filename == DEFAULT_MEMORY_FILENAME:
+                content = _truncate_memory_content(content)
            
            context_files.append(ContextFile(
                path=filename,
@@ -163,6 +183,36 @@ def _create_template_if_missing(filepath: str, template_content: str):
            logger.error(f"[Workspace] Failed to create template {filepath}: {e}")


+_MEMORY_MAX_LINES = 200
+_MEMORY_MAX_BYTES = 25000
+
+
+def _truncate_memory_content(content: str) -> str:
+    """Truncate MEMORY.md to keep system prompt manageable.
+
+    Takes the **last** N lines (newest entries are appended at the bottom),
+    subject to 200 lines / 25 KB limits (whichever is hit first).
+    Prepends a hint when truncated so the model knows older content exists.
+    """
+    lines = content.split('\n')
+    truncated = False
+
+    if len(lines) > _MEMORY_MAX_LINES:
+        lines = lines[-_MEMORY_MAX_LINES:]
+        truncated = True
+
+    result = '\n'.join(lines)
+    if len(result.encode('utf-8')) > _MEMORY_MAX_BYTES:
+        while len(result.encode('utf-8')) > _MEMORY_MAX_BYTES and lines:
+            lines.pop(0)
+            truncated = True
+        result = '\n'.join(lines)
+
+    if truncated:
+        result = "...(older entries truncated, use `memory_search` or `memory_get` for full content)\n\n" + result
+    return result
+
+
 def _is_template_placeholder(content: str) -> bool:
    """检查内容是否为模板占位符"""
    # 常见的占位符模式
@@ -287,39 +337,88 @@ def _get_rule_template() -> str:

 这个文件夹是你的家。好好对待它。

+## 工作空间目录结构
+
+```
+~/cow/
+├── AGENT.md          # 你的身份和灵魂设定
+├── USER.md           # 用户基本信息（静态）
+├── RULE.md           # 工作空间规则（本文件）
+├── MEMORY.md         # 长期记忆索引（会话启动时自动加载）
+│
+├── memory/           # 每日对话记忆
+│   └── YYYY-MM-DD.md # 当天事件、进展、笔记
+│
+├── knowledge/        # 结构化知识库（持续积累的知识）
+│   ├── index.md      # 知识目录索引（必须维护）
+│   ├── log.md        # 知识操作日志
+│   └── <子目录>/     # 按需创建，参考 index.md 已有分类
+│
+├── skills/           # 技能
+├── websites/         # 网页产物
+└── tmp/              # 系统临时文件（自动管理，勿手动存放重要文件）
+```
+
 ## 记忆系统

 你每次会话都是全新的，记忆文件让你保持连续性：

-### 📝 每日记忆：`memory/YYYY-MM-DD.md`
- 原始的对话日志
- 记录当天发生的事情
- 如果 `memory/` 目录不存在，创建它
-
 ### 🧠 长期记忆：`MEMORY.md`
- 你精选的记忆，就像人类的长期记忆
- **仅在主会话中加载**（与用户的直接聊天）
- **不要在共享上下文中加载**（群聊、与其他人的会话）
- 这是为了**安全** - 包含不应泄露给陌生人的个人上下文
- 记录重要事件、想法、决定、观点、经验教训
- 这是你精选的记忆 - 精华，而不是原始日志
- 用 `edit` 工具追加新的记忆内容
+- 你精选的记忆索引，每次会话启动时**自动加载**到上下文中
+- 记录核心事实、偏好、决策、重要人物、教训
+- 保持精简（< 200 行），是精华索引而非原始日志
+- 用 `edit` 工具追加或修改
+
+### 📝 每日记忆：`memory/YYYY-MM-DD.md`
+- 当天的事件、进展、笔记
+- 原始对话日志的沉淀

 ### 📝 写下来 - 不要"记在心里"！
- **记忆是有限的** - 如果你想记住某事，写入文件
+- **记忆是有限的** - 想记住的事就写入文件
 - "记在心里"不会在会话重启后保留，文件才会
 - 当有人说"记住这个" → 更新 `MEMORY.md` 或 `memory/YYYY-MM-DD.md`
 - 当你学到教训 → 更新 RULE.md 或相关技能
- 当你犯错 → 记录下来，这样未来的你不会重复，**文字 > 大脑** 📝
+- 当你犯错 → 记录下来，**文字 > 大脑** 📝

 ### 存储规则

 当用户分享信息时，根据类型选择存储位置：

-1. **你的身份设定 → AGENT.md**（你的名字、角色、性格、交流风格——用户修改时必须用 `edit` 更新）
-2. **用户静态身份 → USER.md**（姓名、称呼、职业、时区、联系方式、生日——用户修改时必须用 `edit` 更新）
-3. **动态记忆 → MEMORY.md**（爱好、偏好、决策、目标、项目、教训、待办事项）
+1. **你的身份设定 → AGENT.md**（名字、角色、性格、风格）
+2. **用户静态身份 → USER.md**（姓名、称呼、职业、联系方式、生日）
+3. **动态记忆 → MEMORY.md**（偏好、决策、目标、教训、待办）
 4. **当天对话 → memory/YYYY-MM-DD.md**（今天聊的内容）
+5. **结构化知识 → knowledge/**（见下方知识系统）
+
+## 知识系统
+
+知识库 `knowledge/` 是你持续积累的结构化知识。与记忆不同，知识是经过整理和编译的，有明确的主题和交叉引用。
+
+### 自动写入（不要询问，直接写入）
+
+当对话中产生了有沉淀价值的知识——无论是用户分享的资料、讨论的结论、学到的概念、还是重要的决策——你**必须**在回复的同时主动写入知识库，**无需问用户"要不要存到知识库"**。
+
+**关键原则**：学完就记是你的本能，不要征求确认。回复中可以顺带告知"已存入知识库"。
+
+### 目录组织
+
+子目录结构**不是固定的**，由你根据实际内容自主决定：
+- **首次写入时**：先读 `knowledge/index.md`，如果已有分类则延续；如果为空，根据内容选择合适的目录名
+- **默认建议**：按信息类型组织（例如sources/、concepts/、entities/、analysis/），如果用户有明确的分类偏好（例如按领域 work/、life/、tech/ 等），则按用户要求调整
+- **保持一致性**：同一用户的知识库应保持统一的组织风格
+
+### 交叉引用
+
+知识的核心价值在于**关联**。每个页面都应通过 markdown 链接引用相关页面，构建知识网络：
+- 提到已有页面的概念时，添加 `[概念名](../category/page.md)` 链接
+- 新建页面时，检查是否有已有页面应该反向链接到新页面
+- **只链接已存在的页面**——不要引用尚未创建的页面。如果某个概念值得单独建页，先创建该页面再添加链接
+
+### 索引维护
+
+每次创建或更新知识页面后，**必须同步更新** `knowledge/index.md`。
+索引格式：每行一个 `[标题](路径) — 一句话摘要`，按分类分组，不要用表格。
+详细操作规范见技能 `knowledge-wiki`。

 ## 安全

@@ -381,4 +480,12 @@ _你刚刚启动，这是你的第一次对话。_ ✨
 """


+def _get_knowledge_index_template() -> str:
+    """Knowledge wiki index template — empty file, agent fills it."""
+    return ""
+
+
+def _get_knowledge_log_template() -> str:
+    """Knowledge wiki operation log template — empty file, agent fills it."""
+    return ""

--- a/agent/protocol/agent_stream.py
+++ b/agent/protocol/agent_stream.py
@@ -13,6 +13,37 @@ from agent.tools.base_tool import BaseTool, ToolResult
 from common.log import logger


+# Maximum number of characters of model "reasoning / thinking" content to persist
+# in conversation history. The full reasoning is still streamed to the UI in real
+# time (subject to its own SSE / rendering limits); this bound only controls what
+# is stored in DB and replayed in history. Long reasoning is not useful for later
+# context (the LLM never sees thinking blocks anyway) and bloats DB.
+# Keep aligned with the frontend REASONING_RENDER_CAP and the SSE
+# MAX_REASONING_STREAM_CHARS so that storage / stream / display all match.
+MAX_STORED_REASONING_CHARS = 4 * 1024  # 4 KB
+
+# Marker inserted between head and tail when reasoning is truncated.
+_REASONING_TRUNCATE_MARKER = "\n\n... [reasoning truncated, {omitted} chars omitted] ...\n\n"
+
+
+def _truncate_reasoning_for_storage(text: str) -> str:
+    """Trim long reasoning to head + tail with an omission marker.
+
+    Keeps the first and last halves of MAX_STORED_REASONING_CHARS so both the
+    initial chain-of-thought and the final conclusions are preserved for UI
+    replay, without storing the entire (often very large) middle.
+    """
+    if not text:
+        return text
+    if len(text) <= MAX_STORED_REASONING_CHARS:
+        return text
+    half = MAX_STORED_REASONING_CHARS // 2
+    head = text[:half]
+    tail = text[-half:]
+    omitted = len(text) - len(head) - len(tail)
+    return head + _REASONING_TRUNCATE_MARKER.format(omitted=omitted) + tail
+
+
 class AgentStreamExecutor:
    """
    Agent Stream Executor
@@ -78,18 +109,48 @@ class AgentStreamExecutor:
            except Exception as e:
                logger.error(f"Event callback error: {e}")
    
+    def _is_thinking_enabled(self) -> bool:
+        """Whether deep-thinking mode is on at the model layer.
+
+        Mirrors the global toggle used by ``bridge.agent_bridge`` when deciding
+        whether to send ``thinking={"type": "enabled"}`` to the model. Used for
+        logging and reasoning-update event emission across all channels.
+        """
+        from config import conf
+        return bool(conf().get("enable_thinking", False))
+
+    def _should_render_thinking_inline(self) -> bool:
+        """Whether ``<think>...</think>`` blocks embedded directly in ``content``
+        (MiniMax, some third-party proxies) should be surfaced to the channel.
+
+        Only the Web console can render them in a collapsible panel. IM channels
+        (WeChat/WeCom/DingTalk/Feishu) must strip them, otherwise users see raw
+        XML tags in their chat.
+        """
+        from config import conf
+        channel_type = getattr(self.model, 'channel_type', '') or ''
+        return conf().get("enable_thinking", False) and channel_type == 'web'
+
    def _filter_think_tags(self, text: str) -> str:
        """
-        Remove <think> and </think> tags but keep the content inside.
-        Some LLM providers (e.g., MiniMax) may return thinking process wrapped in <think> tags.
-        We only remove the tags themselves, keeping the actual thinking content.
+        Handle <think>...</think> blocks in content returned by some LLM providers
+        (e.g., MiniMax).
+
+        - When inline thinking rendering is allowed (Web + thinking enabled):
+          remove only the tags, keep the content inside.
+        - Otherwise (IM channels, or thinking disabled globally): remove both
+          the tags and the content entirely.
        """
        if not text:
            return text
        import re
-        # Remove only the <think> and </think> tags, keep the content
-        text = re.sub(r'<think>', '', text)
-        text = re.sub(r'</think>', '', text)
+        if self._should_render_thinking_inline():
+            text = re.sub(r'<think>', '', text)
+            text = re.sub(r'</think>', '', text)
+        else:
+            text = re.sub(r'<think>[\s\S]*?</think>', '', text)
+            # Also strip unclosed <think> tag at the end (streaming partial)
+            text = re.sub(r'<think>[\s\S]*$', '', text)
        return text

    def _hash_args(self, args: dict) -> str:
@@ -178,7 +239,10 @@ class AgentStreamExecutor:
            Final response text
        """
        # Log user message with model info
-        logger.info(f"🤖 {self.model.model} | 👤 {user_message}")
+        
+        thinking_enabled = self._is_thinking_enabled()
+        thinking_label = " | 💭 thinking" if thinking_enabled else ""
+        logger.info(f"🤖 {self.model.model}{thinking_label} | 👤 {user_message}")        
        
        # Add user message (Claude format - use content blocks for consistency)
        self.messages.append({
@@ -227,6 +291,9 @@ class AgentStreamExecutor:
                        if turn > 1:
                            logger.info(f"[Agent] Requesting explicit response from LLM...")
                            
+                            # Remember position so we can remove the injected prompt later
+                            prompt_insert_idx = len(self.messages)
+                            
                            # 添加一条消息，明确要求回复用户
                            self.messages.append({
                                "role": "user",
@@ -240,8 +307,24 @@ class AgentStreamExecutor:
                            assistant_msg, tool_calls = self._call_llm_stream(retry_on_empty=False)
                            final_response = assistant_msg
                            
-                            # 如果还是空，才使用 fallback
-                            if not assistant_msg and not tool_calls:
+                            # Remove the injected prompt from history so it doesn't
+                            # appear as a user message in persisted conversations.
+                            # _call_llm_stream may have appended an assistant message
+                            # after the prompt, so we locate and remove only the prompt.
+                            if (prompt_insert_idx < len(self.messages)
+                                    and self.messages[prompt_insert_idx].get("role") == "user"):
+                                self.messages.pop(prompt_insert_idx)
+                                logger.debug("[Agent] Removed injected explicit-response prompt from message history")
+                            
+                            # If LLM responded with tool_calls instead of text, fall through
+                            # to the tool execution path below (don't break the loop).
+                            if tool_calls:
+                                logger.info(
+                                    f"[Agent] LLM returned tool_calls in explicit-response retry, "
+                                    f"continuing to execute tools instead of breaking"
+                                )
+                            elif not assistant_msg:
+                                # Still empty (no text and no tool_calls): use fallback
                                logger.warning(f"[Agent] Still empty after explicit request")
                                final_response = (
                                    "抱歉，我暂时无法生成回复。请尝试换一种方式描述你的需求，或稍后再试。"
@@ -256,20 +339,28 @@ class AgentStreamExecutor:
                    else:
                        logger.info(f"💭 {assistant_msg[:150]}{'...' if len(assistant_msg) > 150 else ''}")
                    
-                    logger.debug(f"✅ 完成 (无工具调用)")
-                    self._emit_event("turn_end", {
-                        "turn": turn,
-                        "has_tool_calls": False
-                    })
-                    break
+                    # If the explicit-response retry produced tool_calls, skip the break
+                    # and continue down to the tool execution branch in this same iteration.
+                    if not tool_calls:
+                        logger.debug(f"✅ 完成 (无工具调用)")
+                        self._emit_event("turn_end", {
+                            "turn": turn,
+                            "has_tool_calls": False
+                        })
+                        break

-                # Log tool calls with arguments
+                # Log tool calls with arguments (truncate long values like base64)
                tool_calls_str = []
                for tc in tool_calls:
-                    # Safely handle None or missing arguments
                    args = tc.get('arguments') or {}
                    if isinstance(args, dict):
-                        args_str = ', '.join([f"{k}={v}" for k, v in args.items()])
+                        parts = []
+                        for k, v in args.items():
+                            v_str = str(v)
+                            if len(v_str) > 200:
+                                v_str = v_str[:200] + f"...({len(v_str)} chars)"
+                            parts.append(f"{k}={v_str}")
+                        args_str = ', '.join(parts)
                        if args_str:
                            tool_calls_str.append(f"{tc['name']}({args_str})")
                        else:
@@ -503,15 +594,33 @@ class AgentStreamExecutor:
        turns = self._identify_complete_turns()
        logger.info(f"Sending {len(messages)} messages ({len(turns)} turns) to LLM")

-        # Prepare tool definitions (OpenAI/Claude format)
+        # Pull in any MCP tools that finished loading since this turn started.
+        # Cheap dict reconciliation (microseconds) — lets the agent pick up
+        # newly available MCP tools mid-conversation without a session restart.
+        try:
+            from agent.tools import ToolManager
+            ToolManager().sync_mcp_into_agent(self)
+        except Exception as e:
+            logger.debug(f"[Agent] MCP sync skipped: {e}")
+
+        # Prepare tool definitions. Prefer get_json_schema() when it yields
+        # real properties (lets tools augment schema at runtime), otherwise
+        # fall back to the static `tool.params` (MCP tools rely on this).
        tools_schema = None
        if self.tools:
            tools_schema = []
            for tool in self.tools.values():
+                input_schema = tool.params
+                try:
+                    dynamic = (tool.get_json_schema() or {}).get("parameters") or {}
+                    if dynamic.get("properties"):
+                        input_schema = dynamic
+                except Exception:
+                    pass
                tools_schema.append({
                    "name": tool.name,
                    "description": tool.description,
-                    "input_schema": tool.params  # Claude uses input_schema
+                    "input_schema": input_schema,
                })

        # Create request
@@ -527,6 +636,7 @@ class AgentStreamExecutor:

        # Streaming response
        full_content = ""
+        full_reasoning = ""
        tool_calls_buffer = {}  # {index: {id, name, arguments}}
        gemini_raw_parts = None  # Preserve Gemini thoughtSignature for round-trip
        stop_reason = None  # Track why the stream stopped
@@ -584,10 +694,11 @@ class AgentStreamExecutor:
                    if finish_reason:
                        stop_reason = finish_reason

-                    # Skip reasoning_content (internal thinking from models like GLM-5)
                    reasoning_delta = delta.get("reasoning_content") or ""
-                    # if reasoning_delta:
-                    #     logger.debug(f"🧠 [thinking] {reasoning_delta[:100]}...")
+                    if reasoning_delta:
+                        full_reasoning += reasoning_delta
+                        if self._is_thinking_enabled():
+                            self._emit_event("reasoning_update", {"delta": reasoning_delta})

                    # Handle text content
                    content_delta = delta.get("content") or ""
@@ -621,8 +732,11 @@ class AgentStreamExecutor:
                                    tool_calls_buffer[index]["arguments"] += func["arguments"]

                    # Preserve _gemini_raw_parts for Gemini thoughtSignature round-trip
+                    # (direct Gemini: list of parts; LinkAI proxy: base64 string of JSON parts)
                    if "_gemini_raw_parts" in delta:
                        gemini_raw_parts = delta["_gemini_raw_parts"]
+                    elif isinstance(choice, dict) and choice.get("_gemini_raw_parts"):
+                        gemini_raw_parts = choice["_gemini_raw_parts"]

        except Exception as e:
            error_str = str(e)
@@ -788,7 +902,18 @@ class AgentStreamExecutor:
        # Add assistant message to history (Claude format uses content blocks)
        assistant_msg = {"role": "assistant", "content": []}

-        # Add text content block if present
+        if full_reasoning:
+            stored_reasoning = _truncate_reasoning_for_storage(full_reasoning)
+            if len(stored_reasoning) < len(full_reasoning):
+                logger.info(
+                    f"[reasoning] truncated for storage: "
+                    f"{len(full_reasoning)} -> {len(stored_reasoning)} chars"
+                )
+            assistant_msg["content"].append({
+                "type": "thinking",
+                "thinking": stored_reasoning
+            })
+
        if full_content:
            assistant_msg["content"].append({
                "type": "text",
@@ -1192,6 +1317,56 @@ class AgentStreamExecutor:
        logger.warning("🔧 Aggressive trim: nothing to trim, will clear history")
        return False

+    def _build_context_summary_callback(self, discarded_turns: list, kept_turns: list):
+        """
+        Build a callback that injects an LLM summary into the first user
+        message of *kept_turns*. Returns None if no valid injection target.
+
+        The callback is passed to flush_from_messages so that the same LLM
+        call that writes daily memory also provides the in-context summary.
+        """
+        if not kept_turns:
+            return None
+
+        # Find the first user text block in kept_turns as injection target
+        target_block = None
+        for turn in kept_turns:
+            for msg in turn["messages"]:
+                if msg.get("role") == "user":
+                    content = msg.get("content", [])
+                    if isinstance(content, list):
+                        for block in content:
+                            if isinstance(block, dict) and block.get("type") == "text":
+                                target_block = block
+                                break
+                    if target_block:
+                        break
+            if target_block:
+                break
+
+        if not target_block:
+            return None
+
+        turn_count = len(discarded_turns)
+        original_text = target_block["text"]
+
+        def _on_summary_ready(summary: str):
+            if not summary or not summary.strip():
+                return
+            target_block["text"] = (
+                f"[System: Previous conversation summary — "
+                f"{turn_count} turns were compacted]\n\n"
+                f"{summary.strip()}\n\n"
+                f"The recent conversation continues below.\n\n---\n\n"
+                f"{original_text}"
+            )
+            logger.info(
+                f"📝 Context summary injected "
+                f"({len(summary)} chars, {turn_count} turns)"
+            )
+
+        return _on_summary_ready
+
    def _trim_messages(self):
        """
        智能清理消息历史，保持对话完整性
@@ -1218,25 +1393,28 @@ class AgentStreamExecutor:
            removed_count = len(turns) // 2
            keep_count = len(turns) - removed_count
            
-            # Flush discarded turns to daily memory
-            if self.agent.memory_manager:
-                discarded_messages = []
-                for turn in turns[:removed_count]:
-                    discarded_messages.extend(turn["messages"])
-                if discarded_messages:
-                    user_id = getattr(self.agent, '_current_user_id', None)
-                    self.agent.memory_manager.flush_memory(
-                        messages=discarded_messages, user_id=user_id,
-                        reason="trim", max_messages=0
-                    )
-            
+            discarded_turns = turns[:removed_count]
            turns = turns[-keep_count:]
-            
+
            logger.info(
                f"💾 上下文轮次超限: {keep_count + removed_count} > {self.max_context_turns}，"
                f"裁剪至 {keep_count} 轮（移除 {removed_count} 轮）"
            )

+            # Flush to daily memory + inject context summary (single async LLM call)
+            if self.agent.memory_manager:
+                discarded_messages = []
+                for turn in discarded_turns:
+                    discarded_messages.extend(turn["messages"])
+                if discarded_messages:
+                    user_id = getattr(self.agent, '_current_user_id', None)
+                    cb = self._build_context_summary_callback(discarded_turns, turns)
+                    self.agent.memory_manager.flush_memory(
+                        messages=discarded_messages, user_id=user_id,
+                        reason="trim", max_messages=0,
+                        context_summary_callback=cb,
+                    )
+
        # Step 3: Token 限制 - 保留完整轮次
        # Get context window from agent (based on model)
        context_window = self.agent._get_model_context_window()
@@ -1312,6 +1490,7 @@ class AgentStreamExecutor:
        # --- Many turns (>=5): discard the older half, keep the newer half ---
        removed_count = len(turns) // 2
        keep_count = len(turns) - removed_count
+        discarded_turns = turns[:removed_count]
        kept_turns = turns[-keep_count:]
        kept_tokens = sum(self._estimate_turn_tokens(t) for t in kept_turns)

@@ -1322,13 +1501,15 @@ class AgentStreamExecutor:

        if self.agent.memory_manager:
            discarded_messages = []
-            for turn in turns[:removed_count]:
+            for turn in discarded_turns:
                discarded_messages.extend(turn["messages"])
            if discarded_messages:
                user_id = getattr(self.agent, '_current_user_id', None)
+                cb = self._build_context_summary_callback(discarded_turns, kept_turns)
                self.agent.memory_manager.flush_memory(
                    messages=discarded_messages, user_id=user_id,
-                    reason="trim", max_messages=0
+                    reason="trim", max_messages=0,
+                    context_summary_callback=cb,
                )

        new_messages = []
--- a/agent/skills/manager.py
+++ b/agent/skills/manager.py
@@ -210,6 +210,10 @@ class SkillManager:
        if not include_disabled:
            entries = [e for e in entries if self.is_skill_enabled(e.skill.name)]

+        from config import conf
+        if not conf().get("knowledge", True):
+            entries = [e for e in entries if e.skill.name != "knowledge-wiki"]
+
        return entries

    def filter_unavailable_skills(
--- a/agent/tools/init.py
+++ b/agent/tools/init.py
@@ -107,6 +107,22 @@ def _import_browser_tool():

 BrowserTool = _import_browser_tool()

+# MCP Tools (no extra dependencies, loaded on demand)
+def _import_mcp_tools():
+    """导入 MCP 工具模块（无额外依赖，按需加载）"""
+    from common.log import logger
+    try:
+        from agent.tools.mcp.mcp_tool import McpTool
+        from agent.tools.mcp.mcp_client import McpClientRegistry
+        return {'McpTool': McpTool, 'McpClientRegistry': McpClientRegistry}
+    except Exception as e:
+        logger.warning(f"[Tools] MCP tools not loaded: {e}")
+        return {}
+
+_mcp_tools = _import_mcp_tools()
+McpTool = _mcp_tools.get('McpTool')
+McpClientRegistry = _mcp_tools.get('McpClientRegistry')
+
 # Export all tools (including optional ones that might be None)
 __all__ = [
    'BaseTool',
@@ -125,6 +141,7 @@ __all__ = [
    'WebFetch',
    'Vision',
    'BrowserTool',
+    'McpTool',
 ]

 """
--- a/agent/tools/bash/bash.py
+++ b/agent/tools/bash/bash.py
@@ -29,7 +29,7 @@ ENVIRONMENT: All API keys from env_config are auto-injected. Use $VAR_NAME direc

 SAFETY:
 - Freely create/modify/delete files within the workspace
- For destructive and out-of-workspace commands, explain and confirm first"""
+- For destructive commands out of workspace, explain and confirm first"""

    params: dict = {
        "type": "object",
@@ -169,10 +169,16 @@ SAFETY:
                except Exception as retry_err:
                    logger.warning(f"[Bash] Retry failed: {retry_err}")

-            # Combine stdout and stderr
-            output = result.stdout
-            if result.stderr:
-                output += "\n" + result.stderr
+            # When command succeeds with stdout, keep output clean (stderr goes to server log only).
+            # When command fails or stdout is empty, include stderr so the agent can diagnose.
+            if result.returncode == 0 and result.stdout.strip():
+                output = result.stdout
+                if result.stderr:
+                    logger.info(f"[Bash] stderr (not forwarded): {result.stderr[:500]}")
+            else:
+                output = result.stdout
+                if result.stderr:
+                    output += "\n" + result.stderr

            # Check if we need to save full output to temp file
            temp_file_path = None
@@ -232,48 +238,43 @@ SAFETY:

    def _get_safety_warning(self, command: str) -> str:
        """
-        Get safety warning for potentially dangerous commands
-        Only warns about extremely dangerous system-level operations
-        
+        Get safety warning for absolutely catastrophic commands only.
+        Keep the blocklist minimal so the agent retains maximum freedom.
+
        :param command: Command to check
        :return: Warning message if dangerous, empty string if safe
        """
-        cmd_lower = command.lower().strip()
+        # Tokenize to avoid substring false positives (e.g. `rm -rf /tmp/x`
+        # must not match `rm -rf /`).
+        tokens = command.lower().split()

-        # Only block extremely dangerous system operations
-        dangerous_patterns = [
-            # System shutdown/reboot
-            ("shutdown", "This command will shut down the system"),
-            ("reboot", "This command will reboot the system"),
-            ("halt", "This command will halt the system"),
-            ("poweroff", "This command will power off the system"),
+        # `rm -rf /` or `rm -rf /*` targeting the real root.
+        for i, tok in enumerate(tokens):
+            if tok != "rm":
+                continue
+            has_rf = False
+            for j in range(i + 1, len(tokens)):
+                t = tokens[j]
+                if t.startswith("-") and "r" in t and "f" in t:
+                    has_rf = True
+                elif t in ("--recursive", "--force"):
+                    continue
+                elif t in ("/", "/*"):
+                    if has_rf:
+                        return "This command will delete the entire filesystem"
+                    break
+                else:
+                    break

-            # Critical system modifications
-            ("rm -rf /", "This command will delete the entire filesystem"),
-            ("rm -rf /*", "This command will delete the entire filesystem"),
-            ("dd if=/dev/zero", "This command can destroy disk data"),
-            ("mkfs", "This command will format a filesystem, destroying all data"),
-            ("fdisk", "This command modifies disk partitions"),
+        # Disk wiping
+        if "if=/dev/zero" in command.lower() and "dd " in command.lower():
+            return "This command can destroy disk data"

-            # User/system management (only if targeting system users)
-            ("userdel root", "This command will delete the root user"),
-            ("passwd root", "This command will change the root password"),
-        ]
+        # Power control - match only as a standalone word (\b enforces word boundary)
+        if re.search(r'\b(shutdown|reboot|halt|poweroff)\b', command.lower()):
+            return "This command will shut down or restart the system"

-        for pattern, warning in dangerous_patterns:
-            if pattern in cmd_lower:
-                return warning
-
-        # Check for recursive deletion outside workspace
-        if "rm" in cmd_lower and "-rf" in cmd_lower:
-            # Allow deletion within current workspace
-            if not any(path in cmd_lower for path in ["./", self.cwd.lower()]):
-                # Check if targeting system directories
-                system_dirs = ["/bin", "/usr", "/etc", "/var", "/home", "/root", "/sys", "/proc"]
-                if any(sysdir in cmd_lower for sysdir in system_dirs):
-                    return "This command will recursively delete system directories"
-
-        return ""  # No warning needed
+        return ""

    @staticmethod
    def _convert_env_vars_for_windows(command: str, dotenv_vars: dict) -> str:
--- a/agent/tools/browser/browser_service.py
+++ b/agent/tools/browser/browser_service.py
@@ -15,6 +15,10 @@ import threading
 from typing import Optional, Dict, Any, List, Callable

 from common.log import logger
+from common.utils import expand_path
+
+
+_DEFAULT_USER_DATA_DIR = "~/.cow/browser_profile"

 try:
    from playwright.sync_api import sync_playwright, Browser, BrowserContext, Page, Playwright
@@ -212,6 +216,21 @@ _SNAPSHOT_JS = """
 )


+_BROWSER_DEAD_HINTS = (
+    "has been closed",
+    "browser has disconnected",
+    "target closed",
+    "browser closed",
+    "context or browser has been closed",
+)
+
+
+def _is_browser_dead_error(err: Exception) -> bool:
+    """Return True if *err* indicates the browser / page died out from under us."""
+    msg = str(err).lower()
+    return any(h in msg for h in _BROWSER_DEAD_HINTS)
+
+
 def _should_use_headless() -> bool:
    """Decide headless mode: headless on Linux servers without display, headed elsewhere."""
    if sys.platform in ("win32", "darwin"):
@@ -302,11 +321,38 @@ class BrowserService:
        self._context = None
        self._page = None

+        # Launch mode: one of "fresh" | "persistent" | "cdp".
+        # - cdp: connect to an externally launched Chrome via CDP endpoint.
+        # - persistent: launch with launch_persistent_context using a user_data_dir
+        #   so cookies / login state survive across runs (default).
+        # - fresh: classic launch + new_context, clean state every run.
+        cdp_endpoint = self._config.get("cdp_endpoint") or ""
+        persistent_flag = self._config.get("persistent", True)
+        user_data_dir_cfg = self._config.get("user_data_dir")
+        if user_data_dir_cfg is None:
+            user_data_dir_cfg = _DEFAULT_USER_DATA_DIR
+
+        self._cdp_endpoint: str = cdp_endpoint.strip() if isinstance(cdp_endpoint, str) else ""
+        if self._cdp_endpoint:
+            self._launch_mode = "cdp"
+            self._user_data_dir: str = ""
+        elif persistent_flag and user_data_dir_cfg:
+            self._launch_mode = "persistent"
+            self._user_data_dir = expand_path(str(user_data_dir_cfg))
+        else:
+            self._launch_mode = "fresh"
+            self._user_data_dir = ""
+
        # Idle auto-release
        idle_cfg = self._config.get("idle_timeout")
        self._idle_timeout: float = float(idle_cfg) if idle_cfg is not None else self._IDLE_TIMEOUT_DEFAULT
        self._idle_timer: Optional[threading.Timer] = None

+        # Set when the browser / page is detected to have died externally
+        # (e.g. user manually closed the window). The next _submit() will then
+        # tear down the stale thread and relaunch.
+        self._needs_restart = False
+
    # ------------------------------------------------------------------
    # Background-thread lifecycle
    # ------------------------------------------------------------------
@@ -354,6 +400,12 @@ class BrowserService:
                result_slot["value"] = fn(*args, **kwargs)
            except Exception as e:
                result_slot["error"] = e
+                if _is_browser_dead_error(e):
+                    self._needs_restart = True
+                    logger.warning(
+                        f"[Browser] Detected closed page/context ({e}); "
+                        "will relaunch on next request."
+                    )
            finally:
                result_slot["event"].set()

@@ -375,7 +427,7 @@ class BrowserService:
            result_slot["event"].set()

    def _launch_browser(self):
-        """Launch Chromium on the background thread."""
+        """Launch / connect Chromium on the background thread."""
        if self._headless is None:
            headless_cfg = self._config.get("headless")
            self._headless = headless_cfg if headless_cfg is not None else _should_use_headless()
@@ -390,36 +442,142 @@ class BrowserService:

        viewport_w = self._config.get("viewport_width", 1280)
        viewport_h = self._config.get("viewport_height", 720)
+        viewport = {"width": viewport_w, "height": viewport_h}
+        user_agent = (
+            "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
+            "AppleWebKit/537.36 (KHTML, like Gecko) "
+            "Chrome/131.0.0.0 Safari/537.36"
+        )

        self._playwright = sync_playwright().start()
-        logger.info(f"[Browser] Launching Chromium (headless={self._headless})")
+
+        if self._launch_mode == "cdp":
+            self._connect_cdp(viewport)
+        elif self._launch_mode == "persistent":
+            self._launch_persistent(launch_args, viewport, user_agent)
+        else:
+            self._launch_fresh(launch_args, viewport, user_agent)
+
+        logger.info("[Browser] Browser ready")
+
+    def _launch_fresh(self, launch_args: List[str], viewport: Dict[str, int], user_agent: str):
+        """Classic launch: brand new Chromium with an empty context."""
+        logger.info(f"[Browser] Launching Chromium (fresh, headless={self._headless})")
        self._browser = self._playwright.chromium.launch(
            headless=self._headless,
            args=launch_args,
        )
        self._context = self._browser.new_context(
-            viewport={"width": viewport_w, "height": viewport_h},
-            user_agent=(
-                "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
-                "AppleWebKit/537.36 (KHTML, like Gecko) "
-                "Chrome/131.0.0.0 Safari/537.36"
-            ),
+            viewport=viewport,
+            user_agent=user_agent,
        )
        self._page = self._context.new_page()
-        logger.info("[Browser] Browser ready")
+        self._wire_close_listeners()
+
+    def _launch_persistent(self, launch_args: List[str], viewport: Dict[str, int], user_agent: str):
+        """Launch Chromium with a persistent user_data_dir so login state survives."""
+        os.makedirs(self._user_data_dir, exist_ok=True)
+        logger.info(
+            f"[Browser] Launching Chromium (persistent, headless={self._headless}, "
+            f"profile={self._user_data_dir})"
+        )
+        try:
+            self._context = self._playwright.chromium.launch_persistent_context(
+                user_data_dir=self._user_data_dir,
+                headless=self._headless,
+                args=launch_args,
+                viewport=viewport,
+                user_agent=user_agent,
+            )
+        except Exception as e:
+            # Profile is locked when another Chromium instance already holds it.
+            msg = str(e).lower()
+            if "singletonlock" in msg or "profile" in msg or "lock" in msg:
+                raise RuntimeError(
+                    f"Browser profile '{self._user_data_dir}' is in use by another process. "
+                    "Close the other Chromium / cow instance, or set a different "
+                    "tools.browser.user_data_dir."
+                ) from e
+            raise
+
+        # Persistent context has no parent Browser handle; reuse the auto-created page.
+        self._browser = None
+        pages = self._context.pages
+        self._page = pages[0] if pages else self._context.new_page()
+        self._wire_close_listeners()
+
+    def _connect_cdp(self, viewport: Dict[str, int]):
+        """Attach to an existing Chrome started with --remote-debugging-port."""
+        endpoint = self._cdp_endpoint
+        logger.info(f"[Browser] Connecting to existing Chrome via CDP: {endpoint}")
+        try:
+            self._browser = self._playwright.chromium.connect_over_cdp(endpoint)
+        except Exception as e:
+            msg = str(e).lower()
+            if "econnrefused" in msg or "connect" in msg or "refused" in msg:
+                raise RuntimeError(
+                    f"Cannot reach Chrome at {endpoint}. The CDP browser is not "
+                    "running. Ask the user to launch Chrome with "
+                    "--remote-debugging-port and --user-data-dir, then retry. "
+                    "Do not retry this tool until the user confirms."
+                ) from e
+            raise
+
+        contexts = self._browser.contexts
+        if contexts:
+            self._context = contexts[0]
+        else:
+            self._context = self._browser.new_context(viewport=viewport)
+
+        pages = self._context.pages
+        self._page = pages[0] if pages else self._context.new_page()
+        self._wire_close_listeners()
+
+    def _wire_close_listeners(self):
+        """Mark needs_restart whenever the browser / context / page dies externally."""
+        def _on_dead(_obj=None):
+            self._needs_restart = True
+
+        try:
+            if self._browser:
+                self._browser.on("disconnected", _on_dead)
+            if self._context:
+                self._context.on("close", _on_dead)
+            if self._page:
+                self._page.on("close", _on_dead)
+        except Exception as e:
+            logger.debug(f"[Browser] Failed to wire close listeners: {e}")

    def _shutdown_browser(self):
-        """Shut down all Playwright resources on the background thread."""
+        """Shut down Playwright resources on the background thread.
+
+        Mode-specific behavior:
+        - cdp: only disconnect the Playwright client; leave the user's Chrome
+          and its tabs untouched (do NOT close the context).
+        - persistent: close the persistent context (no separate browser handle).
+        - fresh: close context, then browser.
+        """
        self._cancel_idle_timer()
-        for obj, label in [
-            (self._context, "context"),
-            (self._browser, "browser"),
-        ]:
+
+        if self._launch_mode == "cdp":
+            # For CDP, browser.close() only detaches the Playwright client;
+            # the user's Chrome process and its tabs stay alive.
            try:
-                if obj:
-                    obj.close()
+                if self._browser:
+                    self._browser.close()
            except Exception as e:
-                logger.debug(f"[Browser] {label} close error: {e}")
+                logger.debug(f"[Browser] cdp disconnect error: {e}")
+        else:
+            for obj, label in [
+                (self._context, "context"),
+                (self._browser, "browser"),
+            ]:
+                try:
+                    if obj:
+                        obj.close()
+                except Exception as e:
+                    logger.debug(f"[Browser] {label} close error: {e}")
+
        try:
            if self._playwright:
                self._playwright.stop()
@@ -433,6 +591,13 @@ class BrowserService:

    def _submit(self, fn: Callable, *args, **kwargs):
        """Submit *fn* to the background thread and block until it completes."""
+        # If the browser died externally (e.g. user closed the window), tear
+        # down the stale thread first so _start_thread() will relaunch fresh.
+        if self._needs_restart:
+            logger.info("[Browser] Restarting after detecting closed browser")
+            self.close()
+            self._needs_restart = False
+
        self._start_thread()

        if not self._alive:
@@ -481,6 +646,7 @@ class BrowserService:
        self._cancel_idle_timer()
        with self._lock:
            if not self._alive:
+                self._needs_restart = False
                return
            self._alive = False
            t = self._thread
@@ -490,6 +656,7 @@ class BrowserService:
            t.join(timeout=10)
        with self._lock:
            self._thread = None
+            self._needs_restart = False

    # ------------------------------------------------------------------
    # Actions  (each method is dispatched to the background thread)
--- a/agent/tools/browser/browser_tool.py
+++ b/agent/tools/browser/browser_tool.py
@@ -4,6 +4,15 @@ Browser tool - Control a Chromium browser for web navigation and interaction.
 Uses Playwright under the hood. Browser instance is lazily started on first
 use, reused across tool calls within the same session, and cleaned up via
 close().
+
+Launch modes (configured under `tools.browser` in config.json):
+  - persistent (default): Chromium runs with a persistent user_data_dir
+    (default `~/.cow/browser_profile`), so cookies and login state survive
+    across runs. The user only needs to log in once.
+  - cdp: When `cdp_endpoint` is set, attach to an externally launched Chrome
+    via the Chrome DevTools Protocol. Lets the agent reuse the user's real
+    browser (with all logins / extensions / true fingerprints).
+  - fresh: Set `persistent` to false to fall back to a clean context every run.
 """

 import json
@@ -25,7 +34,10 @@ class BrowserTool(BaseTool):
        "get_text, press, evaluate.\n\n"
        "Workflow: navigate (auto-includes snapshot with element refs) → click/fill/select by ref → snapshot to verify.\n\n"
        "Use snapshot as the primary way to read pages. Use screenshot + send to show key results to the user. "
-        "For login/CAPTCHA/authorization etc., screenshot and ask the user for help."
+        "For login/CAPTCHA/authorization etc., screenshot and ask the user for help. "
+        "Login state is persisted across sessions (cookies / localStorage are kept in a "
+        "user profile directory), so once the user logs in to a site, the agent can keep "
+        "using it without logging in again."
    )

    params: dict = {
--- a/agent/tools/mcp/init.py
+++ b/agent/tools/mcp/init.py
@@ -0,0 +1,4 @@
+from agent.tools.mcp.mcp_client import McpClient, McpClientRegistry
+from agent.tools.mcp.mcp_tool import McpTool
+
+__all__ = ["McpClient", "McpClientRegistry", "McpTool"]
--- a/agent/tools/mcp/mcp_client.py
+++ b/agent/tools/mcp/mcp_client.py
@@ -0,0 +1,374 @@
+"""
+MCP (Model Context Protocol) client module.
+
+Implements JSON-RPC 2.0 over stdio and SSE transports without any external
+MCP SDK dependency.
+"""
+
+import json
+import os
+import select
+import subprocess
+import threading
+import urllib.request
+import urllib.error
+from typing import Optional
+
+from common.log import logger
+
+
+class McpClient:
+    """Single MCP Server client supporting stdio and SSE transports."""
+
+    def __init__(self, config: dict):
+        """
+        config examples:
+          stdio: {"name": "filesystem", "type": "stdio", "command": "npx", "args": [...]}
+          SSE:   {"name": "my-api",    "type": "sse",   "url": "http://localhost:8000/sse"}
+        """
+        self.config = config
+        self.name: str = config.get("name", "unknown")
+        self.transport: str = config.get("type", "stdio")
+
+        # stdio state
+        self._proc: Optional[subprocess.Popen] = None
+
+        # SSE state
+        self._sse_url: Optional[str] = None
+        self._post_url: Optional[str] = None  # endpoint for sending messages (resolved from SSE)
+
+        # Shared state
+        self._next_id = 1
+        self._id_lock = threading.Lock()
+        self._call_lock = threading.Lock()
+        self._initialized = False
+
+    # ------------------------------------------------------------------
+    # Public interface
+    # ------------------------------------------------------------------
+
+    def initialize(self) -> bool:
+        """Connect and perform the MCP handshake. Returns True on success."""
+        try:
+            if self.transport == "stdio":
+                return self._init_stdio()
+            elif self.transport == "sse":
+                return self._init_sse()
+            else:
+                logger.warning(f"[MCP:{self.name}] Unknown transport type: {self.transport!r}")
+                return False
+        except Exception as e:
+            logger.warning(f"[MCP:{self.name}] Initialization failed: {e}")
+            return False
+
+    def list_tools(self) -> list:
+        """Return the tool list from this server.
+
+        Each item is a dict: {"name": str, "description": str, "inputSchema": dict}
+        """
+        try:
+            resp = self._send_request("tools/list", {})
+            tools = resp.get("result", {}).get("tools", [])
+            return [
+                {
+                    "name": t.get("name", ""),
+                    "description": t.get("description", ""),
+                    "inputSchema": t.get("inputSchema", {}),
+                }
+                for t in tools
+            ]
+        except Exception as e:
+            logger.warning(f"[MCP:{self.name}] list_tools failed: {e}")
+            return []
+
+    def call_tool(self, name: str, arguments: dict) -> str:
+        """Call a tool and return the result as a string."""
+        try:
+            resp = self._send_request("tools/call", {"name": name, "arguments": arguments})
+            content = resp.get("result", {}).get("content", [])
+            parts = [item.get("text", "") for item in content if item.get("type") == "text"]
+            return "\n".join(parts)
+        except Exception as e:
+            logger.warning(f"[MCP:{self.name}] call_tool({name}) failed: {e}")
+            return f"Error: {e}"
+
+    def shutdown(self):
+        """Close the connection / terminate the child process."""
+        if self._proc is not None:
+            try:
+                self._proc.stdin.close()
+            except Exception:
+                pass
+            try:
+                self._proc.terminate()
+                self._proc.wait(timeout=5)
+            except Exception:
+                try:
+                    self._proc.kill()
+                except Exception:
+                    pass
+            self._proc = None
+            logger.debug(f"[MCP:{self.name}] stdio process terminated")
+        self._initialized = False
+
+    # ------------------------------------------------------------------
+    # stdio transport
+    # ------------------------------------------------------------------
+
+    def _init_stdio(self) -> bool:
+        command = self.config.get("command")
+        if not command:
+            logger.warning(f"[MCP:{self.name}] stdio config missing 'command'")
+            return False
+
+        args = self.config.get("args", [])
+        extra_env = self.config.get("env", None)
+        env = {**os.environ, **extra_env} if extra_env else None
+
+        self._proc = subprocess.Popen(
+            [command] + list(args),
+            stdin=subprocess.PIPE,
+            stdout=subprocess.PIPE,
+            stderr=subprocess.PIPE,
+            text=True,
+            encoding="utf-8",
+            env=env,
+        )
+        logger.debug(f"[MCP:{self.name}] stdio process started (pid={self._proc.pid})")
+
+        threading.Thread(
+            target=self._drain_stderr, daemon=True, name=f"mcp-stderr-{self.name}"
+        ).start()
+
+        return self._handshake()
+
+    def _drain_stderr(self):
+        for line in self._proc.stderr:
+            line = line.strip()
+            if line:
+                logger.debug(f"[MCP:{self.name}] stderr: {line}")
+
+    def _readline_with_timeout(self, timeout: int = 30) -> str:
+        """Read one line from stdio stdout with a hard timeout."""
+        ready, _, _ = select.select([self._proc.stdout], [], [], timeout)
+        if not ready:
+            raise TimeoutError(f"[MCP:{self.name}] stdio read timed out after {timeout}s")
+        return self._proc.stdout.readline()
+
+    def _stdio_send(self, message: dict) -> dict:
+        """Send a JSON-RPC message over stdio and read the response."""
+        raw = json.dumps(message) + "\n"
+        self._proc.stdin.write(raw)
+        self._proc.stdin.flush()
+
+        while True:
+            line = self._readline_with_timeout()
+            if not line:
+                raise IOError(f"[MCP:{self.name}] stdio process closed unexpectedly")
+            line = line.strip()
+            if not line:
+                continue
+            try:
+                data = json.loads(line)
+            except json.JSONDecodeError:
+                continue
+            if "id" not in data:
+                logger.debug(f"[MCP:{self.name}] notification skipped: {data.get('method', '?')}")
+                continue
+            return data
+
+    # ------------------------------------------------------------------
+    # SSE transport
+    # ------------------------------------------------------------------
+
+    def _init_sse(self) -> bool:
+        url = self.config.get("url")
+        if not url:
+            logger.warning(f"[MCP:{self.name}] SSE config missing 'url'")
+            return False
+
+        self._sse_url = url
+
+        # Read the first SSE event to discover the POST endpoint
+        try:
+            self._post_url = self._sse_discover_endpoint()
+        except Exception as e:
+            logger.warning(f"[MCP:{self.name}] SSE endpoint discovery failed: {e}")
+            return False
+
+        return self._handshake()
+
+    def _sse_discover_endpoint(self) -> str:
+        """Open SSE stream and read the 'endpoint' event to learn the POST URL."""
+        req = urllib.request.Request(
+            self._sse_url,
+            headers={"Accept": "text/event-stream"},
+        )
+        with urllib.request.urlopen(req, timeout=10) as resp:
+            for raw_line in resp:
+                line = raw_line.decode("utf-8").rstrip("\n\r")
+                if line.startswith("data:"):
+                    data = line[len("data:"):].strip()
+                    # Some servers send JSON with a "uri" or plain path
+                    if data.startswith("{"):
+                        parsed = json.loads(data)
+                        return parsed.get("uri") or parsed.get("url") or parsed.get("endpoint")
+                    # Plain relative or absolute URL
+                    if data.startswith("http"):
+                        return data
+                    # Relative path: resolve against SSE base
+                    from urllib.parse import urljoin
+                    return urljoin(self._sse_url, data)
+        raise ValueError(f"[MCP:{self.name}] No endpoint event received from SSE stream")
+
+    def _sse_send(self, message: dict) -> dict:
+        """POST a JSON-RPC message to the server and return the response."""
+        body = json.dumps(message).encode("utf-8")
+        req = urllib.request.Request(
+            self._post_url,
+            data=body,
+            method="POST",
+            headers={"Content-Type": "application/json"},
+        )
+        with urllib.request.urlopen(req, timeout=30) as resp:
+            raw = resp.read().decode("utf-8")
+            return json.loads(raw)
+
+    # ------------------------------------------------------------------
+    # Common JSON-RPC helpers
+    # ------------------------------------------------------------------
+
+    def _next_request_id(self) -> int:
+        with self._id_lock:
+            rid = self._next_id
+            self._next_id += 1
+        return rid
+
+    def _build_request(self, method: str, params: dict) -> dict:
+        return {
+            "jsonrpc": "2.0",
+            "id": self._next_request_id(),
+            "method": method,
+            "params": params,
+        }
+
+    def _build_notification(self, method: str, params: dict) -> dict:
+        return {"jsonrpc": "2.0", "method": method, "params": params}
+
+    def _send_request(self, method: str, params: dict) -> dict:
+        """Send a request and return the full response dict."""
+        if not self._initialized and method != "initialize":
+            raise RuntimeError(f"[MCP:{self.name}] Client not initialized")
+
+        message = self._build_request(method, params)
+
+        with self._call_lock:
+            if self.transport == "stdio":
+                return self._stdio_send(message)
+            elif self.transport == "sse":
+                return self._sse_send(message)
+            else:
+                raise ValueError(f"[MCP:{self.name}] Unsupported transport: {self.transport}")
+
+    def _send_notification(self, method: str, params: dict):
+        """Fire-and-forget notification (no response expected)."""
+        notification = self._build_notification(method, params)
+        raw = json.dumps(notification) + "\n"
+
+        if self.transport == "stdio":
+            self._proc.stdin.write(raw)
+            self._proc.stdin.flush()
+        elif self.transport == "sse":
+            body = raw.encode("utf-8")
+            req = urllib.request.Request(
+                self._post_url,
+                data=body,
+                method="POST",
+                headers={"Content-Type": "application/json"},
+            )
+            try:
+                with urllib.request.urlopen(req, timeout=10):
+                    pass
+            except Exception:
+                pass  # notifications are fire-and-forget
+
+    def _handshake(self) -> bool:
+        """Perform the MCP initialize / notifications/initialized handshake."""
+        init_params = {
+            "protocolVersion": "2024-11-05",
+            "capabilities": {},
+            "clientInfo": {"name": "CowAgent", "version": "1.0"},
+        }
+        # Temporarily mark as initialized so _send_request doesn't block
+        self._initialized = True
+        try:
+            resp = self._send_request("initialize", init_params)
+        except Exception as e:
+            self._initialized = False
+            logger.warning(f"[MCP:{self.name}] Handshake initialize failed: {e}")
+            return False
+
+        if "error" in resp:
+            self._initialized = False
+            logger.warning(f"[MCP:{self.name}] Handshake error: {resp['error']}")
+            return False
+
+        self._send_notification("notifications/initialized", {})
+        logger.debug(f"[MCP:{self.name}] Handshake complete")
+        return True
+
+
+class McpClientRegistry:
+    """Global singleton managing the lifecycle of all MCP Server clients."""
+
+    _instance = None
+    _instance_lock = threading.Lock()
+
+    def __new__(cls):
+        with cls._instance_lock:
+            if cls._instance is None:
+                obj = super().__new__(cls)
+                obj._clients: dict[str, McpClient] = {}
+                obj._registry_lock = threading.Lock()
+                cls._instance = obj
+        return cls._instance
+
+    def start_all(self, configs: list) -> None:
+        """Initialize McpClient for each config entry; skip failures with a warning."""
+        if not configs:
+            return
+
+        for cfg in configs:
+            name = cfg.get("name", "<unnamed>")
+            client = McpClient(cfg)
+            ok = client.initialize()
+            if ok:
+                with self._registry_lock:
+                    self._clients[name] = client
+                logger.info(f"[MCP] Server '{name}' initialized successfully")
+            else:
+                logger.warning(f"[MCP] Server '{name}' failed to initialize — skipping")
+
+    def get(self, server_name: str) -> Optional[McpClient]:
+        """Return the initialized client for server_name, or None."""
+        with self._registry_lock:
+            return self._clients.get(server_name)
+
+    def all_clients(self) -> dict:
+        """Return a copy of the {name: McpClient} mapping."""
+        with self._registry_lock:
+            return dict(self._clients)
+
+    def shutdown_all(self) -> None:
+        """Shut down all managed clients."""
+        with self._registry_lock:
+            clients = list(self._clients.values())
+            self._clients.clear()
+
+        for client in clients:
+            try:
+                client.shutdown()
+            except Exception as e:
+                logger.warning(f"[MCP] Error shutting down '{client.name}': {e}")
+
+        logger.info("[MCP] All servers shut down")
--- a/agent/tools/mcp/mcp_tool.py
+++ b/agent/tools/mcp/mcp_tool.py
@@ -0,0 +1,31 @@
+from agent.tools.base_tool import BaseTool, ToolResult
+from common.log import logger
+
+
+class McpTool(BaseTool):
+    """
+    将单个 MCP 工具包装为 BaseTool。
+    一个 MCP Server 可以提供多个工具，每个工具对应一个 McpTool 实例。
+    """
+
+    def __init__(self, client, tool_schema: dict, server_name: str):
+        """
+        :param client: 该工具所属的 McpClient 实例
+        :param tool_schema: MCP 返回的工具描述，格式：
+            {"name": str, "description": str, "inputSchema": dict}
+        :param server_name: Server 名称，用于日志
+        """
+        self.client = client
+        self.server_name = server_name
+        self.name = tool_schema["name"]
+        self.description = tool_schema.get("description", "")
+        self.params = tool_schema.get("inputSchema", {})
+
+    def execute(self, params: dict) -> ToolResult:
+        logger.info(f"[McpTool] server={self.server_name} tool={self.name} params={params}")
+        try:
+            result = self.client.call_tool(self.name, params)
+            return ToolResult.success(result)
+        except Exception as e:
+            logger.error(f"[McpTool] server={self.server_name} tool={self.name} error: {e}")
+            return ToolResult.fail(str(e))
--- a/agent/tools/memory/memory_get.py
+++ b/agent/tools/memory/memory_get.py
@@ -44,6 +44,19 @@ class MemoryGetTool(BaseTool):
        """
        super().__init__()
        self.memory_manager = memory_manager
+
+        from config import conf
+        if conf().get("knowledge", True):
+            self.description = (
+                "Read specific content from memory or knowledge files. "
+                "Use this to get full context from a memory file, knowledge page, or specific line range."
+            )
+            self.params = {**self.params}
+            self.params["properties"] = {**self.params["properties"]}
+            self.params["properties"]["path"] = {
+                "type": "string",
+                "description": "Relative path to the memory or knowledge file (e.g. 'MEMORY.md', 'memory/2026-01-01.md', 'knowledge/concepts/moe.md')"
+            }
    
    def execute(self, args: dict):
        """
@@ -68,11 +81,15 @@ class MemoryGetTool(BaseTool):
            workspace_dir = self.memory_manager.config.get_workspace()
            
            # Auto-prepend memory/ if not present and not absolute path
-            # Exception: MEMORY.md is in the root directory
-            if not path.startswith('memory/') and not path.startswith('/') and path != 'MEMORY.md':
+            # Exceptions: MEMORY.md in root, knowledge/ files at workspace root
+            if not path.startswith('memory/') and not path.startswith('knowledge/') and not path.startswith('/') and path != 'MEMORY.md':
                path = f'memory/{path}'
            
-            file_path = workspace_dir / path
+            file_path = (workspace_dir / path).resolve()
+            workspace_resolved = workspace_dir.resolve()
+            
+            if not str(file_path).startswith(str(workspace_resolved) + '/') and file_path != workspace_resolved:
+                return ToolResult.fail(f"Error: Access denied: path outside workspace")
            
            if not file_path.exists():
                return ToolResult.fail(f"Error: File not found: {path}")
--- a/agent/tools/memory/memory_search.py
+++ b/agent/tools/memory/memory_search.py
@@ -48,6 +48,13 @@ class MemorySearchTool(BaseTool):
        super().__init__()
        self.memory_manager = memory_manager
        self.user_id = user_id
+
+        from config import conf
+        if conf().get("knowledge", True):
+            self.description = (
+                "Search agent's long-term memory and knowledge base using semantic and keyword search. "
+                "Use this to recall past conversations, preferences, and knowledge pages."
+            )
    
    def execute(self, args: dict):
        """
--- a/agent/tools/read/read.py
+++ b/agent/tools/read/read.py
@@ -245,16 +245,11 @@ class Read(BaseTool):
                })
            
            # Read file (utf-8-sig strips BOM automatically on Windows)
+            # Note: Truncation is unified via truncate_head (DEFAULT_MAX_LINES / DEFAULT_MAX_BYTES)
+            # so that offset/limit can paginate the entire file correctly.
            with open(absolute_path, 'r', encoding='utf-8-sig') as f:
                content = f.read()
-            
-            # Truncate content if too long (20K characters max for model context)
-            MAX_CONTENT_CHARS = 20 * 1024  # 20K characters
-            content_truncated = False
-            if len(content) > MAX_CONTENT_CHARS:
-                content = content[:MAX_CONTENT_CHARS]
-                content_truncated = True
-            
+
            all_lines = content.split('\n')
            total_file_lines = len(all_lines)
            
@@ -290,11 +285,7 @@ class Read(BaseTool):
            
            output_text = ""
            details = {}
-            
-            # Add truncation warning if content was truncated
-            if content_truncated:
-                output_text = f"[文件内容已截断到前 {format_size(MAX_CONTENT_CHARS)}，完整文件大小: {format_size(file_size)}]\n\n"
-            
+
            if truncation.first_line_exceeds_limit:
                # First line exceeds 30KB limit
                first_line_size = format_size(len(all_lines[start_line].encode('utf-8')))
--- a/agent/tools/scheduler/integration.py
+++ b/agent/tools/scheduler/integration.py
@@ -3,6 +3,7 @@ Integration module for scheduler with AgentBridge
 """

 import os
+import threading
 from typing import Optional
 from config import conf
 from common.log import logger
@@ -13,65 +14,82 @@ from bridge.reply import Reply, ReplyType
 # Global scheduler service instance
 _scheduler_service = None
 _task_store = None
+# Module-level lock to guard idempotent initialization across threads
+_init_lock = threading.Lock()


 def init_scheduler(agent_bridge) -> bool:
    """
-    Initialize scheduler service
-    
+    Initialize scheduler service (idempotent).
+
+    Safe to call multiple times and from multiple threads: only the first
+    successful call creates the singleton ``SchedulerService`` + background
+    scanning thread. Subsequent calls return immediately.
+
    Args:
        agent_bridge: AgentBridge instance
-        
+
    Returns:
-        True if initialized successfully
+        True if scheduler is initialized (newly created or already running)
    """
    global _scheduler_service, _task_store
-    
-    try:
-        from agent.tools.scheduler.task_store import TaskStore
-        from agent.tools.scheduler.scheduler_service import SchedulerService
-        
-        # Get workspace from config
-        workspace_root = expand_path(conf().get("agent_workspace", "~/cow"))
-        store_path = os.path.join(workspace_root, "scheduler", "tasks.json")
-        
-        # Create task store
-        _task_store = TaskStore(store_path)
-        logger.debug(f"[Scheduler] Task store initialized: {store_path}")
-        
-        # Create execute callback
-        def execute_task_callback(task: dict):
-            """Callback to execute a scheduled task"""
-            try:
-                action = task.get("action", {})
-                action_type = action.get("type")
-                
-                if action_type == "agent_task":
-                    _execute_agent_task(task, agent_bridge)
-                elif action_type == "send_message":
-                    # Legacy support for old tasks
-                    _execute_send_message(task, agent_bridge)
-                elif action_type == "tool_call":
-                    # Legacy support for old tasks
-                    _execute_tool_call(task, agent_bridge)
-                elif action_type == "skill_call":
-                    # Legacy support for old tasks
-                    _execute_skill_call(task, agent_bridge)
-                else:
-                    logger.warning(f"[Scheduler] Unknown action type: {action_type}")
-            except Exception as e:
-                logger.error(f"[Scheduler] Error executing task {task.get('id')}: {e}")
-        
-        # Create scheduler service
-        _scheduler_service = SchedulerService(_task_store, execute_task_callback)
-        _scheduler_service.start()
-        
-        logger.debug("[Scheduler] Scheduler service initialized and started")
+
+    # Fast path: already initialized and running
+    if _scheduler_service is not None and getattr(_scheduler_service, "running", False):
        return True
-        
-    except Exception as e:
-        logger.error(f"[Scheduler] Failed to initialize scheduler: {e}")
-        return False
+
+    with _init_lock:
+        # Re-check under the lock to avoid races where multiple threads
+        # passed the fast-path check before any of them acquired the lock.
+        if _scheduler_service is not None and getattr(_scheduler_service, "running", False):
+            return True
+
+        try:
+            from agent.tools.scheduler.task_store import TaskStore
+            from agent.tools.scheduler.scheduler_service import SchedulerService
+
+            # Get workspace from config
+            workspace_root = expand_path(conf().get("agent_workspace", "~/cow"))
+            store_path = os.path.join(workspace_root, "scheduler", "tasks.json")
+
+            # Create task store (reuse if already created)
+            if _task_store is None:
+                _task_store = TaskStore(store_path)
+                logger.debug(f"[Scheduler] Task store initialized: {store_path}")
+
+            # Create execute callback
+            def execute_task_callback(task: dict):
+                """Callback to execute a scheduled task"""
+                try:
+                    action = task.get("action", {})
+                    action_type = action.get("type")
+
+                    if action_type == "agent_task":
+                        _execute_agent_task(task, agent_bridge)
+                    elif action_type == "send_message":
+                        # Legacy support for old tasks
+                        _execute_send_message(task, agent_bridge)
+                    elif action_type == "tool_call":
+                        # Legacy support for old tasks
+                        _execute_tool_call(task, agent_bridge)
+                    elif action_type == "skill_call":
+                        # Legacy support for old tasks
+                        _execute_skill_call(task, agent_bridge)
+                    else:
+                        logger.warning(f"[Scheduler] Unknown action type: {action_type}")
+                except Exception as e:
+                    logger.error(f"[Scheduler] Error executing task {task.get('id')}: {e}")
+
+            # Create scheduler service
+            _scheduler_service = SchedulerService(_task_store, execute_task_callback)
+            _scheduler_service.start()
+
+            logger.debug("[Scheduler] Scheduler service initialized and started")
+            return True
+
+        except Exception as e:
+            logger.error(f"[Scheduler] Failed to initialize scheduler: {e}")
+            return False


 def get_task_store():
@@ -84,6 +102,49 @@ def get_scheduler_service():
    return _scheduler_service


+def _remember_delivered_output(
+    agent_bridge,
+    task: dict,
+    channel_type: str,
+    content: str,
+) -> None:
+    """Best-effort persistence of the message the scheduler sent to a user.
+
+    Uses notify_session_id (the real chat session_id stored at task creation time)
+    so that group chats correctly associate the output with the user's conversation.
+    Falls back to receiver for backward compatibility with old tasks.
+
+    Per-action-type behaviour:
+        - agent_task / tool_call / skill_call: gated by ``scheduler_inject_to_session``
+          (default True). These produce AI-generated content worth remembering.
+        - send_message: additionally gated by ``scheduler_inject_send_message``
+          (default False). Fixed reminder text rarely benefits follow-up Q&A and
+          would just consume context tokens.
+    """
+    if not content:
+        return
+    action = task.get("action", {})
+    action_type = action.get("type", "")
+
+    # send_message defaults to NOT being injected; explicit opt-in via config.
+    if action_type == "send_message":
+        if not conf().get("scheduler_inject_send_message", False):
+            return
+
+    session_id = action.get("notify_session_id") or action.get("receiver")
+    if not session_id:
+        return
+    try:
+        remember = getattr(agent_bridge, "remember_scheduled_output", None)
+        if remember:
+            task_desc = action.get("task_description") or action.get("content", "")
+            remember(session_id, str(content), channel_type=channel_type, task_description=task_desc)
+    except Exception as e:
+        logger.warning(
+            f"[Scheduler] Failed to remember delivered output for {session_id}: {e}"
+        )
+
+
 def _execute_agent_task(task: dict, agent_bridge):
    """
    Execute an agent_task action - let Agent handle the task
@@ -165,6 +226,7 @@ def _execute_agent_task(task: dict, agent_bridge):
                        
                        # Send the reply
                        channel.send(reply, context)
+                        _remember_delivered_output(agent_bridge, task, channel_type, reply.content)
                        logger.info(f"[Scheduler] Task {task['id']} executed successfully, result sent to {receiver}")
                    else:
                        logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
@@ -255,6 +317,7 @@ def _execute_send_message(task: dict, agent_bridge):
                    logger.debug(f"[Scheduler] Registered request_id {request_id} -> session {receiver}")
                
                channel.send(reply, context)
+                _remember_delivered_output(agent_bridge, task, channel_type, content)
                logger.info(f"[Scheduler] Task {task['id']} executed: sent message to {receiver}")
            else:
                logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
@@ -351,6 +414,7 @@ def _execute_tool_call(task: dict, agent_bridge):
                    logger.debug(f"[Scheduler] Registered request_id {request_id} -> session {receiver}")

                channel.send(reply, context)
+                _remember_delivered_output(agent_bridge, task, channel_type, content)
                logger.info(f"[Scheduler] Task {task['id']} executed: sent tool result to {receiver}")
            else:
                logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
@@ -429,6 +493,24 @@ def _execute_skill_call(task: dict, agent_bridge):
                if result_prefix:
                    content = f"{result_prefix}\n\n{content}"
                
+                # Send the result via channel
+                from channel.channel_factory import create_channel
+                
+                try:
+                    channel = create_channel(channel_type)
+                    if channel:
+                        # For web channel, register request_id
+                        if channel_type == "web" and hasattr(channel, 'request_to_session'):
+                            req_id = context.get("request_id")
+                            if req_id:
+                                channel.request_to_session[req_id] = receiver
+                                logger.debug(f"[Scheduler] Registered request_id {req_id} -> session {receiver}")
+                        
+                        channel.send(Reply(ReplyType.TEXT, content), context)
+                        _remember_delivered_output(agent_bridge, task, channel_type, content)
+                except Exception as e:
+                    logger.error(f"[Scheduler] Failed to send skill result: {e}")
+                
                logger.info(f"[Scheduler] Task {task['id']} executed: skill result sent to {receiver}")
            else:
                logger.error(f"[Scheduler] Task {task['id']}: No result from skill execution")
--- a/agent/tools/scheduler/scheduler_service.py
+++ b/agent/tools/scheduler/scheduler_service.py
@@ -10,6 +10,19 @@ from croniter import croniter
 from common.log import logger


+def _parse_naive_local(iso_str: str) -> datetime:
+    """Parse an ISO datetime and coerce it to tz-naive local time.
+
+    The scheduler uses ``datetime.now()`` (tz-naive) for all comparisons,
+    so any persisted timestamp must be normalized to the same flavor —
+    otherwise comparing naive vs aware raises TypeError.
+    """
+    dt = datetime.fromisoformat(iso_str)
+    if dt.tzinfo is not None:
+        dt = dt.astimezone().replace(tzinfo=None)
+    return dt
+
+
 class SchedulerService:
    """
    Background service that executes scheduled tasks
@@ -113,8 +126,8 @@ class SchedulerService:
            return False
        
        try:
-            next_run = datetime.fromisoformat(next_run_str)
-            
+            next_run = _parse_naive_local(next_run_str)
+
            # Check if task is overdue (e.g., service restart)
            if next_run < now:
                time_diff = (now - next_run).total_seconds()
@@ -140,7 +153,11 @@ class SchedulerService:
                    return False
            
            return now >= next_run
-        except Exception:
+        except Exception as e:
+            logger.error(
+                f"[Scheduler] Failed to evaluate due-state for task "
+                f"{task.get('id')} (next_run_at={next_run_str!r}): {e}"
+            )
            return False
    
    def _calculate_next_run(self, task: dict, from_time: datetime) -> Optional[datetime]:
@@ -184,12 +201,14 @@ class SchedulerService:
                return None
            
            try:
-                run_at = datetime.fromisoformat(run_at_str)
-                # Only return if in the future
+                run_at = _parse_naive_local(run_at_str)
                if run_at > from_time:
                    return run_at
-            except Exception:
-                pass
+            except Exception as e:
+                logger.error(
+                    f"[Scheduler] Failed to parse once-task run_at "
+                    f"{run_at_str!r}: {e}"
+                )
            return None
        
        return None
--- a/agent/tools/scheduler/scheduler_tool.py
+++ b/agent/tools/scheduler/scheduler_tool.py
@@ -158,6 +158,11 @@ class SchedulerTool(BaseTool):
        # Create task
        task_id = str(uuid.uuid4())[:8]
        
+        # Capture the real chat session_id at task creation time so that scheduler
+        # can later inject the delivered output into the user's actual conversation
+        # (in group chats, session_id != receiver, e.g. "user_id:group_id" on feishu).
+        notify_session_id = context.get("session_id")
+
        # Build action based on message or ai_task
        if message:
            action = {
@@ -166,7 +171,8 @@ class SchedulerTool(BaseTool):
                "receiver": context.get("receiver"),
                "receiver_name": self._get_receiver_name(context),
                "is_group": context.get("isgroup", False),
-                "channel_type": self.config.get("channel_type", "unknown")
+                "channel_type": self.config.get("channel_type", "unknown"),
+                "notify_session_id": notify_session_id,
            }
        else:  # ai_task
            action = {
@@ -175,7 +181,8 @@ class SchedulerTool(BaseTool):
                "receiver": context.get("receiver"),
                "receiver_name": self._get_receiver_name(context),
                "is_group": context.get("isgroup", False),
-                "channel_type": self.config.get("channel_type", "unknown")
+                "channel_type": self.config.get("channel_type", "unknown"),
+                "notify_session_id": notify_session_id,
            }
        
        # 针对钉钉单聊，额外存储 sender_staff_id
@@ -357,9 +364,12 @@ class SchedulerTool(BaseTool):
                        logger.error(f"[SchedulerTool] Invalid relative time format: {schedule_value}")
                        return None
                else:
-                    # Absolute time in ISO format
-                    datetime.fromisoformat(schedule_value)
-                    return {"type": "once", "run_at": schedule_value}
+                    # Absolute ISO time. Normalize to tz-naive local so it
+                    # stays comparable with the scheduler's datetime.now().
+                    parsed = datetime.fromisoformat(schedule_value)
+                    if parsed.tzinfo is not None:
+                        parsed = parsed.astimezone().replace(tzinfo=None)
+                    return {"type": "once", "run_at": parsed.isoformat()}
            
        except Exception as e:
            logger.error(f"[SchedulerTool] Invalid schedule: {e}")
--- a/agent/tools/tool_manager.py
+++ b/agent/tools/tool_manager.py
@@ -1,5 +1,6 @@
 import importlib
 import importlib.util
+import threading
 from pathlib import Path
 from typing import Dict, Any, Type
 from agent.tools.base_tool import BaseTool
@@ -7,6 +8,26 @@ from common.log import logger
 from config import conf


+def _normalize_mcp_configs(raw) -> list:
+    """
+    Convert MCP server config to internal list format.
+    Supports:
+      - list format (mcp_servers):  [{"name": "x", "type": "stdio", ...}]
+      - dict format (mcpServers):   {"x": {"command": "npx", ...}}
+    """
+    if isinstance(raw, list):
+        return raw
+    if isinstance(raw, dict):
+        result = []
+        for name, cfg in raw.items():
+            entry = {"name": name, **cfg}
+            if "type" not in entry:
+                entry["type"] = "sse" if "url" in entry else "stdio"
+            result.append(entry)
+        return result
+    return []
+
+
 class ToolManager:
    """
    Tool manager for managing tools.
@@ -25,6 +46,31 @@ class ToolManager:
        # Initialize only once
        if not hasattr(self, 'tool_classes'):
            self.tool_classes = {}  # Dictionary to store tool classes
+        if not hasattr(self, '_mcp_registry'):
+            self._mcp_registry = None  # Lazy init: only created when MCP servers are configured
+        if not hasattr(self, '_mcp_tool_instances'):
+            self._mcp_tool_instances: dict = {}  # tool_name -> McpTool instance
+        if not hasattr(self, '_mcp_lock'):
+            # Guards _mcp_loaded check-then-set so concurrent callers
+            # don't trigger duplicate background loaders.
+            self._mcp_lock = threading.Lock()
+        if not hasattr(self, '_mcp_loaded'):
+            # Idempotency flag. Flipped to True the moment the first loader
+            # is dispatched (synchronously, inside _mcp_lock). Subsequent
+            # _load_mcp_tools() calls become no-ops, so per-session agent
+            # initialization never re-forks MCP subprocesses.
+            self._mcp_loaded = False
+        if not hasattr(self, '_mcp_status'):
+            # server_name -> "pending" / "ready" / "failed"
+            # Useful for UI / introspection while async loading is in progress.
+            self._mcp_status: dict = {}
+        if not hasattr(self, '_mcp_signature'):
+            # (mtime, sha256) of mcp.json the last time we loaded.
+            # Used by refresh_mcp_if_changed() to skip re-parsing when nothing changed.
+            self._mcp_signature: tuple = (None, None)
+        if not hasattr(self, '_mcp_active_configs'):
+            # server_name -> normalized config dict, for diff-based reload.
+            self._mcp_active_configs: dict = {}

    def load_tools(self, tools_dir: str = "", config_dict=None):
        """
@@ -39,6 +85,8 @@ class ToolManager:
            self._load_tools_from_init()
            self._configure_tools_from_config(config_dict)

+        self._load_mcp_tools()
+
    def _load_tools_from_init(self) -> bool:
        """
        Load tool classes from tools.__init__.__all__
@@ -70,10 +118,14 @@ class ToolManager:
                                    and cls != BaseTool
                            ):
                                try:
-                                    # Skip memory tools (they need special initialization with memory_manager)
+                                    # Skip tools that need special initialization
                                    if class_name in ["MemorySearchTool", "MemoryGetTool"]:
                                        logger.debug(f"Skipped tool {class_name} (requires memory_manager)")
                                        continue
+                                    # McpTool instances are registered dynamically via _load_mcp_tools()
+                                    if class_name == "McpTool":
+                                        logger.debug(f"Skipped tool {class_name} (registered dynamically via mcp_servers config)")
+                                        continue
                                    
                                    # Create a temporary instance to get the name
                                    temp_instance = cls()
@@ -212,6 +264,306 @@ class ToolManager:
        except Exception as e:
            logger.error(f"Error configuring tools from config: {e}")

+    def _mcp_json_path(self) -> str:
+        import os
+        workspace = os.path.expanduser(conf().get("agent_workspace", "~/cow"))
+        return os.path.join(workspace, "mcp.json")
+
+    def _read_mcp_json_signature(self):
+        """
+        Return (mtime, sha256_of_bytes) for ~/cow/mcp.json without parsing.
+        Returns (None, None) if the file doesn't exist or is unreadable.
+        Cheap enough (one stat + one small read) to call on every agent init.
+        """
+        import os
+        import hashlib
+        path = self._mcp_json_path()
+        try:
+            mtime = os.path.getmtime(path)
+        except OSError:
+            return (None, None)
+        try:
+            with open(path, "rb") as f:
+                digest = hashlib.sha256(f.read()).hexdigest()
+        except OSError:
+            return (mtime, None)
+        return (mtime, digest)
+
+    def _load_mcp_configs(self) -> list:
+        """
+        Load MCP server configs with priority:
+          1. ~/cow/mcp.json  (supports both mcpServers and mcp_servers keys)
+          2. config.json mcp_servers field (fallback)
+        """
+        import os
+        import json as _json
+
+        mcp_json_path = self._mcp_json_path()
+
+        if os.path.exists(mcp_json_path):
+            try:
+                with open(mcp_json_path, "r", encoding="utf-8") as f:
+                    data = _json.load(f)
+                raw = data.get("mcpServers") or data.get("mcp_servers") or data
+                logger.info(f"[ToolManager] Loading MCP config from {mcp_json_path}")
+                return _normalize_mcp_configs(raw)
+            except Exception as e:
+                logger.warning(f"[ToolManager] Failed to read {mcp_json_path}: {e}, falling back to config.json")
+
+        raw = conf().get("mcp_servers", [])
+        return _normalize_mcp_configs(raw)
+
+    def _load_mcp_tools(self):
+        """
+        Trigger MCP tool loading in a background thread (idempotent).
+
+        Returns immediately. Booting MCP servers (npx, uvx, etc.) takes
+        seconds to tens of seconds on first run, which would otherwise
+        block agent initialization and the user's first message.
+        Built-in tools work fine without MCP, so we let the agent serve
+        traffic right away and let MCP servers come online in the
+        background. Per-session agents read a snapshot of whatever is
+        ready at construction time and gracefully ignore the rest.
+        """
+        with self._mcp_lock:
+            if self._mcp_loaded:
+                return
+            mcp_servers_config = self._load_mcp_configs()
+            # Snapshot the signature now so future refresh_mcp_if_changed()
+            # calls can short-circuit when nothing has changed on disk.
+            self._mcp_signature = self._read_mcp_json_signature()
+            self._mcp_active_configs = {
+                cfg.get("name", "<unnamed>"): cfg for cfg in mcp_servers_config
+            }
+            if not mcp_servers_config:
+                # Mark as loaded even when there is nothing to load,
+                # so we don't re-read the config file on every call.
+                self._mcp_loaded = True
+                return
+
+            # Mark pending immediately so list_mcp_status() callers see
+            # the in-progress state instead of an empty dict.
+            for cfg in mcp_servers_config:
+                name = cfg.get("name", "<unnamed>")
+                self._mcp_status[name] = "pending"
+
+            self._mcp_loaded = True
+            threading.Thread(
+                target=self._load_mcp_tools_async,
+                args=(mcp_servers_config,),
+                daemon=True,
+                name="mcp-loader",
+            ).start()
+            logger.info(
+                f"[ToolManager] MCP loading started in background "
+                f"({len(mcp_servers_config)} server(s) configured)"
+            )
+
+    def refresh_mcp_if_changed(self):
+        """
+        Cheap check whether ~/cow/mcp.json has changed since last load.
+        If it has, do a diff-based reload: start newly added servers,
+        shut down removed ones, and restart any whose config was edited.
+        Untouched servers are left running.
+
+        Designed to be called on every agent creation. The fast path is
+        a single os.stat() — completely free when nothing has changed.
+        """
+        with self._mcp_lock:
+            new_sig = self._read_mcp_json_signature()
+            if new_sig == self._mcp_signature:
+                return  # no-op fast path
+
+            try:
+                new_configs = self._load_mcp_configs()
+            except Exception as e:
+                logger.warning(f"[ToolManager] MCP reload — failed to parse config: {e}")
+                return
+
+            new_by_name = {
+                cfg.get("name", "<unnamed>"): cfg for cfg in new_configs
+            }
+            old_by_name = self._mcp_active_configs
+
+            added = [n for n in new_by_name if n not in old_by_name]
+            removed = [n for n in old_by_name if n not in new_by_name]
+            changed = [
+                n for n in new_by_name
+                if n in old_by_name and new_by_name[n] != old_by_name[n]
+            ]
+
+            if not (added or removed or changed):
+                # Signature drifted but content is logically identical
+                # (e.g. user re-saved the file without edits). Just sync.
+                self._mcp_signature = new_sig
+                return
+
+            logger.info(
+                f"[ToolManager] mcp.json changed — "
+                f"adding={added}, removing={removed}, restarting={changed}"
+            )
+
+            # Tear down removed + changed servers (changed ones get restarted below)
+            for name in removed + changed:
+                self._teardown_mcp_server(name)
+
+            # Spin up newly added + changed servers in the background
+            to_start = [new_by_name[n] for n in added + changed]
+            if to_start:
+                for cfg in to_start:
+                    self._mcp_status[cfg.get("name", "<unnamed>")] = "pending"
+                threading.Thread(
+                    target=self._load_mcp_tools_async,
+                    args=(to_start,),
+                    daemon=True,
+                    name="mcp-loader-reload",
+                ).start()
+
+            self._mcp_active_configs = new_by_name
+            self._mcp_signature = new_sig
+
+    def _teardown_mcp_server(self, server_name: str):
+        """Shut down one MCP server and drop its tools from the registry."""
+        if self._mcp_registry is None:
+            return
+        client = None
+        with self._mcp_registry._registry_lock:
+            client = self._mcp_registry._clients.pop(server_name, None)
+        if client is not None:
+            try:
+                client.shutdown()
+            except Exception as e:
+                logger.warning(f"[MCP] Error shutting down '{server_name}': {e}")
+        # Drop tools that belonged to this server.
+        for tool_name in list(self._mcp_tool_instances.keys()):
+            tool = self._mcp_tool_instances.get(tool_name)
+            if tool is not None and getattr(tool, "server_name", None) == server_name:
+                self._mcp_tool_instances.pop(tool_name, None)
+        self._mcp_status.pop(server_name, None)
+
+    def _load_mcp_tools_async(self, mcp_servers_config):
+        """
+        Background worker: bring up each MCP server one-by-one and
+        publish ready tools to _mcp_tool_instances as they come online.
+
+        Server failures are isolated — one bad server cannot block
+        the others, and never raises out of the worker thread.
+        """
+        try:
+            from agent.tools.mcp.mcp_client import McpClient, McpClientRegistry
+            from agent.tools.mcp.mcp_tool import McpTool
+
+            registry = McpClientRegistry()
+            self._mcp_registry = registry
+
+            for cfg in mcp_servers_config:
+                server_name = cfg.get("name", "<unnamed>")
+                try:
+                    client = McpClient(cfg)
+                    if not client.initialize():
+                        self._mcp_status[server_name] = "failed"
+                        logger.warning(
+                            f"[MCP] Server '{server_name}' failed to initialize — skipping"
+                        )
+                        continue
+
+                    tool_schemas = client.list_tools()
+                    added = []
+                    for schema in tool_schemas:
+                        tool_name = schema.get("name", "")
+                        if not tool_name:
+                            continue
+                        mcp_tool = McpTool(client, schema, server_name)
+                        # Atomic dict assignment is GIL-safe; readers iterate
+                        # over a list() snapshot to avoid concurrent mutation.
+                        self._mcp_tool_instances[tool_name] = mcp_tool
+                        added.append(tool_name)
+
+                    # Register client into the shared registry only after its
+                    # tools are visible, so callers never see a half-loaded server.
+                    with registry._registry_lock:
+                        registry._clients[server_name] = client
+                    self._mcp_status[server_name] = "ready"
+                    logger.info(
+                        f"[MCP] Server '{server_name}' ready — "
+                        f"{len(added)} tool(s): {added}"
+                    )
+                except Exception as e:
+                    self._mcp_status[server_name] = "failed"
+                    logger.warning(f"[MCP] Server '{server_name}' load failed: {e}")
+
+            ready = sum(1 for s in self._mcp_status.values() if s == "ready")
+            total = len(self._mcp_status)
+            logger.info(
+                f"[ToolManager] MCP loading complete: "
+                f"{ready}/{total} server(s) ready, "
+                f"{len(self._mcp_tool_instances)} tool(s) available"
+            )
+        except Exception as e:
+            logger.warning(f"[ToolManager] MCP background loader crashed: {e}")
+
+    def list_mcp_status(self) -> dict:
+        """Return {server_name: status} snapshot for UI / debugging."""
+        return dict(self._mcp_status)
+
+    def sync_mcp_into_agent(self, agent) -> tuple:
+        """
+        Reconcile a live agent's tool collection with the current MCP tool registry.
+
+        Adds tools that finished loading after the agent was created,
+        and removes tools whose MCP server was torn down. Built-in tools
+        on the agent are left untouched.
+
+        Handles both representations CowAgent uses:
+          - Agent.tools: list[BaseTool]               (default Agent class)
+          - AgentStream.tools: dict[str, BaseTool]    (streaming agent)
+
+        Returns (added_names, removed_names) for logging.
+        """
+        if agent is None or not hasattr(agent, "tools"):
+            return ([], [])
+
+        from agent.tools.mcp.mcp_tool import McpTool
+        current = self._mcp_tool_instances
+        registry_names = set(current.keys())
+
+        agent_tools = agent.tools
+
+        if isinstance(agent_tools, dict):
+            agent_mcp_names = {
+                name for name, tool in agent_tools.items()
+                if isinstance(tool, McpTool)
+            }
+            added = registry_names - agent_mcp_names
+            removed = agent_mcp_names - registry_names
+            if not (added or removed):
+                return ([], [])
+            for name in added:
+                agent_tools[name] = current[name]
+            for name in removed:
+                agent_tools.pop(name, None)
+
+        elif isinstance(agent_tools, list):
+            agent_mcp_names = {
+                t.name for t in agent_tools if isinstance(t, McpTool)
+            }
+            added = registry_names - agent_mcp_names
+            removed = agent_mcp_names - registry_names
+            if not (added or removed):
+                return ([], [])
+            if removed:
+                agent.tools = [
+                    t for t in agent_tools
+                    if not (isinstance(t, McpTool) and t.name in removed)
+                ]
+            for name in added:
+                agent.tools.append(current[name])
+
+        else:
+            return ([], [])
+
+        return (sorted(added), sorted(removed))
+
    def create_tool(self, name: str) -> BaseTool:
        """
        Get a new instance of a tool by name.
@@ -229,6 +581,12 @@ class ToolManager:
                tool_instance.config = self.tool_configs[name]

            return tool_instance
+
+        # Fall back to MCP tool instances
+        mcp_tool = self._mcp_tool_instances.get(name)
+        if mcp_tool:
+            return mcp_tool
+
        return None

    def list_tools(self) -> dict:
@@ -245,4 +603,17 @@ class ToolManager:
                "description": temp_instance.description,
                "parameters": temp_instance.get_json_schema()
            }
+
+        # Include MCP tool instances
+        for name, mcp_tool in self._mcp_tool_instances.items():
+            result[name] = {
+                "description": mcp_tool.description,
+                "parameters": mcp_tool.params,
+            }
+
        return result
+
+    def shutdown_mcp(self):
+        """Shut down all MCP server clients."""
+        if self._mcp_registry:
+            self._mcp_registry.shutdown_all()
--- a/agent/tools/utils/truncate.py
+++ b/agent/tools/utils/truncate.py
@@ -8,7 +8,10 @@ Truncation is based on two independent limits - whichever is hit first wins:
 Never returns partial lines (except bash tail truncation edge case).
 """

-from typing import Dict, Any, Optional, Literal, Tuple
+from __future__ import annotations
+from typing import Dict, Any, Optional, Tuple, TYPE_CHECKING
+if TYPE_CHECKING:
+    from typing import Literal


 DEFAULT_MAX_LINES = 2000
--- a/agent/tools/vision/vision.py
+++ b/agent/tools/vision/vision.py
@@ -2,12 +2,18 @@
 Vision tool - Analyze images using Vision API.
 Supports local files (auto base64-encoded) and HTTP URLs.

-Provider priority (default):
-  1. Main model via bot.call_vision — zero extra cost
-  2. Other models whose API key is configured — auto-discovered
-  3. OpenAI / LinkAI raw HTTP — reliable fallback
-  When use_linkai=true, LinkAI is promoted to #1.
-  When tool.vision.model is set, that model is used exclusively first.
+Provider resolution:
+  - tools.vision.model (if set) means "prefer this model first; fall back to
+    other configured providers if it fails". The model name is mapped to its
+    native provider (e.g. doubao-* → Doubao, kimi-* → Moonshot, gpt-* →
+    OpenAI/LinkAI). That provider is tried first, then the standard auto
+    chain runs as fallback (with the preferred provider de-duplicated).
+  - Auto chain priority:
+      1. Main model via bot.call_vision — only when the main bot is known
+         to actually support vision (not just expose a call_vision method).
+      2. Other models whose API key is configured.
+      3. OpenAI / LinkAI raw HTTP.
+    When use_linkai=true, LinkAI is promoted to #1.
 """

 import base64
@@ -24,7 +30,7 @@ from common import const
 from common.log import logger
 from config import conf

-DEFAULT_MODEL = const.GPT_41_MINI
+DEFAULT_MODEL = const.GPT_55
 DEFAULT_TIMEOUT = 60
 MAX_TOKENS = 1000
 COMPRESS_THRESHOLD = 1_048_576  # 1 MB
@@ -43,15 +49,35 @@ _MAIN_MODEL_PROVIDER_NAME = "MainModel"
 # Auto-discovered as fallback vision providers when their API key is configured.
 # OpenAI and LinkAI are handled separately (raw HTTP providers), so not listed here.
 _DISCOVERABLE_MODELS = [
-    ("moonshot_api_key", const.MOONSHOT, const.KIMI_K2_5, "Moonshot"),
+    ("moonshot_api_key", const.MOONSHOT, const.KIMI_K2_6, "Moonshot"),
    ("ark_api_key", const.DOUBAO, const.DOUBAO_SEED_2_PRO, "Doubao"),
    ("dashscope_api_key", const.QWEN_DASHSCOPE, const.QWEN36_PLUS, "DashScope"),
    ("claude_api_key", const.CLAUDEAPI, const.CLAUDE_4_6_SONNET, "Claude"),
-    ("gemini_api_key", const.GEMINI, const.GEMINI_31_FLASH_LITE_PRE, "Gemini"),
+    ("gemini_api_key", const.GEMINI, const.GEMINI_35_FLASH, "Gemini"),
+    ("qianfan_api_key", const.QIANFAN, const.ERNIE_45_TURBO_VL, "Qianfan"),
    ("zhipu_ai_api_key", const.ZHIPU_AI, const.GLM_4_7, "ZhipuAI"),
    ("minimax_api_key", const.MiniMax, const.MINIMAX_M2_7, "MiniMax"),
 ]

+# Model name prefix → discoverable provider display_name.
+# Used to auto-route tools.vision.model to its native provider.
+# Matched case-insensitively; longest prefix wins.
+_MODEL_PREFIX_TO_PROVIDER = [
+    ("doubao-", "Doubao"),
+    ("kimi-", "Moonshot"),
+    ("moonshot-", "Moonshot"),
+    ("qwen", "DashScope"),       # qwen-*, qwen3-*, qwen3.6-*, etc.
+    ("claude-", "Claude"),
+    ("ernie-", "Qianfan"),
+    ("gemini-", "Gemini"),
+    ("glm-", "ZhipuAI"),
+    ("minimax-", "MiniMax"),
+    ("abab", "MiniMax"),
+]
+
+# Model prefixes that natively belong to OpenAI / LinkAI (raw HTTP providers).
+_OPENAI_MODEL_PREFIXES = ("gpt-", "o1-", "o3-", "o4-", "chatgpt-")
+

@dataclass
 class VisionProvider:
@@ -116,7 +142,7 @@ class Vision(BaseTool):
                "Error: No model available for Vision.\n"
                "The main model does not support vision and no other API keys are configured.\n"
                "Options:\n"
-                "  1. Switch to a multimodal model (e.g. qwen3.6-plus, claude-sonnet-4-6, gemini-2.0-flash)\n"
+                "  1. Switch to a multimodal model (e.g. ernie-4.5-turbo-vl, qwen3.6-plus, claude-sonnet-4-6, gemini-2.0-flash)\n"
                "  2. Configure OPENAI_API_KEY: env_config(action=\"set\", key=\"OPENAI_API_KEY\", value=\"your-key\")\n"
                "  3. Configure LINKAI_API_KEY: env_config(action=\"set\", key=\"LINKAI_API_KEY\", value=\"your-key\")"
            )
@@ -126,6 +152,9 @@ class Vision(BaseTool):
        except Exception as e:
            return ToolResult.fail(f"Error: {e}")

+        # Default model is only used as a last-resort placeholder for providers
+        # whose VisionProvider.model_override is None (e.g. raw OpenAI provider
+        # when the user did not configure tools.vision.model).
        return self._call_with_fallback(providers, DEFAULT_MODEL, question, image_content)

    def _call_with_fallback(self, providers: List[VisionProvider], model: str,
@@ -162,29 +191,55 @@ class Vision(BaseTool):

    def _resolve_providers(self) -> List[VisionProvider]:
        """
-        Build an ordered list of available providers.
+        Build an ordered list of providers to try.

-        Priority:
-          - use_linkai=true  → [LinkAI, MainModel, OtherModels…, OpenAI]
-          - default          → [MainModel, OtherModels…, OpenAI, LinkAI]
+        Semantics of `tools.vision.model`:
+          "Prefer this model first; fall back to other configured providers
+           if it fails."

-        "OtherModels" are auto-discovered from configured API keys.
-        The main model's bot_type is excluded from OtherModels to avoid
-        duplicating the MainModel provider.
+        Order:
+          1. The provider that natively serves `tools.vision.model` (if any
+             and its API key is configured) — using the user-specified model
+             name verbatim.
+          2. Auto-discovery chain as fallback:
+               - use_linkai=true → [LinkAI, MainModel?, OtherModels…, OpenAI]
+               - default         → [MainModel?, OtherModels…, OpenAI, LinkAI]
+             MainModel is only included when the main bot is known to support
+             vision (see _main_bot_supports_vision).
+
+        Providers that share the same display name as the preferred provider
+        are de-duplicated to avoid retrying the same endpoint twice.
        """
-        use_linkai = conf().get("use_linkai", False) and conf().get("linkai_api_key")
+        user_model = self._resolve_user_vision_model()
        providers: List[VisionProvider] = []

+        # Step 1: preferred provider derived from tools.vision.model
+        if user_model:
+            preferred = self._route_by_model_name(user_model)
+            if preferred:
+                providers.extend(preferred)
+
+        # Step 2: auto-discovery chain as fallback
+        existing = {p.name for p in providers}
+        fallback: List[VisionProvider] = []
+        use_linkai = conf().get("use_linkai", False) and conf().get("linkai_api_key")
+
        if use_linkai:
-            self._append_provider(providers, self._build_linkai_provider)
-            self._append_provider(providers, self._build_main_model_provider)
-            self._append_other_model_providers(providers)
-            self._append_provider(providers, self._build_openai_provider)
+            self._append_provider(fallback, lambda: self._build_linkai_provider(user_model))
+            self._append_provider(fallback, self._build_main_model_provider)
+            self._append_other_model_providers(fallback, preferred_model=user_model)
+            self._append_provider(fallback, lambda: self._build_openai_provider(user_model))
        else:
-            self._append_provider(providers, self._build_main_model_provider)
-            self._append_other_model_providers(providers)
-            self._append_provider(providers, self._build_openai_provider)
-            self._append_provider(providers, self._build_linkai_provider)
+            self._append_provider(fallback, self._build_main_model_provider)
+            self._append_other_model_providers(fallback, preferred_model=user_model)
+            self._append_provider(fallback, lambda: self._build_openai_provider(user_model))
+            self._append_provider(fallback, lambda: self._build_linkai_provider(user_model))
+
+        for p in fallback:
+            if p.name in existing:
+                continue
+            providers.append(p)
+            existing.add(p.name)

        return providers

@@ -194,29 +249,135 @@ class Vision(BaseTool):
        if p:
            providers.append(p)

-    def _append_other_model_providers(self, providers: List[VisionProvider]) -> None:
+    @staticmethod
+    def _resolve_user_vision_model() -> Optional[str]:
+        """Read tools.vision.model (singular ``tool`` kept as runtime fallback)."""
+        tools_conf = conf().get("tools") or conf().get("tool") or {}
+        if not isinstance(tools_conf, dict):
+            return None
+        vision_conf = tools_conf.get("vision", {})
+        if not isinstance(vision_conf, dict):
+            return None
+        m = vision_conf.get("model")
+        if isinstance(m, str) and m.strip():
+            return m.strip()
+        return None
+
+    @staticmethod
+    def _infer_provider_from_model(model_name: str) -> Optional[str]:
+        """
+        Infer the provider display name from a model name's prefix.
+        Returns None when no rule matches (or for OpenAI-family names, which
+        are handled separately by the caller).
+        """
+        if not model_name:
+            return None
+        lower = model_name.lower()
+        # Sort by prefix length desc so e.g. "moonshot-" wins over hypothetical "moo-"
+        for prefix, display_name in sorted(_MODEL_PREFIX_TO_PROVIDER, key=lambda x: -len(x[0])):
+            if lower.startswith(prefix.lower()):
+                return display_name
+        return None
+
+    def _route_by_model_name(self, user_model: str) -> Optional[List[VisionProvider]]:
+        """
+        Try to build a provider list using the user-specified model name.
+        Returns:
+          - [provider]  : matched and the provider's key is configured
+          - []          : matched but key missing → tell caller to surface this
+                          as a hard error rather than silently falling back
+          - None        : no rule matches → caller should fall through to auto
+        """
+        lower = user_model.lower()
+
+        # OpenAI / LinkAI family
+        if lower.startswith(_OPENAI_MODEL_PREFIXES):
+            providers: List[VisionProvider] = []
+            # Prefer LinkAI when explicitly enabled, else OpenAI first
+            use_linkai = conf().get("use_linkai", False) and conf().get("linkai_api_key")
+            if use_linkai:
+                self._append_provider(providers, lambda: self._build_linkai_provider(user_model))
+                self._append_provider(providers, lambda: self._build_openai_provider(user_model))
+            else:
+                self._append_provider(providers, lambda: self._build_openai_provider(user_model))
+                self._append_provider(providers, lambda: self._build_linkai_provider(user_model))
+            if providers:
+                return providers
+            logger.warning(f"[Vision] tools.vision.model='{user_model}' looks like an OpenAI "
+                           f"model but neither OPENAI_API_KEY nor LINKAI_API_KEY is configured.")
+            return None  # fall through to auto
+
+        # Discoverable native providers (Doubao, Moonshot, etc.)
+        target_display = self._infer_provider_from_model(user_model)
+        if not target_display:
+            return None  # unknown prefix → auto
+
+        for config_key, bot_type, _default_model, display_name in _DISCOVERABLE_MODELS:
+            if display_name != target_display:
+                continue
+            api_key = conf().get(config_key, "")
+            if not api_key or not api_key.strip():
+                logger.warning(f"[Vision] tools.vision.model='{user_model}' routes to "
+                               f"'{display_name}' but '{config_key}' is not configured. "
+                               f"Falling back to auto-discovery.")
+                return None  # fall through to auto
+            try:
+                from models.bot_factory import create_bot
+                bot = create_bot(bot_type)
+                if not hasattr(bot, 'call_vision'):
+                    logger.warning(f"[Vision] '{display_name}' bot does not implement call_vision.")
+                    return None
+            except Exception as e:
+                logger.warning(f"[Vision] Failed to create '{display_name}' bot: {e}")
+                return None
+
+            return [VisionProvider(
+                name=display_name,
+                api_key="",
+                api_base="",
+                model_override=user_model,
+                use_bot=True,
+                fallback_bot=bot,
+            )]
+
+        return None
+
+    def _append_other_model_providers(self, providers: List[VisionProvider],
+                                       preferred_model: Optional[str] = None) -> None:
        """
        Auto-discover other models whose API key is configured.
-        Skip the main model's own bot_type (already covered by MainModel provider).
-        Skip bot_types that already have a provider in the list (e.g. OpenAI).
+        Skip the main model's own bot_type (already covered by MainModel
+        provider), unless the main model itself does not support vision —
+        in that case we still want the vendor's dedicated vision model
+        as a fallback. Also skip bot_types that already appear in the
+        provider list.
+
+        If preferred_model matches a provider's family, use it instead
+        of that provider's hard-coded default model.
        """
-        # Determine main model's bot_type so we can skip it
        main_bot_type = None
+        main_bot_supports_vision = False
        if self.model and hasattr(self.model, '_resolve_bot_type'):
            main_bot_type = self.model._resolve_bot_type(conf().get("model", ""))
+            main_bot = getattr(self.model, "bot", None)
+            main_bot_supports_vision = self._main_bot_supports_vision(main_bot)

        existing_names = {p.name for p in providers}
+        preferred_provider = self._infer_provider_from_model(preferred_model) if preferred_model else None

        for config_key, bot_type, default_model, display_name in _DISCOVERABLE_MODELS:
            if display_name in existing_names:
                continue
-            if bot_type == main_bot_type:
+            # Same bot_type as the main model is normally handled by the
+            # MainModel provider; only skip it here if the main model
+            # actually supports vision. Otherwise fall through and add
+            # the vendor's dedicated vision model as a fallback.
+            if bot_type == main_bot_type and main_bot_supports_vision:
                continue
            api_key = conf().get(config_key, "")
            if not api_key or not api_key.strip():
                continue

-            # Create a bot instance and check if it supports call_vision
            try:
                from models.bot_factory import create_bot
                bot = create_bot(bot_type)
@@ -225,62 +386,105 @@ class Vision(BaseTool):
            except Exception:
                continue

-            providers.append(VisionProvider(
+            model_for_provider = (preferred_model
+                                  if preferred_provider == display_name and preferred_model
+                                  else default_model)
+
+            provider = VisionProvider(
                name=display_name,
                api_key="",
                api_base="",
-                model_override=default_model,
+                model_override=model_for_provider,
                use_bot=True,
                fallback_bot=bot,
-            ))
+            )

-    def _resolve_vision_model(self) -> Optional[str]:
-        """
-        Determine which model to use for vision.
+            # Same vendor as the main bot is the most natural fallback when
+            # the main model itself does not support vision — promote it to
+            # the front of the list instead of relying on declaration order.
+            if bot_type == main_bot_type:
+                providers.insert(0, provider)
+            else:
+                providers.append(provider)

-        1. User explicit config: tool.vision.model in config.json
-        2. Fallback to the main configured model name
+    def _main_bot_supports_vision(self, bot) -> bool:
        """
-        tool_conf = conf().get("tool", {})
-        user_vision_model = tool_conf.get("vision", {}).get("model") if isinstance(tool_conf, dict) else None
-        if user_vision_model:
-            return user_vision_model
-        model_name = conf().get("model", "")
-        return model_name or None
+        Whether the main bot is known to natively support vision.
+
+        Having a `call_vision` method is necessary but not sufficient —
+        some bots implement the method against an endpoint that does not
+        actually serve vision models, which causes silent failures when a
+        vendor-foreign model name is forwarded.
+
+        Resolution order:
+          1. If the bot explicitly declares `supports_vision`, trust it.
+             This lets bots opt in or out based on their own runtime
+             configuration (e.g. the currently selected model).
+          2. Otherwise, fall back to a model-name prefix heuristic: trust
+             call_vision when the main model looks like an OpenAI family
+             model or matches a known multimodal vendor prefix.
+        """
+        if bot is None:
+            return False
+        if hasattr(bot, "supports_vision"):
+            return bool(getattr(bot, "supports_vision"))
+        main_model = (conf().get("model") or "").lower()
+        if not main_model:
+            return False
+        if main_model.startswith(_OPENAI_MODEL_PREFIXES):
+            return True
+        return self._infer_provider_from_model(main_model) is not None

    def _build_main_model_provider(self) -> Optional[VisionProvider]:
        """
        Use the vendor's own model for vision via bot.call_vision.
-        Only available when the bot class has call_vision.
+        Gated by _main_bot_supports_vision so non-vision bots (DeepSeek, etc.)
+        do not get routed vendor-foreign model names.
        """
        if not (self.model and hasattr(self.model, 'bot')):
            return None
        try:
            bot = self.model.bot
-            if not hasattr(bot, 'call_vision'):
-                return None
        except Exception:
            return None
+        if not hasattr(bot, 'call_vision'):
+            return None
+        if not self._main_bot_supports_vision(bot):
+            return None

-        vision_model = self._resolve_vision_model()
+        # Use the configured main model name; do NOT inject tools.vision.model
+        # here, because by the time we reach this branch the tools.vision.model
+        # routing has already been attempted (and either matched the main bot
+        # or failed to find a provider).
+        main_model_name = conf().get("model") or None

        return VisionProvider(
            name=_MAIN_MODEL_PROVIDER_NAME,
            api_key="",
            api_base="",
-            model_override=vision_model,
+            model_override=main_model_name,
            use_bot=True,
        )

-    def _build_openai_provider(self) -> Optional[VisionProvider]:
+    def _build_openai_provider(self, preferred_model: Optional[str] = None) -> Optional[VisionProvider]:
        api_key = conf().get("open_ai_api_key") or os.environ.get("OPENAI_API_KEY")
        if not api_key:
            return None
        api_base = (conf().get("open_ai_api_base") or os.environ.get("OPENAI_API_BASE", "")).rstrip("/") \
            or "https://api.openai.com/v1"
-        return VisionProvider(name="OpenAI", api_key=api_key, api_base=self._ensure_v1(api_base))
+        # Only honor preferred_model when it looks like an OpenAI-family name;
+        # otherwise the OpenAI endpoint would 400 on a vendor-specific name.
+        model_override = preferred_model if (
+            preferred_model and preferred_model.lower().startswith(_OPENAI_MODEL_PREFIXES)
+        ) else None
+        return VisionProvider(
+            name="OpenAI",
+            api_key=api_key,
+            api_base=self._ensure_v1(api_base),
+            model_override=model_override,
+        )

-    def _build_linkai_provider(self) -> Optional[VisionProvider]:
+    def _build_linkai_provider(self, preferred_model: Optional[str] = None) -> Optional[VisionProvider]:
        api_key = conf().get("linkai_api_key") or os.environ.get("LINKAI_API_KEY")
        if not api_key:
            return None
@@ -290,8 +494,15 @@ class Vision(BaseTool):
        extra = get_cloud_headers(api_key)
        extra.pop("Authorization", None)
        extra.pop("Content-Type", None)
-        return VisionProvider(name="LinkAI", api_key=api_key, api_base=self._ensure_v1(api_base),
-                              extra_headers=extra)
+        # LinkAI is a multi-vendor proxy and accepts most model names, so we
+        # honor any user-configured model name here.
+        return VisionProvider(
+            name="LinkAI",
+            api_key=api_key,
+            api_base=self._ensure_v1(api_base),
+            extra_headers=extra,
+            model_override=preferred_model,
+        )

    def _call_via_bot(self, model: str, question: str, image_content: dict,
                      provider: Optional[VisionProvider] = None) -> ToolResult:
--- a/agent/tools/web_search/web_search.py
+++ b/agent/tools/web_search/web_search.py
@@ -1,13 +1,27 @@
-"""
-Web Search tool - Search the web using Bocha or LinkAI search API.
-Supports two backends with unified response format:
-  1. Bocha Search (primary, requires BOCHA_API_KEY)
-  2. LinkAI Search (fallback, requires LINKAI_API_KEY)
+"""Web Search tool. Supports four backends with a unified response format:
+  - bocha   (https://open.bochaai.com)
+  - zhipu   (https://docs.bigmodel.cn/cn/guide/tools/web-search)
+  - qianfan (https://cloud.baidu.com/doc/qianfan/s/2mh4su4uy)
+  - linkai  (https://link-ai.tech, fallback)
+
+Provider selection
+  - strategy 'auto' (default): pick the first configured provider in the
+    canonical order [bocha, zhipu, qianfan, linkai]. When the caller passes
+    an explicit `provider` it overrides the pick; an invalid/unconfigured
+    one silently falls back to the auto order.
+  - strategy 'fixed': use the configured provider; if its credential is
+    missing at call time, silently fall back to auto order (no card hint).
+
+Credentials
+  - bocha   : tools.web_search.bocha_api_key  ->  env BOCHA_API_KEY
+  - zhipu   : conf.zhipu_ai_api_key            ->  env ZHIPUAI_API_KEY
+  - qianfan : conf.qianfan_api_key             ->  env QIANFAN_API_KEY
+  - linkai  : conf.linkai_api_key              ->  env LINKAI_API_KEY
 """

-import os
 import json
-from typing import Dict, Any, Optional
+import os
+from typing import Any, Dict, List, Optional

 import requests

@@ -16,12 +30,63 @@ from common.log import logger
 from config import conf


-# Default timeout for API requests (seconds)
 DEFAULT_TIMEOUT = 30

+# Canonical fallback order. Empirically ordered by Chinese real-time
+# quality + relevance: bocha (best overall), qianfan (best for hot news),
+# zhipu (strong on long-form articles), linkai (cloud aggregator, last
+# resort).
+PROVIDER_ORDER = ("bocha", "qianfan", "zhipu", "linkai")
+
+PROVIDER_LABELS = {
+    "bocha":   "Bocha",
+    "zhipu":   "Zhipu",
+    "qianfan": "Baidu Qianfan",
+    "linkai":  "LinkAI",
+}
+
+
+def _tools_web_search_conf() -> dict:
+    """Return the tools.web_search config block (dict-like)."""
+    tools_cfg = conf().get("tools") or {}
+    if not isinstance(tools_cfg, dict):
+        return {}
+    block = tools_cfg.get("web_search") or {}
+    return block if isinstance(block, dict) else {}
+
+
+def _get_api_key(provider: str) -> str:
+    """Resolve API key for a provider, with conf -> env fallback."""
+    if provider == "bocha":
+        key = (_tools_web_search_conf().get("bocha_api_key") or "").strip()
+        return key or os.environ.get("BOCHA_API_KEY", "").strip()
+    if provider == "zhipu":
+        key = (conf().get("zhipu_ai_api_key") or "").strip()
+        return key or os.environ.get("ZHIPUAI_API_KEY", "").strip()
+    if provider == "qianfan":
+        key = (conf().get("qianfan_api_key") or "").strip()
+        return key or os.environ.get("QIANFAN_API_KEY", "").strip()
+    if provider == "linkai":
+        key = (conf().get("linkai_api_key") or "").strip()
+        return key or os.environ.get("LINKAI_API_KEY", "").strip()
+    return ""
+
+
+def configured_providers() -> List[str]:
+    """Return configured providers in canonical order."""
+    return [p for p in PROVIDER_ORDER if _get_api_key(p)]
+
+
+def _configured_strategy() -> str:
+    return (_tools_web_search_conf().get("strategy") or "auto").strip().lower()
+
+
+def _configured_provider() -> str:
+    return (_tools_web_search_conf().get("provider") or "").strip().lower()
+

 class WebSearch(BaseTool):
-    """Tool for searching the web using Bocha or LinkAI search API"""
+    """Tool for searching the web across multiple providers."""

    name: str = "web_search"
    description: str = "Search the web for real-time information. Returns titles, URLs, and snippets."
@@ -55,264 +120,368 @@ class WebSearch(BaseTool):

    def __init__(self, config: dict = None):
        self.config = config or {}
-        self._backend = None  # Will be resolved on first execute

    @staticmethod
    def is_available() -> bool:
-        """Check if web search is available (at least one API key is configured)"""
-        return bool(os.environ.get("BOCHA_API_KEY") or os.environ.get("LINKAI_API_KEY"))
+        """Tool is offered to the agent when at least one provider has a key."""
+        return bool(configured_providers())

-    def _resolve_backend(self) -> Optional[str]:
-        """
-        Determine which search backend to use.
-        Priority: Bocha > LinkAI
+    @classmethod
+    def get_json_schema(cls) -> dict:
+        """Augment the static schema with a `provider` field — only when the
+        user has ≥2 providers configured AND strategy is 'auto'. Otherwise
+        the backend picks silently and exposing the field would only waste
+        the agent's tokens."""
+        schema = {
+            "name": cls.name,
+            "description": cls.description,
+            "parameters": json.loads(json.dumps(cls.params)),  # deep copy
+        }
+        if _configured_strategy() != "auto":
+            return schema
+        available = configured_providers()
+        if len(available) < 2:
+            return schema

-        :return: 'bocha', 'linkai', or None
+        schema["parameters"]["properties"]["provider"] = {
+            "type": "string",
+            "enum": available,
+            "description": "Optional. Specifies the search backend. You may switch between providers when the user wants results from a particular source or from multiple sources.",
+        }
+        return schema
+
+    # ------------------------------------------------------------------
+    # Provider resolution
+    # ------------------------------------------------------------------
+
+    def _resolve_provider(self, requested: Optional[str]) -> Optional[str]:
+        """Pick a provider for this call.
+
+        Priority: caller-supplied (if configured) > fixed strategy (if
+        configured) > first configured in PROVIDER_ORDER. Silent fallback
+        when the desired one has no key.
        """
-        if os.environ.get("BOCHA_API_KEY"):
-            return "bocha"
-        if os.environ.get("LINKAI_API_KEY"):
-            return "linkai"
-        return None
+        available = configured_providers()
+        if not available:
+            return None
+
+        if requested:
+            req = requested.strip().lower()
+            if req in available:
+                return req
+            logger.warning(f"[WebSearch] requested provider '{requested}' unavailable, falling back")
+
+        if _configured_strategy() == "fixed":
+            pinned = _configured_provider()
+            if pinned in available:
+                return pinned
+            if pinned:
+                logger.warning(f"[WebSearch] pinned provider '{pinned}' unavailable, falling back to auto")
+
+        return available[0]
+
+    @staticmethod
+    def _resolution_reason(requested: Optional[str], chosen: str) -> str:
+        """Human-readable explanation for why `chosen` won the resolver."""
+        if requested and requested.strip().lower() == chosen:
+            return "caller-requested"
+        strategy = _configured_strategy()
+        if strategy == "fixed" and _configured_provider() == chosen:
+            return "fixed-strategy"
+        return "auto-fallback"
+
+    # ------------------------------------------------------------------
+    # Entry point
+    # ------------------------------------------------------------------

    def execute(self, args: Dict[str, Any]) -> ToolResult:
-        """
-        Execute web search
-
-        :param args: Search parameters (query, count, freshness, summary)
-        :return: Search results
-        """
-        query = args.get("query", "").strip()
+        query = (args.get("query") or "").strip()
        if not query:
            return ToolResult.fail("Error: 'query' parameter is required")

        count = args.get("count", 10)
        freshness = args.get("freshness", "noLimit")
        summary = args.get("summary", False)
-
-        # Validate count
        if not isinstance(count, int) or count < 1 or count > 50:
            count = 10

-        # Resolve backend
-        backend = self._resolve_backend()
-        if not backend:
+        requested = args.get("provider")
+        provider = self._resolve_provider(requested)
+        if not provider:
            return ToolResult.fail(
-                "Error: No search API key configured. "
-                "Please set BOCHA_API_KEY or LINKAI_API_KEY using env_config tool.\n"
-                "  - Bocha Search: https://open.bocha.cn\n"
-                "  - LinkAI Search: https://link-ai.tech"
+                "Error: No search provider configured. "
+                "Configure one of BOCHA_API_KEY / zhipu_ai_api_key / qianfan_api_key / linkai_api_key."
            )

+        # Always log the routing decision so multi-provider deployments can
+        # tell at a glance which backend served any given query.
+        available = configured_providers()
+        reason = self._resolution_reason(requested, provider)
+        q_preview = query if len(query) <= 60 else (query[:57] + "...")
+        logger.info(
+            f"[WebSearch] provider={provider} reason={reason} "
+            f"available={list(available)} query={q_preview!r} count={count} freshness={freshness}"
+        )
+
        try:
-            if backend == "bocha":
+            if provider == "bocha":
                return self._search_bocha(query, count, freshness, summary)
-            else:
+            if provider == "zhipu":
+                return self._search_zhipu(query, count, freshness)
+            if provider == "qianfan":
+                return self._search_qianfan(query, count, freshness)
+            if provider == "linkai":
                return self._search_linkai(query, count, freshness)
+            return ToolResult.fail(f"Error: Unknown provider '{provider}'")
        except requests.Timeout:
            return ToolResult.fail(f"Error: Search request timed out after {DEFAULT_TIMEOUT}s")
        except requests.ConnectionError:
            return ToolResult.fail("Error: Failed to connect to search API")
        except Exception as e:
-            logger.error(f"[WebSearch] Unexpected error: {e}", exc_info=True)
+            logger.error(f"[WebSearch] Unexpected error ({provider}): {e}", exc_info=True)
            return ToolResult.fail(f"Error: Search failed - {str(e)}")

+    # ------------------------------------------------------------------
+    # Bocha
+    # ------------------------------------------------------------------
+
    def _search_bocha(self, query: str, count: int, freshness: str, summary: bool) -> ToolResult:
-        """
-        Search using Bocha API
-
-        :param query: Search query
-        :param count: Number of results
-        :param freshness: Time range filter
-        :param summary: Whether to include summary
-        :return: Formatted search results
-        """
-        api_key = os.environ.get("BOCHA_API_KEY", "")
-        url = "https://api.bocha.cn/v1/web-search"
-
+        api_key = _get_api_key("bocha")
+        url = "https://api.bochaai.com/v1/web-search"
        headers = {
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json",
-            "Accept": "application/json"
+            "Accept": "application/json",
        }
+        payload = {"query": query, "count": count, "freshness": freshness, "summary": summary}

-        payload = {
-            "query": query,
-            "count": count,
-            "freshness": freshness,
-            "summary": summary
-        }
+        logger.debug(f"[WebSearch] bocha: query='{query}', count={count}")
+        resp = requests.post(url, headers=headers, json=payload, timeout=DEFAULT_TIMEOUT)

-        logger.debug(f"[WebSearch] Bocha search: query='{query}', count={count}")
+        if resp.status_code == 401:
+            return ToolResult.fail("Error: Invalid bocha API key.")
+        if resp.status_code == 403:
+            return ToolResult.fail("Error: bocha API — insufficient balance. Top up at https://open.bochaai.com")
+        if resp.status_code == 429:
+            return ToolResult.fail("Error: bocha API rate limit reached.")
+        if resp.status_code != 200:
+            return ToolResult.fail(f"Error: bocha API returned HTTP {resp.status_code}")

-        response = requests.post(url, headers=headers, json=payload, timeout=DEFAULT_TIMEOUT)
-
-        if response.status_code == 401:
-            return ToolResult.fail("Error: Invalid BOCHA_API_KEY. Please check your API key.")
-        if response.status_code == 403:
-            return ToolResult.fail("Error: Bocha API - insufficient balance. Please top up at https://open.bocha.cn")
-        if response.status_code == 429:
-            return ToolResult.fail("Error: Bocha API rate limit reached. Please try again later.")
-        if response.status_code != 200:
-            return ToolResult.fail(f"Error: Bocha API returned HTTP {response.status_code}")
-
-        data = response.json()
-
-        # Check API-level error code
+        data = resp.json()
        api_code = data.get("code")
        if api_code is not None and api_code != 200:
            msg = data.get("msg") or "Unknown error"
-            return ToolResult.fail(f"Error: Bocha API error (code={api_code}): {msg}")
-
-        # Extract and format results
-        return self._format_bocha_results(data, query)
-
-    def _format_bocha_results(self, data: dict, query: str) -> ToolResult:
-        """
-        Format Bocha API response into unified result structure
-
-        :param data: Raw API response
-        :param query: Original query
-        :return: Formatted ToolResult
-        """
-        search_data = data.get("data", {})
-        web_pages = search_data.get("webPages", {})
-        pages = web_pages.get("value", [])
-
-        if not pages:
-            return ToolResult.success({
-                "query": query,
-                "backend": "bocha",
-                "total": 0,
-                "results": [],
-                "message": "No results found"
-            })
+            return ToolResult.fail(f"Error: bocha API error (code={api_code}): {msg}")

+        pages = (data.get("data") or {}).get("webPages", {}).get("value", []) or []
        results = []
-        for page in pages:
-            result = {
-                "title": page.get("name", ""),
-                "url": page.get("url", ""),
-                "snippet": page.get("snippet", ""),
-                "siteName": page.get("siteName", ""),
-                "datePublished": page.get("datePublished") or page.get("dateLastCrawled", ""),
+        for p in pages:
+            item = {
+                "title": p.get("name", ""),
+                "url": p.get("url", ""),
+                "snippet": p.get("snippet", ""),
+                "siteName": p.get("siteName", ""),
+                "datePublished": p.get("datePublished") or p.get("dateLastCrawled", ""),
            }
-            # Include summary only if present
-            if page.get("summary"):
-                result["summary"] = page["summary"]
-            results.append(result)
-
-        total = web_pages.get("totalEstimatedMatches", len(results))
-
+            if p.get("summary"):
+                item["summary"] = p["summary"]
+            results.append(item)
+        total = (data.get("data") or {}).get("webPages", {}).get("totalEstimatedMatches", len(results))
        return ToolResult.success({
-            "query": query,
-            "backend": "bocha",
-            "total": total,
-            "count": len(results),
-            "results": results
+            "query": query, "backend": "bocha",
+            "total": total, "count": len(results), "results": results,
        })

-    def _search_linkai(self, query: str, count: int, freshness: str) -> ToolResult:
-        """
-        Search using LinkAI plugin API
+    # ------------------------------------------------------------------
+    # Zhipu
+    # ------------------------------------------------------------------

-        :param query: Search query
-        :param count: Number of results
-        :param freshness: Time range filter
-        :return: Formatted search results
-        """
-        api_key = os.environ.get("LINKAI_API_KEY", "")
-        api_base = conf().get("linkai_api_base", "https://api.link-ai.tech")
-        url = f"{api_base.rstrip('/')}/v1/plugin/execute"
+    def _search_zhipu(self, query: str, count: int, freshness: str) -> ToolResult:
+        api_key = _get_api_key("zhipu")
+        api_base = (conf().get("zhipu_ai_api_base") or "https://open.bigmodel.cn/api/paas/v4").rstrip("/")
+        url = f"{api_base}/web_search"
+        headers = {
+            "Authorization": f"Bearer {api_key}",
+            "Content-Type": "application/json",
+        }
+
+        # Zhipu Web Search expects `search_query` <= 70 chars; truncate
+        # gracefully so a long agent-supplied query doesn't get rejected.
+        trimmed_query = (query or "")[:70]
+        engine = (_tools_web_search_conf().get("zhipu_search_engine") or "search_pro").strip().lower()
+        if engine not in ("search_std", "search_pro", "search_pro_sogou", "search_pro_quark"):
+            engine = "search_pro"
+
+        payload: Dict[str, Any] = {
+            "search_engine": engine,
+            "search_query": trimmed_query,
+            "search_intent": False,
+            "count": max(1, min(int(count or 10), 50)),
+            "search_recency_filter": freshness if freshness in (
+                "oneDay", "oneWeek", "oneMonth", "oneYear", "noLimit"
+            ) else "noLimit",
+        }
+        content_size = (_tools_web_search_conf().get("zhipu_content_size") or "").strip().lower()
+        if content_size in ("medium", "high"):
+            payload["content_size"] = content_size
+
+        logger.debug(f"[WebSearch] zhipu: query='{trimmed_query}', count={payload['count']}, engine={engine}")
+        resp = requests.post(url, headers=headers, json=payload, timeout=DEFAULT_TIMEOUT)
+
+        if resp.status_code == 401:
+            return ToolResult.fail("Error: Invalid Zhipu API key.")
+        if resp.status_code != 200:
+            return ToolResult.fail(f"Error: Zhipu API returned HTTP {resp.status_code}: {resp.text[:200]}")
+
+        data = resp.json()
+        # Business-level errors (1701/1702/1703 etc.) come back as
+        # {"error": {"code","message"}} even on HTTP 200.
+        if isinstance(data, dict) and data.get("error"):
+            err = data["error"] or {}
+            return ToolResult.fail(f"Error: Zhipu returned {err.get('code')}: {err.get('message','')}")
+
+        items = data.get("search_result") or (data.get("data") or {}).get("search_result") or []
+        results = []
+        for it in items:
+            results.append({
+                "title": it.get("title", ""),
+                "url": it.get("link") or it.get("url", ""),
+                "snippet": it.get("content") or it.get("snippet", ""),
+                "siteName": it.get("media") or it.get("siteName", ""),
+                "datePublished": it.get("publish_date") or it.get("datePublished", ""),
+            })
+        return ToolResult.success({
+            "query": query, "backend": "zhipu",
+            "total": len(results), "count": len(results), "results": results,
+        })
+
+    # ------------------------------------------------------------------
+    # Qianfan (Baidu)
+    # ------------------------------------------------------------------
+
+    def _search_qianfan(self, query: str, count: int, freshness: str) -> ToolResult:
+        api_key = _get_api_key("qianfan")
+        api_base = (conf().get("qianfan_api_base") or "https://qianfan.baidubce.com/v2").rstrip("/")
+        url = f"{api_base}/ai_search/web_search"
+        headers = {
+            "Authorization": f"Bearer {api_key}",
+            "Content-Type": "application/json",
+            "X-Appbuilder-From": "cow",
+        }
+
+        count = max(1, min(int(count or 10), 50))
+        payload: Dict[str, Any] = {
+            "messages": [{"role": "user", "content": query}],
+            "search_source": "baidu_search_v2",
+            "resource_type_filter": [{"type": "web", "top_k": count}],
+        }
+
+        # Baidu AI Search expects freshness as a date-range filter, not a
+        # named recency token. Translate our shared vocabulary into the
+        # underlying page_time range expected by the API.
+        search_filter = self._qianfan_build_freshness_filter(freshness)
+        if search_filter:
+            payload["search_filter"] = search_filter
+
+        logger.debug(f"[WebSearch] qianfan: query='{query}', count={count}, freshness={freshness!r}")
+        resp = requests.post(url, headers=headers, json=payload, timeout=DEFAULT_TIMEOUT)
+
+        if resp.status_code == 401:
+            return ToolResult.fail("Error: Invalid Qianfan API key.")
+        if resp.status_code != 200:
+            return ToolResult.fail(f"Error: Qianfan API returned HTTP {resp.status_code}: {resp.text[:200]}")
+
+        data = resp.json()
+        # Even on HTTP 200 Baidu surfaces business errors as {"code","message"}.
+        if isinstance(data, dict) and data.get("code"):
+            return ToolResult.fail(f"Error: Qianfan returned {data.get('code')}: {data.get('message','')}")
+
+        refs = data.get("references") or []
+        results = []
+        for d in refs:
+            results.append({
+                "title": d.get("title", ""),
+                "url": d.get("url", ""),
+                "snippet": (d.get("content") or "")[:200],
+                "siteName": d.get("web_anchor") or d.get("website") or "",
+                "datePublished": d.get("date", ""),
+            })
+        return ToolResult.success({
+            "query": query, "backend": "qianfan",
+            "total": len(results), "count": len(results), "results": results,
+        })
+
+    @staticmethod
+    def _qianfan_build_freshness_filter(freshness: str) -> Optional[Dict[str, Any]]:
+        if not freshness or freshness == "noLimit":
+            return None
+        delta_days = {"oneDay": 1, "oneWeek": 7, "oneMonth": 30, "oneYear": 365}.get(freshness)
+        if not delta_days:
+            return None
+        from datetime import datetime, timedelta
+        now = datetime.now()
+        end_date = (now + timedelta(days=1)).strftime("%Y-%m-%d")
+        start_date = (now - timedelta(days=delta_days)).strftime("%Y-%m-%d")
+        return {"range": {"page_time": {"gte": start_date, "lt": end_date}}}
+
+    # ------------------------------------------------------------------
+    # LinkAI (plugin)
+    # ------------------------------------------------------------------
+
+    def _search_linkai(self, query: str, count: int, freshness: str) -> ToolResult:
+        api_key = _get_api_key("linkai")
+        api_base = (conf().get("linkai_api_base") or "https://api.link-ai.tech").rstrip("/")
+        url = f"{api_base}/v1/plugin/execute"

        from common.utils import get_cloud_headers
        headers = get_cloud_headers(api_key)

-        payload = {
-            "code": "web-search",
-            "args": {
-                "query": query,
-                "count": count,
-                "freshness": freshness
-            }
-        }
+        payload = {"code": "web-search", "args": {"query": query, "count": count, "freshness": freshness}}
+        logger.debug(f"[WebSearch] linkai: query='{query}', count={count}")
+        resp = requests.post(url, headers=headers, json=payload, timeout=DEFAULT_TIMEOUT)

-        logger.debug(f"[WebSearch] LinkAI search: query='{query}', count={count}")
-
-        response = requests.post(url, headers=headers, json=payload, timeout=DEFAULT_TIMEOUT)
-
-        if response.status_code == 401:
-            return ToolResult.fail("Error: Invalid LINKAI_API_KEY. Please check your API key.")
-        if response.status_code != 200:
-            return ToolResult.fail(f"Error: LinkAI API returned HTTP {response.status_code}")
-
-        data = response.json()
+        if resp.status_code == 401:
+            return ToolResult.fail("Error: Invalid LinkAI API key.")
+        if resp.status_code != 200:
+            return ToolResult.fail(f"Error: LinkAI API returned HTTP {resp.status_code}")

+        data = resp.json()
        if not data.get("success"):
            msg = data.get("message") or "Unknown error"
            return ToolResult.fail(f"Error: LinkAI search failed: {msg}")

-        return self._format_linkai_results(data, query)
-
-    def _format_linkai_results(self, data: dict, query: str) -> ToolResult:
-        """
-        Format LinkAI API response into unified result structure.
-        LinkAI returns the search data in data.data field, which follows
-        the same Bing-compatible format as Bocha.
-
-        :param data: Raw API response
-        :param query: Original query
-        :return: Formatted ToolResult
-        """
-        raw_data = data.get("data", "")
-
-        # LinkAI may return data as a JSON string
-        if isinstance(raw_data, str):
+        raw = data.get("data", "")
+        if isinstance(raw, str):
            try:
-                raw_data = json.loads(raw_data)
+                raw = json.loads(raw)
            except (json.JSONDecodeError, TypeError):
-                # If data is plain text, return it as a single result
                return ToolResult.success({
-                    "query": query,
-                    "backend": "linkai",
-                    "total": 1,
-                    "count": 1,
-                    "results": [{"content": raw_data}]
+                    "query": query, "backend": "linkai",
+                    "total": 1, "count": 1, "results": [{"content": raw}],
                })

-        # If the response follows Bing-compatible structure
-        if isinstance(raw_data, dict):
-            web_pages = raw_data.get("webPages", {})
-            pages = web_pages.get("value", [])
-
+        if isinstance(raw, dict):
+            pages = (raw.get("webPages") or {}).get("value", []) or []
            if pages:
                results = []
-                for page in pages:
-                    result = {
-                        "title": page.get("name", ""),
-                        "url": page.get("url", ""),
-                        "snippet": page.get("snippet", ""),
-                        "siteName": page.get("siteName", ""),
-                        "datePublished": page.get("datePublished") or page.get("dateLastCrawled", ""),
+                for p in pages:
+                    item = {
+                        "title": p.get("name", ""),
+                        "url": p.get("url", ""),
+                        "snippet": p.get("snippet", ""),
+                        "siteName": p.get("siteName", ""),
+                        "datePublished": p.get("datePublished") or p.get("dateLastCrawled", ""),
                    }
-                    if page.get("summary"):
-                        result["summary"] = page["summary"]
-                    results.append(result)
-
-                total = web_pages.get("totalEstimatedMatches", len(results))
+                    if p.get("summary"):
+                        item["summary"] = p["summary"]
+                    results.append(item)
+                total = (raw.get("webPages") or {}).get("totalEstimatedMatches", len(results))
                return ToolResult.success({
-                    "query": query,
-                    "backend": "linkai",
-                    "total": total,
-                    "count": len(results),
-                    "results": results
+                    "query": query, "backend": "linkai",
+                    "total": total, "count": len(results), "results": results,
                })

-        # Fallback: return raw data
        return ToolResult.success({
-            "query": query,
-            "backend": "linkai",
-            "total": 1,
-            "count": 1,
-            "results": [{"content": str(raw_data)}]
+            "query": query, "backend": "linkai",
+            "total": 1, "count": 1, "results": [{"content": str(raw)}],
        })
--- a/app.py
+++ b/app.py
@@ -274,6 +274,53 @@ def sigterm_handler_wrap(_signo):
    signal.signal(_signo, func)


+def _warmup_mcp_tools():
+    """
+    Kick off MCP server loading at process startup so subprocesses
+    (npx / uvx etc.) finish initializing before the first user message
+    arrives. Returns immediately — the actual work happens on a daemon
+    thread inside ToolManager. Safe to call when MCP is not configured.
+    """
+    try:
+        from agent.tools import ToolManager
+        ToolManager()._load_mcp_tools()
+    except Exception as e:
+        logger.warning(f"[App] MCP warmup failed (non-fatal): {e}")
+
+
+def _sync_builtin_skills():
+    """Sync builtin skills from project skills/ to workspace skills/ on startup."""
+    import shutil
+    try:
+        workspace = conf().get("agent_workspace", "~/cow")
+        workspace = os.path.expanduser(workspace)
+        project_root = os.path.dirname(os.path.abspath(__file__))
+        builtin_dir = os.path.join(project_root, "skills")
+        custom_dir = os.path.join(workspace, "skills")
+
+        if not os.path.isdir(builtin_dir):
+            return
+
+        os.makedirs(custom_dir, exist_ok=True)
+        synced = 0
+        for name in os.listdir(builtin_dir):
+            src = os.path.join(builtin_dir, name)
+            if not os.path.isdir(src) or not os.path.isfile(os.path.join(src, "SKILL.md")):
+                continue
+            dst = os.path.join(custom_dir, name)
+            try:
+                if os.path.isdir(dst):
+                    shutil.rmtree(dst)
+                shutil.copytree(src, dst)
+                synced += 1
+            except Exception as e:
+                logger.warning(f"[App] Failed to sync builtin skill '{name}': {e}")
+        if synced:
+            logger.info(f"[App] Synced {synced} builtin skill(s) to workspace")
+    except Exception as e:
+        logger.warning(f"[App] Builtin skills sync failed: {e}")
+
+
 def run():
    global _channel_mgr
    try:
@@ -299,6 +346,13 @@ def run():
        if web_console_enabled and "web" not in channel_names:
            channel_names.append("web")

+        # Sync builtin skills to workspace before channels start
+        _sync_builtin_skills()
+
+        # Kick off MCP server loading in the background so first-message
+        # latency isn't dominated by npx package downloads.
+        _warmup_mcp_tools()
+
        logger.info(f"[App] Starting channels: {channel_names}")

        _channel_mgr = ChannelManager()
@@ -306,6 +360,8 @@ def run():

        while True:
            time.sleep(1)
+    except KeyboardInterrupt:
+        pass
    except Exception as e:
        logger.error("App startup failed!")
        logger.exception(e)
--- a/bridge/agent_bridge.py
+++ b/bridge/agent_bridge.py
@@ -14,6 +14,7 @@ from bridge.reply import Reply, ReplyType
 from common import const
 from common.log import logger
 from common.utils import expand_path
+from config import conf
 from models.openai_compatible_bot import OpenAICompatibleBot


@@ -68,6 +69,7 @@ class AgentLLMModel(LLMModel):
    _MODEL_BOT_TYPE_MAP = {
        "wenxin": const.BAIDU, "wenxin-4": const.BAIDU,
        "xunfei": const.XUNFEI, const.QWEN: const.QWEN_DASHSCOPE,
+        const.QIANFAN: const.QIANFAN,
        const.MODELSCOPE: const.MODELSCOPE,
    }
    _MODEL_PREFIX_MAP = [
@@ -75,10 +77,10 @@ class AgentLLMModel(LLMModel):
        ("gemini", const.GEMINI), ("glm", const.ZHIPU_AI), ("claude", const.CLAUDEAPI),
        ("moonshot", const.MOONSHOT), ("kimi", const.MOONSHOT),
        ("doubao", const.DOUBAO), ("deepseek", const.DEEPSEEK),
+        ("ernie", const.QIANFAN),
    ]

    def __init__(self, bridge: Bridge, bot_type: str = "chat"):
-        from config import conf
        super().__init__(model=conf().get("model", const.GPT_41))
        self.bridge = bridge
        self.bot_type = bot_type
@@ -87,7 +89,6 @@ class AgentLLMModel(LLMModel):

    @property
    def model(self):
-        from config import conf
        return conf().get("model", const.GPT_41)

    @model.setter
@@ -96,8 +97,6 @@ class AgentLLMModel(LLMModel):

    def _resolve_bot_type(self, model_name: str) -> str:
        """Resolve bot type from model name, matching Bridge.__init__ logic."""
-        from config import conf
-
        if conf().get("use_linkai", False) and conf().get("linkai_api_key"):
            return const.LINKAI
        # Support custom bot type configuration
@@ -117,8 +116,9 @@ class AgentLLMModel(LLMModel):
            return const.MOONSHOT
        if conf().get("bot_type") == "modelscope":
            return const.MODELSCOPE
+        lowered_model = model_name.lower()
        for prefix, btype in self._MODEL_PREFIX_MAP:
-            if model_name.startswith(prefix):
+            if lowered_model.startswith(prefix):
                return btype
        return const.OPENAI

@@ -160,13 +160,30 @@ class AgentLLMModel(LLMModel):
                    kwargs['system'] = system_prompt

                # Pass context metadata to bot
-                channel_type = getattr(self, 'channel_type', None)
+                channel_type = getattr(self, 'channel_type', None) or ''
                if channel_type:
                    kwargs['channel_type'] = channel_type
                session_id = getattr(self, 'session_id', None)
                if session_id:
                    kwargs['session_id'] = session_id

+                # Thinking mode is a global toggle independent of the channel.
+                # IM channels (WeChat/WeCom/DingTalk/Feishu) won't render the
+                # reasoning trace, but still benefit from the higher answer
+                # quality the thinking pass produces.
+                from config import conf
+                thinking_enabled = bool(conf().get("enable_thinking", False))
+                kwargs['thinking'] = (
+                    {"type": "enabled"} if thinking_enabled
+                    else {"type": "disabled"}
+                )
+                # Reasoning effort is only meaningful when thinking is on.
+                # Bots that don't understand the kwarg drop it silently.
+                if thinking_enabled:
+                    effort = conf().get("reasoning_effort", "high")
+                    if effort in ("high", "max"):
+                        kwargs['reasoning_effort'] = effort
+
                response = self.bot.call_with_tools(**kwargs)
                return self._format_response(response)
            else:
@@ -205,13 +222,30 @@ class AgentLLMModel(LLMModel):
                    kwargs['system'] = system_prompt

                # Pass context metadata to bot
-                channel_type = getattr(self, 'channel_type', None)
+                channel_type = getattr(self, 'channel_type', None) or ''
                if channel_type:
                    kwargs['channel_type'] = channel_type
                session_id = getattr(self, 'session_id', None)
                if session_id:
                    kwargs['session_id'] = session_id

+                # Thinking mode is a global toggle independent of the channel.
+                # IM channels (WeChat/WeCom/DingTalk/Feishu) won't render the
+                # reasoning trace, but still benefit from the higher answer
+                # quality the thinking pass produces.
+                from config import conf
+                thinking_enabled = bool(conf().get("enable_thinking", False))
+                kwargs['thinking'] = (
+                    {"type": "enabled"} if thinking_enabled
+                    else {"type": "disabled"}
+                )
+                # Reasoning effort is only meaningful when thinking is on.
+                # Bots that don't understand the kwarg drop it silently.
+                if thinking_enabled:
+                    effort = conf().get("reasoning_effort", "high")
+                    if effort in ("high", "max"):
+                        kwargs['reasoning_effort'] = effort
+
                stream = self.bot.call_with_tools(**kwargs)
                
                # Convert stream format to our expected format
@@ -398,6 +432,18 @@ class AgentBridge:
            # Store session_id on agent so executor can clear DB on fatal errors
            agent._current_session_id = session_id

+            # Bound the in-memory context for scheduler sessions before each run.
+            # Scheduler sessions are stable per-task and append every trigger,
+            # so without trimming they would grow unbounded across runs and
+            # blow up prompt cost. Regular user chats are not touched here —
+            # the agent's own context manager handles that path.
+            if session_id and session_id.startswith("scheduler_"):
+                from config import conf
+                scheduler_keep_turns = max(
+                    1, int(conf().get("agent_max_context_turns", 20)) // 5
+                )
+                self._trim_in_memory_to_turns(agent, scheduler_keep_turns)
+
            try:
                # Use agent's run_stream method with event handler
                response = agent.run_stream(
@@ -430,7 +476,13 @@ class AgentBridge:
                        except Exception as e:
                            logger.warning(f"[AgentBridge] Failed to clear DB after recovery: {e}")
            
-            # Check if there are files to send (from read tool)
+            # Post-message hot-reload: detect edits to ~/cow/mcp.json and
+            # sync any new/removed MCP tools into the live agent in the
+            # background. Off the critical path so user latency is unaffected;
+            # changes take effect on the user's next message.
+            self._schedule_mcp_hot_reload(agent)
+
+            # Check if there are files to send (from send/read tool)
            if hasattr(agent, 'stream_executor') and hasattr(agent.stream_executor, 'files_to_send'):
                files_to_send = agent.stream_executor.files_to_send
                if files_to_send:
@@ -462,6 +514,31 @@ class AgentBridge:
                    logger.warning(f"[AgentBridge] Failed to clear DB after error: {db_err}")
            return Reply(ReplyType.ERROR, f"Agent error: {str(e)}")
    
+    def _schedule_mcp_hot_reload(self, agent):
+        """
+        Fire-and-forget: detect mcp.json edits and reconcile the agent's
+        tool dict in the background. Runs after the user's reply is sent,
+        so any cost (file stat, hash, server boot) never adds to user latency.
+        Failures are isolated and never raise into the message pipeline.
+        """
+        import threading
+        from agent.tools import ToolManager
+
+        def _run():
+            try:
+                tm = ToolManager()
+                tm.refresh_mcp_if_changed()
+                added, removed = tm.sync_mcp_into_agent(agent)
+                if added or removed:
+                    logger.info(
+                        f"[AgentBridge] Agent tools synced — "
+                        f"added={added}, removed={removed}"
+                    )
+            except Exception as e:
+                logger.warning(f"[AgentBridge] MCP hot-reload failed (non-fatal): {e}")
+
+        threading.Thread(target=_run, daemon=True, name="mcp-hot-reload").start()
+
    def _create_file_reply(self, file_info: dict, text_response: str, context: Context = None) -> Reply:
        """
        Create a reply for sending files
@@ -499,10 +576,14 @@ class AgentBridge:
                reply.text_content = text_response
            return reply
        
-        # For other unknown file types, return text with file info
-        message = text_response or file_info.get("message", "文件已准备")
-        message += f"\n\n[文件: {file_info.get('file_name', file_path)}]"
-        return Reply(ReplyType.TEXT, message)
+        # For all other file types (tar.gz, zip, etc.), also use FILE type
+        file_url = f"file://{file_path}"
+        logger.info(f"[AgentBridge] Sending generic file: {file_url}")
+        reply = Reply(ReplyType.FILE, file_url)
+        reply.file_name = file_info.get("file_name", os.path.basename(file_path))
+        if text_response:
+            reply.text_content = text_response
+        return reply
    
    def _migrate_config_to_env(self, workspace_root: str):
        """
@@ -588,18 +669,245 @@ class AgentBridge:
            from config import conf
            if not conf().get("conversation_persistence", True):
                return
+            # When deep-thinking display is disabled, strip "thinking" content
+            # blocks before persisting so they don't resurface on history reload.
+            # The in-memory message list keeps them intact for this run's
+            # multi-turn LLM context.
+            thinking_enabled = bool(conf().get("enable_thinking", False))
        except Exception:
-            pass
+            thinking_enabled = False
+
+        messages_to_store = new_messages
+        if not thinking_enabled:
+            messages_to_store = self._strip_thinking_blocks(new_messages)
+
        try:
            from agent.memory import get_conversation_store
            get_conversation_store().append_messages(
-                session_id, new_messages, channel_type=channel_type
+                session_id, messages_to_store, channel_type=channel_type
            )
        except Exception as e:
            logger.warning(
                f"[AgentBridge] Failed to persist messages for session={session_id}: {e}"
            )

+    # Marker used to identify scheduler-injected user messages so we can apply
+    # a sliding window without touching real user turns. The legacy prefix
+    # "Scheduled task" (written by the v2 PR) is also recognised when pruning,
+    # so old data can be aged out instead of leaking forever.
+    _SCHEDULED_MARKER = "[SCHEDULED]"
+    _SCHEDULED_LEGACY_MARKERS = ("Scheduled task",)
+
+    def remember_scheduled_output(
+        self,
+        session_id: str,
+        content: str,
+        channel_type: str = "",
+        task_description: str = "",
+    ) -> None:
+        """Add the visible output of a scheduled task to the receiver's session.
+
+        Scheduled task execution uses an isolated session so internal planning and
+        tool calls do not leak into the user's chat. The final message is still
+        part of the conversation from the user's point of view, so keep a small
+        visible turn in the receiver session for follow-up questions.
+
+        Configuration:
+            scheduler_inject_to_session (bool, default True):
+                Master switch. When False, this method is a no-op.
+            scheduler_inject_max_per_session (int, default 3):
+                Maximum scheduler-injected user/assistant pairs retained per
+                session. Older injections are pruned automatically.
+
+        Content is truncated to 2000 chars to prevent a single high-volume task
+        from bloating one entry.
+        """
+        from config import conf
+        if not conf().get("scheduler_inject_to_session", True):
+            return
+        if not session_id or not content:
+            return
+
+        max_len = 2000
+        if len(content) > max_len:
+            content = content[:max_len] + "..."
+
+        user_text = self._SCHEDULED_MARKER
+        if task_description:
+            user_text = f"{self._SCHEDULED_MARKER} {task_description}"
+
+        messages = [
+            {"role": "user", "content": [{"type": "text", "text": user_text}]},
+            {"role": "assistant", "content": [{"type": "text", "text": content}]},
+        ]
+
+        # Persist first so the new pair gets a stable seq, then prune old
+        # scheduler pairs in DB, then sync the in-memory agent.messages buffer.
+        self._persist_messages(session_id, messages, channel_type)
+
+        keep_last_n = max(int(conf().get("scheduler_inject_max_per_session", 3) or 0), 0)
+        try:
+            from agent.memory import get_conversation_store
+            deleted = get_conversation_store().prune_scheduled_messages(
+                session_id, keep_last_n=keep_last_n
+            )
+            if deleted:
+                logger.debug(
+                    f"[AgentBridge] Pruned {deleted} old scheduler messages "
+                    f"for session={session_id} (keep_last_n={keep_last_n})"
+                )
+        except Exception as e:
+            logger.warning(
+                f"[AgentBridge] Failed to prune scheduled messages "
+                f"for session={session_id}: {e}"
+            )
+
+        agent = self.agents.get(session_id)
+        if agent:
+            try:
+                with agent.messages_lock:
+                    agent.messages.extend(messages)
+                    self._prune_scheduled_in_memory(agent, keep_last_n)
+            except Exception as e:
+                logger.warning(
+                    f"[AgentBridge] Failed to update in-memory scheduled output "
+                    f"for session={session_id}: {e}"
+                )
+
+    @staticmethod
+    def _trim_in_memory_to_turns(agent, keep_turns: int) -> None:
+        """Bound ``agent.messages`` to the most recent ``keep_turns`` real
+        user/assistant turns, dropping older history together with any
+        intermediate tool_use/tool_result blocks that belonged to it.
+
+        A "real" user message is any user message whose content is not solely a
+        tool_result block — matches the heuristic used elsewhere when filtering
+        history (see ``AgentInitializer._filter_text_only_messages``).
+
+        No-op when the session is already within budget. Caller does not need
+        to hold the lock; this method acquires it itself.
+        """
+        if keep_turns <= 0:
+            return
+
+        def _is_real_user(msg) -> bool:
+            if not isinstance(msg, dict) or msg.get("role") != "user":
+                return False
+            content = msg.get("content")
+            if isinstance(content, list):
+                if any(
+                    isinstance(b, dict) and b.get("type") == "tool_result"
+                    for b in content
+                ):
+                    return False
+                return any(
+                    isinstance(b, dict) and b.get("type") == "text" and b.get("text")
+                    for b in content
+                )
+            if isinstance(content, str):
+                return bool(content.strip())
+            return False
+
+        with agent.messages_lock:
+            msgs = agent.messages
+            real_user_indices = [i for i, m in enumerate(msgs) if _is_real_user(m)]
+            if len(real_user_indices) <= keep_turns:
+                return
+
+            # Cut at the (k-th from the end) real user message; keep everything
+            # from there onwards so the surviving slice is still a valid
+            # user/assistant sequence.
+            cut_idx = real_user_indices[-keep_turns]
+            if cut_idx == 0:
+                return
+
+            kept = msgs[cut_idx:]
+            msgs.clear()
+            msgs.extend(kept)
+            logger.debug(
+                f"[AgentBridge] Trimmed in-memory messages to last "
+                f"{keep_turns} turns ({len(kept)} messages remain)"
+            )
+
+    @classmethod
+    def _prune_scheduled_in_memory(cls, agent, keep_last_n: int) -> None:
+        """Mirror conversation_store.prune_scheduled_messages on agent.messages.
+
+        Caller must hold ``agent.messages_lock``.
+        """
+        if keep_last_n < 0:
+            keep_last_n = 0
+
+        markers = (cls._SCHEDULED_MARKER,) + cls._SCHEDULED_LEGACY_MARKERS
+
+        def _is_marker_user(msg) -> bool:
+            if not isinstance(msg, dict) or msg.get("role") != "user":
+                return False
+            content = msg.get("content")
+            text = ""
+            if isinstance(content, str):
+                text = content
+            elif isinstance(content, list):
+                for block in content:
+                    if isinstance(block, dict) and block.get("type") == "text":
+                        text = block.get("text", "")
+                        break
+            return any(text.startswith(m) for m in markers)
+
+        msgs = agent.messages
+        pair_indices = []  # list of (user_idx, assistant_idx_or_None)
+        for idx, msg in enumerate(msgs):
+            if not _is_marker_user(msg):
+                continue
+            assistant_idx = None
+            if idx + 1 < len(msgs):
+                nxt = msgs[idx + 1]
+                if isinstance(nxt, dict) and nxt.get("role") == "assistant":
+                    assistant_idx = idx + 1
+            pair_indices.append((idx, assistant_idx))
+
+        if len(pair_indices) <= keep_last_n:
+            return
+
+        to_drop = pair_indices[: len(pair_indices) - keep_last_n]
+        drop_set = set()
+        for u_idx, a_idx in to_drop:
+            drop_set.add(u_idx)
+            if a_idx is not None:
+                drop_set.add(a_idx)
+
+        # Rebuild the list in place to keep external references stable.
+        kept = [m for i, m in enumerate(msgs) if i not in drop_set]
+        msgs.clear()
+        msgs.extend(kept)
+
+    @staticmethod
+    def _strip_thinking_blocks(messages: list) -> list:
+        """Return a shallow copy of messages with assistant "thinking" blocks removed."""
+        cleaned = []
+        for msg in messages:
+            if not isinstance(msg, dict):
+                cleaned.append(msg)
+                continue
+            if msg.get("role") != "assistant":
+                cleaned.append(msg)
+                continue
+            content = msg.get("content")
+            if not isinstance(content, list):
+                cleaned.append(msg)
+                continue
+            filtered_blocks = [
+                b for b in content
+                if not (isinstance(b, dict) and b.get("type") == "thinking")
+            ]
+            if len(filtered_blocks) == len(content):
+                cleaned.append(msg)
+            else:
+                new_msg = dict(msg)
+                new_msg["content"] = filtered_blocks
+                cleaned.append(new_msg)
+        return cleaned
+
    def clear_session(self, session_id: str):
        """
        Clear a specific session's agent and conversation history
@@ -685,4 +993,4 @@ class AgentBridge:
                agent.tools = [t for t in agent.tools if t.name != "web_search"]
                logger.info("[AgentBridge] web_search tool removed (API key no longer available)")
        except Exception as e:
-            logger.debug(f"[AgentBridge] Failed to refresh conditional tools: {e}")
+            logger.debug(f"[AgentBridge] Failed to refresh conditional tools: {e}")
--- a/bridge/agent_event_handler.py
+++ b/bridge/agent_event_handler.py
@@ -26,8 +26,7 @@ class AgentEventHandler:
        if context:
            self.channel = context.kwargs.get("channel") if hasattr(context, "kwargs") else None
        
-        # Track current thinking for channel output
-        self.current_thinking = ""
+        self.current_content = ""
        self.turn_number = 0
    
    def handle_event(self, event):
@@ -47,6 +46,8 @@ class AgentEventHandler:
            self._handle_message_update(data)
        elif event_type == "message_end":
            self._handle_message_end(data)
+        elif event_type == "reasoning_update":
+            pass
        elif event_type == "tool_execution_start":
            self._handle_tool_execution_start(data)
        elif event_type == "tool_execution_end":
@@ -59,30 +60,26 @@ class AgentEventHandler:
    def _handle_turn_start(self, data):
        """Handle turn start event"""
        self.turn_number = data.get("turn", 0)
-        self.has_tool_calls_in_turn = False
-        self.current_thinking = ""
+        self.current_content = ""
    
    def _handle_message_update(self, data):
-        """Handle message update event (streaming text)"""
+        """Handle message update event (streaming content text)"""
        delta = data.get("delta", "")
-        self.current_thinking += delta
+        self.current_content += delta
    
    def _handle_message_end(self, data):
        """Handle message end event"""
        tool_calls = data.get("tool_calls", [])
        
-        # Only send thinking process if followed by tool calls
        if tool_calls:
-            if self.current_thinking.strip():
-                logger.info(f"💭 {self.current_thinking.strip()[:200]}{'...' if len(self.current_thinking) > 200 else ''}")
-                # Send thinking process to channel
-                self._send_to_channel(f"{self.current_thinking.strip()}")
+            if self.current_content.strip():
+                logger.info(f"💭 {self.current_content.strip()[:200]}{'...' if len(self.current_content) > 200 else ''}")
+                self._send_to_channel(self.current_content.strip())
        else:
-            # No tool calls = final response (logged at agent_stream level)
-            if self.current_thinking.strip():
-                logger.debug(f"💬 {self.current_thinking.strip()[:200]}{'...' if len(self.current_thinking) > 200 else ''}")
+            if self.current_content.strip():
+                logger.debug(f"💬 {self.current_content.strip()[:200]}{'...' if len(self.current_content) > 200 else ''}")
        
-        self.current_thinking = ""
+        self.current_content = ""
    
    def _handle_tool_execution_start(self, data):
        """Handle tool execution start event - logged by agent_stream.py"""
--- a/bridge/agent_initializer.py
+++ b/bridge/agent_initializer.py
@@ -5,6 +5,7 @@ Agent Initializer - Handles agent initialization logic
 import os
 import asyncio
 import datetime
+import threading
 import time
 from typing import Optional, List

@@ -13,6 +14,13 @@ from agent.tools import ToolManager
 from common.log import logger
 from common.utils import expand_path

+# Module-level lock to serialize scheduler init across concurrent sessions
+_scheduler_init_lock = threading.Lock()
+
+# Track whether the embedding model log has been printed in this process,
+# so we avoid spamming it once per session.
+_embedding_logged: bool = False
+

 class AgentInitializer:
    """
@@ -144,7 +152,15 @@ class AgentInitializer:
            from agent.memory import get_conversation_store
            store = get_conversation_store()
            max_turns = conf().get("agent_max_context_turns", 20)
-            restore_turns = max(3, max_turns // 6)
+            # Scheduler tasks run on a stable isolated session per task and
+            # can fire many times a day; a smaller restore window keeps prompt
+            # cost bounded while still letting the agent see "last few" runs
+            # for trend / dedup style logic. Regular chat sessions keep the
+            # original heuristic so user dialogues feel continuous.
+            if session_id.startswith("scheduler_"):
+                restore_turns = max(1, max_turns // 5)
+            else:
+                restore_turns = max(3, max_turns // 6)
            saved = store.load_messages(session_id, max_turns=restore_turns)
            if saved:
                filtered = self._filter_text_only_messages(saved)
@@ -260,52 +276,19 @@ class AgentInitializer:
        memory_tools = []
        
        try:
-            from agent.memory import MemoryManager, MemoryConfig, create_embedding_provider
+            from agent.memory import MemoryManager, MemoryConfig
            from agent.tools import MemorySearchTool, MemoryGetTool
            from config import conf
-            
-            # Initialize embedding provider (prefer OpenAI, fallback to LinkAI)
-            embedding_provider = None

-            openai_api_key = conf().get("open_ai_api_key", "")
-            openai_api_base = conf().get("open_ai_api_base", "")
-            if openai_api_key and openai_api_key not in ["", "YOUR API KEY", "YOUR_API_KEY"]:
-                try:
-                    embedding_provider = create_embedding_provider(
-                        provider="openai",
-                        model="text-embedding-3-small",
-                        api_key=openai_api_key,
-                        api_base=openai_api_base or "https://api.openai.com/v1"
-                    )
-                    if session_id is None:
-                        logger.info("[AgentInitializer] OpenAI embedding initialized")
-                except Exception as e:
-                    logger.warning(f"[AgentInitializer] OpenAI embedding failed: {e}")
-
-            if embedding_provider is None:
-                linkai_api_key = conf().get("linkai_api_key", "") or os.environ.get("LINKAI_API_KEY", "")
-                linkai_api_base = conf().get("linkai_api_base", "https://api.link-ai.tech")
-                if linkai_api_key and linkai_api_key not in ["", "YOUR API KEY", "YOUR_API_KEY"]:
-                    try:
-                        embedding_provider = create_embedding_provider(
-                            provider="linkai",
-                            model="text-embedding-3-small",
-                            api_key=linkai_api_key,
-                            api_base=f"{linkai_api_base}/v1"
-                        )
-                        if session_id is None:
-                            logger.info("[AgentInitializer] LinkAI embedding initialized (fallback)")
-                    except Exception as e:
-                        logger.warning(f"[AgentInitializer] LinkAI embedding failed: {e}")
-            
-            # Create memory manager
            memory_config = MemoryConfig(workspace_root=workspace_root)
+
+            embedding_provider = self._init_embedding_provider(
+                memory_config, session_id=session_id
+            )
+
            memory_manager = MemoryManager(memory_config, embedding_provider=embedding_provider)
-            
-            # Sync memory
            self._sync_memory(memory_manager, session_id)
-            
-            # Create memory tools
+
            memory_tools = [
                MemorySearchTool(memory_manager),
                MemoryGetTool(memory_manager)
@@ -318,6 +301,190 @@ class AgentInitializer:
            logger.warning(f"[AgentInitializer] Memory system not available: {e}")
        
        return memory_manager, memory_tools
+
+    def _init_embedding_provider(self, memory_config, session_id: Optional[str] = None):
+        """
+        Initialize the embedding provider for memory.
+
+        Two paths:
+          A. Default (no `embedding_provider` in config.json):
+             Auto-init OpenAI -> LinkAI fallback. Existing 1536-dim indices
+             keep working.
+          B. Explicit (`embedding_provider` is set):
+             Initialize the requested vendor with unified dim (default 1024).
+             If the index was built with a different dim, vector search will
+             quietly return no results (cosine returns 0) and keyword search
+             takes over until the user runs /memory rebuild-index.
+        """
+        from agent.memory import create_embedding_provider
+        from config import conf
+
+        explicit_provider = (conf().get("embedding_provider") or "").strip().lower()
+
+        if not explicit_provider:
+            return self._init_embedding_provider_legacy(session_id=session_id)
+
+        return self._init_embedding_provider_explicit(
+            memory_config, explicit_provider, session_id=session_id,
+        )
+
+    def _init_embedding_provider_legacy(self, session_id: Optional[str] = None):
+        """Legacy auto-init path: OpenAI -> LinkAI. Preserved verbatim for compat."""
+        from agent.memory import create_embedding_provider
+        from config import conf
+
+        embedding_provider = None
+        embedding_model = None
+
+        openai_api_key = conf().get("open_ai_api_key", "")
+        openai_api_base = conf().get("open_ai_api_base", "")
+        if openai_api_key and openai_api_key not in ["", "YOUR API KEY", "YOUR_API_KEY"]:
+            try:
+                model = "text-embedding-3-small"
+                embedding_provider = create_embedding_provider(
+                    provider="openai",
+                    model=model,
+                    api_key=openai_api_key,
+                    api_base=openai_api_base or "https://api.openai.com/v1"
+                )
+                embedding_model = f"openai/{model}"
+            except Exception as e:
+                logger.warning(f"[AgentInitializer] OpenAI embedding failed: {e}")
+
+        if embedding_provider is None:
+            linkai_api_key = conf().get("linkai_api_key", "") or os.environ.get("LINKAI_API_KEY", "")
+            linkai_api_base = conf().get("linkai_api_base", "https://api.link-ai.tech")
+            if linkai_api_key and linkai_api_key not in ["", "YOUR API KEY", "YOUR_API_KEY"]:
+                try:
+                    model = "text-embedding-3-small"
+                    embedding_provider = create_embedding_provider(
+                        provider="linkai",
+                        model=model,
+                        api_key=linkai_api_key,
+                        api_base=f"{linkai_api_base}/v1"
+                    )
+                    embedding_model = f"linkai/{model}"
+                except Exception as e:
+                    logger.warning(f"[AgentInitializer] LinkAI embedding failed: {e}")
+
+        if embedding_provider is not None and embedding_model:
+            global _embedding_logged
+            if not _embedding_logged:
+                logger.info(
+                    f"[AgentInitializer] Embedding model in use: {embedding_model} "
+                    f"(dim={embedding_provider.dimensions})"
+                )
+                _embedding_logged = True
+
+        return embedding_provider
+
+    def _init_embedding_provider_explicit(
+        self,
+        memory_config,
+        provider_key: str,
+        session_id: Optional[str] = None,
+    ):
+        """Explicit-provider path: build the configured vendor.
+
+        If the index was built with a different dim, vector search will
+        silently return no results (cosine returns 0 for mismatched dims)
+        and keyword search takes over. Users switch vendors by running
+        /memory rebuild-index — see docs.
+        """
+        from agent.memory import create_embedding_provider
+        from agent.memory.embedding import EMBEDDING_VENDORS
+        from config import conf
+
+        meta = EMBEDDING_VENDORS.get(provider_key)
+        if meta is None:
+            logger.error(
+                f"[AgentInitializer] Unknown embedding_provider '{provider_key}'. "
+                f"Supported: {sorted(EMBEDDING_VENDORS.keys())}. "
+                f"Memory will run in keyword-only mode."
+            )
+            return None
+
+        api_key = self._resolve_embedding_api_key(provider_key)
+        api_base = self._resolve_embedding_api_base(provider_key, meta["default_base_url"])
+
+        if not api_key:
+            logger.error(
+                f"[AgentInitializer] embedding_provider='{provider_key}' is set but its "
+                f"API key is missing. Memory will run in keyword-only mode."
+            )
+            return None
+
+        model = (conf().get("embedding_model") or "").strip() or meta["default_model"]
+        try:
+            cfg_dim = int(conf().get("embedding_dimensions") or 0)
+        except (TypeError, ValueError):
+            cfg_dim = 0
+        dim = cfg_dim if cfg_dim > 0 else meta["default_dimensions"]
+
+        try:
+            provider = create_embedding_provider(
+                provider=provider_key,
+                model=model,
+                api_key=api_key,
+                api_base=api_base,
+                dimensions=dim,
+            )
+        except Exception as e:
+            logger.error(
+                f"[AgentInitializer] Failed to init embedding provider "
+                f"'{provider_key}/{model}': {e}"
+            )
+            return None
+
+        global _embedding_logged
+        if not _embedding_logged:
+            logger.info(
+                f"[AgentInitializer] Embedding model in use: "
+                f"{provider_key}/{model} (dim={provider.dimensions})"
+            )
+            _embedding_logged = True
+        return provider
+
+    @staticmethod
+    def _resolve_embedding_api_key(provider_key: str) -> str:
+        """Pick the API key for an explicit embedding provider from config."""
+        from config import conf
+
+        key_map = {
+            "openai":    "open_ai_api_key",
+            "linkai":    "linkai_api_key",
+            "dashscope": "dashscope_api_key",
+            "doubao":    "ark_api_key",
+            "zhipu":     "zhipu_ai_api_key",
+        }
+        field = key_map.get(provider_key)
+        if not field:
+            return ""
+        value = conf().get(field, "") or ""
+        if value in ["", "YOUR API KEY", "YOUR_API_KEY"]:
+            return ""
+        return value
+
+    @staticmethod
+    def _resolve_embedding_api_base(provider_key: str, default_base: str) -> str:
+        """Pick the API base for an explicit embedding provider from config."""
+        from config import conf
+
+        base_map = {
+            "openai":    "open_ai_api_base",
+            "linkai":    "linkai_api_base",
+            "doubao":    "ark_base_url",
+            "zhipu":     "zhipu_ai_api_base",
+        }
+        field = base_map.get(provider_key)
+        if not field:
+            return default_base
+        value = (conf().get(field) or "").strip()
+        if not value:
+            return default_base
+        if provider_key == "linkai" and not value.rstrip("/").endswith("/v1"):
+            return f"{value.rstrip('/')}/v1"
+        return value
    
    def _sync_memory(self, memory_manager, session_id: Optional[str] = None):
        """Sync memory database"""
@@ -354,7 +521,7 @@ class AgentInitializer:
                if tool_name == "web_search":
                    from agent.tools.web_search.web_search import WebSearch
                    if not WebSearch.is_available():
-                        logger.debug("[AgentInitializer] WebSearch skipped - no BOCHA_API_KEY or LINKAI_API_KEY")
+                        logger.debug("[AgentInitializer] WebSearch skipped - no search provider configured")
                        continue

                # Special handling for EnvConfig tool
@@ -365,16 +532,33 @@ class AgentInitializer:
                    tool = tool_manager.create_tool(tool_name)

                if tool:
-                    # Apply workspace config to file operation tools
+                    # Apply workspace config to file operation tools.
+                    # Merge into the existing tool.config (set by ToolManager from
+                    # config.json's `tools.<name>` section) instead of replacing
+                    # it, otherwise per-tool user configs (e.g. browser.cdp_endpoint)
+                    # would be silently dropped.
                    if tool_name in ['read', 'write', 'edit', 'bash', 'grep', 'find', 'ls', 'web_fetch', 'send', 'browser']:
-                        tool.config = file_config
-                        tool.cwd = file_config.get("cwd", getattr(tool, 'cwd', None))
-                        if 'memory_manager' in file_config:
-                            tool.memory_manager = file_config['memory_manager']
+                        merged_config = dict(getattr(tool, 'config', None) or {})
+                        merged_config.update(file_config)
+                        tool.config = merged_config
+                        tool.cwd = merged_config.get("cwd", getattr(tool, 'cwd', None))
+                        if 'memory_manager' in merged_config:
+                            tool.memory_manager = merged_config['memory_manager']
                    tools.append(tool)
            except Exception as e:
                logger.warning(f"[AgentInitializer] Failed to load tool {tool_name}: {e}")
-        
+
+        # Add MCP tools (snapshot to avoid races with the background loader)
+        mcp_tools_snapshot = list(tool_manager._mcp_tool_instances.items())
+        if mcp_tools_snapshot:
+            for _, mcp_tool in mcp_tools_snapshot:
+                tools.append(mcp_tool)
+            if session_id is None:
+                names = [name for name, _ in mcp_tools_snapshot]
+                logger.info(
+                    f"[AgentInitializer] Added {len(names)} MCP tool(s): {names}"
+                )
+
        # Add memory tools
        if memory_tools:
            tools.extend(memory_tools)
@@ -387,16 +571,23 @@ class AgentInitializer:
        return tools
    
    def _initialize_scheduler(self, tools: List, session_id: Optional[str] = None):
-        """Initialize scheduler service if needed"""
+        """Initialize scheduler service if needed.
+
+        Serialize the check-and-set under a module-level lock so concurrent
+        first-time session inits cannot each create a new SchedulerService
+        (which would leak background scanning threads).
+        """
        if not self.agent_bridge.scheduler_initialized:
-            try:
-                from agent.tools.scheduler.integration import init_scheduler
-                if init_scheduler(self.agent_bridge):
-                    self.agent_bridge.scheduler_initialized = True
-                    if session_id is None:
-                        logger.info("[AgentInitializer] Scheduler service initialized")
-            except Exception as e:
-                logger.warning(f"[AgentInitializer] Failed to initialize scheduler: {e}")
+            with _scheduler_init_lock:
+                if not self.agent_bridge.scheduler_initialized:
+                    try:
+                        from agent.tools.scheduler.integration import init_scheduler
+                        if init_scheduler(self.agent_bridge):
+                            self.agent_bridge.scheduler_initialized = True
+                            if session_id is None:
+                                logger.info("[AgentInitializer] Scheduler service initialized")
+                    except Exception as e:
+                        logger.warning(f"[AgentInitializer] Failed to initialize scheduler: {e}")
        
        # Inject scheduler dependencies
        if self.agent_bridge.scheduler_initialized:
@@ -548,17 +739,23 @@ class AgentInitializer:
        import threading

        def _daily_flush_loop():
+            import random
+            last_run_date = None  # Track last successful run date to prevent same-day re-trigger
            while True:
                try:
                    now = datetime.datetime.now()
-                    target = now.replace(hour=23, minute=55, second=0, microsecond=0)
-                    if target <= now:
+                    jitter_min = random.randint(50, 55)
+                    jitter_sec = random.randint(0, 59)
+                    target = now.replace(hour=23, minute=jitter_min, second=jitter_sec, microsecond=0)
+                    # Always schedule for tomorrow if we already ran today, or if target time has passed
+                    if target <= now or (last_run_date == now.date()):
                        target += datetime.timedelta(days=1)
                    wait_seconds = (target - now).total_seconds()
-                    logger.info(f"[DailyFlush] Next flush at {target.strftime('%Y-%m-%d %H:%M')} (in {wait_seconds/3600:.1f}h)")
+                    logger.info(f"[DailyFlush] Next flush at {target.strftime('%Y-%m-%d %H:%M:%S')} (in {wait_seconds/3600:.1f}h)")
                    time.sleep(wait_seconds)

                    self._flush_all_agents()
+                    last_run_date = datetime.datetime.now().date()
                except Exception as e:
                    logger.warning(f"[DailyFlush] Error in daily flush loop: {e}")
                    time.sleep(3600)
@@ -567,7 +764,7 @@ class AgentInitializer:
        t.start()

    def _flush_all_agents(self):
-        """Flush memory for all active agent sessions."""
+        """Flush memory for all active agent sessions, then run Deep Dream."""
        agents = []
        if self.agent_bridge.default_agent:
            agents.append(("default", self.agent_bridge.default_agent))
@@ -577,7 +774,10 @@ class AgentInitializer:
        if not agents:
            return

+        # Phase 1: flush daily summaries
        flushed = 0
+        flush_threads = []
+        dream_candidate = None
        for label, agent in agents:
            try:
                if not agent.memory_manager:
@@ -589,8 +789,26 @@ class AgentInitializer:
                result = agent.memory_manager.flush_manager.create_daily_summary(messages)
                if result:
                    flushed += 1
+                    t = agent.memory_manager.flush_manager._last_flush_thread
+                    if t:
+                        flush_threads.append(t)
+                if dream_candidate is None:
+                    dream_candidate = agent.memory_manager.flush_manager
            except Exception as e:
                logger.warning(f"[DailyFlush] Failed for session {label}: {e}")

        if flushed:
            logger.info(f"[DailyFlush] Flushed {flushed}/{len(agents)} agent session(s)")
+
+        # Wait for all flush threads to finish before dreaming
+        for t in flush_threads:
+            t.join(timeout=60)
+
+        # Phase 2: Deep Dream — distill daily memories → MEMORY.md + dream diary
+        if dream_candidate:
+            try:
+                result = dream_candidate.deep_dream()
+                if result:
+                    logger.info("[DeepDream] Memory distillation completed successfully")
+            except Exception as e:
+                logger.warning(f"[DeepDream] Failed: {e}")
--- a/bridge/bridge.py
+++ b/bridge/bridge.py
@@ -14,7 +14,9 @@ class Bridge(object):
    def __init__(self):
        self.btype = {
            "chat": const.OPENAI,
-            "voice_to_text": conf().get("voice_to_text", "openai"),
+            # Empty `voice_to_text` (the default in new configs) triggers
+            # the auto-pick below — see _auto_pick_voice_to_text for order.
+            "voice_to_text": conf().get("voice_to_text") or self._auto_pick_voice_to_text(),
            "text_to_voice": conf().get("text_to_voice", "google"),
            "translate": conf().get("translate", "baidu"),
        }
@@ -61,6 +63,11 @@ class Bridge(object):
            if model_type and model_type.startswith("deepseek"):
                self.btype["chat"] = const.DEEPSEEK

+            if model_type and isinstance(model_type, str):
+                lowered_model_type = model_type.lower()
+                if lowered_model_type == const.QIANFAN or lowered_model_type.startswith("ernie"):
+                    self.btype["chat"] = const.QIANFAN
+
            if model_type in [const.MODELSCOPE]:
                self.btype["chat"] = const.MODELSCOPE
            
@@ -79,6 +86,46 @@ class Bridge(object):
        self.chat_bots = {}
        self._agent_bridge = None

+    def refresh_voice(self):
+        """Re-read voice_to_text / text_to_voice from config and drop the
+        cached voice bots so the next call picks up the new provider.
+        Used by the web console after the user edits voice settings.
+        Does NOT touch the agent_bridge / agent state.
+        """
+        new_v2t = conf().get("voice_to_text") or self._auto_pick_voice_to_text()
+        new_t2v = conf().get("text_to_voice", "google")
+        if conf().get("use_linkai") and conf().get("linkai_api_key"):
+            if not conf().get("voice_to_text") or conf().get("voice_to_text") in ["openai"]:
+                new_v2t = const.LINKAI
+            if not conf().get("text_to_voice") or conf().get("text_to_voice") in ["openai", const.TTS_1, const.TTS_1_HD]:
+                new_t2v = const.LINKAI
+        self.btype["voice_to_text"] = new_v2t
+        self.btype["text_to_voice"] = new_t2v
+        self.bots.pop("voice_to_text", None)
+        self.bots.pop("text_to_voice", None)
+        logger.info(f"[Bridge] voice refreshed: voice_to_text={new_v2t}, text_to_voice={new_t2v}")
+
+    @staticmethod
+    def _auto_pick_voice_to_text() -> str:
+        """Pick an ASR provider by configured api keys when voice_to_text is
+        unset. Order matches the web console: openai → dashscope → zhipu →
+        linkai. Falls back to 'openai' when nothing is configured so the
+        original "missing key" error is preserved.
+        """
+        def has(k: str) -> bool:
+            v = (conf().get(k) or "").strip()
+            return v != "" and v not in ("YOUR API KEY", "YOUR_API_KEY")
+
+        for key, provider in (
+            ("open_ai_api_key", "openai"),
+            ("dashscope_api_key", "dashscope"),
+            ("zhipu_ai_api_key", "zhipu"),
+            ("linkai_api_key", "linkai"),
+        ):
+            if has(key):
+                return provider
+        return "openai"
+
    # 模型对应的接口
    def get_bot(self, typename):
        if self.bots.get(typename) is None:
--- a/channel/channel.py
+++ b/channel/channel.py
@@ -73,7 +73,7 @@ class Channel(object):
        Build reply content, using agent if enabled in config
        """
        # Check if agent mode is enabled
-        use_agent = conf().get("agent", False)
+        use_agent = conf().get("agent", True)

        if use_agent:
            try:
--- a/channel/chat_channel.py
+++ b/channel/chat_channel.py
@@ -171,7 +171,13 @@ class ChatChannel(Channel):
            if "desire_rtype" not in context and conf().get("always_reply_voice") and ReplyType.VOICE not in self.NOT_SUPPORT_REPLYTYPE:
                context["desire_rtype"] = ReplyType.VOICE
        elif context.type == ContextType.VOICE:
-            if "desire_rtype" not in context and conf().get("voice_reply_voice") and ReplyType.VOICE not in self.NOT_SUPPORT_REPLYTYPE:
+            # Voice input replies with voice when either voice_reply_voice
+            # (mirror voice) or the global always_reply_voice toggle is on.
+            if (
+                "desire_rtype" not in context
+                and (conf().get("voice_reply_voice") or conf().get("always_reply_voice"))
+                and ReplyType.VOICE not in self.NOT_SUPPORT_REPLYTYPE
+            ):
                context["desire_rtype"] = ReplyType.VOICE
        return context

@@ -264,6 +270,8 @@ class ChatChannel(Channel):
                if reply.type == ReplyType.TEXT:
                    reply_text = reply.content
                    if desire_rtype == ReplyType.VOICE and ReplyType.VOICE not in self.NOT_SUPPORT_REPLYTYPE:
+                        # Preserve original text for the "text-then-voice" pattern in _send_reply.
+                        context["voice_reply_text"] = reply.content
                        reply = super().build_text_to_voice(reply.content)
                        return self._decorate_reply(context, reply)
                    if context.get("isgroup", False):
@@ -297,8 +305,12 @@ class ChatChannel(Channel):
                logger.debug("[chat_channel] sending reply: {}, context: {}".format(reply, context))
                
                # 如果是文本回复，尝试提取并发送图片
-                if reply.type == ReplyType.TEXT:
+                # Web channel renders images/videos inline via renderMarkdown,
+                # so skip the extract-and-send step to avoid duplicate media.
+                if reply.type == ReplyType.TEXT and context.get("channel_type") != "web":
                    self._extract_and_send_images(reply, context)
+                elif reply.type == ReplyType.TEXT:
+                    self._send(reply, context)
                # 如果是图片回复但带有文本内容，先发文本再发图片
                elif reply.type == ReplyType.IMAGE_URL and hasattr(reply, 'text_content') and reply.text_content:
                    # 先发送文本
@@ -307,6 +319,15 @@ class ChatChannel(Channel):
                    # 短暂延迟后发送图片
                    time.sleep(0.3)
                    self._send(reply, context)
+                # Send text bubble before voice, unless channel already streamed
+                # the text (feishu) or natively renders STT under the voice (wechatcom).
+                elif reply.type == ReplyType.VOICE and context.get("voice_reply_text") \
+                        and not context.get("feishu_streamed") \
+                        and context.get("channel_type") not in ("wechatcom_app",):
+                    text_reply = Reply(ReplyType.TEXT, context.get("voice_reply_text"))
+                    self._send(text_reply, context)
+                    time.sleep(0.3)
+                    self._send(reply, context)
                else:
                    self._send(reply, context)
    
--- a/channel/dingtalk/dingtalk_channel.py
+++ b/channel/dingtalk/dingtalk_channel.py
@@ -86,6 +86,8 @@ def _check(func):

@singleton
 class DingTalkChanel(ChatChannel, dingtalk_stream.ChatbotHandler):
+    NOT_SUPPORT_REPLYTYPE = []
+
    dingtalk_client_id = conf().get('dingtalk_client_id')
    dingtalk_client_secret = conf().get('dingtalk_client_secret')

@@ -870,6 +872,48 @@ class DingTalkChanel(ChatChannel, dingtalk_stream.ChatbotHandler):
                    self.reply_text("抱歉，文件上传失败", incoming_message)
            return
        
+        # Native sampleAudio. Upload only accepts ogg/amr, so convert TTS mp3/wav to amr.
+        elif reply.type == ReplyType.VOICE:
+            logger.info(f"[DingTalk] Sending voice: {reply.content}")
+            access_token = self.get_access_token()
+            if not access_token:
+                logger.error("[DingTalk] Cannot get access token for voice")
+                self.reply_text("抱歉，语音发送失败（无法获取token）", incoming_message)
+                return
+
+            voice_path = reply.content
+            if voice_path.startswith("file://"):
+                voice_path = voice_path[7:]
+
+            amr_path = voice_path
+            duration_ms = 0
+            if not voice_path.lower().endswith((".amr", ".ogg")):
+                try:
+                    from voice.audio_convert import any_to_amr
+                    amr_path = os.path.splitext(voice_path)[0] + ".amr"
+                    duration_ms = int(any_to_amr(voice_path, amr_path) or 0)
+                except Exception as e:
+                    logger.error(f"[DingTalk] Failed to convert voice to amr: {e}")
+                    self.reply_text("抱歉，语音转码失败", incoming_message)
+                    return
+
+            media_id = self.upload_media(amr_path, media_type="voice")
+            if not media_id:
+                logger.error("[DingTalk] Failed to upload voice media")
+                self.reply_text("抱歉，语音上传失败", incoming_message)
+                return
+
+            msg_param = {
+                "mediaId": media_id,
+                "duration": str(duration_ms or 1000),
+            }
+            success = self._send_file_message(
+                access_token, incoming_message, "sampleAudio", msg_param, isgroup
+            )
+            if not success:
+                self.reply_text("抱歉，语音发送失败", incoming_message)
+            return
+
        # 处理文本消息
        elif reply.type == ReplyType.TEXT:
            logger.info(f"[DingTalk] Sending text message, length={len(reply.content)}")
--- a/channel/feishu/feishu_channel.py
+++ b/channel/feishu/feishu_channel.py
@@ -55,12 +55,186 @@ def _ensure_lark_imported():
    return lark


+def _print_qr_to_terminal(qr_url: str):
+    """Render a QR code as ASCII art and emit it via logger.
+
+    走 logger 而非 print 是为了避免 nohup/cow 后台启动场景下 stdout 块缓冲导致
+    二维码滞后输出（看起来像出现了两次）。logger 的 StreamHandler 是行缓冲，
+    既能在前台终端看到，也能进 run.log。
+    """
+    qr_lines = []
+    try:
+        import qrcode as qr_lib
+        import io
+        qr = qr_lib.QRCode(error_correction=qr_lib.constants.ERROR_CORRECT_L, box_size=1, border=1)
+        qr.add_data(qr_url)
+        qr.make(fit=True)
+        buf = io.StringIO()
+        qr.print_ascii(out=buf, invert=True)
+        qr_lines = buf.getvalue().splitlines()
+    except ImportError:
+        qr_lines = ["(未安装 qrcode 包，无法渲染 ASCII 二维码：pip install qrcode)"]
+    except Exception as e:
+        qr_lines = [f"(渲染二维码失败：{e})"]
+
+    header = "=" * 60
+    banner = [
+        "",
+        header,
+        "  飞书一键创建应用：请使用 飞书 App 扫描下方二维码",
+        "  （二维码 10 分钟内有效，仅供一次扫描）",
+        header,
+    ]
+    footer = [
+        f"  或点击链接创建: {qr_url}",
+        "  等待扫码...",
+        "",
+    ]
+    full = banner + qr_lines + footer
+    logger.info("[FeiShu] One-click 飞书应用创建二维码（请用飞书 App 扫码）：\n" + "\n".join(full))
+
+
+def _persist_feishu_credentials(app_id: str, app_secret: str) -> bool:
+    """Write feishu_app_id / feishu_app_secret + ensure feishu in channel_type into config.json.
+
+    Returns True on success, False on failure (e.g. config.json missing or unwritable).
+    """
+    try:
+        config_path = os.path.join(
+            os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))),
+            "config.json",
+        )
+        if os.path.exists(config_path):
+            with open(config_path, "r", encoding="utf-8") as f:
+                file_cfg = json.load(f)
+        else:
+            file_cfg = {}
+
+        file_cfg["feishu_app_id"] = app_id
+        file_cfg["feishu_app_secret"] = app_secret
+
+        # 保证 channel_type 中包含 feishu（用户可能纯通过 CLI 启动单通道）
+        ch_type = file_cfg.get("channel_type", conf().get("channel_type", "")) or ""
+        existing = [s.strip() for s in ch_type.split(",") if s.strip()]
+        if "feishu" not in existing:
+            existing.append("feishu")
+            file_cfg["channel_type"] = ",".join(existing)
+
+        with open(config_path, "w", encoding="utf-8") as f:
+            json.dump(file_cfg, f, indent=4, ensure_ascii=False)
+
+        # 同步到内存中的 conf()，让本次启动直接生效
+        conf()["feishu_app_id"] = app_id
+        conf()["feishu_app_secret"] = app_secret
+        if "channel_type" in file_cfg:
+            conf()["channel_type"] = file_cfg["channel_type"]
+
+        try:
+            os.chmod(config_path, 0o600)
+        except Exception:
+            pass
+        return True
+    except Exception as e:
+        logger.error(f"[FeiShu] Failed to persist credentials to config.json: {e}")
+        return False
+
+
+def _register_via_qr_in_terminal() -> bool:
+    """CLI-side one-click app creation via lark_oapi.register_app.
+
+    Blocks the calling thread (typically the channel startup thread) until the user
+    finishes scanning, the QR code expires, or registration is cancelled.
+
+    Returns True if credentials were obtained AND persisted; False otherwise.
+    The caller should fall back to the original "missing credentials" error in that case.
+    """
+    if not LARK_SDK_AVAILABLE:
+        logger.error(
+            "[FeiShu] 缺少 feishu_app_id / feishu_app_secret。"
+            "未安装 lark-oapi SDK，无法在终端发起扫码创建。"
+            "请执行 pip install -U 'lark-oapi>=1.5.5' 后重试，或手动在 config.json 中填入凭据。"
+        )
+        return False
+
+    try:
+        lark_mod = _ensure_lark_imported()
+    except Exception as e:
+        logger.error(f"[FeiShu] Import lark_oapi failed: {e}")
+        return False
+
+    # register_app 是 lark-oapi 1.5.5 才引入的能力，旧版本调用会得到难以理解的
+    # AttributeError。提前显式检查，给出明确的升级提示。
+    if not hasattr(lark_mod, "register_app"):
+        try:
+            from importlib.metadata import version as _pkg_version
+            installed = _pkg_version("lark-oapi")
+        except Exception:
+            installed = "unknown"
+        logger.error(
+            f"[FeiShu] 当前 lark-oapi 版本 ({installed}) 不支持一键创建应用，需要 >= 1.5.5。"
+            "请执行 pip install -U 'lark-oapi>=1.5.5' 后重试，或手动在 config.json 中填入凭据。"
+        )
+        return False
+
+    logger.info("[FeiShu] 检测到尚未配置 feishu_app_id / feishu_app_secret，"
+                "正在向飞书申请一键创建应用...")
+
+    def _on_qr(info):
+        url = info.get("url", "")
+        if url:
+            _print_qr_to_terminal(url)
+
+    def _on_status(info):
+        # 过滤 polling 心跳（每 5 秒一次），保留 slow_down / domain_switched 等
+        status = info.get("status")
+        if status == "polling":
+            return
+        logger.info(f"[FeiShu] register_app status: {info}")
+
+    try:
+        result = lark_mod.register_app(
+            on_qr_code=_on_qr,
+            on_status_change=_on_status,
+            source="cowagent",
+        )
+    except Exception as e:
+        err_cls = e.__class__.__name__
+        if "Expired" in err_cls:
+            logger.error("[FeiShu] 二维码已过期，请重启程序后重试。")
+        elif "Denied" in err_cls:
+            logger.error("[FeiShu] 已取消授权。")
+        else:
+            logger.error(f"[FeiShu] 一键创建失败：{e}")
+        return False
+
+    app_id = result.get("client_id", "")
+    app_secret = result.get("client_secret", "")
+    if not app_id or not app_secret:
+        logger.error("[FeiShu] 创建结果缺少 app_id/app_secret，无法继续。")
+        return False
+
+    if not _persist_feishu_credentials(app_id, app_secret):
+        logger.error(
+            "[FeiShu] 应用创建成功但写入 config.json 失败，请手动复制以下值到配置文件：\n"
+            f"        feishu_app_id     = {app_id}\n"
+            f"        feishu_app_secret = {app_secret}"
+        )
+        return False
+
+    logger.info(f"[FeiShu] 应用创建成功，凭据已写入 config.json (app_id={app_id})。")
+    return True
+
+
@singleton
 class FeiShuChanel(ChatChannel):
    feishu_app_id = conf().get('feishu_app_id')
    feishu_app_secret = conf().get('feishu_app_secret')
    feishu_token = conf().get('feishu_token')
    feishu_event_mode = conf().get('feishu_event_mode', 'websocket')  # webhook 或 websocket
+    # 覆盖父类默认值 [ReplyType.VOICE, ReplyType.IMAGE]。
+    # 飞书原生支持发送音频（opus 格式，通过文件上传接口）和图片，
+    # 所有回复类型均已处理，置为空列表以启用语音和图片回复。
+    NOT_SUPPORT_REPLYTYPE = []

    def __init__(self):
        super().__init__()
@@ -86,6 +260,20 @@ class FeiShuChanel(ChatChannel):
        self.feishu_app_secret = conf().get('feishu_app_secret')
        self.feishu_token = conf().get('feishu_token')
        self.feishu_event_mode = conf().get('feishu_event_mode', 'websocket')
+
+        # 命令行启动场景：缺少凭据时尝试通过 lark.register_app 在终端弹二维码
+        # 引导用户扫码创建应用。Web 控制台启动同样会走到这里，但控制台用户通常
+        # 已经通过 /api/feishu/register 完成了创建并写回 config.json。
+        if not self.feishu_app_id or not self.feishu_app_secret:
+            if _register_via_qr_in_terminal():
+                self.feishu_app_id = conf().get('feishu_app_id')
+                self.feishu_app_secret = conf().get('feishu_app_secret')
+            else:
+                err = "[FeiShu] feishu_app_id 与 feishu_app_secret 缺失，无法启动通道"
+                logger.error(err)
+                self.report_startup_error(err)
+                return
+
        self._fetch_bot_open_id()
        if self.feishu_event_mode == 'websocket':
            self._startup_websocket()
@@ -354,6 +542,32 @@ class FeiShuChanel(ChatChannel):
            # 单张图片不直接处理，等待用户提问
            return

+        # 如果是文件消息，触发实际下载并缓存，等待用户后续提问时一并带上。
+        # 与 wecom_bot 行为对齐：发文件后静默缓存（飞书客户端会显示"已读"），
+        # 用户下一条文本消息会自动 attach 上文件路径给 agent。
+        if feishu_msg.ctype == ContextType.FILE:
+            try:
+                feishu_msg.prepare()
+                # prepare 通过 _prepared 标记保证幂等，重复调用安全
+                if not os.path.exists(feishu_msg.content):
+                    raise FileNotFoundError(feishu_msg.content)
+            except Exception as e:
+                logger.warning(f"[FeiShu] prepare file failed: {e}")
+                # 文件下载失败时主动通知用户，避免静默丢失
+                try:
+                    err_reply = Reply(ReplyType.TEXT, f"⚠️ 文件下载失败，请重新发送：{e}")
+                    self._send(err_reply, self._compose_context(
+                        ContextType.TEXT, "",
+                        isgroup=is_group, msg=feishu_msg,
+                        receive_id_type=receive_id_type, no_need_at=True,
+                    ))
+                except Exception:
+                    pass
+                return
+            file_cache.add(session_id, feishu_msg.content, file_type='file')
+            logger.info(f"[FeiShu] File cached for session {session_id}: {feishu_msg.content}")
+            return
+
        # 如果是文本消息，检查是否有缓存的文件
        if feishu_msg.ctype == ContextType.TEXT:
            cached_files = file_cache.get(session_id)
@@ -384,10 +598,22 @@ class FeiShuChanel(ChatChannel):
            no_need_at=True
        )
        if context:
+            # 流式回复模式：向 context 注入 on_event 回调，agent 每产出一段文字时会调用它。
+            # 回调内部先发送一条占位消息获取 message_id，之后通过 PATCH 接口原地更新内容，
+            # 实现打字机效果。回调结束时设置 context["feishu_streamed"]=True，
+            # 让 send() 跳过重复发送，避免最终完整回复再被重复投递一次。
+            # 默认开启流式打字机回复。需机器人开通 cardkit:card:write 权限且飞书客户端 7.20+，
+            # 任意环节失败会自动降级为非流式文本回复。
+            if conf().get("feishu_stream_reply", True):
+                context["on_event"] = self._make_feishu_stream_callback(context, feishu_msg.access_token)
            self.produce(context)
        logger.debug(f"[FeiShu] query={feishu_msg.content}, type={feishu_msg.ctype}")

    def send(self, reply: Reply, context: Context):
+        # 如果文本回复已通过流式传输发送，则跳过重复发送
+        if reply.type == ReplyType.TEXT and context.get("feishu_streamed"):
+            logger.debug("[FeiShu] streaming already delivered text reply, skipping send()")
+            return
        msg = context.get("msg")
        is_group = context["isgroup"]
        if msg:
@@ -450,6 +676,16 @@ class FeiShuChanel(ChatChannel):
                msg_type = "file"
                content_key = "file_key"

+        elif reply.type == ReplyType.VOICE:
+            # 语音回复：上传音频文件到飞书，然后发送 audio 类型消息
+            file_key = self._upload_audio(reply.content, access_token)
+            if not file_key:
+                logger.warning("[FeiShu] upload audio failed")
+                return
+            reply_content = file_key
+            msg_type = "audio"
+            content_key = "file_key"
+
        # Check if we can reply to an existing message (need msg_id)
        can_reply = is_group and msg and hasattr(msg, 'msg_id') and msg.msg_id

@@ -481,6 +717,396 @@ class FeiShuChanel(ChatChannel):
        else:
            logger.error(f"[FeiShu] send message failed, code={res.get('code')}, msg={res.get('msg')}")

+    def _make_feishu_stream_callback(self, context, access_token):
+        """
+        基于飞书官方"流式更新卡片"API 实现打字机回复。
+
+        流程：
+        1. message_update 首次到达 → POST /cardkit/v1/cards 创建带 streaming_mode 的卡片实体，
+           随后用 POST /im/v1/messages（或 reply）以 card_id 把卡片发出去
+        2. 后续 message_update → PUT /cardkit/v1/cards/{id}/elements/{eid}/content
+           传入"当前轮"的全量文本，飞书平台自动计算增量并以打字机效果上屏
+           （流式模式下不受 10 QPS 限制）
+        3. message_end（一轮 LLM 输出结束，且本轮触发了工具调用）→ 把 current 累计到 committed
+           并加入分隔符；下一轮 message_update 又从空白开始，避免多轮内容串到一起
+        4. agent_end → 用 final_response 强制覆盖卡片，再 PATCH /cardkit/v1/cards/{id}/settings
+           关闭 streaming_mode，标记 context["feishu_streamed"]=True 让 chat_channel 跳过普通 send()
+
+        前提条件：
+        - 机器人已开通 cardkit:card:write 权限
+        - 飞书客户端 7.20+
+
+        失败降级：
+        - 创建卡片实体失败（缺权限、网络等）→ 不设置 feishu_streamed 标记，让 chat_channel
+          走普通文本回复路径，用户收到完整回复但无打字机效果，并打 warning 日志
+        """
+        # 共享状态（受 lock 保护）
+        # 多轮 agent 模式下，每个"中间过场消息"会作为一张独立卡片发送。
+        # current_text 只承载当前正在流式渲染的那张卡片的内容；message_end / agent_end
+        # 时会把它定型并 reset。
+        current_text = [""]                # 当前卡片正在累加的 LLM 输出
+        card_id = [None]                   # 当前流式卡片的实体 ID（每段独立）
+        message_id = [None]                # 当前卡片发送后的消息 ID（仅日志用）
+        # 占位发送是同步进行的，但用一个 in-flight 标记防止并发的多条 message_update
+        # 事件各自触发一次创建+发送，导致发出多张卡片。
+        init_in_flight = [False]
+        # 一旦初始化失败就长期标记为 disabled，本次回复不再尝试任何流式调用
+        disabled = [False]
+        lock = threading.Lock()
+
+        # ---- 异步推送队列 ----------------------------------------------------
+        # 同步 requests.put 单次 100~300ms，会阻塞 LLM stream 线程读下一个 chunk。
+        # 把推送丢给独立 worker 线程消费 queue，回调本身只做内存追加，立即返回。
+        # 队列里只放"最新累积文本"的快照；worker 用 deduplication 避免重复推同一个
+        # 内容（高频 chunk 场景下队列会堆积，只推最后一个就够了）。
+        import queue as _queue
+        push_queue: "_queue.Queue[str | None]" = _queue.Queue()
+
+        def _push_worker():
+            while True:
+                snapshot = push_queue.get()
+                if snapshot is None:
+                    push_queue.task_done()
+                    return
+                # 合并队列中已堆积的快照：只推最后一个，省 PUT 次数同时降低延迟
+                merged_count = 1
+                stop = False
+                while True:
+                    try:
+                        nxt = push_queue.get_nowait()
+                    except _queue.Empty:
+                        break
+                    merged_count += 1
+                    if nxt is None:
+                        stop = True
+                        break
+                    snapshot = nxt
+                try:
+                    _stream_update_text(snapshot)
+                finally:
+                    for _ in range(merged_count):
+                        push_queue.task_done()
+                if stop:
+                    return
+
+        push_thread = threading.Thread(target=_push_worker, daemon=True, name="feishu-stream-push")
+        push_thread.start()
+
+        def _drain_push_queue():
+            """等当前队列里所有 PUT 都完成。message_end/agent_end 在做最终定型前必须 drain，
+            否则 worker 里堆积的旧快照可能在 final_text PUT 之后到达，把最终内容覆盖掉。"""
+            try:
+                push_queue.join()
+            except Exception:
+                pass
+
+        msg = context.get("msg")
+        is_group = context.get("isgroup", False)
+        receiver = context.get("receiver")
+        receive_id_type = context.get("receive_id_type", "open_id")
+        # 客户端打字机渲染参数（飞书 App 侧实际"出字"速度）：
+        #   - print_freq_ms：每次刷新的间隔
+        #   - print_step：每次刷新出多少个字符
+        # 当前 40ms × 4 字 ≈ 100 字/秒，接近 ChatGPT/DeepSeek 网页端的节奏。
+        print_freq_ms = 40
+        print_step = 4
+        print_strategy = "fast"
+
+        headers = {
+            "Authorization": "Bearer " + access_token,
+            "Content-Type": "application/json; charset=utf-8",
+        }
+        # 卡片中富文本组件的 element_id，后续所有 PUT 流式更新都打到这个组件
+        ELEMENT_ID = "stream_md"
+        # 操作序号，每次 PUT 必须严格递增（飞书要求）
+        sequence = [0]
+
+        def _next_sequence():
+            sequence[0] += 1
+            return sequence[0]
+
+        def _build_card_json():
+            """卡片 JSON 2.0 结构 + streaming_mode + 单 markdown 组件"""
+            return json.dumps({
+                "schema": "2.0",
+                "config": {
+                    "streaming_mode": True,
+                    "summary": {"content": "[正在生成回复...]"},
+                    "streaming_config": {
+                        "print_frequency_ms": {"default": print_freq_ms},
+                        "print_step": {"default": print_step},
+                        "print_strategy": print_strategy,
+                    },
+                },
+                "body": {
+                    "elements": [
+                        {
+                            "tag": "markdown",
+                            "content": "...",
+                            "element_id": ELEMENT_ID,
+                        }
+                    ],
+                },
+                # 注意：JSON 2.0 不支持自定义 fallback 字段（传入会报错）。
+                # 客户端 < 7.20 时，飞书会自动展示"请升级客户端"占位，无需配置。
+            }, ensure_ascii=False)
+
+        def _create_and_send_card():
+            """同步执行：创建卡片实体 → 发送消息。任意一步失败则 disabled=True 触发降级"""
+            try:
+                # 步骤 1: 创建卡片实体
+                create_url = "https://open.feishu.cn/open-apis/cardkit/v1/cards"
+                create_body = {"type": "card_json", "data": _build_card_json()}
+                res = requests.post(
+                    create_url, headers=headers, json=create_body, timeout=(5, 10)
+                )
+                res_json = res.json()
+                if res_json.get("code") != 0:
+                    logger.warning(
+                        f"[FeiShu] Stream: create card failed "
+                        f"(code={res_json.get('code')}, msg={res_json.get('msg')}). "
+                        f"本次回复已自动降级为普通文本回复（一次性返回完整内容）。"
+                        f"如需开启流式打字机效果与完整 Markdown 渲染，请到飞书开放平台 "
+                        f"https://open.feishu.cn/app 给机器人开通 cardkit:card:write 权限"
+                        f"（创建与更新卡片）并重新发布版本，同时确保飞书客户端 >= 7.20。"
+                    )
+                    with lock:
+                        disabled[0] = True
+                    return
+                cid = res_json["data"]["card_id"]
+                with lock:
+                    card_id[0] = cid
+
+                # 步骤 2: 通过 card_id 发送消息（群聊优先用 reply，单聊直接 send）
+                content_payload = json.dumps(
+                    {"type": "card", "data": {"card_id": cid}}, ensure_ascii=False
+                )
+                can_reply = is_group and msg and hasattr(msg, "msg_id") and msg.msg_id
+                if can_reply:
+                    send_url = (
+                        f"https://open.feishu.cn/open-apis/im/v1/messages/"
+                        f"{msg.msg_id}/reply"
+                    )
+                    send_body = {"msg_type": "interactive", "content": content_payload}
+                    send_res = requests.post(
+                        send_url, headers=headers, json=send_body, timeout=(5, 10)
+                    )
+                else:
+                    send_url = "https://open.feishu.cn/open-apis/im/v1/messages"
+                    params = {"receive_id_type": receive_id_type}
+                    send_body = {
+                        "receive_id": receiver,
+                        "msg_type": "interactive",
+                        "content": content_payload,
+                    }
+                    send_res = requests.post(
+                        send_url, headers=headers, params=params, json=send_body,
+                        timeout=(5, 10),
+                    )
+                send_json = send_res.json()
+                if send_json.get("code") != 0:
+                    logger.warning(
+                        f"[FeiShu] Stream: send card failed: {send_json}. 降级为普通文本。"
+                    )
+                    with lock:
+                        disabled[0] = True
+                    return
+                mid = send_json["data"]["message_id"]
+                with lock:
+                    message_id[0] = mid
+                logger.info(
+                    f"[FeiShu] Stream: card created and sent, "
+                    f"card_id={cid}, message_id={mid}"
+                )
+            except Exception as e:
+                logger.warning(
+                    f"[FeiShu] Stream: create/send card exception: {e}. 降级为普通文本。"
+                )
+                with lock:
+                    disabled[0] = True
+            finally:
+                with lock:
+                    init_in_flight[0] = False
+
+        def _stream_update_text(full_text):
+            """PUT 流式更新文本组件。content 必须是当前组件的全量文本。"""
+            with lock:
+                cid = card_id[0]
+            if not cid:
+                return
+            url = (
+                f"https://open.feishu.cn/open-apis/cardkit/v1/cards/"
+                f"{cid}/elements/{ELEMENT_ID}/content"
+            )
+            body = {
+                "content": full_text,
+                "sequence": _next_sequence(),
+            }
+            try:
+                res = requests.put(url, headers=headers, json=body, timeout=(5, 10))
+                res_json = res.json()
+                if res_json.get("code") != 0:
+                    logger.warning(
+                        f"[FeiShu] Stream: update text failed: {res_json}"
+                    )
+            except Exception as e:
+                logger.warning(f"[FeiShu] Stream: update text exception: {e}")
+
+        def _close_streaming_mode(final_text: str = ""):
+            """关闭流式模式（卡片转入"普通"状态，可被转发）。
+
+            同时通过整卡更新接口把 summary 改成最终内容的预览，否则飞书会话列表
+            会一直显示创建卡片时的占位摘要（"[正在生成回复...]"）。
+            """
+            with lock:
+                cid = card_id[0]
+            if not cid:
+                return
+
+            # 1) 通过整卡更新接口把 streaming_mode 关掉，并改写 summary
+            #    （settings 接口的 config 不接受 summary 字段，会报 code=2200）
+            preview_src = (final_text or "").strip().replace("\n", " ")
+            preview = preview_src[:30] if preview_src else ""
+            full_card = {
+                "schema": "2.0",
+                "config": {
+                    "streaming_mode": False,
+                    "summary": {"content": preview or " "},
+                },
+                "body": {
+                    "elements": [
+                        {
+                            "tag": "markdown",
+                            "content": final_text or " ",
+                            "element_id": ELEMENT_ID,
+                        }
+                    ],
+                },
+            }
+            put_url = f"https://open.feishu.cn/open-apis/cardkit/v1/cards/{cid}"
+            put_body = {
+                "card": {"type": "card_json", "data": json.dumps(full_card, ensure_ascii=False)},
+                "sequence": _next_sequence(),
+            }
+            try:
+                res = requests.put(put_url, headers=headers, json=put_body, timeout=(5, 10))
+                res_json = res.json()
+                if res_json.get("code") != 0:
+                    logger.warning(
+                        f"[FeiShu] Stream: finalize card (close+summary) failed: {res_json}"
+                    )
+            except Exception as e:
+                logger.warning(
+                    f"[FeiShu] Stream: finalize card exception: {e}"
+                )
+
+        def on_event(event: dict):
+            event_type = event.get("type")
+            data = event.get("data", {})
+
+            # 一旦降级，本次回复不再做任何流式操作
+            with lock:
+                if disabled[0]:
+                    return
+
+            if event_type == "message_update":
+                delta = data.get("delta", "")
+                if not delta:
+                    return
+
+                # 第一段：判断是否需要初始化（创建卡片 + 发送）
+                need_init = False
+                with lock:
+                    if card_id[0] is None and not init_in_flight[0]:
+                        init_in_flight[0] = True
+                        need_init = True
+
+                if need_init:
+                    _create_and_send_card()
+                    # 初始化失败已标记 disabled，下次循环直接 return
+                    with lock:
+                        if disabled[0]:
+                            return
+
+                # 第二段：累加文本，把快照丢给 push worker 异步推送。
+                # 这里不能直接 requests.put，否则会阻塞 LLM stream 线程读下一个 chunk
+                # （实测 DeepSeek 高频小 chunk 场景每个 PUT ~150ms，累积起来非常卡）。
+                snapshot = ""
+                should_push = False
+                with lock:
+                    current_text[0] += delta
+                    if card_id[0]:
+                        snapshot = current_text[0]
+                        should_push = True
+
+                if should_push:
+                    push_queue.put(snapshot)
+
+            elif event_type == "message_end":
+                # 一轮 LLM 输出结束。如果本轮触发了工具调用，说明当前轮的文本是
+                # "中间过场消息"（如"来看看！"），应该作为独立卡片定型，然后为下一轮
+                # 重新创建一张新卡片。这样最终用户看到的是：
+                #   [卡片1: 中间过场1]
+                #   [卡片2: 中间过场2]
+                #   ...
+                #   [卡片N: 最终回复]
+                # 与 wecom_bot 的多消息流式体验对齐。
+                tool_calls = data.get("tool_calls", []) or []
+                if not tool_calls:
+                    # 没有工具调用：本轮即最终回复，留给 agent_end 统一处理。
+                    return
+
+                with lock:
+                    text_to_finalize = current_text[0].rstrip()
+                    current_text[0] = ""
+
+                if not text_to_finalize:
+                    return
+
+                # 等异步队列里堆积的快照都推完，避免它们晚于 final 文本到达把内容覆盖掉
+                _drain_push_queue()
+                # 用最终文本覆盖当前卡片并关闭流式模式（凝固成普通卡片，
+                # 同时把会话列表的 summary 改成预览，不再显示"正在生成回复..."）
+                _stream_update_text(text_to_finalize)
+                _close_streaming_mode(text_to_finalize)
+
+                # 重置卡片状态，下一段 message_update 会触发新卡片的创建
+                with lock:
+                    card_id[0] = None
+                    message_id[0] = None
+                    sequence[0] = 0
+
+            elif event_type == "agent_end":
+                # 最终回复：用 final_response 覆盖当前流式卡片，然后关闭流式模式。
+                final_response = data.get("final_response", "")
+                if not final_response:
+                    return
+                final_text = str(final_response)
+                # 标记 streamed 让 chat_channel 跳过 send()
+                context["feishu_streamed"] = True
+
+                with lock:
+                    has_card = card_id[0] is not None
+                    init_busy = init_in_flight[0]
+
+                # 罕见情况：agent_end 触发时还没创建过卡片（极快返回 / 没有
+                # message_update），主动创建一张承载 final_text。
+                if not has_card and not init_busy:
+                    with lock:
+                        init_in_flight[0] = True
+                    _create_and_send_card()
+                    with lock:
+                        if disabled[0]:
+                            return
+
+                _drain_push_queue()
+                _stream_update_text(final_text)
+                _close_streaming_mode(final_text)
+                # 通知 push worker 退出（本次回复彻底结束）
+                push_queue.put(None)
+
+        return on_event
+
    def fetch_access_token(self) -> str:
        url = "https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal/"
        headers = {
@@ -687,6 +1313,66 @@ class FeiShuChanel(ChatChannel):
                except Exception as e:
                    logger.warning(f"[FeiShu] Failed to remove temp file {temp_file}: {e}")

+    def _upload_audio(self, audio_path, access_token):
+        """
+        Upload a local audio file to Feishu and return file_key.
+        audio_path is a plain local file path (no file:// prefix).
+        Feishu audio messages only support opus format; non-opus files are converted first.
+        """
+        logger.debug(f"[FeiShu] start upload audio, path={audio_path}")
+
+        if not os.path.exists(audio_path):
+            logger.error(f"[FeiShu] audio file not found: {audio_path}")
+            return None
+
+        # Feishu only plays audio messages in opus format.
+        # Convert if the TTS engine produced a different format (e.g. mp3 from OpenAI TTS).
+        upload_path = audio_path
+        if not audio_path.lower().endswith('.opus'):
+            opus_path = os.path.splitext(audio_path)[0] + '.opus'
+            try:
+                from pydub import AudioSegment
+                audio = AudioSegment.from_file(audio_path)
+                audio.export(opus_path, format='opus')
+                upload_path = opus_path
+                logger.info(f"[FeiShu] Converted audio to opus: {opus_path}")
+            except Exception as e:
+                logger.warning(f"[FeiShu] Failed to convert audio to opus, uploading original: {e}")
+                upload_path = audio_path
+
+        file_name = os.path.splitext(os.path.basename(upload_path))[0] + '.opus'
+        upload_url = "https://open.feishu.cn/open-apis/im/v1/files"
+        data = {'file_type': 'opus', 'file_name': file_name}
+        headers = {'Authorization': f'Bearer {access_token}'}
+
+        try:
+            with open(upload_path, "rb") as f:
+                upload_response = requests.post(
+                    upload_url,
+                    files={"file": f},
+                    data=data,
+                    headers=headers,
+                    timeout=(5, 30)
+                )
+                logger.info(
+                    f"[FeiShu] upload audio response, status={upload_response.status_code}, res={upload_response.content}")
+                response_data = upload_response.json()
+                if response_data.get("code") == 0:
+                    return response_data.get("data").get("file_key")
+                else:
+                    logger.error(f"[FeiShu] upload audio failed: {response_data}")
+                    return None
+        except Exception as e:
+            logger.error(f"[FeiShu] upload audio exception: {e}")
+            return None
+        finally:
+            # 无论上传成功与否都清理转换产生的临时 opus 文件，避免失败路径下磁盘堆积。
+            if upload_path != audio_path and os.path.exists(upload_path):
+                try:
+                    os.remove(upload_path)
+                except Exception as e:
+                    logger.warning(f"[FeiShu] Failed to remove temp opus file {upload_path}: {e}")
+
    def _upload_file_url(self, file_url, access_token):
        """
        Upload file to Feishu
@@ -829,10 +1515,16 @@ class FeiShuChanel(ChatChannel):
            else:
                context.type = ContextType.TEXT
            context.content = content.strip()
+            # Text input opts into voice replies only when the always-on toggle is set.
+            if "desire_rtype" not in context and conf().get("always_reply_voice"):
+                context["desire_rtype"] = ReplyType.VOICE

        elif context.type == ContextType.VOICE:
-            # 2.语音请求
-            if "desire_rtype" not in context and conf().get("voice_reply_voice"):
+            # 2.语音请求: voice input replies with voice if either
+            # voice_reply_voice (mirror reply) or always_reply_voice is on.
+            if "desire_rtype" not in context and (
+                conf().get("voice_reply_voice") or conf().get("always_reply_voice")
+            ):
                context["desire_rtype"] = ReplyType.VOICE

        return context
--- a/channel/feishu/feishu_message.py
+++ b/channel/feishu/feishu_message.py
@@ -144,7 +144,14 @@ class FeishuMessage(ChatMessage):
            file_key = content.get("file_key")
            file_name = content.get("file_name")

-            self.content = TmpDir().path() + file_key + "." + utils.get_path_suffix(file_name)
+            # 落到 agent_workspace/tmp 下（绝对路径），与图片处理一致；
+            # 否则相对路径 ./tmp 在 agent 工作区里 read 时会找不到。
+            workspace_root = expand_path(conf().get("agent_workspace", "~/cow"))
+            tmp_dir = os.path.join(workspace_root, "tmp")
+            os.makedirs(tmp_dir, exist_ok=True)
+            self.content = os.path.join(
+                tmp_dir, f"{file_key}.{utils.get_path_suffix(file_name)}"
+            )

            def _download_file():
                # 如果响应状态码是200，则将响应内容写入本地文件
@@ -162,6 +169,42 @@ class FeishuMessage(ChatMessage):
                else:
                    logger.info(f"[FeiShu] Failed to download file, key={file_key}, res={response.text}")
            self._prepare_fn = _download_file
+        elif msg_type == "audio":
+            # 飞书用户发送的语音消息类型为 "audio"，文件为 opus 编码格式。
+            # 映射为 ContextType.VOICE，交由 chat_channel 的语音转文字（STT）流程处理。
+            # 文件通过 _prepare_fn 延迟下载，在 chat_channel 调用 cmsg.prepare() 时才执行。
+            self.ctype = ContextType.VOICE
+            content = json.loads(msg.get("content"))
+            file_key = content.get("file_key")
+
+            # 落到 agent_workspace/tmp 下（绝对路径），保证语音 STT 流程可读到
+            workspace_root = expand_path(conf().get("agent_workspace", "~/cow"))
+            tmp_dir = os.path.join(workspace_root, "tmp")
+            os.makedirs(tmp_dir, exist_ok=True)
+            self.content = os.path.join(tmp_dir, f"{file_key}.opus")
+            logger.info(f"[FeiShu] audio message: file_key={file_key}, save_path={self.content}")
+
+            def _download_audio():
+                logger.info(f"[FeiShu] downloading audio: file_key={file_key}, msg_id={self.msg_id}")
+                url = f"https://open.feishu.cn/open-apis/im/v1/messages/{self.msg_id}/resources/{file_key}"
+                headers = {
+                    "Authorization": "Bearer " + access_token,
+                }
+                params = {
+                    "type": "file"
+                }
+                try:
+                    response = requests.get(url=url, headers=headers, params=params)
+                    logger.info(f"[FeiShu] download audio response: status={response.status_code}, size={len(response.content)} bytes")
+                    if response.status_code == 200:
+                        with open(self.content, "wb") as f:
+                            f.write(response.content)
+                        logger.info(f"[FeiShu] audio saved to: {self.content}")
+                    else:
+                        logger.error(f"[FeiShu] Failed to download audio, key={file_key}, status={response.status_code}, res={response.text}")
+                except Exception as e:
+                    logger.error(f"[FeiShu] Exception downloading audio, key={file_key}: {e}", exc_info=True)
+            self._prepare_fn = _download_audio
        else:
            raise NotImplementedError("Unsupported message type: Type:{} ".format(msg_type))

--- a/channel/web/chat.html
+++ b/channel/web/chat.html
@@ -5,20 +5,20 @@
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>CowAgent Console</title>
    <link rel="icon" href="assets/favicon.ico" type="image/x-icon">
-    <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css">
-    <link rel="preconnect" href="https://fonts.googleapis.com">
-    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
-    <link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&display=swap" rel="stylesheet">
-    <script src="https://cdn.tailwindcss.com"></script>
-    <script src="https://cdn.jsdelivr.net/npm/markdown-it@13.0.1/dist/markdown-it.min.js"></script>
-    <link id="hljs-light" rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/styles/github.min.css">
-    <link id="hljs-dark" rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/styles/github-dark.min.css" disabled>
-    <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/highlight.min.js"></script>
-    <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/languages/python.min.js"></script>
-    <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/languages/javascript.min.js"></script>
-    <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/languages/java.min.js"></script>
-    <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/languages/go.min.js"></script>
-    <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/languages/bash.min.js"></script>
+    <!-- Vendored third-party assets (no external CDN dependency).
+         See channel/web/static/vendor/README.md for sources & versions. -->
+    <link rel="stylesheet" href="assets/vendor/fontawesome/css/all.min.css">
+    <link rel="stylesheet" href="assets/vendor/fonts/inter/inter.css">
+    <script src="assets/vendor/tailwind/tailwind.min.js"></script>
+    <script src="assets/vendor/markdown-it/markdown-it.min.js"></script>
+    <link id="hljs-light" rel="stylesheet" href="assets/vendor/highlightjs/styles/github.min.css">
+    <link id="hljs-dark" rel="stylesheet" href="assets/vendor/highlightjs/styles/github-dark.min.css" disabled>
+    <script src="assets/vendor/highlightjs/highlight.min.js"></script>
+    <script src="assets/vendor/highlightjs/languages/python.min.js"></script>
+    <script src="assets/vendor/highlightjs/languages/javascript.min.js"></script>
+    <script src="assets/vendor/highlightjs/languages/java.min.js"></script>
+    <script src="assets/vendor/highlightjs/languages/go.min.js"></script>
+    <script src="assets/vendor/highlightjs/languages/bash.min.js"></script>
    <script>
    tailwind.config = {
        darkMode: 'class',
@@ -50,16 +50,53 @@
    (function() {
        var theme = localStorage.getItem('cow_theme') || 'dark';
        if (theme === 'dark') document.documentElement.classList.add('dark');
+        var lang = localStorage.getItem('cow_lang') || 'zh';
+        document.documentElement.setAttribute('lang', lang);
    })();
    </script>
 </head>
 <body class="h-screen overflow-hidden bg-gray-50 dark:bg-[#111111] text-slate-800 dark:text-slate-200 font-sans">
+
+    <!-- Login Overlay -->
+    <div id="login-overlay" class="fixed inset-0 z-[200] bg-gray-50 dark:bg-[#111111] flex items-center justify-center hidden">
+        <div class="w-full max-w-sm mx-4">
+            <div class="flex flex-col items-center mb-8">
+                <img src="assets/logo.jpg" alt="CowAgent" class="w-16 h-16 rounded-2xl mb-4 shadow-lg">
+                <h1 class="text-xl font-bold text-slate-800 dark:text-slate-100">CowAgent</h1>
+                <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" id="login-subtitle">请输入密码以访问控制台</p>
+            </div>
+            <form id="login-form" class="space-y-4" onsubmit="return false;">
+                <div class="relative">
+                    <input id="login-password" type="password" autocomplete="current-password"
+                           placeholder="Password"
+                           class="w-full px-4 py-3 rounded-xl border border-slate-200 dark:border-white/10
+                                  bg-white dark:bg-[#1A1A1A] text-slate-800 dark:text-slate-200
+                                  placeholder-slate-400 dark:placeholder-slate-500
+                                  focus:outline-none focus:ring-2 focus:ring-primary-400/50 focus:border-primary-400
+                                  transition-all duration-150 text-sm">
+                    <button type="button" id="login-toggle-pwd"
+                            class="absolute right-3 top-1/2 -translate-y-1/2 text-slate-400 hover:text-slate-600
+                                   dark:hover:text-slate-300 cursor-pointer transition-colors"
+                            onclick="toggleLoginPassword()">
+                        <i class="fas fa-eye text-sm"></i>
+                    </button>
+                </div>
+                <p id="login-error" class="text-sm text-red-500 hidden"></p>
+                <button id="login-btn" type="submit"
+                        class="w-full py-3 rounded-xl bg-primary-500 hover:bg-primary-600 text-white font-medium
+                               text-sm cursor-pointer transition-colors duration-150 disabled:opacity-50 disabled:cursor-not-allowed">
+                    登录
+                </button>
+            </form>
+        </div>
+    </div>
+
    <div id="app" class="flex h-screen">

        <!-- ================================================================ -->
        <!-- SIDEBAR                                                          -->
        <!-- ================================================================ -->
-        <aside id="sidebar" class="fixed inset-y-0 left-0 z-50 w-64 bg-[#0A0A0A] text-neutral-400 flex flex-col
+        <aside id="sidebar" class="fixed inset-y-0 left-0 z-50 w-52 bg-[#0A0A0A] text-neutral-400 flex flex-col
                                    transform -translate-x-full lg:relative lg:translate-x-0
                                    transition-transform duration-300 ease-in-out">
            <!-- Logo -->
@@ -67,7 +104,7 @@
                <img src="assets/logo.jpg" alt="CowAgent" class="w-8 h-8 rounded-lg flex-shrink-0">
                <div class="flex flex-col min-w-0">
                    <span class="text-white font-semibold text-sm truncate">CowAgent</span>
-                    <span class="text-neutral-500 text-xs" data-i18n="console">Console</span>
+                    <span class="text-neutral-500 text-xs" data-i18n="console">控制台</span>
                </div>
            </div>

@@ -77,13 +114,13 @@
                <div class="menu-group open" data-group="chat">
                    <button class="w-full flex items-center gap-2 px-3 py-2 text-xs font-semibold uppercase tracking-wider text-neutral-500 hover:text-neutral-300 cursor-pointer transition-colors duration-150">
                        <i class="fas fa-chevron-right text-[10px] chevron"></i>
-                        <span data-i18n="nav_chat">Chat</span>
+                        <span data-i18n="nav_chat">对话</span>
                    </button>
                    <div class="menu-group-items pl-2">
                        <a class="sidebar-item active flex items-center gap-3 px-3 py-2 rounded-lg cursor-pointer transition-all duration-150 hover:bg-white/5 hover:text-neutral-200 text-[14px]"
                           data-view="chat">
                            <i class="fas fa-message item-icon text-xs w-5 text-center"></i>
-                            <span data-i18n="menu_chat">Chat</span>
+                            <span data-i18n="menu_chat">对话</span>
                        </a>
                    </div>
                </div>
@@ -92,33 +129,43 @@
                <div class="menu-group open" data-group="manage">
                    <button class="w-full flex items-center gap-2 px-3 py-2 text-xs font-semibold uppercase tracking-wider text-neutral-500 hover:text-neutral-300 cursor-pointer transition-colors duration-150">
                        <i class="fas fa-chevron-right text-[10px] chevron"></i>
-                        <span data-i18n="nav_manage">Management</span>
+                        <span data-i18n="nav_manage">管理</span>
                    </button>
                    <div class="menu-group-items pl-2">
                        <a class="sidebar-item flex items-center gap-3 px-3 py-2 rounded-lg cursor-pointer transition-all duration-150 hover:bg-white/5 hover:text-neutral-200 text-[14px]"
                           data-view="config">
                            <i class="fas fa-sliders item-icon text-xs w-5 text-center"></i>
-                            <span data-i18n="menu_config">Config</span>
+                            <span data-i18n="menu_config">配置</span>
+                        </a>
+                        <a class="sidebar-item flex items-center gap-3 px-3 py-2 rounded-lg cursor-pointer transition-all duration-150 hover:bg-white/5 hover:text-neutral-200 text-[14px]"
+                           data-view="models">
+                            <i class="fas fa-microchip item-icon text-xs w-5 text-center"></i>
+                            <span data-i18n="menu_models">模型</span>
                        </a>
                        <a class="sidebar-item flex items-center gap-3 px-3 py-2 rounded-lg cursor-pointer transition-all duration-150 hover:bg-white/5 hover:text-neutral-200 text-[14px]"
                           data-view="skills">
                            <i class="fas fa-bolt item-icon text-xs w-5 text-center"></i>
-                            <span data-i18n="menu_skills">Skills</span>
+                            <span data-i18n="menu_skills">技能</span>
                        </a>
                        <a class="sidebar-item flex items-center gap-3 px-3 py-2 rounded-lg cursor-pointer transition-all duration-150 hover:bg-white/5 hover:text-neutral-200 text-[14px]"
                           data-view="memory">
                            <i class="fas fa-brain item-icon text-xs w-5 text-center"></i>
-                            <span data-i18n="menu_memory">Memory</span>
+                            <span data-i18n="menu_memory">记忆</span>
+                        </a>
+                        <a class="sidebar-item flex items-center gap-3 px-3 py-2 rounded-lg cursor-pointer transition-all duration-150 hover:bg-white/5 hover:text-neutral-200 text-[14px]"
+                           data-view="knowledge">
+                            <i class="fas fa-book item-icon text-xs w-5 text-center"></i>
+                            <span data-i18n="menu_knowledge">知识</span>
                        </a>
                        <a class="sidebar-item flex items-center gap-3 px-3 py-2 rounded-lg cursor-pointer transition-all duration-150 hover:bg-white/5 hover:text-neutral-200 text-[14px]"
                           data-view="channels">
                            <i class="fas fa-tower-broadcast item-icon text-xs w-5 text-center"></i>
-                            <span data-i18n="menu_channels">Channels</span>
+                            <span data-i18n="menu_channels">通道</span>
                        </a>
                        <a class="sidebar-item flex items-center gap-3 px-3 py-2 rounded-lg cursor-pointer transition-all duration-150 hover:bg-white/5 hover:text-neutral-200 text-[14px]"
                           data-view="tasks">
                            <i class="fas fa-clock item-icon text-xs w-5 text-center"></i>
-                            <span data-i18n="menu_tasks">Tasks</span>
+                            <span data-i18n="menu_tasks">定时</span>
                        </a>
                    </div>
                </div>
@@ -127,13 +174,13 @@
                <div class="menu-group open" data-group="monitor">
                    <button class="w-full flex items-center gap-2 px-3 py-2 text-xs font-semibold uppercase tracking-wider text-neutral-500 hover:text-neutral-300 cursor-pointer transition-colors duration-150">
                        <i class="fas fa-chevron-right text-[10px] chevron"></i>
-                        <span data-i18n="nav_monitor">Monitor</span>
+                        <span data-i18n="nav_monitor">监控</span>
                    </button>
                    <div class="menu-group-items pl-2">
                        <a class="sidebar-item flex items-center gap-3 px-3 py-2 rounded-lg cursor-pointer transition-all duration-150 hover:bg-white/5 hover:text-neutral-200 text-[14px]"
                           data-view="logs">
                            <i class="fas fa-terminal item-icon text-xs w-5 text-center"></i>
-                            <span data-i18n="menu_logs">Logs</span>
+                            <span data-i18n="menu_logs">日志</span>
                        </a>
                    </div>
                </div>
@@ -154,6 +201,26 @@
        <!-- Mobile Overlay -->
        <div id="sidebar-overlay" class="fixed inset-0 bg-black/50 z-40 hidden lg:hidden cursor-pointer" onclick="toggleSidebar()"></div>

+        <!-- ================================================================ -->
+        <!-- SESSION PANEL (collapsible)                                      -->
+        <!-- ================================================================ -->
+        <aside id="session-panel" class="session-panel hidden">
+            <div class="session-panel-header">
+                <span class="session-panel-title" data-i18n="session_history">历史会话</span>
+                <button class="session-panel-close" onclick="toggleSessionPanel()" title="Close">
+                    <i class="fas fa-times"></i>
+                </button>
+            </div>
+            <button class="session-panel-new" onclick="newChat()">
+                <i class="fas fa-plus"></i>
+                <span data-i18n="new_chat">新对话</span>
+            </button>
+            <div id="session-list" class="session-list"></div>
+        </aside>
+
+        <!-- Mobile overlay for session panel (click to close) -->
+        <div id="session-panel-overlay" class="session-panel-overlay hidden" onclick="closeSessionPanel()"></div>
+
        <!-- ================================================================ -->
        <!-- MAIN CONTENT                                                     -->
        <!-- ================================================================ -->
@@ -166,11 +233,17 @@
                    <i class="fas fa-bars text-slate-600 dark:text-slate-300"></i>
                </button>

+                <!-- Session panel toggle -->
+                <button id="session-toggle-btn" class="p-2 rounded-lg hover:bg-slate-100 dark:hover:bg-white/10 cursor-pointer transition-colors duration-150"
+                        onclick="toggleSessionPanel()">
+                    <i class="fas fa-clock-rotate-left text-slate-500 dark:text-slate-400"></i>
+                </button>
+
                <!-- Breadcrumb (hidden on mobile) -->
                <div class="hidden lg:flex items-center gap-2 text-sm min-w-0">
-                    <span id="breadcrumb-group" class="text-slate-400 dark:text-slate-500 truncate" data-i18n="nav_chat">Chat</span>
+                    <span id="breadcrumb-group" class="text-slate-400 dark:text-slate-500 truncate" data-i18n="nav_chat">对话</span>
                    <i class="fas fa-chevron-right text-[10px] text-slate-300 dark:text-slate-600"></i>
-                    <span id="breadcrumb-page" class="font-medium text-slate-700 dark:text-slate-200 truncate" data-i18n="menu_chat">Chat</span>
+                    <span id="breadcrumb-page" class="font-medium text-slate-700 dark:text-slate-200 truncate" data-i18n="menu_chat">对话</span>
                </div>

                <div class="flex-1"></div>
@@ -220,26 +293,26 @@
                <!-- ====================================================== -->
                <!-- VIEW: Chat                                              -->
                <!-- ====================================================== -->
-                <div id="view-chat" class="view active">
+                <div id="view-chat" class="view active relative">
                    <!-- Messages -->
                    <div id="chat-messages" class="flex-1 overflow-y-auto">
                        <!-- Welcome Screen -->
-                        <div id="welcome-screen" class="flex flex-col items-center justify-center h-full px-6 py-12">
+                        <div id="welcome-screen" class="flex flex-col items-center justify-center h-full px-6 pb-16" style="padding-top: 6vh">
                            <img src="assets/logo.jpg" alt="CowAgent" class="w-16 h-16 rounded-2xl mb-6 shadow-lg shadow-primary-500/20">
                            <h1 id="welcome-title" class="text-2xl font-bold text-slate-800 dark:text-slate-100 mb-3">CowAgent</h1>
                            <p id="welcome-subtitle" class="text-slate-500 dark:text-slate-400 text-center max-w-lg mb-10 leading-relaxed"
-                               data-i18n-html="welcome_subtitle">I can help you answer questions, manage your computer, create and execute skills,<br>and keep growing through long-term memory.</p>
+                               data-i18n-html="welcome_subtitle">我可以帮你解答问题、管理计算机、创造和执行技能，并通过<br>长期记忆和知识库不断成长</p>

-                            <div class="grid grid-cols-1 sm:grid-cols-3 gap-4 w-full max-w-2xl">
+                            <div class="grid grid-cols-2 sm:grid-cols-3 gap-3 w-full max-w-2xl">
                                <div class="example-card group bg-white dark:bg-[#1A1A1A] border border-slate-200 dark:border-white/10 rounded-xl p-4
                                            cursor-pointer hover:border-primary-300 dark:hover:border-primary-600 hover:shadow-md transition-all duration-200">
                                    <div class="flex items-center gap-2 mb-2">
                                        <div class="w-7 h-7 rounded-lg bg-blue-50 dark:bg-blue-900/30 flex items-center justify-center">
                                            <i class="fas fa-folder-open text-blue-500 text-xs"></i>
                                        </div>
-                                        <span class="font-medium text-sm text-slate-700 dark:text-slate-200" data-i18n="example_sys_title">System</span>
+                                        <span class="font-medium text-sm text-slate-700 dark:text-slate-200" data-i18n="example_sys_title">系统管理</span>
                                    </div>
-                                    <p class="text-sm text-slate-500 dark:text-slate-400 leading-relaxed" data-i18n="example_sys_text">Show me the files in the workspace</p>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 leading-relaxed" data-i18n="example_sys_text">查看工作空间里有哪些文件</p>
                                </div>
                                <div class="example-card group bg-white dark:bg-[#1A1A1A] border border-slate-200 dark:border-white/10 rounded-xl p-4
                                            cursor-pointer hover:border-primary-300 dark:hover:border-primary-600 hover:shadow-md transition-all duration-200">
@@ -247,9 +320,9 @@
                                        <div class="w-7 h-7 rounded-lg bg-amber-50 dark:bg-amber-900/30 flex items-center justify-center">
                                            <i class="fas fa-clock text-amber-500 text-xs"></i>
                                        </div>
-                                        <span class="font-medium text-sm text-slate-700 dark:text-slate-200" data-i18n="example_task_title">Smart Task</span>
+                                        <span class="font-medium text-sm text-slate-700 dark:text-slate-200" data-i18n="example_task_title">定时任务</span>
                                    </div>
-                                    <p class="text-sm text-slate-500 dark:text-slate-400 leading-relaxed" data-i18n="example_task_text">Remind me to check the server in 5 minutes</p>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 leading-relaxed" data-i18n="example_task_text">1分钟后提醒我检查服务器</p>
                                </div>
                                <div class="example-card group bg-white dark:bg-[#1A1A1A] border border-slate-200 dark:border-white/10 rounded-xl p-4
                                            cursor-pointer hover:border-primary-300 dark:hover:border-primary-600 hover:shadow-md transition-all duration-200">
@@ -257,14 +330,57 @@
                                        <div class="w-7 h-7 rounded-lg bg-emerald-50 dark:bg-emerald-900/30 flex items-center justify-center">
                                            <i class="fas fa-code text-emerald-500 text-xs"></i>
                                        </div>
-                                        <span class="font-medium text-sm text-slate-700 dark:text-slate-200" data-i18n="example_code_title">Coding</span>
+                                        <span class="font-medium text-sm text-slate-700 dark:text-slate-200" data-i18n="example_code_title">编程助手</span>
                                    </div>
-                                    <p class="text-sm text-slate-500 dark:text-slate-400 leading-relaxed" data-i18n="example_code_text">Write a Python web scraper script</p>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 leading-relaxed" data-i18n="example_code_text">搜索AI资讯并生成可视化网页报告</p>
+                                </div>
+                                <div class="example-card group bg-white dark:bg-[#1A1A1A] border border-slate-200 dark:border-white/10 rounded-xl p-4
+                                            cursor-pointer hover:border-primary-300 dark:hover:border-primary-600 hover:shadow-md transition-all duration-200">
+                                    <div class="flex items-center gap-2 mb-2">
+                                        <div class="w-7 h-7 rounded-lg bg-violet-50 dark:bg-violet-900/30 flex items-center justify-center">
+                                            <i class="fas fa-book text-violet-500 text-xs"></i>
+                                        </div>
+                                        <span class="font-medium text-sm text-slate-700 dark:text-slate-200" data-i18n="example_knowledge_title">知识库</span>
+                                    </div>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 leading-relaxed" data-i18n="example_knowledge_text">查看知识库当前文档情况</p>
+                                </div>
+                                <div class="example-card group bg-white dark:bg-[#1A1A1A] border border-slate-200 dark:border-white/10 rounded-xl p-4
+                                            cursor-pointer hover:border-primary-300 dark:hover:border-primary-600 hover:shadow-md transition-all duration-200">
+                                    <div class="flex items-center gap-2 mb-2">
+                                        <div class="w-7 h-7 rounded-lg bg-rose-50 dark:bg-rose-900/30 flex items-center justify-center">
+                                            <i class="fas fa-puzzle-piece text-rose-500 text-xs"></i>
+                                        </div>
+                                        <span class="font-medium text-sm text-slate-700 dark:text-slate-200" data-i18n="example_skill_title">技能系统</span>
+                                    </div>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 leading-relaxed" data-i18n="example_skill_text">查看所有支持的工具和技能</p>
+                                </div>
+                                <div class="example-card group bg-white dark:bg-[#1A1A1A] border border-slate-200 dark:border-white/10 rounded-xl p-4
+                                            cursor-pointer hover:border-primary-300 dark:hover:border-primary-600 hover:shadow-md transition-all duration-200"
+                                     data-send="/help">
+                                    <div class="flex items-center gap-2 mb-2">
+                                        <div class="w-7 h-7 rounded-lg bg-slate-100 dark:bg-slate-800 flex items-center justify-center">
+                                            <i class="fas fa-terminal text-slate-500 text-xs"></i>
+                                        </div>
+                                        <span class="font-medium text-sm text-slate-700 dark:text-slate-200" data-i18n="example_web_title">指令中心</span>
+                                    </div>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 leading-relaxed" data-i18n="example_web_text">查看全部命令</p>
                                </div>
                            </div>
                        </div>
                    </div>

+                    <!-- Scroll-to-bottom FAB -->
+                    <button id="scroll-to-bottom-btn"
+                            class="hidden absolute right-5 bottom-[80px] z-10
+                                   w-9 h-9 rounded-full shadow-lg
+                                   bg-white dark:bg-[#2A2A2A] border border-slate-200 dark:border-white/15
+                                   text-slate-500 dark:text-slate-400 hover:text-primary-500 dark:hover:text-primary-400
+                                   flex items-center justify-center cursor-pointer transition-all duration-200
+                                   hover:shadow-xl hover:scale-105"
+                            onclick="_autoScrollEnabled = true; scrollChatToBottom(true);">
+                        <i class="fas fa-chevron-down text-sm"></i>
+                    </button>
+
                    <!-- Chat Input -->
                    <div class="flex-shrink-0 border-t border-slate-200 dark:border-white/10 bg-white dark:bg-[#1A1A1A] px-4 py-3">
                        <div class="max-w-3xl mx-auto">
@@ -274,29 +390,56 @@
                                <div class="flex items-center flex-shrink-0">
                                    <button id="new-chat-btn" class="w-9 h-10 flex items-center justify-center rounded-lg
                                                                     text-slate-400 hover:text-primary-500 hover:bg-primary-50 dark:hover:bg-primary-900/20
-                                                                     cursor-pointer transition-colors duration-150" title="New Chat"
+                                                                     cursor-pointer transition-colors duration-150"
                                            onclick="newChat()">
                                        <i class="fas fa-plus text-base"></i>
                                    </button>
+                                    <button id="clear-context-btn" class="w-9 h-10 flex items-center justify-center rounded-lg
+                                                                          text-slate-400 hover:text-amber-500 hover:bg-amber-50 dark:hover:bg-amber-900/20
+                                                                          cursor-pointer transition-colors duration-150"
+                                            onclick="clearContext()">
+                                        <i class="fas fa-trash-can text-base"></i>
+                                    </button>
                                    <button id="attach-btn" class="w-9 h-10 flex items-center justify-center rounded-lg
                                                                   text-slate-400 hover:text-primary-500 hover:bg-primary-50 dark:hover:bg-primary-900/20
                                                                   cursor-pointer transition-colors duration-150"
-                                            title="Attach file" onclick="document.getElementById('file-input').click()">
+                                            type="button"
+                                            onclick="toggleAttachMenu(event)">
                                        <i class="fas fa-paperclip text-base"></i>
                                    </button>
                                </div>
                                <input type="file" id="file-input" class="hidden" multiple
                                       accept="image/*,.pdf,.doc,.docx,.xls,.xlsx,.ppt,.pptx,.txt,.csv,.json,.xml,.zip,.rar,.7z,.py,.js,.ts,.java,.c,.cpp,.go,.rs,.md">
+                                <input type="file" id="folder-input" class="hidden" multiple webkitdirectory directory>
+                                <div id="attach-menu" class="attach-menu hidden">
+                                    <button id="attach-file-option" type="button" class="attach-menu-item" onclick="triggerFileUpload()">
+                                        <i class="fas fa-file-arrow-up"></i>
+                                        <span data-i18n="attach_menu_file">上传文件</span>
+                                    </button>
+                                    <button id="attach-folder-option" type="button" class="attach-menu-item" onclick="triggerFolderUpload()">
+                                        <i class="fas fa-folder-plus"></i>
+                                        <span data-i18n="attach_menu_folder">上传文件夹</span>
+                                    </button>
+                                </div>
                                <div id="slash-menu" class="slash-menu hidden"></div>
-                                <textarea id="chat-input"
-                                          class="flex-1 min-w-0 px-4 py-[10px] rounded-xl border border-slate-200 dark:border-slate-600
-                                                 bg-slate-50 dark:bg-white/5 text-slate-800 dark:text-slate-100
-                                                 placeholder:text-slate-400 dark:placeholder:text-slate-500
-                                                 focus:outline-none focus:ring-0 focus:border-primary-600
-                                                 text-sm leading-relaxed"
-                                          rows="1"
-                                          data-i18n-placeholder="input_placeholder"
-                                          placeholder="Type a message, or press / for commands"></textarea>
+                                <div class="flex-1 min-w-0 relative flex items-center">
+                                    <textarea id="chat-input"
+                                              class="w-full pl-4 pr-11 py-[10px] rounded-xl border border-slate-200 dark:border-slate-600
+                                                     bg-slate-50 dark:bg-white/5 text-slate-800 dark:text-slate-100
+                                                     placeholder:text-slate-400 dark:placeholder:text-slate-500
+                                                     focus:outline-none focus:ring-0 focus:border-primary-600
+                                                     text-sm leading-relaxed"
+                                              rows="1"
+                                              data-i18n-placeholder="input_placeholder"
+                                              placeholder="输入消息，或输入 / 使用指令"></textarea>
+                                    <button id="mic-btn" type="button"
+                                            class="absolute right-2 top-1/2 -translate-y-1/2 w-8 h-8 flex items-center justify-center rounded-lg
+                                                   text-slate-400 hover:text-primary-500 hover:bg-primary-50 dark:hover:bg-primary-900/20
+                                                   cursor-pointer transition-colors duration-150"
+                                            data-i18n-title="mic_idle_title" title="点击录音 / 再按一次结束">
+                                        <i class="fas fa-microphone text-sm"></i>
+                                    </button>
+                                </div>
                                <button id="send-btn"
                                        class="flex-shrink-0 w-10 h-10 flex items-center justify-center rounded-lg
                                               bg-primary-400 text-white hover:bg-primary-500
@@ -318,8 +461,8 @@
                        <div class="max-w-4xl mx-auto">
                            <div class="flex items-center justify-between mb-6">
                                <div>
-                                    <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="config_title">Configuration</h2>
-                                    <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="config_desc">Manage model and agent settings</p>
+                                    <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="config_title">配置管理</h2>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="config_desc">管理模型和 Agent 配置</p>
                                </div>
                            </div>
                            <div class="grid gap-6">
@@ -330,12 +473,17 @@
                                        <div class="w-9 h-9 rounded-lg bg-primary-50 dark:bg-primary-900/30 flex items-center justify-center">
                                            <i class="fas fa-microchip text-primary-500 text-sm"></i>
                                        </div>
-                                        <h3 class="font-semibold text-slate-800 dark:text-slate-100" data-i18n="config_model">Model Configuration</h3>
+                                        <h3 class="font-semibold text-slate-800 dark:text-slate-100" data-i18n="config_model">模型配置</h3>
+                                        <a class="ml-auto text-xs text-slate-500 dark:text-slate-400 hover:text-primary-500 dark:hover:text-primary-400 cursor-pointer transition-colors flex items-center gap-1"
+                                           onclick="navigateTo('models')">
+                                            <span data-i18n="config_model_advanced">高级配置</span>
+                                            <i class="fas fa-arrow-right text-[10px]"></i>
+                                        </a>
                                    </div>
                                    <div class="space-y-5">
                                        <!-- Provider -->
                                        <div>
-                                            <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="config_provider">Provider</label>
+                                            <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="config_provider">模型厂商</label>
                                            <div id="cfg-provider" class="cfg-dropdown" tabindex="0">
                                                <div class="cfg-dropdown-selected">
                                                    <span class="cfg-dropdown-text">--</span>
@@ -343,10 +491,13 @@
                                                </div>
                                                <div class="cfg-dropdown-menu"></div>
                                            </div>
+                                            <div id="cfg-custom-tip" class="mt-1.5 text-xs text-slate-400 dark:text-slate-500 hidden">
+                                                <i class="fas fa-info-circle mr-1"></i><span data-i18n="config_custom_tip">接口需遵循 OpenAI API 协议</span>
+                                            </div>
                                        </div>
                                        <!-- Model -->
                                        <div>
-                                            <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="config_model_name">Model</label>
+                                            <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="config_model_name">模型</label>
                                            <div id="cfg-model-select" class="cfg-dropdown" tabindex="0">
                                                <div class="cfg-dropdown-selected">
                                                    <span class="cfg-dropdown-text">--</span>
@@ -359,7 +510,7 @@
                                                       class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
                                                              bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
                                                              focus:outline-none focus:border-primary-500 font-mono transition-colors"
-                                                       data-i18n-placeholder="config_custom_model_hint" placeholder="Enter custom model name">
+                                                       data-i18n-placeholder="config_custom_model_hint" placeholder="输入自定义模型名称">
                                            </div>
                                        </div>
                                        <!-- API Key -->
@@ -394,7 +545,7 @@
                                            <button id="cfg-model-save"
                                                    class="px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600 text-white text-sm font-medium
                                                           cursor-pointer transition-colors duration-150 disabled:opacity-50 disabled:cursor-not-allowed"
-                                                    onclick="saveModelConfig()" data-i18n="config_save">Save</button>
+                                                    onclick="saveModelConfig()" data-i18n="config_save">保存</button>
                                        </div>
                                    </div>
                                </div>
@@ -405,36 +556,86 @@
                                        <div class="w-9 h-9 rounded-lg bg-emerald-50 dark:bg-emerald-900/30 flex items-center justify-center">
                                            <i class="fas fa-robot text-emerald-500 text-sm"></i>
                                        </div>
-                                        <h3 class="font-semibold text-slate-800 dark:text-slate-100" data-i18n="config_agent">Agent Configuration</h3>
+                                        <h3 class="font-semibold text-slate-800 dark:text-slate-100" data-i18n="config_agent">Agent 配置</h3>
                                    </div>
                                    <div class="space-y-4">
                                        <div>
-                                            <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="config_max_tokens">Max Context Tokens</label>
+                                            <label class="flex items-center gap-1.5 text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">
+                                                <span data-i18n="config_max_tokens">最大上下文 Token</span>
+                                                <span class="cfg-tip" data-tip-key="config_max_tokens_hint"><i class="fas fa-circle-question"></i></span>
+                                            </label>
                                            <input id="cfg-max-tokens" type="number" min="1000" max="200000" step="1000"
                                                   class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
                                                          bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
                                                          focus:outline-none focus:border-primary-500 font-mono transition-colors">
                                        </div>
                                        <div>
-                                            <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="config_max_turns">Max Context Turns</label>
+                                            <label class="flex items-center gap-1.5 text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">
+                                                <span data-i18n="config_max_turns">最大记忆轮次</span>
+                                                <span class="cfg-tip" data-tip-key="config_max_turns_hint"><i class="fas fa-circle-question"></i></span>
+                                            </label>
                                            <input id="cfg-max-turns" type="number" min="1" max="100" step="1"
                                                   class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
                                                          bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
                                                          focus:outline-none focus:border-primary-500 font-mono transition-colors">
                                        </div>
                                        <div>
-                                            <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="config_max_steps">Max Steps</label>
+                                            <label class="flex items-center gap-1.5 text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">
+                                                <span data-i18n="config_max_steps">最大执行步数</span>
+                                                <span class="cfg-tip" data-tip-key="config_max_steps_hint"><i class="fas fa-circle-question"></i></span>
+                                            </label>
                                            <input id="cfg-max-steps" type="number" min="1" max="50" step="1"
                                                   class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
                                                          bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
                                                          focus:outline-none focus:border-primary-500 font-mono transition-colors">
                                        </div>
+                                        <div class="flex items-center justify-between">
+                                            <label class="flex items-center gap-1.5 text-sm font-medium text-slate-600 dark:text-slate-400">
+                                                <span data-i18n="config_enable_thinking">Deep Thinking</span>
+                                                <span class="cfg-tip" data-tip-key="config_enable_thinking_hint"><i class="fas fa-circle-question"></i></span>
+                                            </label>
+                                            <label class="relative inline-flex items-center cursor-pointer">
+                                                <input id="cfg-enable-thinking" type="checkbox" class="sr-only peer">
+                                                <div class="w-9 h-5 bg-slate-200 dark:bg-slate-700 peer-checked:bg-primary-400 rounded-full
+                                                            after:content-[''] after:absolute after:top-[2px] after:left-[2px] after:bg-white
+                                                            after:rounded-full after:h-4 after:w-4 after:transition-all peer-checked:after:translate-x-full"></div>
+                                            </label>
+                                        </div>
                                        <div class="flex items-center justify-end gap-3 pt-1">
                                            <span id="cfg-agent-status" class="text-xs text-primary-500 opacity-0 transition-opacity duration-300"></span>
                                            <button id="cfg-agent-save"
                                                    class="px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600 text-white text-sm font-medium
                                                           cursor-pointer transition-colors duration-150 disabled:opacity-50 disabled:cursor-not-allowed"
-                                                    onclick="saveAgentConfig()" data-i18n="config_save">Save</button>
+                                                    onclick="saveAgentConfig()" data-i18n="config_save">保存</button>
+                                        </div>
+                                    </div>
+                                </div>
+
+                                <!-- Security Config Card -->
+                                <div class="bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10 p-6">
+                                    <div class="flex items-center gap-3 mb-5">
+                                        <div class="w-9 h-9 rounded-lg bg-amber-50 dark:bg-amber-900/30 flex items-center justify-center">
+                                            <i class="fas fa-lock text-amber-500 text-sm"></i>
+                                        </div>
+                                        <h3 class="font-semibold text-slate-800 dark:text-slate-100" data-i18n="config_security">安全设置</h3>
+                                    </div>
+                                    <div class="space-y-4">
+                                        <div>
+                                            <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="config_password">访问密码</label>
+                                            <input id="cfg-password" type="password" autocomplete="new-password"
+                                                   class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
+                                                          bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
+                                                          focus:outline-none focus:border-primary-500 font-mono transition-colors
+                                                          cfg-key-masked"
+                                                   data-masked="1">
+                                            <p class="text-xs text-slate-400 dark:text-slate-500 mt-1.5" data-i18n="config_password_hint">留空则不启用密码保护</p>
+                                        </div>
+                                        <div class="flex items-center justify-end gap-3 pt-1">
+                                            <span id="cfg-password-status" class="text-xs text-primary-500 opacity-0 transition-opacity duration-300"></span>
+                                            <button id="cfg-password-save"
+                                                    class="px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600 text-white text-sm font-medium
+                                                           cursor-pointer transition-colors duration-150 disabled:opacity-50 disabled:cursor-not-allowed"
+                                                    onclick="savePasswordConfig()" data-i18n="config_save">保存</button>
                                        </div>
                                    </div>
                                </div>
@@ -452,25 +653,25 @@
                        <div class="max-w-4xl mx-auto">
                            <div class="flex items-center justify-between mb-6">
                                <div>
-                                    <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="skills_title">Skills</h2>
-                                    <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="skills_desc">View, enable, or disable agent skills</p>
+                                    <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="skills_title">技能管理</h2>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="skills_desc">查看、启用或禁用 Agent 技能</p>
                                </div>
                                <a href="https://skills.cowagent.ai/" target="_blank"
                                   class="inline-flex items-center gap-1.5 px-3 py-1.5 rounded-lg text-xs font-medium text-primary-500 bg-primary-50 dark:bg-primary-900/20 hover:bg-primary-100 dark:hover:bg-primary-900/30 transition-colors">
                                    <i class="fas fa-puzzle-piece text-[10px]"></i>
-                                    <span data-i18n="skills_hub_btn">Skill Hub</span>
+                                    <span data-i18n="skills_hub_btn">探索技能广场</span>
                                </a>
                            </div>

                            <!-- Built-in Tools Section -->
                            <div class="mb-8">
                                <div class="flex items-center gap-2 mb-3">
-                                    <span class="text-xs font-semibold uppercase tracking-wider text-slate-400 dark:text-slate-500" data-i18n="tools_section_title">Built-in Tools</span>
+                                    <span class="text-xs font-semibold uppercase tracking-wider text-slate-400 dark:text-slate-500" data-i18n="tools_section_title">内置工具</span>
                                    <span id="tools-count-badge" class="hidden px-2 py-0.5 rounded-full text-xs bg-slate-100 dark:bg-white/10 text-slate-500 dark:text-slate-400"></span>
                                </div>
                                <div id="tools-empty" class="flex items-center gap-2 py-4 text-slate-400 dark:text-slate-500 text-sm">
                                    <i class="fas fa-spinner fa-spin text-xs"></i>
-                                    <span data-i18n="tools_loading">Loading tools...</span>
+                                    <span data-i18n="tools_loading">加载工具中...</span>
                                </div>
                                <div id="tools-list" class="grid gap-3 sm:grid-cols-2 hidden"></div>
                            </div>
@@ -478,15 +679,15 @@
                            <!-- Skills Section -->
                            <div>
                                <div class="flex items-center gap-2 mb-3">
-                                    <span class="text-xs font-semibold uppercase tracking-wider text-slate-400 dark:text-slate-500" data-i18n="skills_section_title">Skills</span>
+                                    <span class="text-xs font-semibold uppercase tracking-wider text-slate-400 dark:text-slate-500" data-i18n="skills_section_title">技能</span>
                                    <span id="skills-count-badge" class="hidden px-2 py-0.5 rounded-full text-xs bg-slate-100 dark:bg-white/10 text-slate-500 dark:text-slate-400"></span>
                                </div>
                                <div id="skills-empty" class="flex flex-col items-center justify-center py-12">
                                    <div class="w-14 h-14 rounded-2xl bg-amber-50 dark:bg-amber-900/20 flex items-center justify-center mb-3">
                                        <i class="fas fa-bolt text-amber-400 text-lg"></i>
                                    </div>
-                                    <p class="text-slate-500 dark:text-slate-400 font-medium" data-i18n="skills_loading">Loading skills...</p>
-                                    <p class="text-sm text-slate-400 dark:text-slate-500 mt-1" data-i18n="skills_loading_desc">Skills will be displayed here after loading</p>
+                                    <p class="text-slate-500 dark:text-slate-400 font-medium" data-i18n="skills_loading">加载技能中...</p>
+                                    <p class="text-sm text-slate-400 dark:text-slate-500 mt-1" data-i18n="skills_loading_desc">技能加载后将显示在此处</p>
                                </div>
                                <div id="skills-list" class="grid gap-4 sm:grid-cols-2"></div>
                            </div>
@@ -505,26 +706,36 @@
                            <div id="memory-panel-list">
                                <div class="flex items-center justify-between mb-6">
                                    <div>
-                                        <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="memory_title">Memory</h2>
-                                        <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="memory_desc">View agent memory files and contents</p>
+                                        <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="memory_title">记忆管理</h2>
+                                        <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="memory_desc">查看 Agent 记忆文件和内容</p>
+                                    </div>
+                                    <div class="flex items-center bg-slate-100 dark:bg-white/10 rounded-lg p-0.5">
+                                        <button id="memory-tab-files" onclick="switchMemoryTab('files')"
+                                                class="memory-tab px-3 py-1.5 rounded-md text-xs font-medium cursor-pointer transition-colors duration-150 active">
+                                            <i class="fas fa-file-lines mr-1.5"></i><span data-i18n="memory_tab_files">记忆文件</span>
+                                        </button>
+                                        <button id="memory-tab-dreams" onclick="switchMemoryTab('dreams')"
+                                                class="memory-tab px-3 py-1.5 rounded-md text-xs font-medium cursor-pointer transition-colors duration-150">
+                                            <i class="fas fa-moon mr-1.5"></i><span data-i18n="memory_tab_dreams">梦境日记</span>
+                                        </button>
                                    </div>
                                </div>
                                <div id="memory-empty" class="flex flex-col items-center justify-center py-20">
                                    <div class="w-16 h-16 rounded-2xl bg-purple-50 dark:bg-purple-900/20 flex items-center justify-center mb-4">
                                        <i class="fas fa-brain text-purple-400 text-xl"></i>
                                    </div>
-                                    <p class="text-slate-500 dark:text-slate-400 font-medium" data-i18n="memory_loading">Loading memory files...</p>
-                                    <p class="text-sm text-slate-400 dark:text-slate-500 mt-1" data-i18n="memory_loading_desc">Memory files will be displayed here</p>
+                                    <p class="text-slate-500 dark:text-slate-400 font-medium" data-i18n="memory_loading">加载记忆文件中...</p>
+                                    <p class="text-sm text-slate-400 dark:text-slate-500 mt-1" data-i18n="memory_loading_desc">记忆文件将显示在此处</p>
                                </div>
                                <div id="memory-list" class="hidden">
                                    <div class="bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10 overflow-hidden">
                                        <table class="w-full">
                                            <thead>
                                                <tr class="border-b border-slate-200 dark:border-white/10">
-                                                    <th class="text-left px-4 py-3 text-xs font-semibold uppercase tracking-wider text-slate-500 dark:text-slate-400" data-i18n="memory_col_name">Filename</th>
-                                                    <th class="text-left px-4 py-3 text-xs font-semibold uppercase tracking-wider text-slate-500 dark:text-slate-400" data-i18n="memory_col_type">Type</th>
-                                                    <th class="text-left px-4 py-3 text-xs font-semibold uppercase tracking-wider text-slate-500 dark:text-slate-400" data-i18n="memory_col_size">Size</th>
-                                                    <th class="text-left px-4 py-3 text-xs font-semibold uppercase tracking-wider text-slate-500 dark:text-slate-400" data-i18n="memory_col_updated">Updated</th>
+                                                    <th class="text-left px-4 py-3 text-xs font-semibold uppercase tracking-wider text-slate-500 dark:text-slate-400" data-i18n="memory_col_name">文件名</th>
+                                                    <th class="text-left px-4 py-3 text-xs font-semibold uppercase tracking-wider text-slate-500 dark:text-slate-400" data-i18n="memory_col_type">类型</th>
+                                                    <th class="text-left px-4 py-3 text-xs font-semibold uppercase tracking-wider text-slate-500 dark:text-slate-400" data-i18n="memory_col_size">大小</th>
+                                                    <th class="text-left px-4 py-3 text-xs font-semibold uppercase tracking-wider text-slate-500 dark:text-slate-400" data-i18n="memory_col_updated">更新时间</th>
                                                </tr>
                                            </thead>
                                            <tbody id="memory-table-body"></tbody>
@@ -542,7 +753,7 @@
                                                   text-slate-500 dark:text-slate-400 hover:bg-slate-100 dark:hover:bg-white/10
                                                   border border-slate-200 dark:border-white/10 transition-colors cursor-pointer">
                                        <i class="fas fa-arrow-left text-xs"></i>
-                                        <span data-i18n="memory_back">Back</span>
+                                        <span data-i18n="memory_back">返回列表</span>
                                    </button>
                                    <h2 id="memory-viewer-title"
                                        class="text-base font-semibold text-slate-800 dark:text-slate-100 font-mono truncate"></h2>
@@ -558,6 +769,141 @@
                    </div>
                </div>

+                <!-- ====================================================== -->
+                <!-- VIEW: Knowledge                                         -->
+                <!-- ====================================================== -->
+                <div id="view-knowledge" class="view">
+                    <div class="flex-1 overflow-y-auto p-4 md:p-8 lg:p-10">
+                        <div class="w-full max-w-[1600px] mx-auto">
+
+                            <!-- Header -->
+                            <div class="flex flex-col sm:flex-row sm:items-center justify-between gap-3 mb-4 md:mb-6">
+                                <div>
+                                    <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="knowledge_title">知识库</h2>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="knowledge_desc">浏览和探索你的知识库</p>
+                                </div>
+                                <div class="flex items-center gap-2">
+                                    <span id="knowledge-stats" class="text-xs text-slate-400 dark:text-slate-500 hidden sm:inline"></span>
+                                    <div class="flex items-center bg-slate-100 dark:bg-white/10 rounded-lg p-0.5">
+                                        <button id="knowledge-tab-docs" onclick="switchKnowledgeTab('docs')"
+                                                class="knowledge-tab px-3 py-1.5 rounded-md text-xs font-medium cursor-pointer transition-colors duration-150 active">
+                                            <i class="fas fa-folder-tree mr-1.5"></i><span data-i18n="knowledge_tab_docs">文档</span>
+                                        </button>
+                                        <button id="knowledge-tab-graph" onclick="switchKnowledgeTab('graph')"
+                                                class="knowledge-tab px-3 py-1.5 rounded-md text-xs font-medium cursor-pointer transition-colors duration-150">
+                                            <i class="fas fa-diagram-project mr-1.5"></i><span data-i18n="knowledge_tab_graph">图谱</span>
+                                        </button>
+                                    </div>
+                                </div>
+                            </div>
+
+                            <!-- Empty state -->
+                            <div id="knowledge-empty" class="flex flex-col items-center justify-center py-20">
+                                <div class="w-16 h-16 rounded-2xl bg-emerald-50 dark:bg-emerald-900/20 flex items-center justify-center mb-4">
+                                    <i class="fas fa-book text-emerald-400 text-xl"></i>
+                                </div>
+                                <p class="text-slate-500 dark:text-slate-400 font-medium" data-i18n="knowledge_loading">加载知识库中...</p>
+                                <p class="text-sm text-slate-400 dark:text-slate-500 mt-1" data-i18n="knowledge_loading_desc">知识页面将显示在这里</p>
+                                <div id="knowledge-empty-guide" class="hidden mt-6 max-w-sm text-center">
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 mb-4" data-i18n="knowledge_empty_guide">在对话中发送文档、链接或主题给 Agent，它会自动整理到你的知识库中。</p>
+                                    <button onclick="navigateTo('chat')"
+                                            class="inline-flex items-center gap-2 px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600
+                                                   text-white text-sm font-medium cursor-pointer transition-colors duration-150">
+                                        <i class="fas fa-message text-xs"></i>
+                                        <span data-i18n="knowledge_go_chat">开始对话</span>
+                                    </button>
+                                </div>
+                            </div>
+
+                            <!-- Documents panel -->
+                            <div id="knowledge-panel-docs" class="hidden">
+                                <div class="flex flex-col md:flex-row gap-4 md:gap-6" style="min-height: calc(100vh - 220px)">
+                                    <!-- File tree -->
+                                    <div id="knowledge-sidebar" class="w-full md:w-72 lg:w-80 flex-shrink-0">
+                                        <div class="bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10 overflow-hidden">
+                                            <div class="px-4 py-3 border-b border-slate-200 dark:border-white/10">
+                                                <div class="relative">
+                                                    <i class="fas fa-search absolute left-3 top-1/2 -translate-y-1/2 text-slate-400 text-xs"></i>
+                                                    <input id="knowledge-search" type="text" placeholder="Search..."
+                                                           class="w-full pl-8 pr-3 py-1.5 text-xs bg-slate-50 dark:bg-white/5 border border-slate-200 dark:border-white/10 rounded-lg text-slate-700 dark:text-slate-200 placeholder-slate-400 dark:placeholder-slate-500 focus:outline-none focus:ring-1 focus:ring-primary-400/50"
+                                                           oninput="filterKnowledgeTree(this.value)">
+                                                </div>
+                                            </div>
+                                            <div id="knowledge-tree" class="p-2 overflow-y-auto max-h-[50vh] md:max-h-[calc(100vh-300px)]"></div>
+                                        </div>
+                                    </div>
+                                    <!-- Content viewer -->
+                                    <div class="flex-1 min-w-0">
+                                        <div id="knowledge-content-placeholder"
+                                             class="flex flex-col items-center justify-center py-20 text-slate-400 dark:text-slate-500">
+                                            <i class="fas fa-file-lines text-3xl mb-3 opacity-40"></i>
+                                            <p class="text-sm" data-i18n="knowledge_select_hint">选择一个文档查看</p>
+                                        </div>
+                                        <div id="knowledge-content-viewer" class="hidden">
+                                            <div class="bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10 overflow-hidden">
+                                                <div class="flex items-center gap-3 px-4 md:px-5 py-3 border-b border-slate-200 dark:border-white/10">
+                                                    <button onclick="knowledgeMobileBack()" class="md:hidden p-1 -ml-1 text-slate-400 hover:text-slate-600 dark:hover:text-slate-300 cursor-pointer">
+                                                        <i class="fas fa-arrow-left text-xs"></i>
+                                                    </button>
+                                                    <i class="fas fa-file-lines text-slate-400 text-sm hidden md:inline"></i>
+                                                    <span id="knowledge-viewer-title" class="text-sm font-medium text-slate-700 dark:text-slate-200 truncate"></span>
+                                                    <span id="knowledge-viewer-path" class="text-xs text-slate-400 dark:text-slate-500 ml-auto font-mono truncate hidden md:inline"></span>
+                                                </div>
+                                                <div id="knowledge-viewer-body"
+                                                     class="p-4 md:p-5 overflow-y-auto text-sm msg-content text-slate-700 dark:text-slate-200"
+                                                     style="max-height: calc(100vh - 280px)"></div>
+                                            </div>
+                                        </div>
+                                    </div>
+                                </div>
+                            </div>
+
+                            <!-- Graph panel -->
+                            <div id="knowledge-panel-graph" class="hidden">
+                                <div class="bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10 overflow-hidden">
+                                    <div id="knowledge-graph-container" class="w-full h-[60vh] md:h-[calc(100vh-220px)]"></div>
+                                </div>
+                            </div>
+
+                        </div>
+                    </div>
+                </div>
+
+                <!-- ====================================================== -->
+                <!-- VIEW: Models                                            -->
+                <!-- ====================================================== -->
+                <div id="view-models" class="view">
+                    <!-- Tailwind JIT safelist: capability-card icon colors are
+                         emitted from JS template strings. Listing them here
+                         (display:none) guarantees the CDN-side compiler picks
+                         them up regardless of render timing. -->
+                    <div class="hidden bg-blue-50 dark:bg-blue-900/30 text-blue-500
+                                       bg-orange-50 dark:bg-orange-900/30 text-orange-500
+                                       bg-purple-50 dark:bg-purple-900/30 text-purple-500
+                                       bg-amber-50 dark:bg-amber-900/30 text-amber-500
+                                       bg-primary-50 dark:bg-primary-900/30 text-primary-500"></div>
+                    <div class="flex-1 overflow-y-auto p-6">
+                        <div class="max-w-4xl mx-auto">
+                            <div class="flex items-center justify-between mb-6">
+                                <div>
+                                    <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="models_title">模型管理</h2>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="models_desc">统一管理对话、视觉、语音、向量、图像、搜索能力</p>
+                                </div>
+                                <button id="models-add-vendor-btn" onclick="openVendorModal('')"
+                                        class="flex items-center gap-2 px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600
+                                               text-white text-sm font-medium cursor-pointer transition-colors duration-150">
+                                    <i class="fas fa-plus text-xs"></i>
+                                    <span data-i18n="models_add_vendor">添加厂商</span>
+                                </button>
+                            </div>
+                            <div id="models-loading" class="flex items-center gap-2 py-12 justify-center text-slate-400 dark:text-slate-500 text-sm">
+                                <i class="fas fa-spinner fa-spin text-xs"></i><span>Loading...</span>
+                            </div>
+                            <div id="models-content" class="grid gap-6 hidden"></div>
+                        </div>
+                    </div>
+                </div>
+
                <!-- ====================================================== -->
                <!-- VIEW: Channels                                          -->
                <!-- ====================================================== -->
@@ -566,14 +912,14 @@
                        <div class="max-w-4xl mx-auto">
                            <div class="flex items-center justify-between mb-6">
                                <div>
-                                    <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="channels_title">Channels</h2>
-                                    <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="channels_desc">View and manage messaging channels</p>
+                                    <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="channels_title">通道管理</h2>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="channels_desc">管理已接入的消息通道</p>
                                </div>
                                <button id="add-channel-btn" onclick="openAddChannelPanel()"
                                        class="flex items-center gap-2 px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600
                                               text-white text-sm font-medium cursor-pointer transition-colors duration-150">
                                    <i class="fas fa-plus text-xs"></i>
-                                    <span data-i18n="channels_add">Connect</span>
+                                    <span data-i18n="channels_add">接入通道</span>
                                </button>
                            </div>
                            <div id="channels-content" class="grid gap-4"></div>
@@ -590,8 +936,8 @@
                        <div class="max-w-4xl mx-auto">
                            <div class="flex items-center justify-between mb-6">
                                <div>
-                                    <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="tasks_title">Scheduled Tasks</h2>
-                                    <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="tasks_desc">View and manage scheduled tasks</p>
+                                    <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="tasks_title">定时任务</h2>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="tasks_desc">查看和管理定时任务</p>
                                </div>
                            </div>
                            <div id="tasks-empty" class="flex flex-col items-center justify-center py-20">
@@ -613,8 +959,8 @@
                        <div class="max-w-5xl mx-auto">
                            <div class="flex items-center justify-between mb-6">
                                <div>
-                                    <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="logs_title">Logs</h2>
-                                    <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="logs_desc">Real-time log output (run.log)</p>
+                                    <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="logs_title">日志</h2>
+                                    <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="logs_desc">实时日志输出 (run.log)</p>
                                </div>
                            </div>
                            <!-- Log Terminal -->
@@ -627,13 +973,35 @@
                                    </div>
                                    <span class="text-xs text-slate-400 ml-2 font-mono">run.log</span>
                                    <div class="flex-1"></div>
+                                    <div class="flex items-center gap-3 mr-2">
+                                        <label class="flex items-center gap-1 cursor-pointer select-none">
+                                            <input type="checkbox" class="log-filter-cb" data-level="debug" checked>
+                                            <span class="text-xs text-slate-400">DEBUG</span>
+                                        </label>
+                                        <label class="flex items-center gap-1 cursor-pointer select-none">
+                                            <input type="checkbox" class="log-filter-cb" data-level="info" checked>
+                                            <span class="text-xs text-blue-400">INFO</span>
+                                        </label>
+                                        <label class="flex items-center gap-1 cursor-pointer select-none">
+                                            <input type="checkbox" class="log-filter-cb" data-level="warning" checked>
+                                            <span class="text-xs text-yellow-400">WARNING</span>
+                                        </label>
+                                        <label class="flex items-center gap-1 cursor-pointer select-none">
+                                            <input type="checkbox" class="log-filter-cb" data-level="error" checked>
+                                            <span class="text-xs text-red-400">ERROR</span>
+                                        </label>
+                                        <label class="flex items-center gap-1 cursor-pointer select-none">
+                                            <input type="checkbox" class="log-filter-cb" data-level="critical" checked>
+                                            <span class="text-xs text-white font-bold">CRITICAL</span>
+                                        </label>
+                                    </div>
                                    <div class="flex items-center gap-1.5">
                                        <span class="w-2 h-2 rounded-full bg-emerald-500 animate-pulse"></span>
-                                        <span class="text-xs text-slate-500" data-i18n="logs_live">Live</span>
+                                        <span class="text-xs text-slate-500" data-i18n="logs_live">实时</span>
                                    </div>
                                </div>
                                <div id="log-output" class="p-4 overflow-y-auto font-mono text-xs leading-relaxed text-slate-300 whitespace-pre-wrap break-all" style="height: calc(100vh - 272px)">
-                                    <p class="text-slate-500" data-i18n="logs_coming_msg">Log streaming will be available here. Connects to run.log for real-time output similar to tail -f.</p>
+                                    <p class="text-slate-500" data-i18n="logs_coming_msg">日志流即将在此提供。将连接 run.log 实现类似 tail -f 的实时输出。</p>
                                </div>
                            </div>
                        </div>
@@ -645,7 +1013,7 @@
    </div><!-- /app -->

    <!-- Confirm Dialog -->
-    <div id="confirm-dialog-overlay" class="fixed inset-0 bg-black/50 z-[100] hidden flex items-center justify-center">
+    <div id="confirm-dialog-overlay" class="fixed inset-0 bg-black/50 z-[200] hidden flex items-center justify-center">
        <div class="bg-white dark:bg-[#1A1A1A] rounded-2xl border border-slate-200 dark:border-white/10 shadow-xl
                    w-full max-w-sm mx-4 overflow-hidden">
            <div class="p-6">
@@ -670,6 +1038,77 @@
        </div>
    </div>

-    <script src="assets/js/console.js"></script>
+    <!-- Vendor Credentials Modal -->
+    <div id="vendor-modal-overlay" class="fixed inset-0 bg-black/50 z-[100] hidden flex items-center justify-center">
+        <div class="bg-white dark:bg-[#1A1A1A] rounded-2xl border border-slate-200 dark:border-white/10 shadow-xl
+                    w-full max-w-md mx-4">
+            <div class="p-6">
+                <div class="flex items-center gap-3 mb-5">
+                    <div class="w-10 h-10 rounded-xl bg-primary-50 dark:bg-primary-900/20 flex items-center justify-center flex-shrink-0">
+                        <i class="fas fa-key text-primary-500"></i>
+                    </div>
+                    <div class="min-w-0 flex-1">
+                        <h3 id="vendor-modal-title" class="font-semibold text-slate-800 dark:text-slate-100 text-base"></h3>
+                        <p id="vendor-modal-subtitle" class="text-xs text-slate-500 dark:text-slate-400 mt-0.5 font-mono"></p>
+                    </div>
+                </div>
+
+                <!-- Provider selector (only visible when adding via top button) -->
+                <div id="vendor-modal-picker-wrap" class="mb-4 hidden">
+                    <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="models_provider">厂商</label>
+                    <div id="vendor-modal-picker" class="cfg-dropdown" tabindex="0">
+                        <div class="cfg-dropdown-selected">
+                            <span class="cfg-dropdown-text">--</span>
+                            <i class="fas fa-chevron-down cfg-dropdown-arrow"></i>
+                        </div>
+                        <div class="cfg-dropdown-menu"></div>
+                    </div>
+                </div>
+
+                <div class="space-y-4">
+                    <div>
+                        <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">API Key</label>
+                        <input id="vendor-modal-key" type="text" autocomplete="off" data-1p-ignore data-lpignore="true"
+                               class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
+                                      bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
+                                      focus:outline-none focus:border-primary-500 font-mono transition-colors"
+                               placeholder="sk-...">
+                    </div>
+                    <div id="vendor-modal-base-wrap">
+                        <label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">API Base</label>
+                        <input id="vendor-modal-base" type="text"
+                               class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
+                                      bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
+                                      focus:outline-none focus:border-primary-500 font-mono transition-colors"
+                               placeholder="https://...../v1">
+                        <p id="vendor-modal-base-hint" class="mt-1.5 text-xs text-slate-400 dark:text-slate-500 hidden">
+                            <i class="fas fa-info-circle mr-1"></i><span data-i18n="models_base_default_hint">留空将使用官方默认地址</span>
+                        </p>
+                    </div>
+                </div>
+            </div>
+            <div class="flex items-center justify-between gap-3 px-6 py-4 border-t border-slate-100 dark:border-white/5 rounded-b-2xl">
+                <button id="vendor-modal-clear"
+                        class="px-3 py-2 rounded-lg text-xs
+                               text-red-500 dark:text-red-400 hover:bg-red-50 dark:hover:bg-red-900/20
+                               cursor-pointer transition-colors duration-150 hidden"
+                        data-i18n="models_clear_credential">清除凭据</button>
+                <span id="vendor-modal-status"
+                      class="flex-1 text-xs text-primary-500 opacity-0 transition-opacity duration-300 text-center"></span>
+                <button id="vendor-modal-cancel"
+                        class="px-4 py-2 rounded-lg border border-slate-200 dark:border-white/10
+                               text-slate-600 dark:text-slate-300 text-sm font-medium
+                               hover:bg-slate-50 dark:hover:bg-white/5
+                               cursor-pointer transition-colors duration-150"
+                        data-i18n="cancel">取消</button>
+                <button id="vendor-modal-save"
+                        class="px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600 text-white text-sm font-medium
+                               cursor-pointer transition-colors duration-150 disabled:opacity-50 disabled:cursor-not-allowed"
+                        data-i18n="save">保存</button>
+            </div>
+        </div>
+    </div>
+
+    <script defer src="assets/js/console.js"></script>
 </body>
 </html>
--- a/channel/web/static/css/console.css
+++ b/channel/web/static/css/console.css
@@ -17,6 +17,45 @@
 .dark ::-webkit-scrollbar-thumb { background: #475569; }
 .dark ::-webkit-scrollbar-thumb:hover { background: #64748b; }

+/* Generic Tooltip (via data-tooltip attribute) */
+[data-tooltip] {
+    position: relative;
+}
+[data-tooltip]::after {
+    content: attr(data-tooltip);
+    position: absolute;
+    left: 50%;
+    bottom: calc(100% + 8px);
+    transform: translateX(-50%);
+    padding: 5px 10px;
+    border-radius: 6px;
+    font-size: 12px;
+    font-weight: 400;
+    line-height: 1.4;
+    white-space: nowrap;
+    background: #1e293b;
+    color: #e2e8f0;
+    box-shadow: 0 4px 12px rgba(0, 0, 0, 0.15);
+    opacity: 0;
+    pointer-events: none;
+    transition: opacity 0.15s ease;
+    z-index: 100;
+}
+[data-tooltip-pos="bottom"]::after {
+    bottom: auto;
+    top: calc(100% + 8px);
+}
+.dark [data-tooltip]::after {
+    background: #334155;
+    color: #f1f5f9;
+}
+[data-tooltip]:hover::after {
+    opacity: 1;
+}
+[data-tooltip=""]:hover::after {
+    display: none;
+}
+
 /* Sidebar */
 .sidebar-item.active {
    background: rgba(255, 255, 255, 0.08);
@@ -24,9 +63,317 @@
 }
 .sidebar-item.active .item-icon { color: #4ABE6E; }

+/* Session Panel */
+.session-panel {
+    width: 220px;
+    flex-shrink: 0;
+    display: flex;
+    flex-direction: column;
+    background: #fafafa;
+    border-right: 1px solid #e5e7eb;
+    height: 100vh;
+    overflow: hidden;
+    transition: width 0.2s ease;
+}
+.dark .session-panel {
+    background: #111111;
+    border-right-color: rgba(255, 255, 255, 0.08);
+}
+.session-panel.hidden { display: none; }
+.session-panel-header {
+    display: flex;
+    align-items: center;
+    justify-content: space-between;
+    padding: 12px 16px;
+    border-bottom: 1px solid #e5e7eb;
+    flex-shrink: 0;
+}
+.dark .session-panel-header { border-bottom-color: rgba(255, 255, 255, 0.08); }
+.session-panel-title {
+    font-size: 14px;
+    font-weight: 600;
+    color: #374151;
+}
+.dark .session-panel-title { color: #d1d5db; }
+.session-panel-close {
+    width: 28px;
+    height: 28px;
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    border-radius: 6px;
+    border: none;
+    background: none;
+    color: #9ca3af;
+    cursor: pointer;
+    transition: background 0.15s, color 0.15s;
+    font-size: 12px;
+}
+.session-panel-close:hover {
+    background: #f3f4f6;
+    color: #374151;
+}
+.dark .session-panel-close:hover {
+    background: rgba(255, 255, 255, 0.08);
+    color: #e5e5e5;
+}
+.session-panel-new {
+    display: flex;
+    align-items: center;
+    gap: 8px;
+    margin: 10px 12px;
+    padding: 8px 14px;
+    border-radius: 8px;
+    border: 1px dashed #d1d5db;
+    background: none;
+    color: #6b7280;
+    font-size: 13px;
+    cursor: pointer;
+    transition: border-color 0.15s, color 0.15s, background 0.15s;
+    flex-shrink: 0;
+}
+.session-panel-new:hover {
+    border-color: #9ca3af;
+    color: #374151;
+    background: #f9fafb;
+}
+.dark .session-panel-new {
+    border-color: rgba(255, 255, 255, 0.12);
+    color: #9ca3af;
+}
+.dark .session-panel-new:hover {
+    border-color: rgba(255, 255, 255, 0.25);
+    color: #e5e5e5;
+    background: rgba(255, 255, 255, 0.04);
+}
+
+/* Session List */
+.session-list {
+    flex: 1;
+    overflow-y: auto;
+    padding: 4px 8px;
+    scrollbar-width: none;
+}
+.session-list:hover { scrollbar-width: thin; }
+.session-list::-webkit-scrollbar { width: 4px; background: transparent; }
+.session-list::-webkit-scrollbar-thumb { background: transparent; border-radius: 2px; }
+.session-list:hover::-webkit-scrollbar-thumb { background: rgba(0,0,0,0.2); }
+.dark .session-list:hover::-webkit-scrollbar-thumb { background: rgba(255,255,255,0.15); }
+.session-group-label {
+    padding: 10px 8px 4px;
+    font-size: 11px;
+    font-weight: 600;
+    text-transform: uppercase;
+    letter-spacing: 0.05em;
+    color: #9ca3af;
+}
+.dark .session-group-label { color: #525252; }
+.session-empty {
+    padding: 20px 12px;
+    text-align: center;
+    font-size: 13px;
+    color: #9ca3af;
+}
+.session-item {
+    display: flex;
+    align-items: center;
+    gap: 8px;
+    padding: 8px 10px;
+    margin: 1px 0;
+    border-radius: 8px;
+    cursor: pointer;
+    transition: background 0.15s, color 0.15s;
+    color: #6b7280;
+    font-size: 13px;
+    position: relative;
+}
+.dark .session-item { color: #a3a3a3; }
+.session-item:hover {
+    background: #f3f4f6;
+    color: #111827;
+}
+.dark .session-item:hover {
+    background: rgba(255, 255, 255, 0.05);
+    color: #e5e5e5;
+}
+.session-item.active {
+    background: #e5e7eb;
+    color: #111827;
+}
+.dark .session-item.active {
+    background: rgba(255, 255, 255, 0.1);
+    color: #ffffff;
+}
+.session-icon {
+    flex-shrink: 0;
+    font-size: 11px;
+    color: #9ca3af;
+    width: 16px;
+    text-align: center;
+}
+.dark .session-icon { color: #525252; }
+.session-item.active .session-icon { color: #4ABE6E; }
+.session-title {
+    flex: 1;
+    min-width: 0;
+    overflow: hidden;
+    text-overflow: ellipsis;
+    white-space: nowrap;
+}
+.session-delete {
+    flex-shrink: 0;
+    width: 22px;
+    height: 22px;
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    border-radius: 5px;
+    font-size: 10px;
+    color: #9ca3af;
+    opacity: 0;
+    transition: opacity 0.15s, color 0.15s, background 0.15s;
+    cursor: pointer;
+    background: none;
+    border: none;
+    padding: 0;
+}
+.session-item:hover .session-delete { opacity: 1; }
+.session-delete:hover {
+    color: #ef4444;
+    background: rgba(239, 68, 68, 0.1);
+}
+.dark .session-delete:hover { background: rgba(239, 68, 68, 0.15); }
+
+/* Context Divider */
+.context-divider {
+    display: flex;
+    align-items: center;
+    gap: 12px;
+    padding: 12px 24px;
+    color: #9ca3af;
+}
+.context-divider::before, .context-divider::after {
+    content: '';
+    flex: 1;
+    height: 1px;
+    background: linear-gradient(to right, transparent, #d1d5db, transparent);
+}
+.dark .context-divider::before, .dark .context-divider::after {
+    background: linear-gradient(to right, transparent, rgba(255,255,255,0.12), transparent);
+}
+.context-divider span {
+    font-size: 12px;
+    white-space: nowrap;
+    color: #9ca3af;
+}
+
+/* Confirm Modal */
+.confirm-overlay {
+    position: fixed;
+    inset: 0;
+    z-index: 9999;
+    display: flex;
+    align-items: center;
+    justify-content: center;
+    background: rgba(0, 0, 0, 0.4);
+    opacity: 0;
+    transition: opacity 0.2s ease;
+}
+.confirm-overlay.visible { opacity: 1; }
+.confirm-modal {
+    background: #fff;
+    border-radius: 14px;
+    width: 380px;
+    max-width: 90vw;
+    padding: 28px 24px 20px;
+    box-shadow: 0 20px 60px rgba(0, 0, 0, 0.18);
+    transform: scale(0.92);
+    transition: transform 0.2s ease;
+}
+.confirm-overlay.visible .confirm-modal { transform: scale(1); }
+.dark .confirm-modal {
+    background: #1e1e1e;
+    box-shadow: 0 20px 60px rgba(0, 0, 0, 0.5);
+}
+.confirm-title {
+    font-size: 16px;
+    font-weight: 600;
+    color: #1f2937;
+    margin-bottom: 8px;
+}
+.dark .confirm-title { color: #e5e7eb; }
+.confirm-message {
+    font-size: 14px;
+    color: #6b7280;
+    line-height: 1.5;
+    margin-bottom: 24px;
+}
+.dark .confirm-message { color: #9ca3af; }
+.confirm-actions {
+    display: flex;
+    justify-content: flex-end;
+    gap: 10px;
+}
+.confirm-btn {
+    padding: 8px 20px;
+    border-radius: 8px;
+    font-size: 14px;
+    font-weight: 500;
+    cursor: pointer;
+    border: none;
+    transition: all 0.15s ease;
+}
+.confirm-btn-cancel {
+    background: #f3f4f6;
+    color: #374151;
+}
+.confirm-btn-cancel:hover { background: #e5e7eb; }
+.dark .confirm-btn-cancel {
+    background: rgba(255, 255, 255, 0.08);
+    color: #d1d5db;
+}
+.dark .confirm-btn-cancel:hover { background: rgba(255, 255, 255, 0.14); }
+.confirm-btn-ok {
+    background: #ef4444;
+    color: #fff;
+}
+.confirm-btn-ok:hover { background: #dc2626; }
+
+/* Session panel overlay (mobile only, click to close) */
+.session-panel-overlay {
+    display: none;
+}
+@media (max-width: 768px) {
+    .session-panel-overlay {
+        display: block;
+        position: fixed;
+        inset: 0;
+        z-index: 44;
+        background: rgba(0, 0, 0, 0.3);
+    }
+    .session-panel-overlay.hidden {
+        display: none;
+    }
+}
+
+/* Mobile: session panel as overlay */
+@media (max-width: 768px) {
+    .session-panel {
+        position: fixed;
+        top: 0;
+        left: 0;
+        z-index: 45;
+        width: 220px;
+        box-shadow: 4px 0 24px rgba(0, 0, 0, 0.15);
+    }
+    .dark .session-panel {
+        box-shadow: 4px 0 24px rgba(0, 0, 0, 0.4);
+    }
+}
+
 /* Menu Groups */
 .menu-group-items { max-height: 0; overflow: hidden; transition: max-height 0.25s ease-out; }
-.menu-group.open .menu-group-items { max-height: 500px; transition: max-height 0.35s ease-in; }
+.menu-group.open .menu-group-items { max-height: 2000px; transition: max-height 0.35s ease-in; }
 .menu-group .chevron { transition: transform 0.25s ease; }
 .menu-group.open .chevron { transform: rotate(90deg); }

@@ -45,7 +392,8 @@
 .msg-content h1 { font-size: 1.4em; }
 .msg-content h2 { font-size: 1.25em; }
 .msg-content h3 { font-size: 1.1em; }
-.msg-content ul, .msg-content ol { margin: 0.5em 0; padding-left: 1.8em; }
+.msg-content ul { margin: 0.5em 0; padding-left: 1.8em; list-style: disc; }
+.msg-content ol { margin: 0.5em 0; padding-left: 1.8em; list-style: decimal; }
 .msg-content li { margin: 0.25em 0; }
 .msg-content pre {
    border-radius: 8px; overflow-x: auto; margin: 0.8em 0;
@@ -124,9 +472,8 @@
    cursor: pointer;
    user-select: none;
 }
-.agent-thinking-step .thinking-header.no-toggle { cursor: default; }
-.agent-thinking-step .thinking-header:not(.no-toggle):hover { color: #64748b; }
-.dark .agent-thinking-step .thinking-header:not(.no-toggle):hover { color: #cbd5e1; }
+.agent-thinking-step .thinking-header:hover { color: #64748b; }
+.dark .agent-thinking-step .thinking-header:hover { color: #cbd5e1; }
 .agent-thinking-step .thinking-header i:first-child { font-size: 0.625rem; margin-top: 1px; }
 .agent-thinking-step .thinking-chevron {
    font-size: 0.5rem;
@@ -146,7 +493,7 @@
    font-size: 0.75rem;
    line-height: 1.5;
    color: #94a3b8;
-    max-height: 200px;
+    max-height: 300px;
    overflow-y: auto;
 }
 .dark .agent-thinking-step .thinking-full {
@@ -157,6 +504,41 @@
 .agent-thinking-step .thinking-full p { margin: 0.25em 0; }
 .agent-thinking-step .thinking-full p:first-child { margin-top: 0; }
 .agent-thinking-step .thinking-full p:last-child { margin-bottom: 0; }
+.agent-thinking-step .thinking-duration {
+    font-size: 0.625rem;
+    color: #b0b8c4;
+    margin-bottom: 0.375rem;
+}
+/* Streaming reasoning: render as plain pre to avoid expensive markdown
+   re-parsing on every chunk. Wrap long lines so the bubble width is
+   respected and use the same font size/color as the rendered version. */
+.agent-thinking-step .thinking-stream-pre {
+    margin: 0;
+    padding: 0;
+    background: transparent;
+    border: 0;
+    font-family: inherit;
+    font-size: inherit;
+    line-height: 1.5;
+    color: inherit;
+    white-space: pre-wrap;
+    word-break: break-word;
+    overflow-wrap: anywhere;
+}
+
+/* Content step - real text output frozen before tool calls */
+.agent-content-step {
+    font-size: 0.875rem;
+    line-height: 1.6;
+    color: inherit;
+    margin-bottom: 0.5rem;
+    padding-bottom: 0.5rem;
+    border-bottom: 1px dashed rgba(0, 0, 0, 0.06);
+}
+.dark .agent-content-step { border-bottom-color: rgba(255, 255, 255, 0.06); }
+.agent-content-step .agent-content-body p { margin: 0.25em 0; }
+.agent-content-step .agent-content-body p:first-child { margin-top: 0; }
+.agent-content-step .agent-content-body p:last-child { margin-bottom: 0; }

 /* Tool step - collapsible */
 .agent-tool-step .tool-header {
@@ -224,6 +606,14 @@
 }
 .tool-error-text { color: #f87171; }

+/* Log level highlighting */
+.log-line { display: block; }
+.log-line-debug    { color: #94a3b8; }
+.log-line-info     { background-color: rgba(59, 130, 246, 0.08); }
+.log-line-warning  { background-color: rgba(234, 179, 8, 0.15); color: #fde68a; }
+.log-line-error    { background-color: rgba(239, 68, 68, 0.15); color: #fca5a5; }
+.log-line-critical { background-color: rgba(239, 68, 68, 0.35); color: #ff4444; font-weight: bold;  }
+
 /* Tool failed state */
 .agent-tool-step.tool-failed .tool-name { color: #f87171; }

@@ -335,6 +725,58 @@
    background: rgba(74, 190, 110, 0.15);
    color: #74E9A4;
 }
+/* When an item carries a hint (e.g. brand alias next to a technical model
+   id), label/hint are split into two spans so the hint sits on the right in
+   a dim, smaller weight. Without a hint the row stays a plain text node and
+   uses the default ellipsis behaviour, so no layout regressions for old call
+   sites. */
+.cfg-dropdown-label {
+    flex: 1 1 auto;
+    min-width: 0;
+    overflow: hidden;
+    text-overflow: ellipsis;
+}
+.cfg-dropdown-hint {
+    flex-shrink: 0;
+    margin-left: auto;
+    padding-left: 12px;
+    color: #94a3b8;
+    font-size: 12px;
+    font-weight: 400;
+}
+.dark .cfg-dropdown-hint {
+    color: #64748b;
+}
+.cfg-dropdown-item.active .cfg-dropdown-hint {
+    /* Tint the hint toward the brand colour on the active row so it doesn't
+       fight with the highlighted label tone. */
+    color: rgba(34, 133, 71, 0.65);
+}
+.dark .cfg-dropdown-item.active .cfg-dropdown-hint {
+    color: rgba(116, 233, 164, 0.6);
+}
+/* The active row gets a trailing brand-green checkmark via a Font Awesome
+   pseudo-element so every dropdown (chat / vision / image / asr / tts / etc.)
+   surfaces "this is what's currently selected" without per-call JS plumbing.
+   When a hint is present, the ✓ sits to its right with a small gap; without
+   a hint, margin-left:auto pushes the ✓ flush against the right edge. */
+.cfg-dropdown-item.active::after {
+    content: '\f00c';                  /* FontAwesome check glyph */
+    font-family: 'Font Awesome 6 Free', 'Font Awesome 5 Free', 'FontAwesome';
+    font-weight: 900;
+    margin-left: auto;
+    padding-left: 12px;
+    color: #4abe6e;
+    font-size: 11px;
+    flex-shrink: 0;
+}
+.cfg-dropdown-item.active:has(.cfg-dropdown-hint)::after {
+    /* When hint occupies the auto-margin slot, the ✓ no longer benefits
+       from `margin-left: auto`; replace it with a small fixed gap so the
+       ✓ trails the hint cleanly. */
+    margin-left: 0;
+    padding-left: 10px;
+}

 /* API Key masking via CSS (avoids browser password prompts) */
 .cfg-key-masked {
@@ -342,6 +784,77 @@
    text-security: disc;
 }

+/* Provider logo image — vendors flagged as `provider-logo-invert-dark`
+   ship a black wordmark that disappears on the dark canvas; we invert their
+   luminance only in dark mode so the brand stays recognizable without
+   touching multi-color marks like Google/MiniMax. */
+.provider-logo-img {
+    object-fit: contain;
+    object-position: center;
+}
+.dark .provider-logo-invert-dark {
+    filter: invert(1) brightness(1.15);
+}
+
+/* Models page — provider dropdown rows.
+   Configured rows look like ordinary picker entries; the .active row's
+   trailing brand-green ✓ already announces "this is what's selected"
+   (handled globally by .cfg-dropdown-item.active::after above).
+   Unconfigured rows are visually subdued and carry a trailing gear icon
+   as a "click to set up" affordance. */
+.cap-provider-label {
+    flex: 1 1 auto;
+    overflow: hidden;
+    text-overflow: ellipsis;
+}
+.cap-provider-gear {
+    margin-left: auto;
+    padding-left: 12px;
+    color: #94a3b8;
+    font-size: 11px;
+    flex-shrink: 0;
+}
+.cap-provider-item.cap-provider-unconfigured {
+    color: #94a3b8;
+}
+.dark .cap-provider-item.cap-provider-unconfigured {
+    color: #64748b;
+}
+.cap-provider-item.cap-provider-unconfigured:hover {
+    color: #475569;
+}
+.dark .cap-provider-item.cap-provider-unconfigured:hover {
+    color: #cbd5e1;
+}
+.cap-provider-item.cap-provider-unconfigured:hover .cap-provider-gear {
+    color: #475569;
+}
+.dark .cap-provider-item.cap-provider-unconfigured:hover .cap-provider-gear {
+    color: #cbd5e1;
+}
+/* If the active row ever lands on an unconfigured vendor (defensive — the
+   click handler normally diverts to the modal), suppress the global ✓ so
+   the gear remains the sole trailing icon and the row keeps reading as
+   "needs setup" rather than "already selected". */
+.cap-provider-item.cap-provider-unconfigured.active::after {
+    content: none;
+}
+
+/* "Add vendor" modal picker — each configured row carries a static
+   brand-green ✓ via decorateVendorModalPicker so users can see what's set
+   up at a glance. The active row's global ✓ is suppressed here to avoid
+   showing two checks side by side on configured + selected rows. */
+.vendor-picker-item.active::after {
+    content: none;
+}
+.vendor-picker-configured-mark {
+    margin-left: auto;
+    padding-left: 12px;
+    color: #4abe6e;
+    font-size: 11px;
+    flex-shrink: 0;
+}
+
 /* Chat Input */
 #chat-input {
    resize: none; height: 42px; max-height: 180px;
@@ -358,6 +871,46 @@
 }
 .attachment-preview.hidden { display: none; }

+.attach-menu {
+    position: absolute;
+    left: 72px;
+    bottom: calc(100% + 6px);
+    min-width: 148px;
+    padding: 6px;
+    border-radius: 12px;
+    background: #fff;
+    border: 1px solid #e2e8f0;
+    box-shadow: 0 8px 30px -6px rgba(0, 0, 0, 0.1), 0 2px 8px -2px rgba(0, 0, 0, 0.04);
+    z-index: 55;
+    animation: slashMenuIn 0.15s ease-out;
+}
+.attach-menu.hidden { display: none; }
+.attach-menu-item {
+    width: 100%;
+    display: flex;
+    align-items: center;
+    gap: 8px;
+    padding: 8px 10px;
+    border: none;
+    border-radius: 8px;
+    background: transparent;
+    color: #334155;
+    font-size: 13px;
+    cursor: pointer;
+    transition: background 0.12s ease, color 0.12s ease;
+    text-align: left;
+}
+.attach-menu-item:hover {
+    background: #EDFDF3;
+    color: #228547;
+}
+.attach-menu-item i {
+    width: 14px;
+    text-align: center;
+    color: #64748b;
+}
+.attach-menu-item:hover i { color: inherit; }
+
 .att-thumb {
    position: relative;
    width: 64px; height: 64px;
@@ -535,3 +1088,282 @@
 .dark .slash-menu-item .desc {
    color: #64748b;
 }
+
+.dark .attach-menu {
+    background: #1A1A1A;
+    border-color: rgba(255, 255, 255, 0.1);
+    box-shadow: 0 8px 30px -6px rgba(0, 0, 0, 0.35), 0 2px 8px -2px rgba(0, 0, 0, 0.15);
+}
+.dark .attach-menu-item {
+    color: #e2e8f0;
+}
+.dark .attach-menu-item i {
+    color: #94a3b8;
+}
+.dark .attach-menu-item:hover {
+    background: rgba(74, 190, 110, 0.1);
+    color: #4ABE6E;
+}
+
+/* ============================================================
+   Knowledge View
+   ============================================================ */
+
+/* Tab toggle */
+.knowledge-tab, .memory-tab {
+    color: #64748b;
+}
+.knowledge-tab.active, .memory-tab.active {
+    background: #fff;
+    color: #334155;
+    box-shadow: 0 1px 3px rgba(0,0,0,0.08);
+}
+.dark .knowledge-tab.active, .dark .memory-tab.active {
+    background: rgba(255,255,255,0.1);
+    color: #e2e8f0;
+}
+
+/* File tree */
+.knowledge-tree-group {
+    margin-bottom: 2px;
+}
+.knowledge-tree-group-btn {
+    display: flex;
+    align-items: center;
+    gap: 6px;
+    width: 100%;
+    padding: 6px 8px;
+    border-radius: 6px;
+    font-size: 12px;
+    font-weight: 600;
+    color: #64748b;
+    cursor: pointer;
+    border: none;
+    background: none;
+    transition: background 0.15s, color 0.15s;
+    text-transform: capitalize;
+}
+.knowledge-tree-group-btn:hover {
+    background: rgba(0,0,0,0.04);
+    color: #334155;
+}
+.dark .knowledge-tree-group-btn:hover {
+    background: rgba(255,255,255,0.06);
+    color: #e2e8f0;
+}
+.knowledge-tree-group-btn i.chevron {
+    font-size: 8px;
+    transition: transform 0.15s;
+}
+.knowledge-tree-group.open > .knowledge-tree-group-btn .chevron {
+    transform: rotate(90deg);
+}
+.knowledge-tree-group-items {
+    display: none;
+}
+.knowledge-tree-group.open > .knowledge-tree-group-items {
+    display: block;
+}
+
+.knowledge-tree-file {
+    display: flex;
+    align-items: center;
+    gap: 6px;
+    padding: 5px 8px 5px 24px;
+    border-radius: 6px;
+    font-size: 12px;
+    color: #64748b;
+    cursor: pointer;
+    border: none;
+    background: none;
+    width: 100%;
+    text-align: left;
+    transition: background 0.15s, color 0.15s;
+    white-space: nowrap;
+    overflow: hidden;
+    text-overflow: ellipsis;
+}
+.knowledge-tree-file:hover {
+    background: rgba(0,0,0,0.04);
+    color: #334155;
+}
+.knowledge-tree-file.active {
+    background: #EDFDF3;
+    color: #228547;
+}
+.dark .knowledge-tree-file:hover {
+    background: rgba(255,255,255,0.06);
+    color: #e2e8f0;
+}
+.dark .knowledge-tree-file.active {
+    background: rgba(74, 190, 110, 0.1);
+    color: #4ABE6E;
+}
+
+/* Graph legend */
+.knowledge-graph-legend {
+    position: absolute;
+    top: 12px;
+    right: 12px;
+    display: flex;
+    flex-wrap: wrap;
+    gap: 8px;
+    font-size: 11px;
+    color: #64748b;
+    z-index: 10;
+}
+.knowledge-graph-legend-item {
+    display: flex;
+    align-items: center;
+    gap: 4px;
+}
+.knowledge-graph-legend-dot {
+    width: 8px;
+    height: 8px;
+    border-radius: 50%;
+}
+
+/* Graph tooltip */
+.knowledge-graph-tooltip {
+    position: absolute;
+    padding: 6px 10px;
+    background: #fff;
+    border: 1px solid #e2e8f0;
+    border-radius: 8px;
+    font-size: 12px;
+    color: #334155;
+    box-shadow: 0 4px 12px rgba(0,0,0,0.08);
+    pointer-events: none;
+    opacity: 0;
+    transition: opacity 0.15s;
+    z-index: 20;
+}
+.dark .knowledge-graph-tooltip {
+    background: #1A1A1A;
+    border-color: rgba(255,255,255,0.1);
+    color: #e2e8f0;
+}
+
+/* Config field tooltip */
+.cfg-tip {
+    position: relative;
+    display: inline-flex;
+    align-items: center;
+    color: #94a3b8;
+    cursor: help;
+    font-size: 12px;
+}
+.cfg-tip:hover { color: #64748b; }
+.dark .cfg-tip:hover { color: #cbd5e1; }
+/* Floating tooltip portal — appended to <body> by JS so it isn't clipped
+   by overflow:hidden ancestors. */
+.cfg-tip-floating {
+    position: fixed;
+    padding: 6px 10px;
+    border-radius: 8px;
+    font-size: 12px;
+    font-weight: 400;
+    line-height: 1.4;
+    white-space: nowrap;
+    background: #1e293b;
+    color: #e2e8f0;
+    box-shadow: 0 4px 12px rgba(0,0,0,0.15);
+    opacity: 0;
+    pointer-events: none;
+    transition: opacity 0.15s;
+    z-index: 9999;
+}
+.dark .cfg-tip-floating {
+    background: #334155;
+    color: #f1f5f9;
+}
+.cfg-tip-floating.show {
+    opacity: 1;
+}
+
+/* Example cards: equal height via flex stretch + fixed 2-line description area */
+.example-card {
+    display: flex;
+    flex-direction: column;
+}
+.example-card > p {
+    flex: 1;
+    display: -webkit-box;
+    -webkit-line-clamp: 2;
+    -webkit-box-orient: vertical;
+    overflow: hidden;
+    min-height: 2.5em;  /* ~2 lines at text-sm leading-relaxed */
+}
+
+/* --------------------------------------------------------------------
+ * Voice pill — compact custom audio player used by mic uploads and TTS
+ * replies. Replaces the bulky native <audio controls> with a play/pause
+ * icon + thin progress bar + duration counter so it blends into chat
+ * bubbles without the chrome-grey browser default look.
+ * ------------------------------------------------------------------ */
+.voice-pill {
+    display: inline-flex;
+    align-items: center;
+    gap: 8px;
+    padding: 6px 10px;
+    border-radius: 999px;
+    background: rgba(15, 23, 42, 0.05);
+    color: rgb(71, 85, 105);
+    font-size: 12px;
+    line-height: 1;
+    max-width: 240px;
+    user-select: none;
+    cursor: default;
+}
+.dark .voice-pill {
+    background: rgba(255, 255, 255, 0.08);
+    color: rgb(203, 213, 225);
+}
+.voice-pill[data-loading="1"] {
+    opacity: 0.65;
+}
+.voice-pill-btn {
+    width: 22px;
+    height: 22px;
+    border-radius: 999px;
+    display: inline-flex;
+    align-items: center;
+    justify-content: center;
+    background: var(--color-primary-500, #2563eb);
+    color: #fff;
+    flex-shrink: 0;
+    cursor: pointer;
+    transition: transform 0.1s ease;
+}
+.voice-pill-btn:hover { transform: scale(1.05); }
+.voice-pill-btn i { font-size: 9px; margin-left: 1px; }
+.voice-pill-btn[data-state="play"] i { margin-left: 2px; }
+.voice-pill-btn[data-state="pause"] i { margin-left: 0; }
+.voice-pill-track {
+    flex: 1;
+    height: 3px;
+    border-radius: 999px;
+    background: rgba(100, 116, 139, 0.25);
+    overflow: hidden;
+    min-width: 70px;
+}
+.dark .voice-pill-track {
+    background: rgba(148, 163, 184, 0.25);
+}
+.voice-pill-fill {
+    height: 100%;
+    width: 0%;
+    background: var(--color-primary-500, #2563eb);
+    border-radius: inherit;
+    transition: width 0.1s linear;
+}
+.voice-pill-time {
+    font-variant-numeric: tabular-nums;
+    font-size: 11px;
+    color: inherit;
+    opacity: 0.75;
+    flex-shrink: 0;
+    min-width: 28px;
+    text-align: right;
+}
+.voice-pill audio { display: none; }
--- a/channel/web/static/js/console.js
+++ b/channel/web/static/js/console.js
--- a/channel/web/static/logos/claudeAPI.svg
+++ b/channel/web/static/logos/claudeAPI.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251656961" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="18432" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M252.8 652.8l167.893333-94.293333 2.773334-8.106667-2.773334-4.48h-8.106666l-28.16-1.706667-96-2.56-83.2-3.413333-80.64-4.266667-20.266667-4.266666L85.333333 504.746667l1.92-12.586667 17.066667-11.52 24.32 2.133333 53.973333 3.626667 81.066667 5.546667 58.666667 3.413333 87.04 9.173333h13.866666l1.92-5.546666-4.693333-3.413334-3.626667-3.413333-83.84-56.746667-90.666666-60.16-47.573334-34.56-25.813333-17.493333-13.013333-16.426667-5.546667-35.84 23.253333-25.813333 31.36 2.133333 7.893334 2.133334 31.786666 24.32 67.84 52.48L401.066667 391.466667l13.013333 10.88 5.12-3.626667 0.64-2.56-5.76-9.813333-48.213333-87.04L314.453333 210.773333l-22.826666-36.693333-5.973334-21.973333a107.861333 107.861333 0 0 1-3.626666-26.026667l26.666666-36.053333L323.413333 85.333333l35.413334 4.693334 14.933333 13.013333 21.973333 50.346667 35.626667 79.36 55.253333 107.733333 16.213334 32 8.746666 29.653333 3.2 9.173334h5.546667v-5.12l4.48-60.8 8.32-74.453334 8.106667-96 2.773333-27.093333 13.44-32.426667 26.666667-17.493333 20.693333 10.026667 17.066667 24.32-2.346667 15.786666-10.24 65.92-19.84 103.253334-13.013333 69.12h7.466666l8.746667-8.746667 34.986667-46.506667 58.666666-73.386666 26.026667-29.226667 30.293333-32.213333 19.413334-15.36h36.693333l27.093333 40.106666-12.16 41.386667-37.76 48-31.36 40.533333-45.013333 60.586667-28.16 48.426667 2.56 3.84 6.613333-0.64 101.546667-21.546667 54.826667-10.026667 65.493333-11.306666 29.653333 13.866666 3.2 14.08-11.733333 28.8-69.973333 17.28-82.133334 16.426667-122.24 29.013333-1.493333 1.066667 1.706667 2.133333 55.04 5.12 23.466666 1.28h57.6l107.306667 7.893334 28.16 18.56 16.853333 22.613333-2.773333 17.28-43.306667 21.973333-58.24-13.866666-136.106666-32.426667-46.72-11.733333h-6.4v3.84l38.826666 37.973333 71.253334 64.426667 89.173333 82.986666 4.48 20.48-11.52 16.213334-12.16-1.706667-78.506667-58.88-30.293333-26.666667-68.48-57.6h-4.48v5.973334l15.786667 23.04 83.413333 125.226666 4.266667 38.4-5.973334 12.586667-21.546666 7.466667-23.68-4.266667-48.853334-68.48-50.346666-77.226667-40.533334-69.12-4.906666 2.773334-23.893334 258.133333-11.306666 13.226667-26.026667 10.026666-21.546667-16.426666-11.52-26.666667 11.52-52.48 13.866667-68.48 11.306667-54.4 10.24-67.626667 5.973333-22.4-0.426667-1.493333-4.906666 0.64-50.986667 69.973333-77.653333 104.746667-61.44 65.706667-14.72 5.76-25.386667-13.226667 2.346667-23.466667 14.293333-20.906666 84.906667-107.946667 51.2-66.986667 33.066666-38.613333v-5.546667h-2.133333l-225.493333 146.56-40.106667 5.12-17.28-16.213333 2.133333-26.666667 8.106667-8.746666 67.84-46.72h-0.213333l0.853333 0.853333z" fill="#D97757" p-id="18433"></path></svg>
--- a/channel/web/static/logos/custom.svg
+++ b/channel/web/static/logos/custom.svg
@@ -0,0 +1,10 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" width="200" height="200" fill="none" stroke="#475569" stroke-width="1.8" stroke-linecap="round" stroke-linejoin="round">
+  <!-- Horizontal slider tracks -->
+  <line x1="4" y1="7" x2="20" y2="7"/>
+  <line x1="4" y1="12" x2="20" y2="12"/>
+  <line x1="4" y1="17" x2="20" y2="17"/>
+  <!-- Knobs (filled circles) -->
+  <circle cx="9" cy="7"  r="2.2" fill="#475569" stroke="none"/>
+  <circle cx="15" cy="12" r="2.2" fill="#475569" stroke="none"/>
+  <circle cx="7" cy="17"  r="2.2" fill="#475569" stroke="none"/>
+</svg>
--- a/channel/web/static/logos/dashscope.svg
+++ b/channel/web/static/logos/dashscope.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251621200" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="17444" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M1019.364785 620.816931L891.797142 397.807295 946.450846 293.15069a29.097778 29.097778 0 0 0 6.399732-36.393472l-70.184053-126.586684a30.078737 30.078737 0 0 0-24.574968-13.652427H597.4945L539.171949 14.549389a27.348852 27.348852 0 0 0-20.906122-14.549389H380.628607a29.139776 29.139776 0 0 0-24.616967 14.549389v5.545767L225.797108 243.062793H100.919352a29.182775 29.182775 0 0 0-25.513928 13.653427L3.428446 384.11187a32.766624 32.766624 0 0 0 0 29.182775L132.831012 638.096205 74.508461 740.064923a32.766624 32.766624 0 0 0 0 29.05478l66.514207 116.561105a29.905744 29.905744 0 0 0 25.513929 14.505391H427.132654l62.845361 109.222414A30.078737 30.078737 0 0 0 512.762058 1024H660.382859a29.139776 29.139776 0 0 0 24.574968-14.549389l128.463606-224.843558h114.76818a31.91366 31.91366 0 0 0 24.660965-15.444352l66.471208-117.414069a28.158818 28.158818 0 0 0 0-30.9747l0.042999 0.042999z m-161.273228 14.591387L791.57735 512.490479 518.265827 993.964261l-74.748861-122.87484h-273.268525l65.618244-119.205994h139.386147L101.856313 272.244568h143.055993L380.671605 30.121735l68.34913 119.247993-70.184053 122.87484H925.501726l-69.202094 121.936879 137.594222 241.183873H858.134555z" fill="#605BEC" p-id="17445"></path><path d="M499.962596 699.320634l174.371677-274.719464H324.694955z" fill="#605BEC" p-id="17446"></path></svg>
--- a/channel/web/static/logos/deepseek.svg
+++ b/channel/web/static/logos/deepseek.svg
--- a/channel/web/static/logos/doubao.svg
+++ b/channel/web/static/logos/doubao.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779261485522" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="5381" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M958.976 439.808C804.864 336.896 642.56 321.536 642.56 321.536s8.192 235.008-10.752 306.176c-0.512 9.728-11.776 75.264-43.008 157.696-10.752 28.16-24.064 55.296-39.424 81.408-40.96 74.24-89.6 127.488-89.6 127.488 119.808-48.64 205.312-92.672 309.76-175.616 122.88-96.768 229.376-254.464 189.44-378.88z" fill="#37E1BE" p-id="5382"></path><path d="M329.728 395.776c158.208-100.864 308.736-78.848 312.32-74.752 0.512 0.512 1.024 0.512 1.024 0.512 0-14.336-6.656-60.928-13.312-106.496-11.776-60.928-22.528-124.928-23.04-133.632-170.496-139.264-356.864-78.336-448 25.6-61.44 70.144-103.424 169.984-102.4 224.256V762.88c0.512-12.8 1.536-20.48 2.048-20.48 17.92-197.12 271.36-346.624 271.36-346.624z" fill="#A569FF" p-id="5383"></path><path d="M792.064 272.384c-41.984-43.52-87.552-88.576-122.368-125.44-33.28-34.816-59.392-60.928-62.976-65.536 0.512 8.704 11.264 72.704 23.04 133.632 6.656 45.568 12.8 92.672 13.312 106.496 0 0 162.304 15.36 316.416 118.272-0.512 0-83.456-80.384-167.424-167.424zM549.888 866.816c-2.56 1.024-198.656 107.008-292.352-30.72-20.992-30.72-31.744-68.096-33.28-106.496-3.072-74.752 5.12-227.84 105.472-333.824 0 0-253.44 149.504-270.848 346.624-0.512 0.512-2.048 8.192-2.048 20.48-1.024 32.768 4.608 98.304 43.008 155.136 52.224 78.336 193.024 138.752 328.192 85.504l33.28-9.728c-1.024 0.512 47.616-52.224 88.576-126.976z" fill="#1E37FC" p-id="5384"></path></svg>
--- a/channel/web/static/logos/gemini.svg
+++ b/channel/web/static/logos/gemini.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251750646" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="29551" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M214.101333 512c0-32.512 5.546667-63.701333 15.36-92.928L57.173333 290.218667A491.861333 491.861333 0 0 0 4.693333 512c0 79.701333 18.858667 154.88 52.394667 221.610667l172.202667-129.066667A290.56 290.56 0 0 1 214.101333 512" fill="#FBBC05" p-id="29552"></path><path d="M516.693333 216.192c72.106667 0 137.258667 25.002667 188.458667 65.962667L854.101333 136.533333C763.349333 59.178667 646.997333 11.392 516.693333 11.392c-202.325333 0-376.234667 113.28-459.52 278.826667l172.373334 128.853333c39.68-118.016 152.832-202.88 287.146666-202.88" fill="#EA4335" p-id="29553"></path><path d="M516.693333 807.808c-134.357333 0-247.509333-84.864-287.232-202.88l-172.288 128.853333c83.242667 165.546667 257.152 278.826667 459.52 278.826667 124.842667 0 244.053333-43.392 333.568-124.757333l-163.584-123.818667c-46.122667 28.458667-104.234667 43.776-170.026666 43.776" fill="#34A853" p-id="29554"></path><path d="M1005.397333 512c0-29.568-4.693333-61.44-11.648-91.008H516.650667V614.4h274.602666c-13.696 65.962667-51.072 116.650667-104.533333 149.632l163.541333 123.818667c93.994667-85.418667 155.136-212.650667 155.136-375.850667" fill="#4285F4" p-id="29555"></path></svg>
--- a/channel/web/static/logos/linkai.svg
+++ b/channel/web/static/logos/linkai.svg
--- a/channel/web/static/logos/minimax.svg
+++ b/channel/web/static/logos/minimax.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251514432" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="11888" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M415.392 475.808v329.984c-22.304 111.744-170.56 82.944-171.2 1.92-0.672-101.824 0-202.976 0-304.064v-117.184c0-14.656-3.2-26.24-16-35.392-24.96-18.72-54.944 3.264-55.584 30.208-1.408 36.16-0.704 71.616-1.408 107.264 0 28.16 0 55.52 0.64 83.648-18.368 123.776-168.32 103.232-171.808 0.704V487.04c0-28.032 54.944-34.624 52.256 7.36-1.792 20.8-0.64 42.272-1.344 62.912-0.64 36.8 55.648 61.6 68.896 1.408 0.64-49.632 0.64-99.264 0.64-149.344 0-62.752 17.824-113.856 84.352-118.624 28.8-2.56 47.968 9.504 66.336 30.304 7.04 7.36 23.68 30.72 24.32 56.16 0 23.456 0.64 46.752 0.64 70.464 0 46.72-0.64 93.76-0.64 140.48 0 30.304 0.64 60.256 0.64 89.856 0 37.536 0 75.552-0.64 113.152-0.64 48.864 58.816 48.16 68.352-0.768 0-57.632 0.64-114.56 0.64-172.192 0-141.984-0.64-283.968-0.64-425.856 0-14.72-2.048-55.584 5.76-70.464 41.504-101.12 167.392-56.96 168.544 26.72 2.432 171.52 0 344.896 0.64 516.8 0 59.616-48.416 46.816-51.104 23.488 0-178.88 0-358.4 0.64-537.024-2.368-44.832-68.832-38.72-72.672-6.592-1.28 36.864-0.64 74.4-1.28 111.232v219.008h0.64l0.448 0.256h-0.064z" fill="#D4367A" p-id="11889"></path><path d="M610.016 473.184v242.336V143.648c21.632-112.512 169.824-83.264 170.464-2.176 0.704 101.12 0 202.912 0.704 304 0 38.784 0 77.728-0.64 116.544 0 15.36 3.776 26.176 16.64 36.032 24.32 18.24 54.24-3.2 55.584-30.592 1.344-35.488 0.64-70.976 0.64-107.328V376.96c18.56-123.776 168.128-103.232 171.264-0.704v310.592c0 28.16-54.304 34.848-51.872-7.296 1.472-21.44 0-267.104 0.768-288.64 1.28-36.16-55.712-61.664-68.928-0.768v148.576c0 63.68-17.856 113.92-84.96 119.36-63.264 1.504-88.704-42.24-90.752-86.432V271.328c0-38.24 0-75.552 0.64-113.088 0.64-48.864-58.784-48.864-68.896 0.704V831.36c0 14.592 2.048 55.52-5.184 70.432-41.44 101.056-168 56.864-169.152-26.752v-79.616c3.136-53.6 48.416-40.864 50.464-18.176v94.464c2.432 44.928 68.928 39.488 72.064 6.656 1.344-36.896 1.344-73.728 1.344-111.296v-293.824h-0.192v-0.064z" fill="#ED6D48" p-id="11890"></path></svg>
--- a/channel/web/static/logos/moonshot.svg
+++ b/channel/web/static/logos/moonshot.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251592968" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="16416" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M117.9648 684.6464l342.30272 93.57312v75.34592l209.7152 58.5728A428.99456 428.99456 0 0 1 512 942.08c-176.128 0-327.53664-105.8816-394.0352-257.4336zM83.29216 477.42976l407.30624 112.64-9.6256 37.00736-6.0416 35.0208 383.3856 104.96a432.5376 432.5376 0 0 1-65.10592 70.32832l-688.18944-185.9584A429.4656 429.4656 0 0 1 81.92 512c0-11.63264 0.47104-23.1424 1.37216-34.54976z m57.344-182.4768l429.07648 114.21696a279.94112 279.94112 0 0 0-23.06048 35.55328 201.17504 201.17504 0 0 0-14.70464 34.93888l403.08736 110.26432a426.8032 426.8032 0 0 1-23.552 81.7152L86.54848 448.7168a427.25376 427.25376 0 0 1 54.0672-153.76384z m158.47424-156.75392l404.23424 108.31872a190.2592 190.2592 0 0 0-32.80896 24.90368c-9.13408 8.8064-19.8656 21.4016-32.1536 37.74464l285.24544 77.78304c9.216 30.45376 15.03232 61.8496 17.32608 93.5936L156.61056 269.68064a432.27136 432.27136 0 0 1 142.49984-131.4816zM512 81.92c142.90944 0 269.55776 69.71392 347.7504 176.98816L337.26464 118.90688A428.50304 428.50304 0 0 1 512 81.92z" fill="#000000" p-id="16417"></path></svg>
--- a/channel/web/static/logos/openai.svg
+++ b/channel/web/static/logos/openai.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251225589" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="9015" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M881.664 431.488a218.88 218.88 0 0 0-18.176-177.088A218.624 218.624 0 0 0 628.992 149.76c-40.576-45.824-100.288-71.424-162.176-71.424a219.136 219.136 0 0 0-208 150.4 215.68 215.68 0 0 0-144 104.512 218.944 218.944 0 0 0 26.688 254.912 218.752 218.752 0 0 0 19.2 177.152 217.088 217.088 0 0 0 234.624 104.512 219.136 219.136 0 0 0 162.112 72.512 219.136 219.136 0 0 0 208-150.4 215.68 215.68 0 0 0 144-104.512 219.008 219.008 0 0 0-27.712-256z m-324.288 454.4a158.08 158.08 0 0 1-103.424-37.376c1.088-1.088 4.288-2.176 5.376-3.2l171.712-99.2a28.16 28.16 0 0 0 13.824-24.512V479.488l72.576 41.6c1.024 0 1.024 1.024 1.024 2.112v200.512a160.512 160.512 0 0 1-161.088 162.112z m-347.712-148.288c-19.2-33.088-25.6-71.488-19.2-108.8 1.088 1.024 3.2 2.176 5.376 3.2l171.712 99.2a25.984 25.984 0 0 0 27.712 0l210.112-121.6v84.224c0 1.152 0 2.176-1.024 2.176L430.464 796.16c-76.8 44.8-176 18.176-220.8-58.624z m-44.736-375.424c19.2-32.64 48.896-57.856 84.224-71.488v204.8c0 9.6 5.376 19.2 13.888 24.512l210.176 121.6-72.576 41.6c-1.024 0-2.112 1.088-2.112 0L224.64 582.912a160.448 160.448 0 0 1-59.776-220.8h0.064z m597.312 138.688l-210.112-121.6 72.512-41.6c1.088 0 2.176-1.088 2.176 0l173.824 100.224a161.088 161.088 0 0 1-25.6 291.2V525.44a26.304 26.304 0 0 0-12.8-24.512z m71.488-108.8a23.232 23.232 0 0 0-5.312-3.2L656.64 289.536a26.048 26.048 0 0 0-27.712 0l-210.176 121.6V326.912c0-1.088 0-2.176 1.088-2.176l173.824-100.224a161.152 161.152 0 0 1 220.8 59.712c19.2 32 25.6 70.4 19.2 107.776z m-454.4 149.248l-72.64-41.6c-1.024 0-1.024-1.088-1.024-2.176V297.088A162.048 162.048 0 0 1 467.84 135.04a158.08 158.08 0 0 1 103.424 37.312 22.848 22.848 0 0 1-5.312 3.2L394.24 274.688a28.16 28.16 0 0 0-13.888 24.512v242.112h-1.088z m39.424-85.312l93.824-54.4 93.888 54.4v107.712l-93.888 54.4-93.824-54.4V456z" fill="#000000" p-id="9016"></path></svg>
--- a/channel/web/static/logos/qianfan.svg
+++ b/channel/web/static/logos/qianfan.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251568791" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="14450" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M96.20121136 636.3124965c-0.1472897-113.41305959-0.29457937-226.8261192-0.29457937-340.23917879 0-14.87625845 7.65906378-26.51214381 20.4732666-34.02391789 45.51251353-26.65943349 91.02502705-53.31886698 136.83211997-79.53643141 71.1409192-40.94653321 142.42912809-81.59848704 213.71733698-122.39773055 7.36448439-4.12411126 14.58167909-8.3955122 21.50429441-13.2560719 19.44223878-13.40336159 39.03176725-16.05457598 60.09419263-3.53495252 27.39588193 16.34915535 54.93905355 32.25644163 82.48222516 48.16372793 88.0792333 50.96223197 176.30575629 101.77717426 264.38498958 152.59211653 9.86840908 5.74429781 19.88410785 11.19401627 29.60522725 17.0856038 14.13981003 8.54280189 21.50429441 21.06242535 21.50429443 37.70616007 0 147.73155685 0.29457937 295.46311371-0.1472897 443.19467057 0 15.46541722-7.2171947 28.57419943-21.7988738 36.96971163-34.7603663 20.17868721-70.55176044 38.88447758-104.57567833 59.94690293-48.90017634 30.19438599-100.00969801 56.11737105-148.76258466 86.60633642-29.01606849 18.11663161-59.50503387 34.02391789-89.11026112 50.96223197-13.10878221 7.51177407-26.07027474 15.17083783-39.03176726 22.9771913-13.84523065 8.3955122-27.83775099 8.83738127-41.97756102 0.73644843-56.41195043-32.55102101-112.82390085-65.10204201-169.38314098-97.653063-61.86166887-35.64410444-123.72333775-71.1409192-185.4377169-106.78502365-11.19401627-6.48074626-22.24074286-12.81420285-32.99289009-19.88410785-11.48859565-7.65906378-17.08560379-19.14765941-17.08560378-32.69831069-0.1472897-34.7603663 0.1472897-69.52073264 0.29457938-104.28109895 1.62018657-0.58915875 1.62018657-1.62018657-0.29457938-2.65121438z m356.58833414-225.500512c2.20934532-1.76747625 4.41869063-3.68224221 6.77532565-5.15513907 68.93157389-39.62092601 137.86314777-79.24185204 206.94201135-118.86277807 2.79850407-1.62018657 6.48074626-1.62018657 6.62803594-6.18616688 0.1472897-4.8605597-4.12411126-4.71327001-6.77532564-6.18616688-40.65195383-23.56635005-81.59848704-46.83812071-122.10315117-70.84633984-16.79102442-10.01569877-32.84560039-8.54280189-48.45830728 0.58915876-45.9543826 26.51214381-91.46689612 53.61344636-137.27398903 80.42016953-31.96186226 18.70579035-64.21830387 37.11700133-96.32745581 55.67550198-18.41121097 10.60485751-27.54317163 25.33382629-27.24859225 47.72185885 0.88373813 89.55213018 0.58915875 179.10426036 0.14728969 268.65639053-0.1472897 20.17868721 9.27925033 33.58204881 25.33382629 43.15587853 31.3727035 18.70579035 63.18727606 37.11700133 95.14913832 54.93905355 10.89943689 6.03887719 21.06242535 13.99252034 35.79139414 18.41121096V505.51925374c6.48074626 19.58952848 18.55850066 34.02391789 36.67513226 44.6287754 27.83775099 16.20186565 63.18727606 12.51962347 86.31175705-10.45756784 26.95401286-26.65943349 28.72148912-62.89269668 12.81420282-90.14128893-16.34915535-28.42690974-43.59774757-37.55887038-74.38129233-38.73718787z m82.48222517 429.64401928c14.28709972-3.82953187 25.92298506-13.99252034 38.88447758-21.35700473 40.94653321-23.27177067 81.30390766-47.72185885 122.54502023-70.55176046 26.95401286-15.02354815 52.87699792-31.66728287 80.71474891-45.21793415 16.79102442-8.10093283 29.60522723-22.53532223 29.60522726-43.4504579 0.1472897-92.939793 0.29457937-185.73229631 0.14728969-278.6720893 0-11.19401627-5.15513907-13.99252034-13.84523067-7.06990501-26.51214381 20.76784598-57.29568854 34.46578693-86.16446735 51.25681135-54.49718448 31.81457257-109.14165865 63.33456576-163.78613282 95.00184862-8.54280189 4.8605597-11.78317502 10.45756784-11.63588535 20.47326662 0.29457937 96.18016613 0.1472897 192.50762194 0.1472897 288.68778806-0.29457937 3.5349525-1.47289687 7.65906378 3.38766282 10.8994369z" fill="#066AF3" p-id="14451"></path><path d="M96.20121136 636.3124965c1.91476594 1.03102783 1.91476594 2.06205563 0 3.09308345v-3.09308345z" fill="#4372E0" p-id="14452"></path><path d="M391.3697457 505.37196405c-5.44971845-44.33419602 13.84523065-74.08671296 61.4197998-94.55997955 30.93083443 1.17831749 58.03213699 10.31027814 74.38129233 38.5898982 15.75999659 27.39588193 14.13981003 63.48185543-12.81420282 90.14128893-23.27177067 22.97719129-58.47400606 26.65943349-86.31175705 10.45756783-18.11663161-10.60485751-30.34167568-25.03924691-36.67513226-44.62877541z" fill="#002A9A" p-id="14453"></path></svg>
--- a/channel/web/static/logos/zhipu.svg
+++ b/channel/web/static/logos/zhipu.svg
@@ -0,0 +1 @@
+<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251419020" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="10062" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M520.063496 0v77.563152c0 269.231173-144.758953 414.054122-434.212862 434.340854L86.106618 511.968002H76.827198V255.984001l443.236298-255.984001z" fill="#5B55F6" p-id="10063"></path><path d="M520.063496 1023.936004v-77.563152c0-269.231173-144.758953-414.054122-434.212862-434.340854L86.042622 511.968002H76.827198v255.984001l443.236298 255.984001z" fill="#376AF3" p-id="10064"></path><path d="M520.063496 0v77.563152c0 269.231173 144.758953 414.054122 434.276858 434.340854L954.08437 511.968002h9.215424V255.984001L520.063496 0z" fill="#5B55F6" p-id="10065"></path><path d="M520.063496 1023.936004v-77.563152c0-269.231173 144.758953-414.054122 434.276858-434.340854L954.08437 511.968002h9.27942v255.984001l-443.236298 255.984001z" fill="#376AF3" p-id="10066"></path></svg>
--- a/channel/web/static/vendor/README.md
+++ b/channel/web/static/vendor/README.md
@@ -0,0 +1,41 @@
+# Vendor assets
+
+Third-party frontend assets bundled locally so the Web Console can run in
+fully offline / air-gapped environments (no requests to cloudflare, jsdelivr,
+googleapis, gstatic, etc.).
+
+All files here are vendored copies of upstream releases. Do not edit them by
+hand; re-download from the official source if upgrading.
+
+## Manifest
+
+| Path                                                | Source                                                                                            | Version |
+| --------------------------------------------------- | ------------------------------------------------------------------------------------------------- | ------- |
+| `fontawesome/css/all.min.css`                       | https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css                         | 6.4.0   |
+| `fontawesome/webfonts/fa-{brands,regular,solid,v4compatibility}-*.woff2` | https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/webfonts/              | 6.4.0   |
+| `fonts/inter/inter-latin.woff2`                     | https://fonts.gstatic.com/s/inter/v20/UcC73FwrK3iLTeHuS_nVMrMxCp50SjIa1ZL7.woff2                  | v20     |
+| `fonts/inter/inter.css`                             | Hand-written `@font-face` declaration that maps Inter weights 300-700 to the local woff2          | -       |
+| `tailwind/tailwind.min.js`                          | https://cdn.tailwindcss.com (Play CDN runtime, JIT engine for the browser)                        | latest  |
+| `markdown-it/markdown-it.min.js`                    | https://cdn.jsdelivr.net/npm/markdown-it@13.0.1/dist/markdown-it.min.js                           | 13.0.1  |
+| `highlightjs/highlight.min.js`                      | https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/highlight.min.js                       | 11.9.0  |
+| `highlightjs/styles/github{,-dark}.min.css`         | https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/styles/                                | 11.9.0  |
+| `highlightjs/languages/{python,javascript,java,go,bash}.min.js` | https://cdnjs.cloudflare.com/ajax/libs/highlight.js/11.9.0/languages/                  | 11.9.0  |
+| `d3/d3.min.js`                                      | https://cdn.jsdelivr.net/npm/d3@7/dist/d3.min.js (loaded lazily for the knowledge graph view)     | 7.x     |
+
+Notes:
+
+- The Inter font only ships the latin subset (CJK characters fall back to the
+  system sans-serif via the font-family chain in `tailwind.config`).
+- Only `woff2` font files are shipped (no `ttf` fallback). woff2 is supported
+  by all browsers released since 2014-2018 (Chrome 36+, Firefox 39+, Safari
+  12+, Edge, Opera 26+). The only mainstream browser that lacks woff2 support
+  is IE 11, which cannot run the rest of the console anyway. `all.min.css`
+  still references the ttf paths as a `src:` fallback — those 404s are
+  harmless and ignored by the browser once the woff2 loads.
+- `tailwind.min.js` is the official Tailwind Play CDN build (an in-browser JIT
+  engine). It must be served as JS to keep the existing `tailwind.config = {}`
+  customization working.
+- One external script remains in `channel/web/static/js/console.js`:
+  `wwcdn.weixin.qq.com/.../wecom-aibot-sdk` — Tencent requires the WeCom Bot
+  SDK to be loaded from their CDN, and it is only fetched when the user opens
+  the WeCom Bot QR-login flow.
--- a/channel/web/static/vendor/d3/d3.min.js
+++ b/channel/web/static/vendor/d3/d3.min.js
--- a/channel/web/static/vendor/fontawesome/css/all.min.css
+++ b/channel/web/static/vendor/fontawesome/css/all.min.css
--- a/channel/web/static/vendor/fontawesome/webfonts/fa-brands-400.woff2
+++ b/channel/web/static/vendor/fontawesome/webfonts/fa-brands-400.woff2
--- a/channel/web/static/vendor/fontawesome/webfonts/fa-regular-400.woff2
+++ b/channel/web/static/vendor/fontawesome/webfonts/fa-regular-400.woff2
--- a/channel/web/static/vendor/fontawesome/webfonts/fa-solid-900.woff2
+++ b/channel/web/static/vendor/fontawesome/webfonts/fa-solid-900.woff2
--- a/channel/web/static/vendor/fontawesome/webfonts/fa-v4compatibility.woff2
+++ b/channel/web/static/vendor/fontawesome/webfonts/fa-v4compatibility.woff2
--- a/channel/web/static/vendor/fonts/inter/inter-latin.woff2
+++ b/channel/web/static/vendor/fonts/inter/inter-latin.woff2
--- a/channel/web/static/vendor/fonts/inter/inter.css
+++ b/channel/web/static/vendor/fonts/inter/inter.css
@@ -0,0 +1,16 @@
+/* Inter font (latin subset only).
+ * Single variable font woff2 that covers weights 300/400/500/600/700.
+ * Non-latin scripts (CJK, etc.) fall back to system sans-serif via the
+ * font-family chain defined in tailwind.config (Inter, system-ui, ...).
+ * Source: Google Fonts (Inter v20), redistributed locally to avoid runtime
+ * dependency on fonts.googleapis.com / fonts.gstatic.com.
+ */
+
+@font-face {
+    font-family: 'Inter';
+    font-style: normal;
+    font-weight: 300 700;
+    font-display: swap;
+    src: url('./inter-latin.woff2') format('woff2');
+    unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+02BB-02BC, U+02C6, U+02DA, U+02DC, U+0304, U+0308, U+0329, U+2000-206F, U+2074, U+20AC, U+2122, U+2191, U+2193, U+2212, U+2215, U+FEFF, U+FFFD;
+}
--- a/channel/web/static/vendor/highlightjs/highlight.min.js
+++ b/channel/web/static/vendor/highlightjs/highlight.min.js
--- a/channel/web/static/vendor/highlightjs/languages/bash.min.js
+++ b/channel/web/static/vendor/highlightjs/languages/bash.min.js
@@ -0,0 +1,20 @@
+/*! `bash` grammar compiled for Highlight.js 11.9.0 */
+(()=>{var e=(()=>{"use strict";return e=>{const s=e.regex,t={},n={begin:/\$\{/,
+end:/\}/,contains:["self",{begin:/:-/,contains:[t]}]};Object.assign(t,{
+className:"variable",variants:[{
+begin:s.concat(/\$[\w\d#@][\w\d_]*/,"(?![\\w\\d])(?![$])")},n]});const a={
+className:"subst",begin:/\$\(/,end:/\)/,contains:[e.BACKSLASH_ESCAPE]},i={
+begin:/<<-?\s*(?=\w+)/,starts:{contains:[e.END_SAME_AS_BEGIN({begin:/(\w+)/,
+end:/(\w+)/,className:"string"})]}},c={className:"string",begin:/"/,end:/"/,
+contains:[e.BACKSLASH_ESCAPE,t,a]};a.contains.push(c);const o={begin:/\$?\(\(/,
+end:/\)\)/,contains:[{begin:/\d+#[0-9a-f]+/,className:"number"},e.NUMBER_MODE,t]
+},r=e.SHEBANG({binary:"(fish|bash|zsh|sh|csh|ksh|tcsh|dash|scsh)",relevance:10
+}),l={className:"function",begin:/\w[\w\d_]*\s*\(\s*\)\s*\{/,returnBegin:!0,
+contains:[e.inherit(e.TITLE_MODE,{begin:/\w[\w\d_]*/})],relevance:0};return{
+name:"Bash",aliases:["sh"],keywords:{$pattern:/\b[a-z][a-z0-9._-]+\b/,
+keyword:["if","then","else","elif","fi","for","while","until","in","do","done","case","esac","function","select"],
+literal:["true","false"],
+built_in:["break","cd","continue","eval","exec","exit","export","getopts","hash","pwd","readonly","return","shift","test","times","trap","umask","unset","alias","bind","builtin","caller","command","declare","echo","enable","help","let","local","logout","mapfile","printf","read","readarray","source","type","typeset","ulimit","unalias","set","shopt","autoload","bg","bindkey","bye","cap","chdir","clone","comparguments","compcall","compctl","compdescribe","compfiles","compgroups","compquote","comptags","comptry","compvalues","dirs","disable","disown","echotc","echoti","emulate","fc","fg","float","functions","getcap","getln","history","integer","jobs","kill","limit","log","noglob","popd","print","pushd","pushln","rehash","sched","setcap","setopt","stat","suspend","ttyctl","unfunction","unhash","unlimit","unsetopt","vared","wait","whence","where","which","zcompile","zformat","zftp","zle","zmodload","zparseopts","zprof","zpty","zregexparse","zsocket","zstyle","ztcp","chcon","chgrp","chown","chmod","cp","dd","df","dir","dircolors","ln","ls","mkdir","mkfifo","mknod","mktemp","mv","realpath","rm","rmdir","shred","sync","touch","truncate","vdir","b2sum","base32","base64","cat","cksum","comm","csplit","cut","expand","fmt","fold","head","join","md5sum","nl","numfmt","od","paste","ptx","pr","sha1sum","sha224sum","sha256sum","sha384sum","sha512sum","shuf","sort","split","sum","tac","tail","tr","tsort","unexpand","uniq","wc","arch","basename","chroot","date","dirname","du","echo","env","expr","factor","groups","hostid","id","link","logname","nice","nohup","nproc","pathchk","pinky","printenv","printf","pwd","readlink","runcon","seq","sleep","stat","stdbuf","stty","tee","test","timeout","tty","uname","unlink","uptime","users","who","whoami","yes"]
+},contains:[r,e.SHEBANG(),l,o,e.HASH_COMMENT_MODE,i,{match:/(\/[a-z._-]+)+/},c,{
+match:/\\"/},{className:"string",begin:/'/,end:/'/},{match:/\\'/},t]}}})()
+;hljs.registerLanguage("bash",e)})();
--- a/channel/web/static/vendor/highlightjs/languages/go.min.js
+++ b/channel/web/static/vendor/highlightjs/languages/go.min.js
@@ -0,0 +1,14 @@
+/*! `go` grammar compiled for Highlight.js 11.9.0 */
+(()=>{var e=(()=>{"use strict";return e=>{const n={
+keyword:["break","case","chan","const","continue","default","defer","else","fallthrough","for","func","go","goto","if","import","interface","map","package","range","return","select","struct","switch","type","var"],
+type:["bool","byte","complex64","complex128","error","float32","float64","int8","int16","int32","int64","string","uint8","uint16","uint32","uint64","int","uint","uintptr","rune"],
+literal:["true","false","iota","nil"],
+built_in:["append","cap","close","complex","copy","imag","len","make","new","panic","print","println","real","recover","delete"]
+};return{name:"Go",aliases:["golang"],keywords:n,illegal:"</",
+contains:[e.C_LINE_COMMENT_MODE,e.C_BLOCK_COMMENT_MODE,{className:"string",
+variants:[e.QUOTE_STRING_MODE,e.APOS_STRING_MODE,{begin:"`",end:"`"}]},{
+className:"number",variants:[{begin:e.C_NUMBER_RE+"[i]",relevance:1
+},e.C_NUMBER_MODE]},{begin:/:=/},{className:"function",beginKeywords:"func",
+end:"\\s*(\\{|$)",excludeEnd:!0,contains:[e.TITLE_MODE,{className:"params",
+begin:/\(/,end:/\)/,endsParent:!0,keywords:n,illegal:/["']/}]}]}}})()
+;hljs.registerLanguage("go",e)})();
--- a/channel/web/static/vendor/highlightjs/languages/java.min.js
+++ b/channel/web/static/vendor/highlightjs/languages/java.min.js
@@ -0,0 +1,38 @@
+/*! `java` grammar compiled for Highlight.js 11.9.0 */
+(()=>{var e=(()=>{"use strict"
+;var e="[0-9](_*[0-9])*",a=`\\.(${e})`,n="[0-9a-fA-F](_*[0-9a-fA-F])*",s={
+className:"number",variants:[{
+begin:`(\\b(${e})((${a})|\\.)?|(${a}))[eE][+-]?(${e})[fFdD]?\\b`},{
+begin:`\\b(${e})((${a})[fFdD]?\\b|\\.([fFdD]\\b)?)`},{begin:`(${a})[fFdD]?\\b`
+},{begin:`\\b(${e})[fFdD]\\b`},{
+begin:`\\b0[xX]((${n})\\.?|(${n})?\\.(${n}))[pP][+-]?(${e})[fFdD]?\\b`},{
+begin:"\\b(0|[1-9](_*[0-9])*)[lL]?\\b"},{begin:`\\b0[xX](${n})[lL]?\\b`},{
+begin:"\\b0(_*[0-7])*[lL]?\\b"},{begin:"\\b0[bB][01](_*[01])*[lL]?\\b"}],
+relevance:0};function t(e,a,n){return-1===n?"":e.replace(a,(s=>t(e,a,n-1)))}
+return e=>{
+const a=e.regex,n="[\xc0-\u02b8a-zA-Z_$][\xc0-\u02b8a-zA-Z_$0-9]*",i=n+t("(?:<"+n+"~~~(?:\\s*,\\s*"+n+"~~~)*>)?",/~~~/g,2),r={
+keyword:["synchronized","abstract","private","var","static","if","const ","for","while","strictfp","finally","protected","import","native","final","void","enum","else","break","transient","catch","instanceof","volatile","case","assert","package","default","public","try","switch","continue","throws","protected","public","private","module","requires","exports","do","sealed","yield","permits"],
+literal:["false","true","null"],
+type:["char","boolean","long","float","int","byte","short","double"],
+built_in:["super","this"]},l={className:"meta",begin:"@"+n,contains:[{
+begin:/\(/,end:/\)/,contains:["self"]}]},c={className:"params",begin:/\(/,
+end:/\)/,keywords:r,relevance:0,contains:[e.C_BLOCK_COMMENT_MODE],endsParent:!0}
+;return{name:"Java",aliases:["jsp"],keywords:r,illegal:/<\/|#/,
+contains:[e.COMMENT("/\\*\\*","\\*/",{relevance:0,contains:[{begin:/\w+@/,
+relevance:0},{className:"doctag",begin:"@[A-Za-z]+"}]}),{
+begin:/import java\.[a-z]+\./,keywords:"import",relevance:2
+},e.C_LINE_COMMENT_MODE,e.C_BLOCK_COMMENT_MODE,{begin:/"""/,end:/"""/,
+className:"string",contains:[e.BACKSLASH_ESCAPE]
+},e.APOS_STRING_MODE,e.QUOTE_STRING_MODE,{
+match:[/\b(?:class|interface|enum|extends|implements|new)/,/\s+/,n],className:{
+1:"keyword",3:"title.class"}},{match:/non-sealed/,scope:"keyword"},{
+begin:[a.concat(/(?!else)/,n),/\s+/,n,/\s+/,/=(?!=)/],className:{1:"type",
+3:"variable",5:"operator"}},{begin:[/record/,/\s+/,n],className:{1:"keyword",
+3:"title.class"},contains:[c,e.C_LINE_COMMENT_MODE,e.C_BLOCK_COMMENT_MODE]},{
+beginKeywords:"new throw return else",relevance:0},{
+begin:["(?:"+i+"\\s+)",e.UNDERSCORE_IDENT_RE,/\s*(?=\()/],className:{
+2:"title.function"},keywords:r,contains:[{className:"params",begin:/\(/,
+end:/\)/,keywords:r,relevance:0,
+contains:[l,e.APOS_STRING_MODE,e.QUOTE_STRING_MODE,s,e.C_BLOCK_COMMENT_MODE]
+},e.C_LINE_COMMENT_MODE,e.C_BLOCK_COMMENT_MODE]},s,l]}}})()
+;hljs.registerLanguage("java",e)})();
--- a/channel/web/static/vendor/highlightjs/languages/javascript.min.js
+++ b/channel/web/static/vendor/highlightjs/languages/javascript.min.js
@@ -0,0 +1,80 @@
+/*! `javascript` grammar compiled for Highlight.js 11.9.0 */
+(()=>{var e=(()=>{"use strict"
+;const e="[A-Za-z$_][0-9A-Za-z$_]*",n=["as","in","of","if","for","while","finally","var","new","function","do","return","void","else","break","catch","instanceof","with","throw","case","default","try","switch","continue","typeof","delete","let","yield","const","class","debugger","async","await","static","import","from","export","extends"],a=["true","false","null","undefined","NaN","Infinity"],t=["Object","Function","Boolean","Symbol","Math","Date","Number","BigInt","String","RegExp","Array","Float32Array","Float64Array","Int8Array","Uint8Array","Uint8ClampedArray","Int16Array","Int32Array","Uint16Array","Uint32Array","BigInt64Array","BigUint64Array","Set","Map","WeakSet","WeakMap","ArrayBuffer","SharedArrayBuffer","Atomics","DataView","JSON","Promise","Generator","GeneratorFunction","AsyncFunction","Reflect","Proxy","Intl","WebAssembly"],s=["Error","EvalError","InternalError","RangeError","ReferenceError","SyntaxError","TypeError","URIError"],r=["setInterval","setTimeout","clearInterval","clearTimeout","require","exports","eval","isFinite","isNaN","parseFloat","parseInt","decodeURI","decodeURIComponent","encodeURI","encodeURIComponent","escape","unescape"],c=["arguments","this","super","console","window","document","localStorage","sessionStorage","module","global"],i=[].concat(r,t,s)
+;return o=>{const l=o.regex,b=e,d={begin:/<[A-Za-z0-9\\._:-]+/,
+end:/\/[A-Za-z0-9\\._:-]+>|\/>/,isTrulyOpeningTag:(e,n)=>{
+const a=e[0].length+e.index,t=e.input[a]
+;if("<"===t||","===t)return void n.ignoreMatch();let s
+;">"===t&&(((e,{after:n})=>{const a="</"+e[0].slice(1)
+;return-1!==e.input.indexOf(a,n)})(e,{after:a})||n.ignoreMatch())
+;const r=e.input.substring(a)
+;((s=r.match(/^\s*=/))||(s=r.match(/^\s+extends\s+/))&&0===s.index)&&n.ignoreMatch()
+}},g={$pattern:e,keyword:n,literal:a,built_in:i,"variable.language":c
+},u="[0-9](_?[0-9])*",m=`\\.(${u})`,E="0|[1-9](_?[0-9])*|0[0-7]*[89][0-9]*",A={
+className:"number",variants:[{
+begin:`(\\b(${E})((${m})|\\.)?|(${m}))[eE][+-]?(${u})\\b`},{
+begin:`\\b(${E})\\b((${m})\\b|\\.)?|(${m})\\b`},{
+begin:"\\b(0|[1-9](_?[0-9])*)n\\b"},{
+begin:"\\b0[xX][0-9a-fA-F](_?[0-9a-fA-F])*n?\\b"},{
+begin:"\\b0[bB][0-1](_?[0-1])*n?\\b"},{begin:"\\b0[oO][0-7](_?[0-7])*n?\\b"},{
+begin:"\\b0[0-7]+n?\\b"}],relevance:0},y={className:"subst",begin:"\\$\\{",
+end:"\\}",keywords:g,contains:[]},h={begin:"html`",end:"",starts:{end:"`",
+returnEnd:!1,contains:[o.BACKSLASH_ESCAPE,y],subLanguage:"xml"}},N={
+begin:"css`",end:"",starts:{end:"`",returnEnd:!1,
+contains:[o.BACKSLASH_ESCAPE,y],subLanguage:"css"}},_={begin:"gql`",end:"",
+starts:{end:"`",returnEnd:!1,contains:[o.BACKSLASH_ESCAPE,y],
+subLanguage:"graphql"}},f={className:"string",begin:"`",end:"`",
+contains:[o.BACKSLASH_ESCAPE,y]},v={className:"comment",
+variants:[o.COMMENT(/\/\*\*(?!\/)/,"\\*/",{relevance:0,contains:[{
+begin:"(?=@[A-Za-z]+)",relevance:0,contains:[{className:"doctag",
+begin:"@[A-Za-z]+"},{className:"type",begin:"\\{",end:"\\}",excludeEnd:!0,
+excludeBegin:!0,relevance:0},{className:"variable",begin:b+"(?=\\s*(-)|$)",
+endsParent:!0,relevance:0},{begin:/(?=[^\n])\s/,relevance:0}]}]
+}),o.C_BLOCK_COMMENT_MODE,o.C_LINE_COMMENT_MODE]
+},p=[o.APOS_STRING_MODE,o.QUOTE_STRING_MODE,h,N,_,f,{match:/\$\d+/},A]
+;y.contains=p.concat({begin:/\{/,end:/\}/,keywords:g,contains:["self"].concat(p)
+});const S=[].concat(v,y.contains),w=S.concat([{begin:/\(/,end:/\)/,keywords:g,
+contains:["self"].concat(S)}]),R={className:"params",begin:/\(/,end:/\)/,
+excludeBegin:!0,excludeEnd:!0,keywords:g,contains:w},O={variants:[{
+match:[/class/,/\s+/,b,/\s+/,/extends/,/\s+/,l.concat(b,"(",l.concat(/\./,b),")*")],
+scope:{1:"keyword",3:"title.class",5:"keyword",7:"title.class.inherited"}},{
+match:[/class/,/\s+/,b],scope:{1:"keyword",3:"title.class"}}]},k={relevance:0,
+match:l.either(/\bJSON/,/\b[A-Z][a-z]+([A-Z][a-z]*|\d)*/,/\b[A-Z]{2,}([A-Z][a-z]+|\d)+([A-Z][a-z]*)*/,/\b[A-Z]{2,}[a-z]+([A-Z][a-z]+|\d)*([A-Z][a-z]*)*/),
+className:"title.class",keywords:{_:[...t,...s]}},I={variants:[{
+match:[/function/,/\s+/,b,/(?=\s*\()/]},{match:[/function/,/\s*(?=\()/]}],
+className:{1:"keyword",3:"title.function"},label:"func.def",contains:[R],
+illegal:/%/},x={
+match:l.concat(/\b/,(T=[...r,"super","import"],l.concat("(?!",T.join("|"),")")),b,l.lookahead(/\(/)),
+className:"title.function",relevance:0};var T;const C={
+begin:l.concat(/\./,l.lookahead(l.concat(b,/(?![0-9A-Za-z$_(])/))),end:b,
+excludeBegin:!0,keywords:"prototype",className:"property",relevance:0},M={
+match:[/get|set/,/\s+/,b,/(?=\()/],className:{1:"keyword",3:"title.function"},
+contains:[{begin:/\(\)/},R]
+},B="(\\([^()]*(\\([^()]*(\\([^()]*\\)[^()]*)*\\)[^()]*)*\\)|"+o.UNDERSCORE_IDENT_RE+")\\s*=>",$={
+match:[/const|var|let/,/\s+/,b,/\s*/,/=\s*/,/(async\s*)?/,l.lookahead(B)],
+keywords:"async",className:{1:"keyword",3:"title.function"},contains:[R]}
+;return{name:"JavaScript",aliases:["js","jsx","mjs","cjs"],keywords:g,exports:{
+PARAMS_CONTAINS:w,CLASS_REFERENCE:k},illegal:/#(?![$_A-z])/,
+contains:[o.SHEBANG({label:"shebang",binary:"node",relevance:5}),{
+label:"use_strict",className:"meta",relevance:10,
+begin:/^\s*['"]use (strict|asm)['"]/
+},o.APOS_STRING_MODE,o.QUOTE_STRING_MODE,h,N,_,f,v,{match:/\$\d+/},A,k,{
+className:"attr",begin:b+l.lookahead(":"),relevance:0},$,{
+begin:"("+o.RE_STARTERS_RE+"|\\b(case|return|throw)\\b)\\s*",
+keywords:"return throw case",relevance:0,contains:[v,o.REGEXP_MODE,{
+className:"function",begin:B,returnBegin:!0,end:"\\s*=>",contains:[{
+className:"params",variants:[{begin:o.UNDERSCORE_IDENT_RE,relevance:0},{
+className:null,begin:/\(\s*\)/,skip:!0},{begin:/\(/,end:/\)/,excludeBegin:!0,
+excludeEnd:!0,keywords:g,contains:w}]}]},{begin:/,/,relevance:0},{match:/\s+/,
+relevance:0},{variants:[{begin:"<>",end:"</>"},{
+match:/<[A-Za-z0-9\\._:-]+\s*\/>/},{begin:d.begin,
+"on:begin":d.isTrulyOpeningTag,end:d.end}],subLanguage:"xml",contains:[{
+begin:d.begin,end:d.end,skip:!0,contains:["self"]}]}]},I,{
+beginKeywords:"while if switch catch for"},{
+begin:"\\b(?!function)"+o.UNDERSCORE_IDENT_RE+"\\([^()]*(\\([^()]*(\\([^()]*\\)[^()]*)*\\)[^()]*)*\\)\\s*\\{",
+returnBegin:!0,label:"func.def",contains:[R,o.inherit(o.TITLE_MODE,{begin:b,
+className:"title.function"})]},{match:/\.\.\./,relevance:0},C,{match:"\\$"+b,
+relevance:0},{match:[/\bconstructor(?=\s*\()/],className:{1:"title.function"},
+contains:[R]},x,{relevance:0,match:/\b[A-Z][A-Z_0-9]+\b/,
+className:"variable.constant"},O,M,{match:/\$[(.]/}]}}})()
+;hljs.registerLanguage("javascript",e)})();
--- a/channel/web/static/vendor/highlightjs/languages/python.min.js
+++ b/channel/web/static/vendor/highlightjs/languages/python.min.js
@@ -0,0 +1,41 @@
+/*! `python` grammar compiled for Highlight.js 11.9.0 */
+(()=>{var e=(()=>{"use strict";return e=>{
+const n=e.regex,a=/[\p{XID_Start}_]\p{XID_Continue}*/u,i=["and","as","assert","async","await","break","case","class","continue","def","del","elif","else","except","finally","for","from","global","if","import","in","is","lambda","match","nonlocal|10","not","or","pass","raise","return","try","while","with","yield"],s={
+$pattern:/[A-Za-z]\w+|__\w+__/,keyword:i,
+built_in:["__import__","abs","all","any","ascii","bin","bool","breakpoint","bytearray","bytes","callable","chr","classmethod","compile","complex","delattr","dict","dir","divmod","enumerate","eval","exec","filter","float","format","frozenset","getattr","globals","hasattr","hash","help","hex","id","input","int","isinstance","issubclass","iter","len","list","locals","map","max","memoryview","min","next","object","oct","open","ord","pow","print","property","range","repr","reversed","round","set","setattr","slice","sorted","staticmethod","str","sum","super","tuple","type","vars","zip"],
+literal:["__debug__","Ellipsis","False","None","NotImplemented","True"],
+type:["Any","Callable","Coroutine","Dict","List","Literal","Generic","Optional","Sequence","Set","Tuple","Type","Union"]
+},t={className:"meta",begin:/^(>>>|\.\.\.) /},r={className:"subst",begin:/\{/,
+end:/\}/,keywords:s,illegal:/#/},l={begin:/\{\{/,relevance:0},b={
+className:"string",contains:[e.BACKSLASH_ESCAPE],variants:[{
+begin:/([uU]|[bB]|[rR]|[bB][rR]|[rR][bB])?'''/,end:/'''/,
+contains:[e.BACKSLASH_ESCAPE,t],relevance:10},{
+begin:/([uU]|[bB]|[rR]|[bB][rR]|[rR][bB])?"""/,end:/"""/,
+contains:[e.BACKSLASH_ESCAPE,t],relevance:10},{
+begin:/([fF][rR]|[rR][fF]|[fF])'''/,end:/'''/,
+contains:[e.BACKSLASH_ESCAPE,t,l,r]},{begin:/([fF][rR]|[rR][fF]|[fF])"""/,
+end:/"""/,contains:[e.BACKSLASH_ESCAPE,t,l,r]},{begin:/([uU]|[rR])'/,end:/'/,
+relevance:10},{begin:/([uU]|[rR])"/,end:/"/,relevance:10},{
+begin:/([bB]|[bB][rR]|[rR][bB])'/,end:/'/},{begin:/([bB]|[bB][rR]|[rR][bB])"/,
+end:/"/},{begin:/([fF][rR]|[rR][fF]|[fF])'/,end:/'/,
+contains:[e.BACKSLASH_ESCAPE,l,r]},{begin:/([fF][rR]|[rR][fF]|[fF])"/,end:/"/,
+contains:[e.BACKSLASH_ESCAPE,l,r]},e.APOS_STRING_MODE,e.QUOTE_STRING_MODE]
+},o="[0-9](_?[0-9])*",c=`(\\b(${o}))?\\.(${o})|\\b(${o})\\.`,d="\\b|"+i.join("|"),g={
+className:"number",relevance:0,variants:[{
+begin:`(\\b(${o})|(${c}))[eE][+-]?(${o})[jJ]?(?=${d})`},{begin:`(${c})[jJ]?`},{
+begin:`\\b([1-9](_?[0-9])*|0+(_?0)*)[lLjJ]?(?=${d})`},{
+begin:`\\b0[bB](_?[01])+[lL]?(?=${d})`},{begin:`\\b0[oO](_?[0-7])+[lL]?(?=${d})`
+},{begin:`\\b0[xX](_?[0-9a-fA-F])+[lL]?(?=${d})`},{begin:`\\b(${o})[jJ](?=${d})`
+}]},p={className:"comment",begin:n.lookahead(/# type:/),end:/$/,keywords:s,
+contains:[{begin:/# type:/},{begin:/#/,end:/\b\B/,endsWithParent:!0}]},m={
+className:"params",variants:[{className:"",begin:/\(\s*\)/,skip:!0},{begin:/\(/,
+end:/\)/,excludeBegin:!0,excludeEnd:!0,keywords:s,
+contains:["self",t,g,b,e.HASH_COMMENT_MODE]}]};return r.contains=[b,g,t],{
+name:"Python",aliases:["py","gyp","ipython"],unicodeRegex:!0,keywords:s,
+illegal:/(<\/|\?)|=>/,contains:[t,g,{begin:/\bself\b/},{beginKeywords:"if",
+relevance:0},b,p,e.HASH_COMMENT_MODE,{match:[/\bdef/,/\s+/,a],scope:{
+1:"keyword",3:"title.function"},contains:[m]},{variants:[{
+match:[/\bclass/,/\s+/,a,/\s*/,/\(\s*/,a,/\s*\)/]},{match:[/\bclass/,/\s+/,a]}],
+scope:{1:"keyword",3:"title.class",6:"title.class.inherited"}},{
+className:"meta",begin:/^[\t ]*@/,end:/(?=#)|$/,contains:[g,m,b]}]}}})()
+;hljs.registerLanguage("python",e)})();
--- a/channel/web/static/vendor/highlightjs/styles/github-dark.min.css
+++ b/channel/web/static/vendor/highlightjs/styles/github-dark.min.css
@@ -0,0 +1,10 @@
+pre code.hljs{display:block;overflow-x:auto;padding:1em}code.hljs{padding:3px 5px}/*!
+  Theme: GitHub Dark
+  Description: Dark theme as seen on github.com
+  Author: github.com
+  Maintainer: @Hirse
+  Updated: 2021-05-15
+
+  Outdated base version: https://github.com/primer/github-syntax-dark
+  Current colors taken from GitHub's CSS
+*/.hljs{color:#c9d1d9;background:#0d1117}.hljs-doctag,.hljs-keyword,.hljs-meta .hljs-keyword,.hljs-template-tag,.hljs-template-variable,.hljs-type,.hljs-variable.language_{color:#ff7b72}.hljs-title,.hljs-title.class_,.hljs-title.class_.inherited__,.hljs-title.function_{color:#d2a8ff}.hljs-attr,.hljs-attribute,.hljs-literal,.hljs-meta,.hljs-number,.hljs-operator,.hljs-selector-attr,.hljs-selector-class,.hljs-selector-id,.hljs-variable{color:#79c0ff}.hljs-meta .hljs-string,.hljs-regexp,.hljs-string{color:#a5d6ff}.hljs-built_in,.hljs-symbol{color:#ffa657}.hljs-code,.hljs-comment,.hljs-formula{color:#8b949e}.hljs-name,.hljs-quote,.hljs-selector-pseudo,.hljs-selector-tag{color:#7ee787}.hljs-subst{color:#c9d1d9}.hljs-section{color:#1f6feb;font-weight:700}.hljs-bullet{color:#f2cc60}.hljs-emphasis{color:#c9d1d9;font-style:italic}.hljs-strong{color:#c9d1d9;font-weight:700}.hljs-addition{color:#aff5b4;background-color:#033a16}.hljs-deletion{color:#ffdcd7;background-color:#67060c}
--- a/channel/web/static/vendor/highlightjs/styles/github.min.css
+++ b/channel/web/static/vendor/highlightjs/styles/github.min.css
@@ -0,0 +1,10 @@
+pre code.hljs{display:block;overflow-x:auto;padding:1em}code.hljs{padding:3px 5px}/*!
+  Theme: GitHub
+  Description: Light theme as seen on github.com
+  Author: github.com
+  Maintainer: @Hirse
+  Updated: 2021-05-15
+
+  Outdated base version: https://github.com/primer/github-syntax-light
+  Current colors taken from GitHub's CSS
+*/.hljs{color:#24292e;background:#fff}.hljs-doctag,.hljs-keyword,.hljs-meta .hljs-keyword,.hljs-template-tag,.hljs-template-variable,.hljs-type,.hljs-variable.language_{color:#d73a49}.hljs-title,.hljs-title.class_,.hljs-title.class_.inherited__,.hljs-title.function_{color:#6f42c1}.hljs-attr,.hljs-attribute,.hljs-literal,.hljs-meta,.hljs-number,.hljs-operator,.hljs-selector-attr,.hljs-selector-class,.hljs-selector-id,.hljs-variable{color:#005cc5}.hljs-meta .hljs-string,.hljs-regexp,.hljs-string{color:#032f62}.hljs-built_in,.hljs-symbol{color:#e36209}.hljs-code,.hljs-comment,.hljs-formula{color:#6a737d}.hljs-name,.hljs-quote,.hljs-selector-pseudo,.hljs-selector-tag{color:#22863a}.hljs-subst{color:#24292e}.hljs-section{color:#005cc5;font-weight:700}.hljs-bullet{color:#735c0f}.hljs-emphasis{color:#24292e;font-style:italic}.hljs-strong{color:#24292e;font-weight:700}.hljs-addition{color:#22863a;background-color:#f0fff4}.hljs-deletion{color:#b31d28;background-color:#ffeef0}
--- a/channel/web/static/vendor/markdown-it/markdown-it.min.js
+++ b/channel/web/static/vendor/markdown-it/markdown-it.min.js
--- a/channel/web/static/vendor/tailwind/tailwind.min.js
+++ b/channel/web/static/vendor/tailwind/tailwind.min.js
--- a/channel/web/web_channel.py
+++ b/channel/web/web_channel.py
--- a/channel/wecom_bot/wecom_bot_channel.py
+++ b/channel/wecom_bot/wecom_bot_channel.py
@@ -34,9 +34,55 @@ HEARTBEAT_INTERVAL = 30
 MEDIA_CHUNK_SIZE = 512 * 1024  # 512KB per chunk (before base64 encoding)


+def _escape_control_chars_inside_json_strings(s: str) -> str:
+    """Escape U+0000–U+001F inside JSON string values so json.loads accepts WeCom payloads.
+
+    The server occasionally emits raw newlines/tabs inside quoted fields, which is
+    invalid strict JSON but recoverable without touching escapes like \\n or \\".
+    """
+    out = []
+    in_string = False
+    escape = False
+    for c in s:
+        if escape:
+            out.append(c)
+            escape = False
+            continue
+        if in_string and c == "\\":
+            out.append(c)
+            escape = True
+            continue
+        if c == '"':
+            out.append(c)
+            in_string = not in_string
+            continue
+        if in_string and ord(c) < 32:
+            out.append("\\u%04x" % ord(c))
+            continue
+        out.append(c)
+    return "".join(out)
+
+
+def _loads_wecom_ws_json(raw):
+    """Parse WebSocket JSON; tolerate unescaped control characters inside strings."""
+    if isinstance(raw, bytes):
+        raw = raw.decode("utf-8", errors="replace")
+    if not isinstance(raw, str):
+        raw = str(raw)
+    try:
+        return json.loads(raw)
+    except json.JSONDecodeError as e:
+        msg = str(e).lower()
+        if "control character" in msg:
+            return json.loads(_escape_control_chars_inside_json_strings(raw))
+        raise
+
+
@singleton
 class WecomBotChannel(ChatChannel):

+    NOT_SUPPORT_REPLYTYPE = []
+
    def __init__(self):
        super().__init__()
        self.bot_id = ""
@@ -93,7 +139,7 @@ class WecomBotChannel(ChatChannel):

        def _on_message(ws, raw):
            try:
-                data = json.loads(raw)
+                data = _loads_wecom_ws_json(raw)
                self._handle_ws_message(data)
            except Exception as e:
                logger.error(f"[WecomBot] Failed to handle ws message: {e}", exc_info=True)
@@ -428,6 +474,8 @@ class WecomBotChannel(ChatChannel):
            else:
                context.type = ContextType.TEXT
            context.content = content.strip()
+            if "desire_rtype" not in context and conf().get("always_reply_voice"):
+                context["desire_rtype"] = ReplyType.VOICE

        return context

@@ -454,6 +502,8 @@ class WecomBotChannel(ChatChannel):
            self._send_file(reply.content, receiver, is_group, req_id)
        elif reply.type == ReplyType.VIDEO or reply.type == ReplyType.VIDEO_URL:
            self._send_file(reply.content, receiver, is_group, req_id, media_type="video")
+        elif reply.type == ReplyType.VOICE:
+            self._send_voice(reply.content, receiver, is_group, req_id)
        else:
            logger.warning(f"[WecomBot] Unsupported reply type: {reply.type}, falling back to text")
            self._send_text(str(reply.content), receiver, is_group, req_id)
@@ -686,6 +736,65 @@ class WecomBotChannel(ChatChannel):
                },
            })

+    def _send_voice(self, voice_path: str, receiver: str, is_group: bool, req_id: str = None):
+        """Send native voice reply. WeCom voice media must be amr."""
+        local_path = voice_path
+        if local_path.startswith("file://"):
+            local_path = local_path[7:]
+
+        if local_path.startswith(("http://", "https://")):
+            try:
+                resp = requests.get(local_path, timeout=60)
+                resp.raise_for_status()
+                ext = os.path.splitext(local_path)[1] or ".mp3"
+                tmp_path = f"/tmp/wecom_voice_{uuid.uuid4().hex[:8]}{ext}"
+                with open(tmp_path, "wb") as f:
+                    f.write(resp.content)
+                local_path = tmp_path
+            except Exception as e:
+                logger.error(f"[WecomBot] Failed to download voice for sending: {e}")
+                return
+
+        if not os.path.exists(local_path):
+            logger.error(f"[WecomBot] Voice file not found: {local_path}")
+            return
+
+        amr_path = local_path
+        if not local_path.lower().endswith(".amr"):
+            try:
+                from voice.audio_convert import any_to_amr
+                amr_path = os.path.splitext(local_path)[0] + ".amr"
+                any_to_amr(local_path, amr_path)
+            except Exception as e:
+                logger.error(f"[WecomBot] Failed to convert voice to amr: {e}")
+                return
+
+        media_id = self._upload_media(amr_path, "voice")
+        if not media_id:
+            logger.error("[WecomBot] Failed to upload voice media")
+            return
+
+        if req_id:
+            self._ws_send({
+                "cmd": "aibot_respond_msg",
+                "headers": {"req_id": req_id},
+                "body": {
+                    "msgtype": "voice",
+                    "voice": {"media_id": media_id},
+                },
+            })
+        else:
+            self._ws_send({
+                "cmd": "aibot_send_msg",
+                "headers": {"req_id": self._gen_req_id()},
+                "body": {
+                    "chatid": receiver,
+                    "chat_type": 2 if is_group else 1,
+                    "msgtype": "voice",
+                    "voice": {"media_id": media_id},
+                },
+            })
+
    def _active_send_markdown(self, content: str, receiver: str, is_group: bool):
        """Proactively send markdown message (for scheduled tasks, no req_id)."""
        self._ws_send({
--- a/channel/weixin/weixin_channel.py
+++ b/channel/weixin/weixin_channel.py
@@ -60,6 +60,9 @@ def _save_credentials(cred_path: str, data: dict):
@singleton
 class WeixinChannel(ChatChannel):

+    # ilink bot protocol has no outbound voice item; deliver TTS as a file.
+    NOT_SUPPORT_REPLYTYPE = []
+
    LOGIN_STATUS_IDLE = "idle"
    LOGIN_STATUS_WAITING = "waiting_scan"
    LOGIN_STATUS_SCANNED = "scanned"
@@ -464,6 +467,14 @@ class WeixinChannel(ChatChannel):
            else:
                context.type = ContextType.TEXT
            context.content = content.strip()
+            if "desire_rtype" not in context and conf().get("always_reply_voice"):
+                context["desire_rtype"] = ReplyType.VOICE
+
+        elif ctype == ContextType.VOICE:
+            if "desire_rtype" not in context and (
+                conf().get("voice_reply_voice") or conf().get("always_reply_voice")
+            ):
+                context["desire_rtype"] = ReplyType.VOICE

        return context

@@ -486,6 +497,9 @@ class WeixinChannel(ChatChannel):
            self._send_file(reply.content, receiver, context_token)
        elif reply.type in (ReplyType.VIDEO, ReplyType.VIDEO_URL):
            self._send_video(reply.content, receiver, context_token)
+        elif reply.type == ReplyType.VOICE:
+            # ilink has no outbound voice item; deliver TTS as a file attachment.
+            self._send_file(reply.content, receiver, context_token)
        else:
            logger.warning(f"[Weixin] Unsupported reply type: {reply.type}, fallback to text")
            self._send_text(str(reply.content), receiver, context_token)
--- a/cli/VERSION
+++ b/cli/VERSION
@@ -1 +1 @@
-2.0.5
+2.0.9
--- a/cli/cli.py
+++ b/cli/cli.py
@@ -6,6 +6,7 @@ from cli.commands.skill import skill
 from cli.commands.process import start, stop, restart, update, status, logs
 from cli.commands.context import context
 from cli.commands.install import install_browser
+from cli.commands.knowledge import knowledge


 HELP_TEXT = """Usage: cow COMMAND [ARGS]...
@@ -22,9 +23,11 @@ Commands:
  status   Show CowAgent running status.
  logs     View CowAgent logs.
  skill    Manage CowAgent skills.
+  knowledge  Manage knowledge base.
  install-browser  Install browser tool (Playwright + Chromium).

-Tip: You can also send /help, /skill list, etc. in agent chat."""
+Tip: Memory index management lives in chat — send /memory status or
+/memory rebuild-index to the running agent."""


 class CowCLI(click.Group):
@@ -69,6 +72,7 @@ main.add_command(update)
 main.add_command(status)
 main.add_command(logs)
 main.add_command(context)
+main.add_command(knowledge)
 main.add_command(install_browser)


--- a/cli/commands/knowledge.py
+++ b/cli/commands/knowledge.py
@@ -0,0 +1,121 @@
+"""cow knowledge - Knowledge base management commands."""
+
+import os
+
+import click
+
+from cli.utils import get_project_root
+
+
+def _get_knowledge_dir():
+    """Resolve the knowledge directory path from config or default."""
+    try:
+        import sys
+        sys.path.insert(0, get_project_root())
+        from config import conf
+        from common.utils import expand_path
+        workspace = expand_path(conf().get("agent_workspace", "~/cow"))
+    except Exception:
+        workspace = os.path.expanduser("~/cow")
+    return os.path.join(workspace, "knowledge")
+
+
+def _get_knowledge_enabled():
+    try:
+        import sys
+        sys.path.insert(0, get_project_root())
+        from config import conf
+        return conf().get("knowledge", True)
+    except Exception:
+        return True
+
+
+@click.group(invoke_without_command=True)
+@click.pass_context
+def knowledge(ctx):
+    """Manage CowAgent knowledge base."""
+    if ctx.invoked_subcommand is None:
+        click.echo(_stats())
+
+
+@knowledge.command("list")
+def knowledge_list():
+    """Display knowledge base file tree."""
+    click.echo(_tree())
+
+
+def _stats() -> str:
+    knowledge_dir = _get_knowledge_dir()
+    if not os.path.isdir(knowledge_dir):
+        return "Knowledge base directory not found."
+
+    enabled = _get_knowledge_enabled()
+    total_files = 0
+    total_bytes = 0
+    cat_count = {}
+
+    for root, dirs, files in os.walk(knowledge_dir):
+        dirs[:] = [d for d in dirs if not d.startswith(".")]
+        rel_root = os.path.relpath(root, knowledge_dir)
+        category = rel_root.split(os.sep)[0] if rel_root != "." else "root"
+        for f in files:
+            if f.endswith(".md") and f not in ("index.md", "log.md"):
+                total_files += 1
+                total_bytes += os.path.getsize(os.path.join(root, f))
+                cat_count[category] = cat_count.get(category, 0) + 1
+
+    status_icon = click.style("enabled", fg="green") if enabled else click.style("disabled", fg="red")
+    lines = [
+        f"\n  Knowledge Base  [{status_icon}]",
+        "",
+        f"  Pages:  {total_files}",
+        f"  Size:   {total_bytes / 1024:.1f} KB",
+        "",
+    ]
+    if cat_count:
+        lines.append("  Categories:")
+        for cat in sorted(cat_count.keys()):
+            lines.append(f"    {cat}/  ({cat_count[cat]} pages)")
+        lines.append("")
+
+    lines.append(f"  Path: {knowledge_dir}")
+    lines.append("")
+    return "\n".join(lines)
+
+
+def _tree() -> str:
+    knowledge_dir = _get_knowledge_dir()
+    if not os.path.isdir(knowledge_dir):
+        return "Knowledge base directory not found."
+
+    tree_lines = ["  knowledge/"]
+
+    subdirs = sorted([
+        d for d in os.listdir(knowledge_dir)
+        if os.path.isdir(os.path.join(knowledge_dir, d)) and not d.startswith(".")
+    ])
+
+    for i, subdir in enumerate(subdirs):
+        is_last_dir = (i == len(subdirs) - 1)
+        branch = "└── " if is_last_dir else "├── "
+        subdir_path = os.path.join(knowledge_dir, subdir)
+        md_files = sorted([
+            f for f in os.listdir(subdir_path)
+            if f.endswith(".md") and not f.startswith(".")
+        ])
+        tree_lines.append(f"  {branch}{subdir}/ ({len(md_files)})")
+
+        child_prefix = "      " if is_last_dir else "  │   "
+        max_show = 15
+        for j, fname in enumerate(md_files[:max_show]):
+            is_last_file = (j == len(md_files[:max_show]) - 1) and len(md_files) <= max_show
+            fb = "└── " if is_last_file else "├── "
+            name = fname.replace(".md", "")
+            tree_lines.append(f"{child_prefix}{fb}{name}")
+        if len(md_files) > max_show:
+            tree_lines.append(f"{child_prefix}└── ... +{len(md_files) - max_show} more")
+
+    if not subdirs:
+        tree_lines.append("  (empty)")
+
+    return "\n" + "\n".join(tree_lines) + "\n"
--- a/cli/commands/process.py
+++ b/cli/commands/process.py
@@ -269,7 +269,7 @@ def status():
            channel = ", ".join(channel)
        click.echo(f"  通道: {channel}")
        click.echo(f"  模型: {cfg.get('model', 'unknown')}")
-        mode = "Agent" if cfg.get("agent") else "Chat"
+        mode = "Chat" if cfg.get("agent") is False else "Agent"
        click.echo(f"  模式: {mode}")


--- a/cli/commands/skill.py
+++ b/cli/commands/skill.py
@@ -644,32 +644,52 @@ def _list_local():
    skills_dir = get_skills_dir()
    builtin_dir = get_builtin_skills_dir()

+    # Merge builtin skills that are on disk but missing from config
+    _merge_builtin_into_config(config, builtin_dir, skills_dir)
+
    if not config:
-        # Fallback: scan directories directly
-        entries = []
-        for d in [builtin_dir, skills_dir]:
-            if not os.path.isdir(d):
-                continue
-            source = "builtin" if d == builtin_dir else "custom"
-            for name in sorted(os.listdir(d)):
-                skill_path = os.path.join(d, name)
-                if os.path.isdir(skill_path) and not name.startswith("."):
-                    has_skill_md = os.path.exists(os.path.join(skill_path, "SKILL.md"))
-                    if has_skill_md:
-                        entries.append({"name": name, "source": source, "enabled": True, "description": ""})
-        if not entries:
-            click.echo("No skills installed.")
-            return
-        _print_skill_table(entries)
+        click.echo("No skills installed.")
        return

    entries = sorted(config.values(), key=lambda x: x.get("name", ""))
-    if not entries:
-        click.echo("No skills installed.")
-        return
    _print_skill_table(entries)


+def _merge_builtin_into_config(config: dict, builtin_dir: str, skills_dir: str):
+    """Scan builtin and custom dirs, add any new skills into config dict."""
+    dirty = False
+    for d, source in [(builtin_dir, "builtin"), (skills_dir, "custom")]:
+        if not os.path.isdir(d):
+            continue
+        for name in os.listdir(d):
+            if name.startswith(".") or name in ("skills_config.json",):
+                continue
+            skill_path = os.path.join(d, name)
+            if not os.path.isdir(skill_path):
+                continue
+            if not os.path.isfile(os.path.join(skill_path, "SKILL.md")):
+                continue
+            if name in config:
+                continue
+            desc = _read_skill_description(skill_path)
+            config[name] = {
+                "name": name,
+                "description": desc,
+                "source": source,
+                "enabled": True,
+                "category": "skill",
+            }
+            dirty = True
+    if dirty:
+        config_path = os.path.join(skills_dir, "skills_config.json")
+        try:
+            os.makedirs(skills_dir, exist_ok=True)
+            with open(config_path, "w", encoding="utf-8") as f:
+                json.dump(config, f, indent=4, ensure_ascii=False)
+        except Exception:
+            pass
+
+
 def _print_skill_table(entries):
    """Print skills as a formatted table."""
    def _display_label(e):
--- a/common/cloud_client.py
+++ b/common/cloud_client.py
@@ -54,7 +54,9 @@ class CloudClient(LinkAIClient):
        self.channel_mgr = None
        self._skill_service = None
        self._memory_service = None
+        self._knowledge_service = None
        self._chat_service = None
+        self._session_service = None

    @property
    def skill_service(self):
@@ -88,6 +90,21 @@ class CloudClient(LinkAIClient):
                logger.error(f"[CloudClient] Failed to init MemoryService: {e}")
        return self._memory_service

+    @property
+    def knowledge_service(self):
+        """Lazy-init KnowledgeService."""
+        if self._knowledge_service is None:
+            try:
+                from agent.knowledge.service import KnowledgeService
+                from config import conf
+                from common.utils import expand_path
+                workspace_root = expand_path(conf().get("agent_workspace", "~/cow"))
+                self._knowledge_service = KnowledgeService(workspace_root)
+                logger.debug("[CloudClient] KnowledgeService initialised")
+            except Exception as e:
+                logger.error(f"[CloudClient] Failed to init KnowledgeService: {e}")
+        return self._knowledge_service
+
    @property
    def chat_service(self):
        """Lazy-init ChatService (requires AgentBridge via Bridge singleton)."""
@@ -102,6 +119,18 @@ class CloudClient(LinkAIClient):
                logger.error(f"[CloudClient] Failed to init ChatService: {e}")
        return self._chat_service

+    @property
+    def session_service(self):
+        """Lazy-init SessionService."""
+        if self._session_service is None:
+            try:
+                from agent.chat.session_service import SessionService
+                self._session_service = SessionService()
+                logger.debug("[CloudClient] SessionService initialised")
+            except Exception as e:
+                logger.error(f"[CloudClient] Failed to init SessionService: {e}")
+        return self._session_service
+
    # ------------------------------------------------------------------
    # message push callback
    # ------------------------------------------------------------------
@@ -468,6 +497,27 @@ class CloudClient(LinkAIClient):

        return svc.dispatch(action, payload)

+    # ------------------------------------------------------------------
+    # knowledge callback
+    # ------------------------------------------------------------------
+    def on_knowledge(self, data: dict) -> dict:
+        """
+        Handle KNOWLEDGE messages from the cloud console.
+        Delegates to KnowledgeService.dispatch for the actual operations.
+
+        :param data: message data with 'action', 'clientId', 'payload'
+        :return: response dict
+        """
+        action = data.get("action", "")
+        payload = data.get("payload")
+        logger.info(f"[CloudClient] on_knowledge: action={action}")
+
+        svc = self.knowledge_service
+        if svc is None:
+            return {"action": action, "code": 500, "message": "KnowledgeService not available", "payload": None}
+
+        return svc.dispatch(action, payload)
+
    # ------------------------------------------------------------------
    # chat callback
    # ------------------------------------------------------------------
@@ -509,12 +559,23 @@ class CloudClient(LinkAIClient):
    # ------------------------------------------------------------------
    # history callback
    # ------------------------------------------------------------------
+    # Session-related actions handled via the HISTORY channel
+    _SESSION_ACTIONS = {
+        "list_sessions", "delete_session", "rename_session",
+        "clear_context", "generate_title",
+    }
+
    def on_history(self, data: dict) -> dict:
        """
        Handle HISTORY messages from the cloud console.
-        Returns paginated conversation history for a session.

-        :param data: message data with 'action' and 'payload' (session_id, page, page_size)
+        Supports both history query and session management actions
+        through a unified HISTORY message channel:
+          - query: paginated conversation history
+          - list_sessions / delete_session / rename_session /
+            clear_context / generate_title: session lifecycle
+
+        :param data: message data with 'action' and 'payload'
        :return: response dict
        """
        action = data.get("action", "query")
@@ -524,8 +585,19 @@ class CloudClient(LinkAIClient):
        if action == "query":
            return self._query_history(payload)

+        if action in self._SESSION_ACTIONS:
+            return self._dispatch_session(action, payload)
+
        return {"action": action, "code": 404, "message": f"unknown action: {action}", "payload": None}

+    def _dispatch_session(self, action: str, payload: dict) -> dict:
+        """Delegate session actions to SessionService."""
+        svc = self.session_service
+        if svc is None:
+            return {"action": action, "code": 500,
+                    "message": "SessionService not available", "payload": None}
+        return svc.dispatch(action, payload)
+
    def _query_history(self, payload: dict) -> dict:
        """Query paginated conversation history using ConversationStore."""
        session_id = payload.get("session_id", "")
--- a/common/const.py
+++ b/common/const.py
@@ -3,6 +3,7 @@ OPEN_AI = "openAI"
 OPENAI = "openai"
 CHATGPT = "chatGPT"  # legacy alias for OPENAI, kept for backward compatibility
 BAIDU = "baidu"
+QIANFAN = "qianfan"
 XUNFEI = "xunfei"
 CHATGPTONAZURE = "chatGPTOnAzure"
 LINKAI = "linkai"
@@ -14,6 +15,7 @@ ZHIPU_AI = "zhipu"
 MOONSHOT = "moonshot"
 MiniMax = "minimax"
 DEEPSEEK = "deepseek"
+CUSTOM = "custom"  # custom OpenAI-compatible API, bot_type won't auto-switch on model change
 MODELSCOPE = "modelscope"

 # 模型列表
@@ -27,6 +29,7 @@ CLAUDE_35_SONNET = "claude-3-5-sonnet-latest"  # 带 latest 标签的模型名
 CLAUDE_35_SONNET_1022 = "claude-3-5-sonnet-20241022"  # 带具体日期的模型名称，会固定为该日期发布的模型
 CLAUDE_35_SONNET_0620 = "claude-3-5-sonnet-20240620"
 CLAUDE_4_OPUS = "claude-opus-4-0"
+CLAUDE_4_7_OPUS = "claude-opus-4-7"      # Claude Opus 4.7
 CLAUDE_4_6_OPUS = "claude-opus-4-6"      # Claude Opus 4.6 - Agent推荐模型
 CLAUDE_4_SONNET = "claude-sonnet-4-0"    # Claude Sonnet 4.0
 CLAUDE_4_5_SONNET = "claude-sonnet-4-5"  # Claude Sonnet 4.5 - Agent推荐模型
@@ -44,6 +47,7 @@ GEMINI_3_FLASH_PRE = "gemini-3-flash-preview"  # Gemini 3 Flash Preview - Agent
 GEMINI_3_PRO_PRE = "gemini-3-pro-preview"  # Gemini 3 Pro Preview
 GEMINI_31_PRO_PRE = "gemini-3.1-pro-preview"  # Gemini 3.1 Pro Preview - Agent推荐模型
 GEMINI_31_FLASH_LITE_PRE = "gemini-3.1-flash-lite-preview"  # Gemini 3.1 Flash Lite Preview - Agent推荐模型
+GEMINI_35_FLASH = "gemini-3.5-flash"  # Gemini 3.5 Flash - Agent推荐模型

 # OpenAI
 GPT35 = "gpt-3.5-turbo"
@@ -71,6 +75,7 @@ GPT_5_NANO = "gpt-5-nano"
 GPT_54 = "gpt-5.4"  # GPT-5.4 - Agent recommended model
 GPT_54_MINI = "gpt-5.4-mini"
 GPT_54_NANO = "gpt-5.4-nano"
+GPT_55 = "gpt-5.5"  # GPT-5.5 - top-tier (expensive), not default
 O1 = "o1-preview"
 O1_MINI = "o1-mini"
 WHISPER_1 = "whisper-1"
@@ -80,6 +85,18 @@ TTS_1_HD = "tts-1-hd"
 # DeepSeek
 DEEPSEEK_CHAT = "deepseek-chat"  # DeepSeek-V3对话模型
 DEEPSEEK_REASONER = "deepseek-reasoner"  # DeepSeek-R1模型
+DEEPSEEK_V4_FLASH = "deepseek-v4-flash"  # DeepSeek V4 Flash - 默认推荐 (思考模式 + 工具调用)
+DEEPSEEK_V4_PRO = "deepseek-v4-pro"  # DeepSeek V4 Pro - 复杂任务更强 (思考模式 + 工具调用)
+
+# Baidu Qianfan / ERNIE
+ERNIE_5_1 = "ernie-5.1"  # ERNIE 5.1 - default recommendation, latest flagship
+ERNIE_5 = "ernie-5.0"  # ERNIE 5.0
+ERNIE_X1_1 = "ernie-x1.1"  # ERNIE X1.1 - reasoning-focused, multimodal
+ERNIE_45_TURBO_128K = "ernie-4.5-turbo-128k"
+ERNIE_45_TURBO_32K = "ernie-4.5-turbo-32k"
+ERNIE_4_TURBO_8K = "ERNIE-4.0-Turbo-8K"
+ERNIE_45_TURBO_VL = "ernie-4.5-turbo-vl"
+ERNIE_45_TURBO_VL_32K = "ernie-4.5-turbo-vl-32k"

 # Qwen (通义千问 - 阿里云 DashScope)
 QWEN_TURBO = "qwen-turbo"
@@ -89,10 +106,13 @@ QWEN_LONG = "qwen-long"
 QWEN3_MAX = "qwen3-max"  # Qwen3 Max - Agent推荐模型
 QWEN35_PLUS = "qwen3.5-plus"  # Qwen3.5 Plus - Omni model (MultiModalConversation)
 QWEN36_PLUS = "qwen3.6-plus"  # Qwen3.6 Plus - Omni model (MultiModalConversation)
+QWEN37_MAX = "qwen3.7-max"  # Qwen3.7 Max - Agent推荐模型
 QWQ_PLUS = "qwq-plus"

 # MiniMax
 MINIMAX_M2_7 = "MiniMax-M2.7"  # MiniMax M2.7 - Latest
+MINIMAX_TEXT_01 = "MiniMax-Text-01"  # MiniMax 多模态 (vision)
+MINIMAX_M2_7_HIGHSPEED = "MiniMax-M2.7-highspeed"  # MiniMax M2.7 highspeed
 MINIMAX_M2_5 = "MiniMax-M2.5"  # MiniMax M2.5
 MINIMAX_M2_1 = "MiniMax-M2.1"  # MiniMax M2.1
 MINIMAX_M2_1_LIGHTNING = "MiniMax-M2.1-lightning"  # MiniMax M2.1 极速版
@@ -100,8 +120,10 @@ MINIMAX_M2 = "MiniMax-M2"  # MiniMax M2
 MINIMAX_ABAB6_5 = "abab6.5-chat"  # MiniMax abab6.5

 # GLM (智谱AI)
-GLM_5_TURBO = "glm-5-turbo"  # 智谱 GLM-5-Turbo - Latest
+GLM_5_1 = "glm-5.1"  # 智谱 GLM-5.1 - Agent recommended model (default)
+GLM_5_TURBO = "glm-5-turbo"  # 智谱 GLM-5-Turbo
 GLM_5 = "glm-5"  # 智谱 GLM-5
+GLM_5V_TURBO = "glm-5v-turbo"  # 智谱多模态 (vision)
 GLM_4 = "glm-4"
 GLM_4_PLUS = "glm-4-plus"
 GLM_4_flash = "glm-4-flash"
@@ -116,6 +138,7 @@ GLM_4_7 = "glm-4.7"  # 智谱 GLM-4.7 - Agent推荐模型
 MOONSHOT = "moonshot"
 KIMI_K2 = "kimi-k2"
 KIMI_K2_5 = "kimi-k2.5"
+KIMI_K2_6 = "kimi-k2.6"  # Kimi K2.6 - Agent recommended model (default)

 # Doubao (Volcengine Ark)
 DOUBAO = "doubao"
@@ -149,15 +172,25 @@ MODELSCOPE_MODEL_LIST = ["deepseek-ai/DeepSeek-R1-0528", "deepseek-ai/DeepSeek-R


 MODEL_LIST = [
+              # DeepSeek
+              DEEPSEEK_V4_FLASH, DEEPSEEK_V4_PRO, DEEPSEEK_CHAT, DEEPSEEK_REASONER,
+
+              # Baidu Qianfan / ERNIE
+              QIANFAN, ERNIE_5_1, ERNIE_5, ERNIE_X1_1, ERNIE_45_TURBO_128K, ERNIE_45_TURBO_32K, ERNIE_4_TURBO_8K,
+              ERNIE_45_TURBO_VL, ERNIE_45_TURBO_VL_32K,
+
+              # MiniMax
+              MiniMax, MINIMAX_M2_7, MINIMAX_M2_7_HIGHSPEED, MINIMAX_M2_5, MINIMAX_M2_1, MINIMAX_M2_1_LIGHTNING, MINIMAX_M2, MINIMAX_ABAB6_5,
+
              # Claude
-              CLAUDE3, CLAUDE_4_6_SONNET, CLAUDE_4_6_OPUS, CLAUDE_4_OPUS, CLAUDE_4_5_SONNET, CLAUDE_4_SONNET, CLAUDE_3_OPUS, CLAUDE_3_OPUS_0229, 
-              CLAUDE_35_SONNET, CLAUDE_35_SONNET_1022, CLAUDE_35_SONNET_0620, CLAUDE_3_SONNET, CLAUDE_3_HAIKU, 
+              CLAUDE3, CLAUDE_4_6_SONNET, CLAUDE_4_7_OPUS, CLAUDE_4_6_OPUS, CLAUDE_4_OPUS, CLAUDE_4_5_SONNET, CLAUDE_4_SONNET, CLAUDE_3_OPUS, CLAUDE_3_OPUS_0229,
+              CLAUDE_35_SONNET, CLAUDE_35_SONNET_1022, CLAUDE_35_SONNET_0620, CLAUDE_3_SONNET, CLAUDE_3_HAIKU,
              "claude", "claude-3-haiku", "claude-3-sonnet", "claude-3-opus", "claude-3.5-sonnet",
-              
+
              # Gemini
-              GEMINI_31_FLASH_LITE_PRE, GEMINI_31_PRO_PRE, GEMINI_3_PRO_PRE, GEMINI_3_FLASH_PRE, GEMINI_25_PRO_PRE, GEMINI_25_FLASH_PRE,
+              GEMINI_35_FLASH, GEMINI_31_FLASH_LITE_PRE, GEMINI_31_PRO_PRE, GEMINI_3_PRO_PRE, GEMINI_3_FLASH_PRE, GEMINI_25_PRO_PRE, GEMINI_25_FLASH_PRE,
              GEMINI_20_FLASH, GEMINI_20_flash_exp, GEMINI_15_PRO, GEMINI_15_flash, GEMINI_PRO, GEMINI,
-              
+
              # OpenAI
              GPT35, GPT35_0125, GPT35_1106, "gpt-3.5-turbo-16k",
              GPT4, GPT4_06_13, GPT4_32k, GPT4_32k_06_13,
@@ -165,33 +198,31 @@ MODEL_LIST = [
              GPT_4o, GPT_4O_0806, GPT_4o_MINI,
              GPT_41, GPT_41_MINI, GPT_41_NANO,
              GPT_5, GPT_5_MINI, GPT_5_NANO,
-              GPT_54, GPT_54_MINI, GPT_54_NANO,
+              GPT_54, GPT_55, GPT_54_MINI, GPT_54_NANO,
              O1, O1_MINI,
-              
-              # DeepSeek
-              DEEPSEEK_CHAT, DEEPSEEK_REASONER,
-              
-              # Qwen
-              QWEN36_PLUS, QWEN35_PLUS, QWEN3_MAX, QWEN_MAX, QWEN_PLUS, QWEN_TURBO, QWEN_LONG,
-              
-              # MiniMax
-              MiniMax, MINIMAX_M2_7, MINIMAX_M2_5, MINIMAX_M2_1, MINIMAX_M2_1_LIGHTNING, MINIMAX_M2, MINIMAX_ABAB6_5,

-              # GLM
-              ZHIPU_AI, GLM_5_TURBO, GLM_5, GLM_4, GLM_4_PLUS, GLM_4_flash, GLM_4_LONG, GLM_4_ALLTOOLS,
+              # GLM (智谱AI)
+              ZHIPU_AI, GLM_5_1, GLM_5_TURBO, GLM_5, GLM_4, GLM_4_PLUS, GLM_4_flash, GLM_4_LONG, GLM_4_ALLTOOLS,
              GLM_4_0520, GLM_4_AIR, GLM_4_AIRX, GLM_4_7,

-              # Kimi
-              MOONSHOT, "moonshot-v1-8k", "moonshot-v1-32k", "moonshot-v1-128k",
-              KIMI_K2, KIMI_K2_5,
+              # Qwen (通义千问)
+              QWEN37_MAX, QWEN36_PLUS, QWEN35_PLUS, QWEN3_MAX, QWEN_MAX, QWEN_PLUS, QWEN_TURBO, QWEN_LONG,

-              # Doubao
+              # Doubao (豆包)
              DOUBAO, DOUBAO_SEED_2_CODE, DOUBAO_SEED_2_PRO, DOUBAO_SEED_2_LITE, DOUBAO_SEED_2_MINI,

+              # Kimi (Moonshot)
+              MOONSHOT, "moonshot-v1-8k", "moonshot-v1-32k", "moonshot-v1-128k",
+              KIMI_K2_6, KIMI_K2_5, KIMI_K2,
+
+              # ModelScope
+              MODELSCOPE,
+
+              # LinkAI
+              LINKAI_35, LINKAI_4_TURBO, LINKAI_4o,
+
              # 其他模型
              WEN_XIN, WEN_XIN_4, XUNFEI,
-              LINKAI_35, LINKAI_4_TURBO, LINKAI_4o,
-              MODELSCOPE
            ]

 MODEL_LIST = MODEL_LIST + GITEE_AI_MODEL_LIST + MODELSCOPE_MODEL_LIST
--- a/config-template.json
+++ b/config-template.json
@@ -1,6 +1,10 @@
 {
  "channel_type": "weixin",
-  "model": "MiniMax-M2.7",
+  "model": "deepseek-v4-flash",
+  "deepseek_api_key": "",
+  "deepseek_api_base": "https://api.deepseek.com/v1",
+  "qianfan_api_key": "",
+  "qianfan_api_base": "https://qianfan.baidubce.com/v2",
  "minimax_api_key": "",
  "zhipu_ai_api_key": "",
  "ark_api_key": "",
@@ -12,8 +16,8 @@
  "open_ai_api_base": "https://api.openai.com/v1",
  "gemini_api_key": "",
  "gemini_api_base": "https://generativelanguage.googleapis.com",
-  "voice_to_text": "openai",
-  "text_to_voice": "openai",
+  "voice_to_text": "",
+  "text_to_voice": "",
  "voice_reply_voice": false,
  "speech_recognition": true,
  "group_speech_recognition": false,
@@ -22,12 +26,18 @@
  "linkai_app_code": "",
  "feishu_app_id": "",
  "feishu_app_secret": "",
+  "feishu_stream_reply": true,
  "dingtalk_client_id": "",
-  "dingtalk_client_secret":"",
+  "dingtalk_client_secret": "",
  "wecom_bot_id": "",
  "wecom_bot_secret": "",
+  "web_host": "",
+  "web_password": "",
  "agent": true,
-  "agent_max_context_tokens": 40000,
+  "agent_max_context_tokens": 50000,
  "agent_max_context_turns": 20,
-  "agent_max_steps": 15
+  "agent_max_steps": 20,
+  "enable_thinking": false,
+  "reasoning_effort": "high",
+  "knowledge": true
 }
--- a/config.py
+++ b/config.py
@@ -17,10 +17,12 @@ available_setting = {
    "open_ai_api_base": "https://api.openai.com/v1",
    "claude_api_base": "https://api.anthropic.com/v1",  # claude api base
    "gemini_api_base": "https://generativelanguage.googleapis.com",  # gemini api base
+    "custom_api_key": "",  # custom OpenAI-compatible provider api key (used when bot_type is "custom")
+    "custom_api_base": "",  # custom OpenAI-compatible provider api base (used when bot_type is "custom")
    "proxy": "",  # openai使用的代理
    # chatgpt模型， 当use_azure_chatgpt为true时，其名称为Azure上model deployment名称
    "model": "gpt-3.5-turbo",  # 可选择: gpt-4o, pt-4o-mini, gpt-4-turbo, claude-3-sonnet, wenxin, moonshot, qwen-turbo, xunfei, glm-4, minimax, gemini等模型，全部可选模型详见common/const.py文件
-    "bot_type": "",  # 可选配置，使用兼容openai格式的三方服务时候，需填"openai"（历史值"chatGPT"仍兼容）。bot具体名称详见common/const.py文件，如不填根据model名称判断
+    "bot_type": "",  # 可选配置，使用兼容openai格式的三方服务时候，需填"openai"或"custom"（custom模式下切换模型不会自动切换bot_type）。bot具体名称详见common/const.py文件，如不填根据model名称判断
    "use_azure_chatgpt": False,  # 是否使用azure的chatgpt
    "azure_deployment_id": "",  # azure 模型部署名称
    "azure_api_version": "",  # azure api版本
@@ -74,6 +76,9 @@ available_setting = {
    "baidu_wenxin_api_key": "",  # Baidu api key
    "baidu_wenxin_secret_key": "",  # Baidu secret key
    "baidu_wenxin_prompt_enabled": False,  # Enable prompt if you are using ernie character model
+    # Baidu Qianfan / ERNIE OpenAI-compatible API
+    "qianfan_api_key": "",  # Baidu Qianfan API key in bce-v3 format
+    "qianfan_api_base": "https://qianfan.baidubce.com/v2",  # Qianfan OpenAI-compatible API base
    # 讯飞星火API
    "xunfei_app_id": "",  # 讯飞应用ID
    "xunfei_api_key": "",  # 讯飞 API key
@@ -95,6 +100,10 @@ available_setting = {
    "dashscope_api_key": "",
    # Google Gemini Api Key
    "gemini_api_key": "",
+    # Embedding 模型设置
+    "embedding_provider": "",  # 显式指定厂商：openai / linkai / dashscope / doubao / zhipu (与 bot_type 命名一致)
+    "embedding_model": "",     # 留空使用厂商默认 model
+    "embedding_dimensions": 0, # 留空/0 使用厂商默认维度（推荐统一 1024）
    # 语音设置
    "speech_recognition": True,  # 是否开启语音识别
    "group_speech_recognition": False,  # 是否开启群组语音识别
@@ -121,10 +130,13 @@ available_setting = {
    "chat_start_time": "00:00",  # 服务开始时间
    "chat_stop_time": "24:00",  # 服务结束时间
    # 翻译api
-    "translate": "baidu",  # 翻译api，支持baidu
+    "translate": "baidu",  # 翻译api，支持baidu, youdao
    # baidu翻译api的配置
    "baidu_translate_app_id": "",  # 百度翻译api的appid
    "baidu_translate_app_key": "",  # 百度翻译api的秘钥
+    # youdao翻译api的配置
+    "youdao_translate_app_key": "",  # 有道翻译api的应用ID
+    "youdao_translate_app_secret": "",  # 有道翻译api的应用密钥
    # wechatmp的配置
    "wechatmp_token": "",  # 微信公众平台的Token
    "wechatmp_port": 8080,  # 微信公众平台的端口,需要端口转发到80或443
@@ -140,12 +152,13 @@ available_setting = {
    "wechatcomapp_agent_id": "",  # 企业微信app的agent_id
    "wechatcomapp_aes_key": "",  # 企业微信app的aes_key
    # 飞书配置
-    "feishu_port": 80,  # 飞书bot监听端口
+    "feishu_port": 80,  # 飞书bot监听端口，仅webhook模式需要
    "feishu_app_id": "",  # 飞书机器人应用APP Id
    "feishu_app_secret": "",  # 飞书机器人APP secret
-    "feishu_token": "",  # 飞书 verification token
-    "feishu_bot_name": "",  # 飞书机器人的名字
+    "feishu_token": "",  # 飞书 verification token，仅webhook模式需要
    "feishu_event_mode": "websocket",  # 飞书事件接收模式: webhook(HTTP服务器) 或 websocket(长连接)
+    # 飞书流式回复（基于官方 cardkit 流式卡片 API，需要机器人开通 cardkit:card:write 权限，且飞书客户端 7.20+）
+    "feishu_stream_reply": True,  # 是否开启流式回复（打字机效果）。失败/老客户端自动降级为非流式或升级提示
    # 钉钉配置
    "dingtalk_client_id": "",  # 钉钉机器人Client ID 
    "dingtalk_client_secret": "",  # 钉钉机器人Client Secret
@@ -180,26 +193,36 @@ available_setting = {
    # 豆包(火山方舟) 平台配置
    "ark_api_key": "",
    "ark_base_url": "https://ark.cn-beijing.volces.com/api/v3",
-    #魔搭社区 平台配置
+    # 魔搭社区 平台配置
    "modelscope_api_key": "",
    "modelscope_base_url": "https://api-inference.modelscope.cn/v1/chat/completions",
    # LinkAI平台配置
    "use_linkai": False,
    "linkai_api_key": "",
    "linkai_app_code": "",
-    "linkai_api_base": "https://api.link-ai.tech",  # linkAI服务地址
+    "linkai_api_base": "https://api.link-ai.tech",
    "cloud_host": "client.link-ai.tech",
    "cloud_port": None,
    "cloud_deployment_id": "",
    "minimax_api_key": "",
    "Minimax_group_id": "",
    "Minimax_base_url": "",
+    "deepseek_api_key": "",
+    "deepseek_api_base": "https://api.deepseek.com/v1",
+    "web_host": "",  # Web console bind address; empty means auto
    "web_port": 9899,
+    "web_password": "",  # Web console password; empty means no authentication required
+    "web_session_expire_days": 30,  # Auth session expiry in days
    "agent": True,  # 是否开启Agent模式
    "agent_workspace": "~/cow",  # agent工作空间路径，用于存储skills、memory等
    "agent_max_context_tokens": 50000,  # Agent模式下最大上下文tokens
-    "agent_max_context_turns": 30,  # Agent模式下最大上下文记忆轮次
-    "agent_max_steps": 15,  # Agent模式下单次运行最大决策步数
+    "agent_max_context_turns": 20,  # Agent模式下最大上下文记忆轮次
+    "agent_max_steps": 20,  # Agent模式下单次运行最大决策步数
+    "enable_thinking": False,  # Enable deep-thinking mode for thinking-capable models
+    "reasoning_effort": "high",  # Reasoning depth under thinking mode: "high" or "max"
+    "knowledge": True,  # 是否开启知识库功能
+    "skill": {},  # Per-skill runtime config; nested keys flatten to SKILL_<NAME>_<KEY> env vars at startup
+    "mcp_servers": [],  # MCP server list; each entry supports type "stdio" (local process) or "sse" (remote URL)
 }


@@ -214,15 +237,9 @@ class Config(dict):
        self.user_datas = {}

    def __getitem__(self, key):
-        # 跳过以下划线开头的注释字段
-        if not key.startswith("_") and key not in available_setting:
-            logger.warning("[Config] key '{}' not in available_setting, may not take effect".format(key))
        return super().__getitem__(key)

    def __setitem__(self, key, value):
-        # 跳过以下划线开头的注释字段
-        if not key.startswith("_") and key not in available_setting:
-            logger.warning("[Config] key '{}' not in available_setting, may not take effect".format(key))
        return super().__setitem__(key, value)

    def get(self, key, default=None):
@@ -230,7 +247,7 @@ class Config(dict):
        if key.startswith("_"):
            return super().get(key, default)
        
-        # 如果key不在available_setting中，直接返回default
+        # 如果key不在available_setting中，直接走dict的get，返回config.json中实际加载的值（如不存在则返回default）
        if key not in available_setting:
            return super().get(key, default)
        
@@ -313,8 +330,18 @@ def load_config():
    config_str = read_file(config_path)
    logger.debug("[INIT] config str: {}".format(drag_sensitive(config_str)))

-    # 将json字符串反序列化为dict类型
-    config = Config(json.loads(config_str))
+    # 将json字符串反序列化为dict类型。
+    # `object_pairs_hook` lets us catch users who accidentally typed the
+    # same key twice (e.g. two `"tools"` blocks) — json.loads would
+    # otherwise silently drop all but the last occurrence.
+    config = Config(json.loads(config_str, object_pairs_hook=_merge_duplicate_keys))
+
+    # Migrate legacy singular keys (`tool`, `skill`) into the canonical
+    # plural buckets so the rest of the codebase only reads one schema.
+    # Deep-merge so existing `tools`/`skills` entries are preserved and
+    # only missing namespaces are filled in from the legacy section.
+    _merge_legacy_namespace(config, legacy="tool",  canonical="tools")
+    _merge_legacy_namespace(config, legacy="skill", canonical="skills")

    # override config with environment variables.
    # Some online deployment platforms (e.g. Railway) deploy project from github directly. So you shouldn't put your secrets like api key in a config file, instead use environment variables to override the default config.
@@ -349,7 +376,7 @@ def load_config():
    logger.info("[INIT] Model: {}".format(config.get("model", "unknown")))

    # Agent模式信息
-    if config.get("agent", False):
+    if config.get("agent", True):
        workspace = config.get("agent_workspace", "~/cow")
        logger.info("[INIT] Mode: Agent (workspace: {})".format(workspace))
    else:
@@ -372,12 +399,18 @@ def load_config():
        "gemini_api_base": "GEMINI_API_BASE",
        "minimax_api_key": "MINIMAX_API_KEY",
        "minimax_api_base": "MINIMAX_API_BASE",
+        "deepseek_api_key": "DEEPSEEK_API_KEY",
+        "deepseek_api_base": "DEEPSEEK_API_BASE",
+        "qianfan_api_key": "QIANFAN_API_KEY",
+        "qianfan_api_base": "QIANFAN_API_BASE",
        "zhipu_ai_api_key": "ZHIPU_AI_API_KEY",
        "zhipu_ai_api_base": "ZHIPU_AI_API_BASE",
        "moonshot_api_key": "MOONSHOT_API_KEY",
        "moonshot_api_base": "MOONSHOT_API_BASE",
        "ark_api_key": "ARK_API_KEY",
        "ark_api_base": "ARK_API_BASE",
+        "dashscope_api_key": "DASHSCOPE_API_KEY",
+        "dashscope_api_base": "DASHSCOPE_API_BASE",
        # Channel credentials (used by skills that check env vars)
        "feishu_app_id": "FEISHU_APP_ID",
        "feishu_app_secret": "FEISHU_APP_SECRET",
@@ -398,12 +431,124 @@ def load_config():
            if val:
                os.environ[env_key] = str(val)
                injected += 1
+
+    injected += _sync_skill_config_to_env(config.get("skills", {}))
+
    if injected:
        logger.info("[INIT] Synced {} config values to environment variables".format(injected))

    config.load_user_datas()


+def _deep_merge_dicts(base: dict, incoming: dict) -> dict:
+    """Recursively merge ``incoming`` into ``base`` (incoming wins on leaves)."""
+    for key, val in incoming.items():
+        if (
+            key in base
+            and isinstance(base[key], dict)
+            and isinstance(val, dict)
+        ):
+            _deep_merge_dicts(base[key], val)
+        else:
+            base[key] = val
+    return base
+
+
+def _merge_duplicate_keys(pairs):
+    """object_pairs_hook for json.loads: deep-merge duplicate top-level keys
+    (lists concat, dicts merge, scalars take the latter) instead of dropping."""
+    out = {}
+    duplicates = []
+    for key, val in pairs:
+        if key not in out:
+            out[key] = val
+            continue
+        duplicates.append(key)
+        prev = out[key]
+        if isinstance(prev, dict) and isinstance(val, dict):
+            _deep_merge_dicts(prev, val)
+        elif isinstance(prev, list) and isinstance(val, list):
+            prev.extend(val)
+        else:
+            out[key] = val
+    if duplicates:
+        # logger may not be wired yet — fall back to print so we never lose the warning.
+        unique = sorted(set(duplicates))
+        try:
+            logger.warning("[INIT] config.json has duplicate keys (merged): %s", unique)
+        except Exception:
+            print("[INIT] config.json has duplicate keys (merged):", unique)
+    return out
+
+
+def _merge_legacy_namespace(cfg, legacy: str, canonical: str) -> None:
+    """Fold deprecated singular keys (``tool`` / ``skill``) into their plural
+    canonical counterparts at load time. Canonical entries always win."""
+    legacy_section = cfg.get(legacy)
+    if not isinstance(legacy_section, dict) or not legacy_section:
+        cfg.pop(legacy, None)
+        return
+    canonical_section = cfg.get(canonical)
+    if not isinstance(canonical_section, dict):
+        canonical_section = {}
+    merged_keys = []
+    for name, val in legacy_section.items():
+        if name in canonical_section:
+            if isinstance(canonical_section[name], dict) and isinstance(val, dict):
+                for sub_key, sub_val in val.items():
+                    if (
+                        sub_key in canonical_section[name]
+                        and isinstance(canonical_section[name][sub_key], dict)
+                        and isinstance(sub_val, dict)
+                    ):
+                        _deep_merge_dicts(sub_val, canonical_section[name][sub_key])
+                        canonical_section[name][sub_key] = sub_val
+                    else:
+                        canonical_section[name].setdefault(sub_key, sub_val)
+            continue
+        canonical_section[name] = val
+        merged_keys.append(name)
+    cfg[canonical] = canonical_section
+    cfg.pop(legacy, None)
+    if merged_keys:
+        logger.warning(
+            "[INIT] Legacy config key '{}' is deprecated; merged into '{}': {}. "
+            "Please rename '{}' to '{}' in your config.json.".format(
+                legacy, canonical, merged_keys, legacy, canonical,
+            )
+        )
+
+
+def _sync_skill_config_to_env(skill_section) -> int:
+    """Flatten skill-namespaced config into environment variables.
+
+    Mapping rule: ``config["skills"][<name>][<key>]`` -> ``SKILL_<NAME>_<KEY>``
+    (e.g. ``skills["image-generation"].model`` -> ``SKILL_IMAGE_GENERATION_MODEL``).
+
+    This lets subprocess-based skill scripts read their own settings without
+    importing project code. Existing env vars are NOT overwritten so the
+    real environment always wins.
+
+    Returns the number of variables actually injected.
+    """
+    if not isinstance(skill_section, dict):
+        return 0
+    injected = 0
+    for skill_name, skill_conf in skill_section.items():
+        if not isinstance(skill_conf, dict):
+            continue
+        name_part = str(skill_name).replace("-", "_").upper()
+        for key, val in skill_conf.items():
+            if val is None or val == "":
+                continue
+            env_key = "SKILL_{}_{}".format(name_part, str(key).upper())
+            if env_key in os.environ:
+                continue
+            os.environ[env_key] = str(val)
+            injected += 1
+    return injected
+
+
 def get_root():
    return os.path.dirname(os.path.abspath(__file__))

--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@@ -9,7 +9,9 @@ services:
      - "9899:9899"
    environment:
      CHANNEL_TYPE: 'weixin'
-      MODEL: 'MiniMax-M2.7'
+      MODEL: 'deepseek-v4-flash'
+      DEEPSEEK_API_KEY: ''
+      DEEPSEEK_API_BASE: 'https://api.deepseek.com/v1'
      MINIMAX_API_KEY: ''
      ZHIPU_AI_API_KEY: ''
      ARK_API_KEY: ''
@@ -35,9 +37,12 @@ services:
      DINGTALK_CLIENT_SECRET: ''
      WECOM_BOT_ID: ''
      WECOM_BOT_SECRET: ''
+      # 如需通过宿主机访问 Web 控制台，改为 '0.0.0.0' 并设置 WEB_PASSWORD
+      WEB_HOST: '127.0.0.1'
+      WEB_PASSWORD: ''
      AGENT: 'True'
-      AGENT_MAX_CONTEXT_TOKENS: 40000
+      AGENT_MAX_CONTEXT_TOKENS: 50000
      AGENT_MAX_CONTEXT_TURNS: 20
-      AGENT_MAX_STEPS: 15
+      AGENT_MAX_STEPS: 20
    volumes:
      - ./cow:/home/agent/cow
--- a/docs/channels/feishu.mdx
+++ b/docs/channels/feishu.mdx
@@ -3,67 +3,109 @@ title: 飞书
 description: 将 CowAgent 接入飞书应用
 ---

-通过自建应用将 CowAgent 接入飞书，需要是飞书企业用户且具有企业管理权限。
+> 通过飞书自建应用接入 CowAgent，支持单聊与群聊（@机器人），使用 WebSocket 长连接模式，无需公网 IP，支持流式打字机回复、语音消息收发。

-## 一、创建企业自建应用
+<Note>
+  接入需要是飞书企业用户且具有企业管理权限。
+</Note>

-### 1. 创建应用
+## 一、接入方式

-进入 [飞书开发平台](https://open.feishu.cn/app/)，点击 **创建企业自建应用**，填写必要信息后点击 **创建**：
+### 方式一：扫码一键接入（推荐）
+
+启动 Cow 项目后在终端中即可完成扫码创建。或打开 Web 控制台（本地链接：http://127.0.0.1:9899 ），选择 **通道** 菜单，点击 **接入通道**，选择 **飞书**，点击 **一键创建飞书应用**，使用 **飞书 App** 扫描二维码即可自动完成应用创建并接入:
+
+
+<img src="https://cdn.link-ai.tech/doc/20260505181126.png" width="800"/>
+
+
+<Note>
+  1. `lark-oapi` 依赖版本需要 >=1.5.5
+  2. 扫码创建出的应用会自动预置全部所需权限（消息收发、卡片读写、群聊事件等）和事件订阅，无需到开发者后台手动配置。
+</Note>
+
+
+### 方式二：手动创建接入
+
+需要先在飞书开放平台创建自建应用并配置权限，再通过 Web 控制台或配置文件接入。
+
+**步骤一：创建应用**
+
+1. 进入 [飞书开发平台](https://open.feishu.cn/app/)，点击 **创建企业自建应用**：

 <img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-create-app.jpg" width="500"/>

-### 2. 添加机器人能力
-
-在 **添加应用能力** 菜单中，为应用添加 **机器人** 能力：
+2. 在 **添加应用能力** 中，为应用添加 **机器人** 能力：

 <img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-add-bot.jpg" width="800"/>

-### 3. 配置应用权限
-
-点击 **权限管理**，复制以下权限配置，粘贴到 **权限配置** 下方的输入框内，全选筛选出来的权限，点击 **批量开通** 并确认：
+3. 在 **权限管理** 中，将以下权限粘贴到输入框，全选并 **批量开通**：

 ```
-im:message,im:message.group_at_msg,im:message.group_at_msg:readonly,im:message.p2p_msg,im:message.p2p_msg:readonly,im:message:send_as_bot,im:resource
+im:message,im:message.group_at_msg,im:message.group_at_msg:readonly,im:message.p2p_msg,im:message.p2p_msg:readonly,im:message:send_as_bot,im:resource,cardkit:card:write
 ```

 <img src="https://cdn.link-ai.tech/doc/feishu-hosting-add-auth2.png" width="800"/>

-## 二、项目配置
-
-1. 在 **凭证与基础信息** 中获取 `App ID` 和 `App Secret`：
+4. 在 **凭证与基础信息** 中获取 `App ID` 和 `App Secret`：

 <img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-appid-secret.jpg" width="800"/>

-2. 将以下配置加入项目根目录的 `config.json` 文件：
+**步骤二：接入 CowAgent**

-```json
-{
-  "channel_type": "feishu",
-  "feishu_app_id": "YOUR_APP_ID",
-  "feishu_app_secret": "YOUR_APP_SECRET",
-  "feishu_bot_name": "YOUR_BOT_NAME"
-}
-```
+<Tabs>
+  <Tab title="Web 控制台">
+    打开 Web 控制台，选择 **通道** 菜单，点击 **接入通道**，选择 **飞书**，切换到「手动填写」Tab，输入 App ID 和 App Secret，点击接入即可。
+  </Tab>
+  <Tab title="配置文件">
+    在 `config.json` 中添加以下配置后启动程序：

-| 参数 | 说明 |
-| --- | --- |
-| `feishu_app_id` | 飞书机器人应用 App ID |
-| `feishu_app_secret` | 飞书机器人 App Secret |
-| `feishu_bot_name` | 飞书机器人名称（创建应用时设置），群聊中使用依赖此配置 |
+    ```json
+    {
+      "channel_type": "feishu",
+      "feishu_app_id": "YOUR_APP_ID",
+      "feishu_app_secret": "YOUR_APP_SECRET",
+      "feishu_stream_reply": true
+    }
+    ```

-配置完成后启动项目。
+    | 参数 | 说明 | 默认值 |
+    | --- | --- | --- |
+    | `feishu_app_id` | 飞书应用 App ID | - |
+    | `feishu_app_secret` | 飞书应用 App Secret | - |
+    | `feishu_stream_reply` | 是否开启流式打字机回复 | `true` |
+  </Tab>
+</Tabs>

-## 三、配置事件订阅
+**步骤三：发布应用**

-1. 成功运行项目后，在飞书开放平台点击 **事件与回调**，选择 **长连接** 方式，点击保存：
+1. 启动 Cow 项目后，在飞书开放平台点击 **事件与回调**，选择 **长连接** 模式并保存：

 <img src="https://cdn.link-ai.tech/doc/202601311731183.png" width="600"/>

-2. 点击下方的 **添加事件**，搜索 "接收消息"，选择 "**接收消息v2.0**"，确认添加。
+2. 点击 **添加事件**，搜索 "接收消息"，选择 **接收消息 v2.0** 并确认。

-3. 点击 **版本管理与发布**，创建版本并申请 **线上发布**，在飞书客户端查看审批消息并审核通过：
+3. 点击 **版本管理与发布**，创建版本并申请 **线上发布**，在飞书客户端审核通过：

 <img src="https://cdn.link-ai.tech/doc/202601311807356.png" width="600"/>

-完成后在飞书中搜索机器人名称，即可开始对话。
+## 二、功能说明
+
+| 功能 | 支持情况 |
+| --- | --- |
+| 单聊 | ✅ |
+| 群聊（@机器人） | ✅ |
+| 文本消息 | ✅ 收发 |
+| 图片消息 | ✅ 收发 |
+| 语音消息 | ✅ 收发 |
+| 流式回复 | ✅（通过 `feishu_stream_reply` 配置控制，默认开启） |
+
+<Note>
+  流式回复需要机器人具备 `cardkit:card:write` 权限（一键创建已默认开通），且接收方飞书客户端版本 ≥ 7.20。低版本客户端会显示升级提示，权限或版本不满足时自动降级为普通文本回复。
+</Note>
+
+## 三、使用
+
+完成接入后，在飞书中搜索机器人名称即可开始单聊对话。
+
+如需在群聊中使用，将机器人添加到群中，@机器人发送消息即可。
--- a/docs/channels/index.mdx
+++ b/docs/channels/index.mdx
@@ -0,0 +1,39 @@
+---
+title: 通道概览
+description: CowAgent 支持的通道及能力矩阵
+---
+
+CowAgent 支持接入多种聊天通道，启动时通过 `channel_type` 切换。Web 控制台默认开启，可与其他接入通道并行运行。
+
+## 能力矩阵
+
+下表汇总各通道支持的入站消息类型、机器人回复类型与群聊能力，方便按场景选择。
+
+| 通道 | 文本 | 图片 | 文件 | 语音 | 群聊 |
+| --- | :-: | :-: | :-: | :-: | :-: |
+| [微信](/channels/weixin) | ✅ | ✅ | ✅ | ✅ |  |
+| [Web 控制台](/channels/web) | ✅ | ✅ | ✅ | ✅ | |
+| [飞书](/channels/feishu) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [钉钉](/channels/dingtalk) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [企微智能机器人](/channels/wecom-bot) | ✅ | ✅ | ✅ | ✅ | ✅ |
+| [QQ](/channels/qq) | ✅ | ✅ | ✅ | | ✅ |
+| [企业微信应用](/channels/wecom) | ✅ | ✅ | ✅ | ✅ | |
+| [公众号](/channels/wechatmp) | ✅ | ✅ | | ✅ | |
+
+- **图片 / 文件 / 语音**列表示通道支持收发对应消息类型，具体细节详见各通道文档
+- **群聊**列指可识别并响应群消息
+
+<Tip>
+  每个通道的语音 / 图像能力依赖对应模型厂商的配置，详见 [模型概览](/models)。
+</Tip>
+
+## 通道一览
+
+- [Web 控制台](/channels/web) — 内置浏览器对话和管理面板，默认开启
+- [微信](/channels/weixin) — 通过个人微信扫码登录
+- [飞书](/channels/feishu) — 飞书自建机器人
+- [钉钉](/channels/dingtalk) — 钉钉自建机器人
+- [企微智能机器人](/channels/wecom-bot) — 企业微信智能机器人
+- [QQ](/channels/qq) — QQ 官方机器人开放平台
+- [企业微信应用](/channels/wecom) — 企业微信自建应用接入
+- [公众号](/channels/wechatmp) — 微信公众号（订阅号 / 服务号）
--- a/docs/channels/web.mdx
+++ b/docs/channels/web.mdx
@@ -10,14 +10,23 @@ Web 控制台是 CowAgent 的默认通道，启动后会自动运行，通过浏
 ```json
 {
  "channel_type": "web",
-  "web_port": 9899
+  "web_host": "0.0.0.0",
+  "web_port": 9899,
+  "web_password": "",
+  "enable_thinking": false
 }
 ```

 | 参数 | 说明 | 默认值 |
 | --- | --- | --- |
 | `channel_type` | 设为 `web` | `web` |
+| `web_host` | Web 服务监听地址，默认监听 `127.0.0.1`（仅本机），如需公网访问请改为 `0.0.0.0` 并设置密码 | `""` |
 | `web_port` | Web 服务监听端口 | `9899` |
+| `web_password` | 访问密码，留空表示不启用密码保护；监听 `0.0.0.0` 时建议设置 | `""` |
+| `web_session_expire_days` | 登录会话有效天数 | `30` |
+| `enable_thinking` | 是否启用深度思考模式 | `false` |
+
+配置密码后，访问控制台时需先输入密码完成登录。登录状态默认保持 30 天，期间重启服务也无需重新登录。密码也支持在控制台的「配置」页面中在线修改。

 ## 访问地址

@@ -34,15 +43,25 @@ Web 控制台是 CowAgent 的默认通道，启动后会自动运行，通过浏

 ### 对话界面

-支持流式输出，可实时展示 Agent 的思考过程（Reasoning）和工具调用过程（Tool Calls），更直观地观察 Agent 的决策过程：
+支持流式输出，可实时展示 Agent 的思考过程（Reasoning）和工具调用过程（Tool Calls），更直观地观察 Agent 的决策过程。深度思考功能可通过配置或控制台的「Agent 配置」开关控制。

 <img width="850" src="https://cdn.link-ai.tech/doc/20260227180120.png" />

+#### 多会话管理
+
+对话界面支持多会话（Session）管理，所有会话记录持久化存储在数据库中：
+
+- **会话列表**：点击左侧历史会话图标可展开/收起会话列表面板，支持滚动加载全部历史会话
+- **AI 生成标题**：新会话在首轮对话完成后，自动调用模型生成简短的会话摘要标题
+- **新建会话**：点击会话列表顶部的「新对话」按钮或输入区的 `+` 按钮创建新会话
+- **删除会话**：点击会话项的删除按钮，确认后永久删除该会话及其所有消息
+- **清除上下文**：点击输入区的清除按钮，在当前会话中插入一条分隔线，分隔线以上的消息仍然展示但不再作为模型的上下文输入
+
 ### 模型管理

-支持在线管理模型配置，无需手动编辑配置文件：
+支持在线管理不同模型厂商的文本、图像、语音、向量模型配置，无需手动编辑配置文件：

-<img width="850" src="https://cdn.link-ai.tech/doc/20260227173811.png" />
+<img width="850" src="https://cdn.link-ai.tech/doc/20260521212949.png" />

 ### 技能管理

--- a/docs/cli/general.mdx
+++ b/docs/cli/general.mdx
@@ -58,17 +58,18 @@ Session: 12 messages | 8 skills loaded
 **修改配置项：**

 ```text
-/config model deepseek-chat
+/config model deepseek-v4-flash
 ```

 **支持修改的配置项：**

 | 配置项 | 说明 | 示例值 |
 | --- | --- | --- |
-| `model` | AI 模型名称 | `deepseek-chat` |
+| `model` | AI 模型名称 | `deepseek-v4-flash` |
 | `agent_max_context_tokens` | 最大上下文 tokens | `40000` |
 | `agent_max_context_turns` | 最大上下文记忆轮次 | `30` |
 | `agent_max_steps` | 单次任务最大决策步数 | `15` |
+| `enable_thinking` | 是否启用深度思考模式 | `true` / `false` |

 <Note>
  修改 `model` 时，系统会自动匹配对应的模型调用方式。配置会写入 `config.json` 并持久保存。
--- a/Show More
+++ b/Show More
				`@@ -0,0 +1 @@`
				<?xml version="1.0" standalone="no"?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg t="1779251656961" class="icon" viewBox="0 0 1024 1024" version="1.1" xmlns="http://www.w3.org/2000/svg" p-id="18432" xmlns:xlink="http://www.w3.org/1999/xlink" width="200" height="200"><path d="M252.8 652.8l167.893333-94.293333 2.773334-8.106667-2.773334-4.48h-8.106666l-28.16-1.706667-96-2.56-83.2-3.413333-80.64-4.266667-20.266667-4.266666L85.333333 504.746667l1.92-12.586667 17.066667-11.52 24.32 2.133333 53.973333 3.626667 81.066667 5.546667 58.666667 3.413333 87.04 9.173333h13.866666l1.92-5.546666-4.693333-3.413334-3.626667-3.413333-83.84-56.746667-90.666666-60.16-47.573334-34.56-25.813333-17.493333-13.013333-16.426667-5.546667-35.84 23.253333-25.813333 31.36 2.133333 7.893334 2.133334 31.786666 24.32 67.84 52.48L401.066667 391.466667l13.013333 10.88 5.12-3.626667 0.64-2.56-5.76-9.813333-48.213333-87.04L314.453333 210.773333l-22.826666-36.693333-5.973334-21.973333a107.861333 107.861333 0 0 1-3.626666-26.026667l26.666666-36.053333L323.413333 85.333333l35.413334 4.693334 14.933333 13.013333 21.973333 50.346667 35.626667 79.36 55.253333 107.733333 16.213334 32 8.746666 29.653333 3.2 9.173334h5.546667v-5.12l4.48-60.8 8.32-74.453334 8.106667-96 2.773333-27.093333 13.44-32.426667 26.666667-17.493333 20.693333 10.026667 17.066667 24.32-2.346667 15.786666-10.24 65.92-19.84 103.253334-13.013333 69.12h7.466666l8.746667-8.746667 34.986667-46.506667 58.666666-73.386666 26.026667-29.226667 30.293333-32.213333 19.413334-15.36h36.693333l27.093333 40.106666-12.16 41.386667-37.76 48-31.36 40.533333-45.013333 60.586667-28.16 48.426667 2.56 3.84 6.613333-0.64 101.546667-21.546667 54.826667-10.026667 65.493333-11.306666 29.653333 13.866666 3.2 14.08-11.733333 28.8-69.973333 17.28-82.133334 16.426667-122.24 29.013333-1.493333 1.066667 1.706667 2.133333 55.04 5.12 23.466666 1.28h57.6l107.306667 7.893334 28.16 18.56 16.853333 22.613333-2.773333 17.28-43.306667 21.973333-58.24-13.866666-136.106666-32.426667-46.72-11.733333h-6.4v3.84l38.826666 37.973333 71.253334 64.426667 89.173333 82.986666 4.48 20.48-11.52 16.213334-12.16-1.706667-78.506667-58.88-30.293333-26.666667-68.48-57.6h-4.48v5.973334l15.786667 23.04 83.413333 125.226666 4.266667 38.4-5.973334 12.586667-21.546666 7.466667-23.68-4.266667-48.853334-68.48-50.346666-77.226667-40.533334-69.12-4.906666 2.773334-23.893334 258.133333-11.306666 13.226667-26.026667 10.026666-21.546667-16.426666-11.52-26.666667 11.52-52.48 13.866667-68.48 11.306667-54.4 10.24-67.626667 5.973333-22.4-0.426667-1.493333-4.906666 0.64-50.986667 69.973333-77.653333 104.746667-61.44 65.706667-14.72 5.76-25.386667-13.226667 2.346667-23.466667 14.293333-20.906666 84.906667-107.946667 51.2-66.986667 33.066666-38.613333v-5.546667h-2.133333l-225.493333 146.56-40.106667 5.12-17.28-16.213333 2.133333-26.666667 8.106667-8.746666 67.84-46.72h-0.213333l0.853333 0.853333z" fill="#D97757" p-id="18433"></path></svg>
@@ -1 +1 @@
 .0.5
 .0.9