feat(web): hint API base version path in config placeholder

fix: remove unnecessary API Base URL in run scripts
feat: switch default model to deepseek-v4-flash
2026-06-03 02:27:09 +08:00 · 2026-04-26 17:10:24 +08:00 · 2026-04-26 16:29:08 +08:00 · 2026-04-26 15:54:50 +08:00 · 2026-04-24 16:39:48 +08:00 · 2026-04-24 15:29:43 +08:00
126 changed files with 7428 additions and 1107 deletions
--- a/.github/workflows/deploy-image-arm.yml
+++ b/.github/workflows/deploy-image-arm.yml
@@ -19,7 +19,7 @@ env:

 jobs:
  build-and-push-image:
-    if: github.repository == 'zhayujie/chatgpt-on-wechat'
+    if: github.repository == 'zhayujie/CowAgent'
    runs-on: ubuntu-latest
    permissions:
      contents: read
@@ -51,7 +51,12 @@ jobs:
        uses: docker/metadata-action@v4
        with:
          images: |
-            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
+            ${{ env.REGISTRY }}/zhayujie/chatgpt-on-wechat
+            ${{ env.REGISTRY }}/zhayujie/cowagent
+          tags: |
+            type=raw,value=latest-arm64,enable={{is_default_branch}}
+            type=ref,event=branch,suffix=-arm64
+            type=ref,event=tag,suffix=-arm64

      - name: Build and push Docker image
        uses: docker/build-push-action@v3
@@ -60,7 +65,7 @@ jobs:
          push: true
          file: ./docker/Dockerfile.latest
          platforms: linux/arm64
-          tags: ${{ steps.meta.outputs.tags }}-arm64
+          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}

      - uses: actions/delete-package-versions@v4
--- a/.github/workflows/deploy-image.yml
+++ b/.github/workflows/deploy-image.yml
@@ -16,10 +16,11 @@ on:
 env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}
+  DOCKERHUB_IMAGE: zhayujie/chatgpt-on-wechat

 jobs:
  build-and-push-image:
-    if: github.repository == 'zhayujie/chatgpt-on-wechat'
+    if: github.repository == 'zhayujie/CowAgent'
    runs-on: ubuntu-latest
    permissions:
      contents: read
@@ -47,8 +48,14 @@ jobs:
        uses: docker/metadata-action@v4
        with:
          images: |
-            ${{ env.IMAGE_NAME }}
-            ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
+            zhayujie/chatgpt-on-wechat
+            zhayujie/cowagent
+            ${{ env.REGISTRY }}/zhayujie/chatgpt-on-wechat
+            ${{ env.REGISTRY }}/zhayujie/cowagent
+          tags: |
+            type=raw,value=latest,enable={{is_default_branch}}
+            type=ref,event=branch
+            type=ref,event=tag

      - name: Build and push Docker image
        uses: docker/build-push-action@v3
--- a/README.md
+++ b/README.md
@@ -23,13 +23,13 @@
 > 该项目既是一个可以开箱即用的超级 AI 助理，也是一个支持高扩展的 Agent 框架，可以通过为项目扩展大模型接口、接入渠道、内置工具、Skills 系统来灵活实现各种定制需求。核心能力如下：

 -  ✅  **自主任务规划**：能够理解复杂任务并自主规划执行，持续思考和调用工具直到完成目标
-  ✅  **长期记忆：** 自动将对话记忆持久化至本地文件和数据库中，包括核心记忆和日级记忆，支持关键词及向量检索
+-  ✅  **长期记忆：** 自动将对话记忆持久化至本地文件和数据库中，包括核心记忆、日级记忆和梦境蒸馏，支持关键词及向量检索
 -  ✅  **个人知识库：** 自动整理结构化知识，通过交叉引用构建知识图谱，支持通过对话管理和可视化浏览知识库
 -  ✅  **技能系统：** Skills 安装和运行的引擎，支持从 [Skill Hub](https://skills.cowagent.ai/)、GitHub 等一键安装技能，或通过对话创造 Skills
 -  ✅  **工具系统：** 内置文件读写、终端执行、浏览器操作、定时任务等工具，Agent 自主调用以完成复杂任务
 -  ✅  **CLI系统：** 提供终端命令和对话命令，支持进程管理、技能安装、配置修改等操作
 -  ✅  **多模态消息：** 支持对文本、图片、语音、文件等多类型消息进行解析、处理、生成、发送等操作
-  ✅  **多模型支持：** 支持 OpenAI, Claude, Gemini, DeepSeek, MiniMax、GLM、Qwen、Kimi、Doubao 等国内外主流模型厂商
+-  ✅  **多模型支持：** 支持 DeepSeek、MiniMax、Claude、Gemini、OpenAI、GLM、Qwen、Doubao、Kimi 等国内外主流模型厂商
 -  ✅  **多通道接入：** 支持运行在本地计算机或服务器，可集成到微信、飞书、钉钉、企业微信、QQ、微信公众号、网页中使用

 ## 声明
@@ -70,6 +70,10 @@

 # 🏷 更新日志

+>**2026.04.22：** [2.0.7版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.7)，图像生成内置技能（GPT Image 2、Nano Banana 等）、新模型支持（Kimi K2.6、Claude Opus 4.7、GLM 5.1）、知识库和记忆增强、Web 控制台优化
+
+>**2026.04.14：** [2.0.6版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.6)，知识库系统、梦境记忆模块、上下文智能压缩、Web 控制台多会话及多项优化。
+
 >**2026.04.01：** [2.0.5版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.5)，Cow CLI 命令系统、Skill Hub 开源、浏览器工具、企微扫码创建、多项优化和修复。

 >**2026.03.22：** [2.0.4版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.4)，新增个人微信通道（微信扫码即用）、新增 MiniMax-M2.7 和 GLM-5-Turbo 模型、run.sh 脚本重构、日文文档及多项修复。
@@ -111,13 +115,13 @@ irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex

 项目支持国内外主流厂商的模型接口，可选模型及配置说明参考：[模型说明](#模型说明)。

-> 注：Agent 模式下推荐使用以下模型，可根据效果及成本综合选择：MiniMax-M2.7、glm-5-turbo、kimi-k2.5、qwen3.5-plus、claude-sonnet-4-6、gemini-3.1-pro-preview、gpt-5.4、gpt-5.4-mini
+> 注：Agent 模式下推荐使用以下模型，可根据效果及成本综合选择：deepseek-v4-flash、MiniMax-M2.7、glm-5.1、kimi-k2.6、qwen3.5-plus、claude-sonnet-4-6、gemini-3.1-pro-preview、gpt-5.4、gpt-5.4-mini

 同时支持使用 **LinkAI 平台** 接口，支持上述全部模型，并支持知识库、工作流、插件等 Agent 技能，参考 [接口文档](https://docs.link-ai.tech/platform/api)。

 ### 2.环境安装

-支持 Linux、MacOS、Windows 操作系统，可在个人计算机及服务器上运行，需安装 `Python`，Python 版本需在3.7 ~ 3.12 之间。
+支持 Linux、MacOS、Windows 操作系统，可在个人计算机及服务器上运行，需安装 `Python`，Python 版本需在 3.7 ~ 3.13 之间。

 > 注意：Agent 模式推荐使用源码运行，若选择 Docker 部署则无需安装 python 环境和下载源码，可直接快进到下一节。

@@ -178,7 +182,9 @@ cow install-browser
 # config.json 文件内容示例
 {
  "channel_type": "weixin",                                   # 接入渠道类型，默认为 weixin, 支持修改为 feishu,dingtalk,wecom_bot,qq,wechatcom_app,wechatmp_service,wechatmp,terminal
-  "model": "MiniMax-M2.7",                                    # 模型名称
+  "model": "deepseek-v4-flash",                                # 模型名称
+  "deepseek_api_key": "",                                      # DeepSeek API Key
+  "deepseek_api_base": "https://api.deepseek.com/v1",         # DeepSeek API 地址
  "minimax_api_key": "",                                      # MiniMax API Key
  "zhipu_ai_api_key": "",                                     # 智谱 GLM API Key
  "moonshot_api_key": "",                                     # Kimi/Moonshot API Key
@@ -188,8 +194,6 @@ cow install-browser
  "claude_api_base": "https://api.anthropic.com/v1",          # Claude API 地址，修改可接入三方代理平台
  "gemini_api_key": "",                                       # Gemini API Key
  "gemini_api_base": "https://generativelanguage.googleapis.com", # Gemini API 地址
-  "deepseek_api_key": "",                                      # DeepSeek API Key
-  "deepseek_api_base": "https://api.deepseek.com/v1",         # DeepSeek API 地址，可修改为第三方代理
  "open_ai_api_key": "",                                      # OpenAI API Key
  "open_ai_api_base": "https://api.openai.com/v1",            # OpenAI API 地址
  "linkai_api_key": "",                                       # LinkAI API Key
@@ -203,7 +207,8 @@ cow install-browser
  "agent_workspace": "~/cow",                                 # Agent 的工作空间路径，用于存储 memory、skills、系统设定等
  "agent_max_context_tokens": 50000,                          # Agent 模式下最大上下文 tokens，超出将自动智能压缩处理
  "agent_max_context_turns": 20,                              # Agent 模式下最大上下文记忆轮次，一问一答为一轮，超出后智能压缩处理
-  "agent_max_steps": 20                                       # Agent 模式下单次任务的最大决策步数，超出后将停止继续调用工具
+  "agent_max_steps": 20,                                      # Agent 模式下单次任务的最大决策步数，超出后将停止继续调用工具
+  "enable_thinking": false                                    # 是否启用深度思考模式
 }
 ```

@@ -221,7 +226,7 @@ cow install-browser
 <details>
 <summary>2. 其他配置</summary>

-+ `model`: 模型名称，Agent 模式下推荐使用 `MiniMax-M2.7`、`glm-5-turbo`、`kimi-k2.5`、`qwen3.6-plus`、`claude-sonnet-4-6`、`gemini-3.1-pro-preview`，全部模型名称参考[common/const.py](https://github.com/zhayujie/CowAgent/blob/master/common/const.py)文件
+ `model`: 模型名称，Agent 模式下推荐使用 `deepseek-v4-flash`、`MiniMax-M2.7`、`glm-5.1`、`kimi-k2.6`、`qwen3.6-plus`、`claude-sonnet-4-6`、`gemini-3.1-pro-preview`，全部模型名称参考[common/const.py](https://github.com/zhayujie/CowAgent/blob/master/common/const.py)文件
 + `character_desc`：普通对话模式下的机器人系统提示词。在 Agent 模式下该配置不生效，由工作空间中的文件内容构成。
 + `subscribe_msg`：订阅消息，公众号和企业微信 channel 中请填写，当被订阅时会自动回复， 可使用特殊占位符。目前支持的占位符有{trigger_prefix}，在程序中它会自动替换成 bot 的触发词。
 </details>
@@ -309,44 +314,36 @@ sudo docker logs -f chatgpt-on-wechat
 推荐通过 Web 控制台在线管理模型配置，无需手动编辑文件，详见 [模型文档](https://docs.cowagent.ai/models)。以下是手动修改 `config.json` 配置模型的说明：

 <details>
-<summary>OpenAI</summary>
+<summary>DeepSeek</summary>

-1. API Key 创建：在 [OpenAI平台](https://platform.openai.com/api-keys) 创建 API Key
+1. API Key 创建：在 [DeepSeek 平台](https://platform.deepseek.com/api_keys) 创建 API Key

 2. 填写配置

-```json
-{
-    "model": "gpt-5.4",
-    "open_ai_api_key": "YOUR_API_KEY",
-    "open_ai_api_base": "https://api.openai.com/v1",
-    "bot_type": "openai"
-}
-```
-
- - `model`: 与 OpenAI 接口的 [model参数](https://platform.openai.com/docs/models) 一致，支持包括 gpt-5.4、gpt-5.4-mini、gpt-5.4-nano、o 系列、gpt-4.1 等模型，Agent 模式推荐使用  `gpt-5.4`、`gpt-5.4-mini`
- - `open_ai_api_base`: 如果需要接入第三方代理接口，可通过修改该参数进行接入
- - `bot_type`: 使用 OpenAI 相关模型时无需填写。当使用第三方代理接口接入 Claude 等非 OpenAI 官方模型时，该参数设为 `openai`
-</details>
-
-<details>
-<summary>LinkAI</summary>
-
-1. API Key 创建：在 [LinkAI平台](https://link-ai.tech/console/interface) 创建 API Key 
-
-2. 填写配置
+方式一：官方接入（推荐）：

 ```json
 {
-    "model": "gpt-5.4-mini",
-    "use_linkai": true,
-    "linkai_api_key": "YOUR API KEY"
+    "model": "deepseek-v4-flash",
+    "deepseek_api_key": "sk-xxxxxxxxxxx"
+}
+```
+
+ - `model`: 推荐填写 `deepseek-v4-flash`、`deepseek-v4-pro`
+ - `deepseek_api_key`: DeepSeek 平台的 API Key
+ - `deepseek_api_base`: 可选，默认为 `https://api.deepseek.com/v1`，可修改为第三方代理地址
+
+方式二：OpenAI 兼容方式接入：
+
+```json
+{
+    "model": "deepseek-v4-flash",
+    "bot_type": "openai",
+    "open_ai_api_key": "sk-xxxxxxxxxxx",
+    "open_ai_api_base": "https://api.deepseek.com/v1"
 }
 ```

-+ `use_linkai`: 是否使用 LinkAI 接口，默认关闭，设置为 true 后可对接 LinkAI 平台的模型，并使用知识库、工作流、数据库、插件等丰富的 Agent 技能
-+ `linkai_api_key`: LinkAI 平台的 API Key，可在 [控制台](https://link-ai.tech/console/interface) 中创建
-+ `model`: [模型列表](https://link-ai.tech/console/models)中的全部模型均可使用
 </details>

 <details>
@@ -378,6 +375,56 @@ sudo docker logs -f chatgpt-on-wechat
 - `open_ai_api_key`: MiniMax 平台的 API-KEY
 </details>

+<details>
+<summary>Claude</summary>
+
+1. API Key 创建：在 [Claude控制台](https://console.anthropic.com/settings/keys) 创建 API Key
+
+2. 填写配置
+
+```json
+{
+    "model": "claude-sonnet-4-6",
+    "claude_api_key": "YOUR_API_KEY"
+}
+```
+ - `model`: 参考 [官方模型ID](https://docs.anthropic.com/en/docs/about-claude/models/overview#model-aliases) ，支持 `claude-sonnet-4-6、claude-opus-4-7、claude-opus-4-6、claude-sonnet-4-5、claude-sonnet-4-0、claude-opus-4-0、claude-3-5-sonnet-latest` 等
+</details>
+
+<details>
+<summary>Gemini</summary>
+
+API Key 创建：在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn) 创建 API Key ，配置如下
+```json
+{
+    "model": "gemini-3.1-flash-lite-preview",
+    "gemini_api_key": ""
+}
+```
+ - `model`: 参考[官方文档-模型列表](https://ai.google.dev/gemini-api/docs/models?hl=zh-cn)，支持 `gemini-3.1-flash-lite-preview、gemini-3.1-pro-preview、gemini-3-flash-preview、gemini-3-pro-preview` 等
+</details>
+
+<details>
+<summary>OpenAI</summary>
+
+1. API Key 创建：在 [OpenAI平台](https://platform.openai.com/api-keys) 创建 API Key
+
+2. 填写配置
+
+```json
+{
+    "model": "gpt-5.4",
+    "open_ai_api_key": "YOUR_API_KEY",
+    "open_ai_api_base": "https://api.openai.com/v1",
+    "bot_type": "openai"
+}
+```
+
+ - `model`: 与 OpenAI 接口的 [model参数](https://platform.openai.com/docs/models) 一致，支持包括 gpt-5.4、gpt-5.4-mini、gpt-5.4-nano、o 系列、gpt-4.1 等模型，Agent 模式推荐使用  `gpt-5.4`、`gpt-5.4-mini`
+ - `open_ai_api_base`: 如果需要接入第三方代理接口，可通过修改该参数进行接入
+ - `bot_type`: 使用 OpenAI 相关模型时无需填写。当使用第三方代理接口接入 Claude 等非 OpenAI 官方模型时，该参数设为 `openai`
+</details>
+
 <details>
 <summary>智谱AI (GLM)</summary>

@@ -385,24 +432,24 @@ sudo docker logs -f chatgpt-on-wechat

 ```json
 {
-  "model": "glm-5-turbo",
+  "model": "glm-5.1",
  "zhipu_ai_api_key": ""
 }
 ```
- - `model`: 可填 `glm-5-turbo、glm-5、glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long` 等, 参考 [glm 系列模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4)
+ - `model`: 可填 `glm-5.1、glm-5-turbo、glm-5、glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long` 等, 参考 [glm 系列模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4)
 - `zhipu_ai_api_key`: 智谱AI 平台的 API KEY，在 [控制台](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) 创建

 方式二：OpenAI 兼容方式接入，配置如下：
 ```json
 {
  "bot_type": "openai",
-  "model": "glm-5-turbo",
+  "model": "glm-5.1",
  "open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
  "open_ai_api_key": ""
 }
 ```
 - `bot_type`: OpenAI 兼容方式
- `model`: 可填 `glm-5-turbo、glm-5、glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long` 等
+- `model`: 可填 `glm-5.1、glm-5-turbo、glm-5、glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long` 等
 - `open_ai_api_base`: 智谱AI 平台的 BASE URL
 - `open_ai_api_key`: 智谱AI 平台的 API KEY
 </details>
@@ -436,35 +483,6 @@ sudo docker logs -f chatgpt-on-wechat
 - `open_ai_api_key`: 通义千问的 API-KEY
 </details>

-<details>
-<summary>Kimi (Moonshot)</summary>
-
-方式一：官方接入，配置如下：
-
-```json
-{
-    "model": "kimi-k2.5",
-    "moonshot_api_key": ""
-}
-```
- - `model`: 可填写 `kimi-k2.5、kimi-k2、moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
- - `moonshot_api_key`: Moonshot 的 API-KEY，在 [控制台](https://platform.moonshot.cn/console/api-keys) 创建
- 
-方式二：OpenAI 兼容方式接入，配置如下：
-```json
-{
-  "bot_type": "openai",
-  "model": "kimi-k2.5",
-  "open_ai_api_base": "https://api.moonshot.cn/v1",
-  "open_ai_api_key": ""
-}
-```
- `bot_type`: OpenAI 兼容方式
- `model`: 可填写 `kimi-k2.5、kimi-k2、moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
- `open_ai_api_base`: Moonshot 的 BASE URL
- `open_ai_api_key`: Moonshot 的 API-KEY
-</details>
-
 <details>
 <summary>豆包 (Doubao)</summary>

@@ -484,67 +502,74 @@ sudo docker logs -f chatgpt-on-wechat
 </details>

 <details>
-<summary>Claude</summary>
+<summary>Kimi (Moonshot)</summary>

-1. API Key 创建：在 [Claude控制台](https://console.anthropic.com/settings/keys) 创建 API Key
+方式一：官方接入，配置如下：
+
+```json
+{
+    "model": "kimi-k2.6",
+    "moonshot_api_key": ""
+}
+```
+ - `model`: 可填写 `kimi-k2.6、kimi-k2.5、kimi-k2、moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
+ - `moonshot_api_key`: Moonshot 的 API-KEY，在 [控制台](https://platform.moonshot.cn/console/api-keys) 创建
+
+方式二：OpenAI 兼容方式接入，配置如下：
+```json
+{
+  "bot_type": "openai",
+  "model": "kimi-k2.6",
+  "open_ai_api_base": "https://api.moonshot.cn/v1",
+  "open_ai_api_key": ""
+}
+```
+- `bot_type`: OpenAI 兼容方式
+- `model`: 可填写 `kimi-k2.6、kimi-k2.5、kimi-k2、moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
+- `open_ai_api_base`: Moonshot 的 BASE URL
+- `open_ai_api_key`: Moonshot 的 API-KEY
+</details>
+
+<details>
+<summary>ModelScope</summary>
+
+```json
+{
+  "bot_type": "modelscope",
+  "model": "Qwen/QwQ-32B",
+  "modelscope_api_key": "your_api_key",
+  "modelscope_base_url": "https://api-inference.modelscope.cn/v1/chat/completions",
+  "text_to_image": "MusePublic/489_ckpt_FLUX_1"
+}
+```
+
+- `bot_type`: modelscope 接口格式
+- `model`: 参考[模型列表](https://www.modelscope.cn/models?filter=inference_type&page=1)
+- `modelscope_api_key`: 参考 [官方文档-访问令牌](https://modelscope.cn/docs/accounts/token) ，在 [控制台](https://modelscope.cn/my/myaccesstoken)
+- `modelscope_base_url`: modelscope 平台的 BASE URL
+- `text_to_image`: 图像生成模型，参考[模型列表](https://www.modelscope.cn/models?filter=inference_type&page=1)
+</details>
+
+<details>
+<summary>LinkAI</summary>
+
+1. API Key 创建：在 [LinkAI平台](https://link-ai.tech/console/interface) 创建 API Key

 2. 填写配置

 ```json
 {
-    "model": "claude-sonnet-4-6",
-    "claude_api_key": "YOUR_API_KEY"
+    "model": "gpt-5.4-mini",
+    "use_linkai": true,
+    "linkai_api_key": "YOUR API KEY"
 }
 ```
- - `model`: 参考 [官方模型ID](https://docs.anthropic.com/en/docs/about-claude/models/overview#model-aliases) ，支持 `claude-sonnet-4-6、claude-opus-4-6、claude-sonnet-4-5、claude-sonnet-4-0、claude-opus-4-0、claude-3-5-sonnet-latest` 等
+
+ `use_linkai`: 是否使用 LinkAI 接口，默认关闭，设置为 true 后可对接 LinkAI 平台的模型，并使用知识库、工作流、数据库、插件等丰富的 Agent 技能
+ `linkai_api_key`: LinkAI 平台的 API Key，可在 [控制台](https://link-ai.tech/console/interface) 中创建
+ `model`: [模型列表](https://link-ai.tech/console/models)中的全部模型均可使用
 </details>

-<details>
-<summary>Gemini</summary>
-
-API Key 创建：在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn) 创建 API Key ，配置如下
-```json
-{
-    "model": "gemini-3.1-flash-lite-preview",
-    "gemini_api_key": ""
-}
-```
- - `model`: 参考[官方文档-模型列表](https://ai.google.dev/gemini-api/docs/models?hl=zh-cn)，支持 `gemini-3.1-flash-lite-preview、gemini-3.1-pro-preview、gemini-3-flash-preview、gemini-3-pro-preview` 等
-</details>
-
-<details>
-<summary>DeepSeek</summary>
-
-1. API Key 创建：在 [DeepSeek 平台](https://platform.deepseek.com/api_keys) 创建 API Key 
-
-2. 填写配置
-
-方式一：官方接入（推荐）：
-
-```json
-{
-    "model": "deepseek-chat",
-    "deepseek_api_key": "sk-xxxxxxxxxxx"
-}
-```
-
- - `model`: 可填 `deepseek-chat、deepseek-reasoner`，分别对应的是 DeepSeek-V3.2（非思考模式）和 DeepSeek-R1（思考模式）
- - `deepseek_api_key`: DeepSeek 平台的 API Key
- - `deepseek_api_base`: 可选，默认为 `https://api.deepseek.com/v1`，可修改为第三方代理地址
-
-方式二：OpenAI 兼容方式接入：
-
-```json
-{
-    "model": "deepseek-chat",
-    "bot_type": "openai",
-    "open_ai_api_key": "sk-xxxxxxxxxxx",
-    "open_ai_api_base": "https://api.deepseek.com/v1"
-}
-```
-
- </details>
-
 <details>
 <summary>Azure</summary>

@@ -637,26 +662,6 @@ API Key 创建：在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn
 - `open_ai_api_key`: 讯飞星火平台的[APIPassword](https://console.xfyun.cn/services/bm3) ，因模型而已
 </details>

-<details>
-<summary>ModelScope</summary>
-
-```json
-{
-  "bot_type": "modelscope",
-  "model": "Qwen/QwQ-32B",
-  "modelscope_api_key": "your_api_key",
-  "modelscope_base_url": "https://api-inference.modelscope.cn/v1/chat/completions",
-  "text_to_image": "MusePublic/489_ckpt_FLUX_1"
-}
-```
-
- `bot_type`: modelscope 接口格式
- `model`: 参考[模型列表](https://www.modelscope.cn/models?filter=inference_type&page=1)
- `modelscope_api_key`: 参考 [官方文档-访问令牌](https://modelscope.cn/docs/accounts/token) ，在 [控制台](https://modelscope.cn/my/myaccesstoken) 
- `modelscope_base_url`: modelscope 平台的 BASE URL
- `text_to_image`: 图像生成模型，参考[模型列表](https://www.modelscope.cn/models?filter=inference_type&page=1)
-</details>
-
 <details>
 <summary>Coding Plan</summary>

--- a/agent/chat/session_service.py
+++ b/agent/chat/session_service.py
@@ -0,0 +1,241 @@
+"""
+SessionService - Manages multi-session lifecycle for both web channel and cloud client.
+
+Provides a unified interface for listing, deleting, renaming, clearing context,
+and generating AI titles for conversation sessions. Backed by ConversationStore
+(SQLite) and AgentBridge (in-memory agent instances).
+"""
+
+import re
+from typing import Optional
+
+from common.log import logger
+
+
+def _truncate_fallback_title(user_message: str, max_len: int = 30) -> str:
+    """Pick the first non-empty line of the user message and truncate it."""
+    if not user_message:
+        return "New Chat"
+    first_line = ""
+    for line in user_message.splitlines():
+        line = line.strip()
+        if line:
+            first_line = line
+            break
+    if not first_line:
+        return "New Chat"
+    if len(first_line) > max_len:
+        first_line = first_line[:max_len].rstrip() + "..."
+    return first_line
+
+
+def generate_session_title(user_message: str, assistant_reply: str = "") -> str:
+    """
+    Generate a short session title by calling the current bot's reply_text.
+    Falls back to the first line of the user message if the LLM call fails
+    or returns an obvious error sentinel.
+    """
+    fallback = _truncate_fallback_title(user_message)
+    try:
+        from bridge.bridge import Bridge
+        from models.session_manager import Session
+        bot = Bridge().get_bot("chat")
+
+        prompt_parts = [f"User: {user_message[:300]}"]
+        if assistant_reply:
+            prompt_parts.append(f"Assistant: {assistant_reply[:300]}")
+
+        session = Session("__title_gen__", system_prompt="")
+        session.messages = [
+            {"role": "user", "content": (
+                "Generate a very short title (max 15 characters for Chinese, max 6 words for English) "
+                "summarizing this conversation. Return ONLY the title text, nothing else.\n\n"
+                + "\n".join(prompt_parts)
+            )}
+        ]
+
+        result = bot.reply_text(session) or {}
+        # When bots fail (network error, auth error, rate limit, etc.) they
+        # typically return completion_tokens=0 with a sentinel content like
+        # "请再问我一次吧" / "我现在有点累了". Treat that as failure.
+        completion_tokens = result.get("completion_tokens", 0) or 0
+        raw = (result.get("content") or "").strip()
+        if completion_tokens <= 0:
+            logger.warning(
+                f"[SessionService] Title generation got empty completion "
+                f"(completion_tokens={completion_tokens}, content='{raw[:50]}'), "
+                f"using fallback")
+            return fallback
+
+        title = re.sub(r'<think>.*?</think>', '', raw, flags=re.DOTALL).strip().strip('"\'')
+        logger.info(f"[SessionService] Title generation result: '{title}' (len={len(title)})")
+        if title and len(title) <= 50:
+            return title
+    except Exception as e:
+        logger.warning(f"[SessionService] Title generation failed: {e}")
+    return fallback
+
+
+class SessionService:
+    """
+    High-level service for session lifecycle management.
+
+    Usage:
+        svc = SessionService()
+        result = svc.dispatch("list", {"channel_type": "web", "page": 1})
+    """
+
+    def _get_store(self):
+        from agent.memory import get_conversation_store
+        return get_conversation_store()
+
+    def _remove_agent(self, session_id: str):
+        """Remove the in-memory Agent instance for a session if it exists."""
+        try:
+            from bridge.bridge import Bridge
+            ab = Bridge().get_agent_bridge()
+            if session_id in ab.agents:
+                del ab.agents[session_id]
+                logger.info(f"[SessionService] Removed agent instance: {session_id}")
+        except Exception:
+            pass
+
+    @staticmethod
+    def _normalize_sid(session_id: str) -> str:
+        if session_id and not session_id.startswith("session_"):
+            return f"session_{session_id}"
+        return session_id
+
+    # ------------------------------------------------------------------
+    # actions
+    # ------------------------------------------------------------------
+    def list_sessions(self, channel_type: Optional[str] = None,
+                      page: int = 1, page_size: int = 50) -> dict:
+        store = self._get_store()
+        return store.list_sessions(
+            channel_type=channel_type,
+            page=page,
+            page_size=page_size,
+        )
+
+    def delete_session(self, session_id: str) -> None:
+        if not session_id:
+            raise ValueError("session_id required")
+        session_id = self._normalize_sid(session_id)
+
+        store = self._get_store()
+        store.clear_session(session_id)
+        self._remove_agent(session_id)
+        logger.info(f"[SessionService] Session deleted: {session_id}")
+
+    def rename_session(self, session_id: str, title: str) -> None:
+        if not session_id:
+            raise ValueError("session_id required")
+        if not title:
+            raise ValueError("title required")
+        session_id = self._normalize_sid(session_id)
+
+        store = self._get_store()
+        found = store.rename_session(session_id, title)
+        if not found:
+            raise ValueError("session not found")
+
+    def clear_context(self, session_id: str) -> int:
+        """
+        Set context boundary. Returns the new context_start_seq value.
+        """
+        if not session_id:
+            raise ValueError("session_id required")
+        session_id = self._normalize_sid(session_id)
+
+        store = self._get_store()
+        new_seq = store.clear_context(session_id)
+        self._remove_agent(session_id)
+        return new_seq
+
+    def gen_title(self, session_id: str, user_message: str,
+                  assistant_reply: str = "") -> str:
+        """
+        Generate an AI title and persist it. Returns the generated title.
+        """
+        if not session_id:
+            raise ValueError("session_id required")
+        if not user_message:
+            raise ValueError("user_message required")
+        session_id = self._normalize_sid(session_id)
+
+        title = generate_session_title(user_message, assistant_reply)
+
+        store = self._get_store()
+        updated = store.rename_session(session_id, title)
+        logger.info(f"[SessionService] Title set: sid={session_id}, "
+                     f"title='{title}', db_updated={updated}")
+        return title
+
+    # ------------------------------------------------------------------
+    # dispatch — single entry point for protocol messages
+    # ------------------------------------------------------------------
+    def dispatch(self, action: str, payload: Optional[dict] = None) -> dict:
+        """
+        Dispatch a session management action and return a protocol-compatible
+        response dict.
+
+        Action names use a ``*_session`` / session-prefixed convention so they
+        can coexist with history actions (e.g. ``query``) on the same HISTORY
+        message channel without ambiguity.
+
+        Supported actions:
+          - list_sessions: list sessions with pagination
+          - delete_session: delete a session
+          - rename_session: rename a session title
+          - clear_context: set context boundary
+          - generate_title: AI-generate a session title
+
+        :param action: one of the above action names
+        :param payload: action-specific payload
+        :return: dict with action, code, message, payload
+        """
+        payload = payload or {}
+        try:
+            if action == "list_sessions":
+                result = self.list_sessions(
+                    channel_type=payload.get("channel_type"),
+                    page=int(payload.get("page", 1)),
+                    page_size=int(payload.get("page_size", 50)),
+                )
+                return {"action": action, "code": 200, "message": "success", "payload": result}
+
+            elif action == "delete_session":
+                self.delete_session(payload.get("session_id", ""))
+                return {"action": action, "code": 200, "message": "success", "payload": None}
+
+            elif action == "rename_session":
+                self.rename_session(
+                    payload.get("session_id", ""),
+                    payload.get("title", "").strip(),
+                )
+                return {"action": action, "code": 200, "message": "success", "payload": None}
+
+            elif action == "clear_context":
+                new_seq = self.clear_context(payload.get("session_id", ""))
+                return {"action": action, "code": 200, "message": "success",
+                        "payload": {"context_start_seq": new_seq}}
+
+            elif action == "generate_title":
+                title = self.gen_title(
+                    payload.get("session_id", ""),
+                    payload.get("user_message", ""),
+                    payload.get("assistant_reply", ""),
+                )
+                return {"action": action, "code": 200, "message": "success",
+                        "payload": {"title": title}}
+
+            else:
+                return {"action": action, "code": 400,
+                        "message": f"unknown action: {action}", "payload": None}
+
+        except ValueError as e:
+            return {"action": action, "code": 400, "message": str(e), "payload": None}
+        except Exception as e:
+            logger.error(f"[SessionService] dispatch error: action={action}, error={e}")
+            return {"action": action, "code": 500, "message": str(e), "payload": None}
--- a/agent/knowledge/service.py
+++ b/agent/knowledge/service.py
@@ -34,7 +34,8 @@ class KnowledgeService:
    # ------------------------------------------------------------------
    def list_tree(self) -> dict:
        """
-        Return the knowledge directory tree grouped by category.
+        Return the knowledge directory tree grouped by category,
+        supporting arbitrarily nested sub-directories.

        Returns::

@@ -44,10 +45,20 @@ class KnowledgeService:
                        "dir": "concepts",
                        "files": [
                            {"name": "moe.md", "title": "MoE", "size": 1234},
-                            ...
+                        ],
+                        "children": []
+                    },
+                    {
+                        "dir": "platform",
+                        "files": [],
+                        "children": [
+                            {
+                                "dir": "analysis",
+                                "files": [{"name": "perf.md", ...}],
+                                "children": []
+                            }
                        ]
                    },
-                    ...
                ],
                "stats": {"pages": 15, "size": 32768},
                "enabled": true
@@ -56,37 +67,48 @@ class KnowledgeService:
        if not os.path.isdir(self.knowledge_dir):
            return {"tree": [], "stats": {"pages": 0, "size": 0}, "enabled": conf().get("knowledge", True)}

-        tree = []
-        total_files = 0
-        total_bytes = 0
-        for name in sorted(os.listdir(self.knowledge_dir)):
-            full = os.path.join(self.knowledge_dir, name)
-            if not os.path.isdir(full) or name.startswith("."):
-                continue
-            files = []
-            for fname in sorted(os.listdir(full)):
-                if fname.endswith(".md") and not fname.startswith("."):
-                    fpath = os.path.join(full, fname)
-                    size = os.path.getsize(fpath)
-                    total_files += 1
-                    total_bytes += size
-                    title = fname.replace(".md", "")
-                    try:
-                        with open(fpath, "r", encoding="utf-8") as f:
-                            first_line = f.readline().strip()
-                        if first_line.startswith("# "):
-                            title = first_line[2:].strip()
-                    except Exception:
-                        pass
-                    files.append({"name": fname, "title": title, "size": size})
-            tree.append({"dir": name, "files": files})
+        stats = {"pages": 0, "size": 0}
+        root_files, tree = self._scan_dir(self.knowledge_dir, stats, is_root=True)

        return {
+            "root_files": root_files,
            "tree": tree,
-            "stats": {"pages": total_files, "size": total_bytes},
+            "stats": stats,
            "enabled": conf().get("knowledge", True),
        }

+    def _scan_dir(self, dir_path: str, stats: dict, is_root: bool = False) -> tuple:
+        """
+        Recursively scan a directory.
+
+        :return: (files, children) where files is a list of .md file dicts
+                 in this directory and children is a list of sub-directory nodes.
+        """
+        files = []
+        children = []
+        for name in sorted(os.listdir(dir_path)):
+            if name.startswith("."):
+                continue
+            full = os.path.join(dir_path, name)
+            if os.path.isdir(full):
+                sub_files, sub_children = self._scan_dir(full, stats)
+                children.append({"dir": name, "files": sub_files, "children": sub_children})
+            elif name.endswith(".md"):
+                size = os.path.getsize(full)
+                if not is_root:
+                    stats["pages"] += 1
+                    stats["size"] += size
+                title = name.replace(".md", "")
+                try:
+                    with open(full, "r", encoding="utf-8") as f:
+                        first_line = f.readline().strip()
+                    if first_line.startswith("# "):
+                        title = first_line[2:].strip()
+                except Exception:
+                    pass
+                files.append({"name": name, "title": title, "size": size})
+        return files, children
+
    # ------------------------------------------------------------------
    # read — single file content
    # ------------------------------------------------------------------
--- a/agent/memory/conversation_store.py
+++ b/agent/memory/conversation_store.py
@@ -139,6 +139,7 @@ def _extract_tool_results(content: Any) -> Dict[str, str]:

 def _group_into_display_turns(
    rows: List[tuple],
+    include_thinking: bool = True,
 ) -> List[Dict[str, Any]]:
    """
    Convert raw (role, content_json, created_at) DB rows into display turns.
@@ -216,6 +217,8 @@ def _group_into_display_turns(
                            continue
                        btype = block.get("type")
                        if btype == "thinking":
+                            if not include_thinking:
+                                continue
                            txt = block.get("thinking", "").strip()
                            if txt:
                                steps.append({"type": "thinking", "content": txt})
@@ -601,9 +604,17 @@ class ConversationStore:
            finally:
                conn.close()

+        # Honour the current enable_thinking switch when building display turns
+        # so that toggling it off hides previously-saved thinking blocks too.
+        try:
+            from config import conf
+            include_thinking = bool(conf().get("enable_thinking", False))
+        except Exception:
+            include_thinking = False
+
        # Strip seq for display grouping, but record max seq per visible user group
        plain_rows = [(role, content, created_at) for _seq, role, content, created_at in rows]
-        visible = _group_into_display_turns(plain_rows)
+        visible = _group_into_display_turns(plain_rows, include_thinking=include_thinking)

        # Build a mapping: find the seq of each visible user message to annotate context boundary.
        # Walk through rows to find visible user message seqs in order.
--- a/agent/memory/manager.py
+++ b/agent/memory/manager.py
@@ -401,24 +401,28 @@ class MemoryManager:
        user_id: Optional[str] = None,
        reason: str = "threshold",
        max_messages: int = 10,
+        context_summary_callback=None,
    ) -> bool:
        """
        Flush conversation summary to daily memory file.
-        
+
        Args:
            messages: Conversation message list
            user_id: Optional user ID
            reason: "threshold" | "overflow" | "daily_summary"
            max_messages: Max recent messages to include (0 = all)
-        
+            context_summary_callback: Optional callback(str) invoked with the
+                daily summary text for in-context injection
+
        Returns:
-            True if content was written
+            True if flush was dispatched
        """
        success = self.flush_manager.flush_from_messages(
            messages=messages,
            user_id=user_id,
            reason=reason,
            max_messages=max_messages,
+            context_summary_callback=context_summary_callback,
        )
        if success:
            self._dirty = True
--- a/agent/memory/service.py
+++ b/agent/memory/service.py
@@ -32,68 +32,80 @@ class MemoryService:
    # ------------------------------------------------------------------
    # list — paginated file metadata
    # ------------------------------------------------------------------
-    def list_files(self, page: int = 1, page_size: int = 20) -> dict:
+    def list_files(self, page: int = 1, page_size: int = 20, category: str = "memory") -> dict:
        """
-        List all memory files with metadata (without content).
+        List memory or dream files with metadata (without content).

-        Returns::
-
-            {
-                "page": 1,
-                "page_size": 20,
-                "total": 15,
-                "list": [
-                    {"filename": "MEMORY.md", "type": "global", "size": 2048, "updated_at": "2026-02-20 10:00:00"},
-                    {"filename": "2026-02-20.md", "type": "daily", "size": 512, "updated_at": "2026-02-20 09:30:00"},
-                    ...
-                ]
-            }
+        Args:
+            category: ``"memory"`` (default) — MEMORY.md + daily files;
+                      ``"dream"``  — dream diary files from memory/dreams/
        """
+        if category == "dream":
+            files = self._list_dream_files()
+        else:
+            files = self._list_memory_files()
+
+        total = len(files)
+        start = (page - 1) * page_size
+        end = start + page_size
+
+        return {
+            "page": page,
+            "page_size": page_size,
+            "total": total,
+            "list": files[start:end],
+        }
+
+    def _list_memory_files(self) -> List[dict]:
+        """MEMORY.md + memory/*.md (newest first)."""
        files: List[dict] = []

-        # 1. Global memory — MEMORY.md in workspace root
        global_path = os.path.join(self.workspace_root, "MEMORY.md")
        if os.path.isfile(global_path):
            files.append(self._file_info(global_path, "MEMORY.md", "global"))

-        # 2. Daily memory files — memory/*.md (sorted newest first)
        if os.path.isdir(self.memory_dir):
            daily_files = []
            for name in os.listdir(self.memory_dir):
                full = os.path.join(self.memory_dir, name)
                if os.path.isfile(full) and name.endswith(".md"):
                    daily_files.append((name, full))
-            # Sort by filename descending (newest date first)
            daily_files.sort(key=lambda x: x[0], reverse=True)
            for name, full in daily_files:
                files.append(self._file_info(full, name, "daily"))

-        total = len(files)
+        return files

-        # Paginate
-        start = (page - 1) * page_size
-        end = start + page_size
-        page_items = files[start:end]
+    def _list_dream_files(self) -> List[dict]:
+        """memory/dreams/*.md (newest first)."""
+        files: List[dict] = []
+        dreams_dir = os.path.join(self.memory_dir, "dreams")

-        return {
-            "page": page,
-            "page_size": page_size,
-            "total": total,
-            "list": page_items,
-        }
+        if os.path.isdir(dreams_dir):
+            entries = []
+            for name in os.listdir(dreams_dir):
+                full = os.path.join(dreams_dir, name)
+                if os.path.isfile(full) and name.endswith(".md"):
+                    entries.append((name, full))
+            entries.sort(key=lambda x: x[0], reverse=True)
+            for name, full in entries:
+                files.append(self._file_info(full, name, "dream"))
+
+        return files

    # ------------------------------------------------------------------
    # content — read a single file
    # ------------------------------------------------------------------
-    def get_content(self, filename: str) -> dict:
+    def get_content(self, filename: str, category: str = "memory") -> dict:
        """
-        Read the full content of a memory file.
+        Read the full content of a memory or dream file.

-        :param filename: File name, e.g. ``MEMORY.md`` or ``2026-02-20.md``
+        :param filename: File name, e.g. ``MEMORY.md``, ``2026-02-20.md``
+        :param category: ``"memory"`` or ``"dream"``
        :return: dict with ``filename`` and ``content``
        :raises FileNotFoundError: if the file does not exist
        """
-        path = self._resolve_path(filename)
+        path = self._resolve_path(filename, category)
        if not os.path.isfile(path):
            raise FileNotFoundError(f"Memory file not found: {filename}")

@@ -113,7 +125,7 @@ class MemoryService:
        Dispatch a memory management action.

        :param action: ``list`` or ``content``
-        :param payload: action-specific payload
+        :param payload: action-specific payload (supports ``category``: ``"memory"`` | ``"dream"``)
        :return: protocol-compatible response dict
        """
        payload = payload or {}
@@ -121,14 +133,16 @@ class MemoryService:
            if action == "list":
                page = payload.get("page", 1)
                page_size = payload.get("page_size", 20)
-                result_payload = self.list_files(page=page, page_size=page_size)
+                category = payload.get("category", "memory")
+                result_payload = self.list_files(page=page, page_size=page_size, category=category)
                return {"action": action, "code": 200, "message": "success", "payload": result_payload}

            elif action == "content":
                filename = payload.get("filename")
                if not filename:
                    return {"action": action, "code": 400, "message": "filename is required", "payload": None}
-                result_payload = self.get_content(filename)
+                category = payload.get("category", "memory")
+                result_payload = self.get_content(filename, category=category)
                return {"action": action, "code": 200, "message": "success", "payload": result_payload}

            else:
@@ -145,18 +159,20 @@ class MemoryService:
    # ------------------------------------------------------------------
    # internal helpers
    # ------------------------------------------------------------------
-    def _resolve_path(self, filename: str) -> str:
+    def _resolve_path(self, filename: str, category: str = "memory") -> str:
        """
        Safely resolve a filename to its absolute path within the allowed directory.

        - ``MEMORY.md`` → ``{workspace_root}/MEMORY.md``
-        - ``2026-02-20.md`` → ``{workspace_root}/memory/2026-02-20.md``
+        - ``2026-02-20.md`` (memory) → ``{workspace_root}/memory/2026-02-20.md``
+        - ``2026-02-20.md`` (dream) → ``{workspace_root}/memory/dreams/2026-02-20.md``

-        Raises ValueError if the resolved path escapes the allowed directory
-        (path traversal protection).
+        Raises ValueError if the resolved path escapes the allowed directory.
        """
        if filename == "MEMORY.md":
            base_dir = self.workspace_root
+        elif category == "dream":
+            base_dir = os.path.join(self.memory_dir, "dreams")
        else:
            base_dir = self.memory_dir

--- a/agent/memory/summarizer.py
+++ b/agent/memory/summarizer.py
@@ -1,13 +1,12 @@
 """
-Memory flush manager (with Light Dream)
+Memory flush manager with Deep Dream distillation

 Handles memory persistence when conversation context is trimmed or overflows:
- Uses LLM to summarize discarded messages into concise key-information entries
+- Uses LLM to summarize discarded messages into concise daily records
 - Writes to daily memory files (lazy creation)
- Light Dream: extracts long-term memories to MEMORY.md in the same LLM call
 - Deduplicates trim flushes to avoid repeated writes
 - Runs summarization asynchronously to avoid blocking normal replies
- Provides daily summary interface for scheduler
+- Deep Dream: periodically distills daily memories → refined MEMORY.md + dream diary
 """

 import threading
@@ -17,43 +16,78 @@ from datetime import datetime
 from common.log import logger


-SUMMARIZE_SYSTEM_PROMPT = """你是一个记忆提取助手。你的任务是从对话记录中提炼出两种记忆：
+SUMMARIZE_SYSTEM_PROMPT = """你是一个对话记录助手。请将对话内容归纳为当天的日常记录。

-## 第一部分：日常记录（[DAILY]）
+## 要求

-按「事件」维度归纳当天发生的事，不要按对话轮次逐条记录：
+按「事件」维度归纳发生的事，不要按对话轮次逐条记录：
 - 每条一行，用 "- " 开头
 - 合并同一件事的多轮对话
 - 只记录有意义的事件，忽略闲聊和问候
+- 保留关键的决策、结论和待办事项

-## 第二部分：长期记忆（[MEMORY]）
+当对话没有任何记录价值（仅含问候或无意义内容），直接回复"无"。"""

-提取值得**永久记住**的关键信息，这些信息在未来的对话中仍然有价值：
- 用户的偏好、习惯、风格（如"用户偏好中文回复"、"用户喜欢简洁风格"）
- 重要的决策或约定（如"项目决定使用 PostgreSQL"）
- 关键人物信息（如"张总是用户的上级"）
- 用户明确要求记住的内容
- 重要的教训或经验总结
+SUMMARIZE_USER_PROMPT = """请归纳以下对话的日常记录：

-**如果没有值得永久记住的信息，[MEMORY] 部分留空即可。**
+{conversation}"""
+
+# ---------------------------------------------------------------------------
+# Deep Dream prompts — distill daily memories → MEMORY.md + dream diary
+# ---------------------------------------------------------------------------
+
+DREAM_SYSTEM_PROMPT = """你是一个记忆整理助手，负责定期整理用户的长期记忆。
+
+你将收到两份材料：
+1. **当前长期记忆** — MEMORY.md 的全部现有内容
+2. **今日日记** — 当天的日常记录
+
+MEMORY.md 会注入每次对话的系统提示词中，因此必须保持精炼，只存放有价值和值得记忆的内容。
+
+**重要：只能基于提供的材料进行整理，严禁编造、推测或添加材料中不存在的信息。**
+
+## 任务
+
+### Part 1: 更新后的长期记忆（[MEMORY]）
+
+在现有记忆基础上进行整理和提炼，输出完整的更新后内容：
+- **合并提炼**：将含义相近的多条合并为一条高密度表述，而非简单罗列
+- **新增萃取**：从今日日记中提取值得永久记住的新信息（偏好、决策、人物、规则、经验）
+- **冲突更新**：当新信息与旧条目矛盾时，以新信息为准，替换旧条目
+- **清理无效**：删除临时性记录、空白条目、格式残留、无意义、重复内容等
+- **删除冗余**：已被更精炼表述涵盖的旧条目应删除，避免信息重复
+- 每条一行，用 "- " 开头，不带日期前缀
+- 可用 "## 标题" 对相关条目分组，使结构更清晰
+- 目标：控制在 50 条以内，每条尽量一句话概括
+
+### Part 2: 梦境日记（[DREAM]）
+
+用简洁的叙事风格写一篇短日记，记录这次整理的发现，保持格式美观易读：
+- 发现了哪些重复或矛盾
+- 从日记中提取了什么新洞察
+- 做了哪些清理和优化
+- 整体感受和观察

 ## 输出格式（严格遵守）

 ```
-[DAILY]
- 事件1的摘要
- 事件2的摘要
-
 [MEMORY]
- 值得永久记住的信息1
- 值得永久记住的信息2
-```
+- 记忆条目1
+- 记忆条目2
+...

-当对话没有任何记录价值（仅含问候或无意义内容），直接回复"无"。"""
+[DREAM]
+梦境日记内容...
+```"""

-SUMMARIZE_USER_PROMPT = """请从以下对话记录中提取记忆（按 [DAILY] 和 [MEMORY] 两部分输出）：
+DREAM_USER_PROMPT = """## 当前长期记忆（MEMORY.md）
+
+{memory_content}
+
+## 近期日记（最近 {days} 天）
+
+{daily_content}"""

-{conversation}"""


 class MemoryFlushManager:
@@ -81,6 +115,8 @@ class MemoryFlushManager:
        self.last_flush_timestamp: Optional[datetime] = None
        self._trim_flushed_hashes: set = set()  # Content hashes of already-flushed messages
        self._last_flushed_content_hash: str = ""  # Content hash at last flush, for daily dedup
+        self._last_dream_input_hash: str = ""  # Hash of dream input, for dedup
+        self._last_flush_thread: Optional[threading.Thread] = None
    
    def get_today_memory_file(self, user_id: Optional[str] = None, ensure_exists: bool = False) -> Path:
        """Get today's memory file path: memory/YYYY-MM-DD.md"""
@@ -124,21 +160,19 @@ class MemoryFlushManager:
        user_id: Optional[str] = None,
        reason: str = "trim",
        max_messages: int = 0,
+        context_summary_callback: Optional[Callable[[str], None]] = None,
    ) -> bool:
        """
        Asynchronously summarize and flush messages to daily memory.
-        
+
        Deduplication runs synchronously, then LLM summarization + file write
        run in a background thread so the main reply flow is never blocked.
-        
-        Args:
-            messages: Conversation message list (OpenAI/Claude format)
-            user_id: Optional user ID for user-scoped memory
-            reason: Why flush was triggered ("trim" | "overflow" | "daily_summary")
-            max_messages: Max recent messages to summarize (0 = all)
-        
-        Returns:
-            True if flush was dispatched
+
+        If *context_summary_callback* is provided, it is called with the
+        [DAILY] portion of the LLM summary once available. The caller can use
+        this to inject the summary into the live message list for context
+        continuity — one LLM call serves both disk persistence and in-context
+        injection.
        """
        try:
            import hashlib
@@ -153,18 +187,19 @@ class MemoryFlushManager:
                    deduped.append(m)
            if not deduped:
                return False
-            
+
            import copy
            snapshot = copy.deepcopy(deduped)
            thread = threading.Thread(
                target=self._flush_worker,
-                args=(snapshot, user_id, reason, max_messages),
+                args=(snapshot, user_id, reason, max_messages, context_summary_callback),
                daemon=True,
            )
            thread.start()
            logger.info(f"[MemoryFlush] Async flush dispatched (reason={reason}, msgs={len(snapshot)})")
+            self._last_flush_thread = thread
            return True
-            
+
        except Exception as e:
            logger.warning(f"[MemoryFlush] Failed to dispatch flush (reason={reason}): {e}")
            return False
@@ -175,43 +210,41 @@ class MemoryFlushManager:
        user_id: Optional[str],
        reason: str,
        max_messages: int,
+        context_summary_callback: Optional[Callable[[str], None]] = None,
    ):
-        """Background worker: summarize with LLM, write daily file + MEMORY.md (Light Dream)."""
+        """Background worker: summarize with LLM, write daily memory file."""
        try:
            raw_summary = self._summarize_messages(messages, max_messages)
            if not raw_summary or not raw_summary.strip() or raw_summary.strip() == "无":
                logger.info(f"[MemoryFlush] No valuable content to flush (reason={reason})")
                return

-            daily_part, memory_part = self._parse_dual_output(raw_summary)
+            # Strip legacy [DAILY]/[MEMORY] markers if model still outputs them
+            daily_part = self._clean_summary_output(raw_summary)
+            if not daily_part:
+                return

            # --- Write daily memory ---
-            if daily_part:
-                daily_file = ensure_daily_memory_file(self.workspace_dir, user_id)
+            daily_file = ensure_daily_memory_file(self.workspace_dir, user_id)

-                if reason == "overflow":
-                    header = f"## Context Overflow Recovery ({datetime.now().strftime('%H:%M')})"
-                    note = "The following conversation was trimmed due to context overflow:\n"
-                elif reason == "trim":
-                    header = f"## Trimmed Context ({datetime.now().strftime('%H:%M')})"
-                    note = ""
-                elif reason == "daily_summary":
-                    header = f"## Daily Summary ({datetime.now().strftime('%H:%M')})"
-                    note = ""
-                else:
-                    header = f"## Session Notes ({datetime.now().strftime('%H:%M')})"
-                    note = ""
+            headers = {
+                "overflow": f"## Context Overflow Recovery ({datetime.now().strftime('%H:%M')})",
+                "trim": f"## Trimmed Context ({datetime.now().strftime('%H:%M')})",
+                "daily_summary": f"## Daily Summary ({datetime.now().strftime('%H:%M')})",
+            }
+            header = headers.get(reason, f"## Session Notes ({datetime.now().strftime('%H:%M')})")

-                flush_entry = f"\n{header}\n\n{note}{daily_part}\n"
+            with open(daily_file, "a", encoding="utf-8") as f:
+                f.write(f"\n{header}\n\n{daily_part}\n")

-                with open(daily_file, "a", encoding="utf-8") as f:
-                    f.write(flush_entry)
+            logger.info(f"[MemoryFlush] Wrote daily memory to {daily_file.name} (reason={reason}, chars={len(daily_part)})")

-                logger.info(f"[MemoryFlush] Wrote daily memory to {daily_file.name} (reason={reason}, chars={len(daily_part)})")
-
-            # --- Light Dream: write long-term memory to MEMORY.md ---
-            if memory_part:
-                self._append_to_main_memory(memory_part, user_id)
+            # --- Inject context summary into live messages (if callback provided) ---
+            if context_summary_callback:
+                try:
+                    context_summary_callback(daily_part)
+                except Exception as e:
+                    logger.warning(f"[MemoryFlush] Context summary callback failed: {e}")

            self.last_flush_timestamp = datetime.now()

@@ -219,67 +252,26 @@ class MemoryFlushManager:
            logger.warning(f"[MemoryFlush] Async flush failed (reason={reason}): {e}")

    @staticmethod
-    def _parse_dual_output(raw: str) -> tuple:
-        """
-        Parse LLM output into (daily_part, memory_part).
-        Handles both new [DAILY]/[MEMORY] format and legacy single-section format.
-        """
+    def _clean_summary_output(raw: str) -> str:
+        """Strip legacy [DAILY]/[MEMORY] markers if present, return clean daily text."""
        raw = raw.strip()
+        if not raw or raw == "无":
+            return ""

-        if "[DAILY]" in raw or "[MEMORY]" in raw:
-            daily_part = ""
-            memory_part = ""
+        # Strip [DAILY] marker
+        if "[DAILY]" in raw:
+            start = raw.index("[DAILY]") + len("[DAILY]")
+            end = raw.index("[MEMORY]") if "[MEMORY]" in raw else len(raw)
+            raw = raw[start:end].strip()

-            # Extract [DAILY] section
-            if "[DAILY]" in raw:
-                start = raw.index("[DAILY]") + len("[DAILY]")
-                end = raw.index("[MEMORY]") if "[MEMORY]" in raw else len(raw)
-                daily_part = raw[start:end].strip()
+        # Remove stray [MEMORY] section entirely
+        if "[MEMORY]" in raw:
+            raw = raw[:raw.index("[MEMORY]")].strip()

-            # Extract [MEMORY] section
-            if "[MEMORY]" in raw:
-                start = raw.index("[MEMORY]") + len("[MEMORY]")
-                memory_part = raw[start:].strip()
+        # Remove markdown code fences
+        raw = raw.replace("```", "").strip()

-            # Filter out empty markers
-            if memory_part and all(
-                not line.strip() or line.strip() == "-"
-                for line in memory_part.split("\n")
-            ):
-                memory_part = ""
-
-            return daily_part, memory_part
-
-        # Legacy format: treat entire output as daily, no memory extraction
-        return raw, ""
-
-    def _append_to_main_memory(self, memory_entries: str, user_id: Optional[str] = None):
-        """Append extracted long-term memories to MEMORY.md with date stamp."""
-        try:
-            main_file = self.get_main_memory_file(user_id)
-            today = datetime.now().strftime("%Y-%m-%d")
-
-            # Add date prefix to each entry line
-            stamped_lines = []
-            for line in memory_entries.strip().split("\n"):
-                line = line.strip()
-                if line.startswith("- "):
-                    stamped_lines.append(f"- ({today}) {line[2:]}")
-                elif line:
-                    stamped_lines.append(f"- ({today}) {line}")
-
-            if not stamped_lines:
-                return
-
-            stamped_text = "\n".join(stamped_lines)
-
-            with open(main_file, "a", encoding="utf-8") as f:
-                f.write(f"\n{stamped_text}\n")
-
-            logger.info(f"[LightDream] Appended {len(stamped_lines)} entries to MEMORY.md")
-
-        except Exception as e:
-            logger.warning(f"[LightDream] Failed to append to MEMORY.md: {e}")
+        return raw

    def create_daily_summary(
        self,
@@ -306,12 +298,187 @@ class MemoryFlushManager:
            reason="daily_summary",
            max_messages=0,
        )
-    
+
+    # ---- Deep Dream (memory distillation) ----
+
+    def deep_dream(self, user_id: Optional[str] = None, lookback_days: int = 1, force: bool = False) -> bool:
+        """
+        Distill recent daily memories into MEMORY.md and generate a dream diary.
+
+        Args:
+            lookback_days: How many days of daily files to read (default 1 for scheduled, 3 for manual)
+            force: Skip input-hash dedup check (used by manual /memory dream trigger)
+        """
+        if not self.llm_model:
+            logger.warning("[DeepDream] No LLM model available, skipping")
+            return False
+
+        logger.info(f"[DeepDream] Starting memory distillation (lookback={lookback_days} days)")
+
+        # Collect materials
+        memory_content = self._read_main_memory(user_id)
+        daily_content, has_content = self._read_recent_dailies(user_id, lookback_days)
+
+        if not has_content:
+            logger.info("[DeepDream] No recent daily records, skipping to preserve existing MEMORY.md")
+            return False
+
+        # Dedup: skip if input materials haven't changed since last dream
+        import hashlib
+        input_hash = hashlib.md5((memory_content + daily_content).encode("utf-8")).hexdigest()
+        if not force and input_hash == self._last_dream_input_hash:
+            logger.debug("[DeepDream] Input unchanged since last dream, skipping")
+            return False
+        self._last_dream_input_hash = input_hash
+
+        logger.info(
+            f"[DeepDream] Materials collected: "
+            f"MEMORY.md={len(memory_content)} chars, "
+            f"daily={len(daily_content)} chars"
+        )
+
+        # Call LLM for distillation
+        import time as _time
+        t0 = _time.monotonic()
+        try:
+            user_msg = DREAM_USER_PROMPT.format(
+                memory_content=memory_content or "(empty)",
+                days=lookback_days,
+                daily_content=daily_content or "(no recent daily records)",
+            )
+            from agent.protocol.models import LLMRequest
+            # Scale max_tokens based on input size to avoid truncating large MEMORY.md
+            input_chars = len(memory_content) + len(daily_content)
+            dream_max_tokens = max(2000, min(input_chars, 8000))
+            request = LLMRequest(
+                messages=[{"role": "user", "content": user_msg}],
+                temperature=0.3,
+                max_tokens=dream_max_tokens,
+                stream=False,
+                system=DREAM_SYSTEM_PROMPT,
+            )
+            response = self.llm_model.call(request)
+            raw = self._extract_response_text(response)
+            elapsed = _time.monotonic() - t0
+            if not raw or not raw.strip():
+                logger.warning(f"[DeepDream] LLM returned empty response ({elapsed:.1f}s)")
+                return False
+            logger.info(f"[DeepDream] LLM distillation completed ({elapsed:.1f}s, {len(raw)} chars)")
+        except Exception as e:
+            elapsed = _time.monotonic() - t0
+            logger.warning(f"[DeepDream] LLM call failed ({elapsed:.1f}s): {e}")
+            return False
+
+        # Parse [MEMORY] and [DREAM] sections
+        new_memory, dream_diary = self._parse_dream_output(raw)
+
+        if not new_memory:
+            logger.warning("[DeepDream] No [MEMORY] section in LLM output, skipping overwrite")
+            return False
+
+        # Overwrite MEMORY.md
+        try:
+            main_file = self.get_main_memory_file(user_id)
+            old_size = len(memory_content)
+            main_file.write_text(new_memory + "\n", encoding="utf-8")
+            logger.info(
+                f"[DeepDream] Updated MEMORY.md "
+                f"({old_size} → {len(new_memory)} chars)"
+            )
+        except Exception as e:
+            logger.warning(f"[DeepDream] Failed to write MEMORY.md: {e}")
+            return False
+
+        # Write dream diary
+        if dream_diary:
+            try:
+                self._write_dream_diary(dream_diary, user_id)
+            except Exception as e:
+                logger.warning(f"[DeepDream] Failed to write dream diary: {e}")
+
+        logger.info("[DeepDream] ✅ Deep Dream completed successfully")
+        return True
+
+    def _read_main_memory(self, user_id: Optional[str] = None) -> str:
+        """Read current MEMORY.md content."""
+        main_file = self.get_main_memory_file(user_id)
+        if main_file.exists():
+            return main_file.read_text(encoding="utf-8").strip()
+        return ""
+
+    def _read_recent_dailies(
+        self, user_id: Optional[str] = None, lookback_days: int = 1
+    ) -> tuple:
+        """
+        Read recent daily memory files.
+
+        Returns:
+            (combined_text, has_content) tuple
+        """
+        from datetime import timedelta
+
+        parts = []
+        has_content = False
+        today = datetime.now().date()
+
+        for offset in range(lookback_days):
+            day = today - timedelta(days=offset)
+            date_str = day.strftime("%Y-%m-%d")
+            if user_id:
+                daily_file = self.memory_dir / "users" / user_id / f"{date_str}.md"
+            else:
+                daily_file = self.memory_dir / f"{date_str}.md"
+
+            if daily_file.exists():
+                content = daily_file.read_text(encoding="utf-8").strip()
+                if content:
+                    parts.append(f"### {date_str}\n\n{content}")
+                    has_content = True
+            else:
+                parts.append(f"### {date_str}\n\n(no records)")
+
+        return "\n\n".join(parts), has_content
+
+    @staticmethod
+    def _parse_dream_output(raw: str) -> tuple:
+        """Parse LLM output into (new_memory, dream_diary)."""
+        raw = raw.strip().replace("```", "")
+        new_memory = ""
+        dream_diary = ""
+
+        if "[MEMORY]" in raw:
+            start = raw.index("[MEMORY]") + len("[MEMORY]")
+            end = raw.index("[DREAM]") if "[DREAM]" in raw else len(raw)
+            new_memory = raw[start:end].strip()
+
+        if "[DREAM]" in raw:
+            start = raw.index("[DREAM]") + len("[DREAM]")
+            dream_diary = raw[start:].strip()
+
+        return new_memory, dream_diary
+
+    def _write_dream_diary(self, content: str, user_id: Optional[str] = None):
+        """Write dream diary to memory/dreams/YYYY-MM-DD.md."""
+        dreams_dir = self.memory_dir / "dreams"
+        if user_id:
+            dreams_dir = self.memory_dir / "users" / user_id / "dreams"
+        dreams_dir.mkdir(parents=True, exist_ok=True)
+
+        today = datetime.now().strftime("%Y-%m-%d")
+        diary_file = dreams_dir / f"{today}.md"
+        diary_file.write_text(
+            f"# Dream Diary: {today}\n\n{content}\n",
+            encoding="utf-8",
+        )
+        logger.info(f"[DeepDream] Wrote dream diary to {diary_file}")
+
    # ---- Internal helpers ----
    
    def _summarize_messages(self, messages: List[Dict], max_messages: int = 0) -> str:
        """
-        Summarize conversation messages using LLM, with rule-based fallback.
+        Summarize conversation messages using LLM.
+        Returns empty string if LLM deems content not worth recording.
+        Rule-based fallback only used when LLM call raises an exception.
        """
        conversation_text = self._format_conversation_for_summary(messages, max_messages)
        if not conversation_text.strip():
@@ -322,13 +489,14 @@ class MemoryFlushManager:
                summary = self._call_llm_for_summary(conversation_text)
                if summary and summary.strip() and summary.strip() != "无":
                    return summary.strip()
-                logger.info(f"[MemoryFlush] LLM returned empty or '无', using fallback")
+                logger.info("[MemoryFlush] LLM returned empty or '无', skipping write")
+                return ""
            except Exception as e:
                logger.warning(f"[MemoryFlush] LLM summarization failed, using fallback: {e}")
+                return self._extract_summary_fallback(messages, max_messages)
        else:
            logger.info("[MemoryFlush] No LLM model available, using rule-based fallback")
-        
-        return self._extract_summary_fallback(messages, max_messages)
+            return self._extract_summary_fallback(messages, max_messages)

    def _format_conversation_for_summary(self, messages: List[Dict], max_messages: int = 0) -> str:
        """Format messages into readable conversation text for LLM summarization."""
@@ -346,6 +514,52 @@ class MemoryFlushManager:
                lines.append(f"助手: {text[:500]}")
        return "\n".join(lines)

+    @staticmethod
+    def _extract_response_text(response) -> str:
+        """
+        Extract text from LLM response regardless of format.
+
+        Handles:
+        - Generator (MiniMax _handle_sync_response yields Claude-format dicts)
+        - Claude format: {"role":"assistant","content":[{"type":"text","text":"..."}]}
+        - OpenAI format: {"choices":[{"message":{"content":"..."}}]}
+        - OpenAI SDK response object with .choices attribute
+        """
+        import types
+
+        # Unwrap generator — consume first yielded item
+        if isinstance(response, types.GeneratorType):
+            try:
+                response = next(response)
+            except StopIteration:
+                return ""
+
+        if not response:
+            return ""
+
+        if isinstance(response, dict):
+            # Check for error
+            if response.get("error"):
+                raise RuntimeError(response.get("message", "LLM call failed"))
+
+            # Claude format: content is a list of blocks
+            content = response.get("content")
+            if isinstance(content, list):
+                for block in content:
+                    if isinstance(block, dict) and block.get("type") == "text":
+                        return block.get("text", "")
+
+            # OpenAI format
+            choices = response.get("choices", [])
+            if choices:
+                return choices[0].get("message", {}).get("content", "")
+
+        # OpenAI SDK response object
+        if hasattr(response, "choices") and response.choices:
+            return response.choices[0].message.content or ""
+
+        return ""
+
    def _call_llm_for_summary(self, conversation_text: str) -> str:
        """Call LLM to generate a concise summary of the conversation."""
        from agent.protocol.models import LLMRequest
@@ -359,27 +573,31 @@ class MemoryFlushManager:
        )
        
        response = self.llm_model.call(request)
-        
-        if isinstance(response, dict):
-            if response.get("error"):
-                raise RuntimeError(response.get("message", "LLM call failed"))
-            # OpenAI format
-            choices = response.get("choices", [])
-            if choices:
-                return choices[0].get("message", {}).get("content", "")
-        
-        # Handle response object with attribute access (e.g. OpenAI SDK response)
-        if hasattr(response, "choices") and response.choices:
-            return response.choices[0].message.content or ""
-        
-        return ""
+        return self._extract_response_text(response)
+
+    @staticmethod
+    def _extract_first_meaningful_line(text: str, max_len: int = 120) -> str:
+        """Extract the first meaningful line from assistant reply, skipping markdown noise."""
+        import re
+        for line in text.split("\n"):
+            line = line.strip()
+            if not line:
+                continue
+            # Skip markdown headings, horizontal rules, code fences, pure emoji/symbols
+            if re.match(r'^(#{1,4}\s|```|---|\*\*\*|[-*]\s*$|[^\w\u4e00-\u9fff]{1,5}$)', line):
+                continue
+            # Strip leading markdown bold/emoji decorations
+            cleaned = re.sub(r'^[\*#>\-\s]+', '', line).strip()
+            cleaned = re.sub(r'^[\U0001f300-\U0001f9ff\u2600-\u27bf\s]+', '', cleaned).strip()
+            if len(cleaned) >= 5:
+                return cleaned[:max_len]
+        return text.split("\n")[0].strip()[:max_len]

    @staticmethod
    def _extract_summary_fallback(messages: List[Dict], max_messages: int = 0) -> str:
        """
-        Rule-based fallback when LLM is unavailable.
-        Groups consecutive user+assistant messages into events instead of
-        listing each message individually.
+        Rule-based summary of discarded messages.
+        Format: "用户问了X; 助手回答了Y" per event, compact and readable.
        """
        msgs = messages if max_messages == 0 else messages[-max_messages * 2:]

@@ -393,19 +611,19 @@ class MemoryFlushManager:
            text = text.strip()

            if role == "user":
-                if len(text) <= 5:
+                if len(text) <= 3:
                    continue
-                current_user_text = text[:150]
+                current_user_text = text[:120]
            elif role == "assistant" and current_user_text:
-                first_line = text.split("\n")[0].strip()
-                if len(first_line) > 10:
-                    events.append(f"- {current_user_text} → {first_line[:150]}")
+                reply_summary = MemoryFlushManager._extract_first_meaningful_line(text)
+                if reply_summary:
+                    events.append(f"- 用户: {current_user_text} → 回复: {reply_summary}")
                else:
-                    events.append(f"- {current_user_text}")
+                    events.append(f"- 用户: {current_user_text}")
                current_user_text = ""

        if current_user_text:
-            events.append(f"- {current_user_text}")
+            events.append(f"- 用户: {current_user_text}")

        return "\n".join(events[:10])
    
--- a/agent/prompt/builder.py
+++ b/agent/prompt/builder.py
@@ -291,8 +291,8 @@ def _build_memory_section(memory_manager: Any, tools: Optional[List[Any]], langu
        "",
        "### Memory Recall（mandatory）",
        "",
-        "在回答任何关于过往工作、决策、日期、人物、偏好或待办事项的问题之前，**必须**先检索记忆。",
-        "MEMORY.md 已自动加载在项目上下文中（可能被截断），完整内容和每日记忆需要通过工具检索。",
+        "当用户询问过往事件、引用之前的决定、提到人物关系、偏好、待办、或你对某事不确定时，**必须先检索记忆再回答**。",
+        "如果 MEMORY.md 中已有相关信息则无需重复检索。完整内容和每日记忆需要通过工具检索。",
        "",
        "1. 不确定位置 → `memory_search` 关键词/语义检索",
        "2. 已知位置 → `memory_get` 直接读取对应行",
@@ -307,7 +307,7 @@ def _build_memory_section(memory_manager: Any, tools: Optional[List[Any]], langu
        "",
        "遇到以下情况时，**主动**将信息写入记忆文件（无需告知用户）：",
        "",
-        "- 用户要求记住某些信息",
+        "- 用户要求记住某些信息，或使用了「记住」「以后」「总是」「不要」「偏好」等表达",
        "- 用户分享了重要的个人偏好、习惯、决策",
        "- 对话中产生了重要的结论、方案、约定",
        "- 完成了复杂任务，值得记录关键步骤和结果",
--- a/agent/protocol/agent_stream.py
+++ b/agent/protocol/agent_stream.py
@@ -13,6 +13,37 @@ from agent.tools.base_tool import BaseTool, ToolResult
 from common.log import logger


+# Maximum number of characters of model "reasoning / thinking" content to persist
+# in conversation history. The full reasoning is still streamed to the UI in real
+# time (subject to its own SSE / rendering limits); this bound only controls what
+# is stored in DB and replayed in history. Long reasoning is not useful for later
+# context (the LLM never sees thinking blocks anyway) and bloats DB.
+# Keep aligned with the frontend REASONING_RENDER_CAP and the SSE
+# MAX_REASONING_STREAM_CHARS so that storage / stream / display all match.
+MAX_STORED_REASONING_CHARS = 4 * 1024  # 4 KB
+
+# Marker inserted between head and tail when reasoning is truncated.
+_REASONING_TRUNCATE_MARKER = "\n\n... [reasoning truncated, {omitted} chars omitted] ...\n\n"
+
+
+def _truncate_reasoning_for_storage(text: str) -> str:
+    """Trim long reasoning to head + tail with an omission marker.
+
+    Keeps the first and last halves of MAX_STORED_REASONING_CHARS so both the
+    initial chain-of-thought and the final conclusions are preserved for UI
+    replay, without storing the entire (often very large) middle.
+    """
+    if not text:
+        return text
+    if len(text) <= MAX_STORED_REASONING_CHARS:
+        return text
+    half = MAX_STORED_REASONING_CHARS // 2
+    head = text[:half]
+    tail = text[-half:]
+    omitted = len(text) - len(head) - len(tail)
+    return head + _REASONING_TRUNCATE_MARKER.format(omitted=omitted) + tail
+
+
 class AgentStreamExecutor:
    """
    Agent Stream Executor
@@ -78,18 +109,48 @@ class AgentStreamExecutor:
            except Exception as e:
                logger.error(f"Event callback error: {e}")
    
+    def _is_thinking_enabled(self) -> bool:
+        """Whether deep-thinking mode is on at the model layer.
+
+        Mirrors the global toggle used by ``bridge.agent_bridge`` when deciding
+        whether to send ``thinking={"type": "enabled"}`` to the model. Used for
+        logging and reasoning-update event emission across all channels.
+        """
+        from config import conf
+        return bool(conf().get("enable_thinking", False))
+
+    def _should_render_thinking_inline(self) -> bool:
+        """Whether ``<think>...</think>`` blocks embedded directly in ``content``
+        (MiniMax, some third-party proxies) should be surfaced to the channel.
+
+        Only the Web console can render them in a collapsible panel. IM channels
+        (WeChat/WeCom/DingTalk/Feishu) must strip them, otherwise users see raw
+        XML tags in their chat.
+        """
+        from config import conf
+        channel_type = getattr(self.model, 'channel_type', '') or ''
+        return conf().get("enable_thinking", False) and channel_type == 'web'
+
    def _filter_think_tags(self, text: str) -> str:
        """
-        Remove <think> and </think> tags but keep the content inside.
-        Some LLM providers (e.g., MiniMax) may return thinking process wrapped in <think> tags.
-        We only remove the tags themselves, keeping the actual thinking content.
+        Handle <think>...</think> blocks in content returned by some LLM providers
+        (e.g., MiniMax).
+
+        - When inline thinking rendering is allowed (Web + thinking enabled):
+          remove only the tags, keep the content inside.
+        - Otherwise (IM channels, or thinking disabled globally): remove both
+          the tags and the content entirely.
        """
        if not text:
            return text
        import re
-        # Remove only the <think> and </think> tags, keep the content
-        text = re.sub(r'<think>', '', text)
-        text = re.sub(r'</think>', '', text)
+        if self._should_render_thinking_inline():
+            text = re.sub(r'<think>', '', text)
+            text = re.sub(r'</think>', '', text)
+        else:
+            text = re.sub(r'<think>[\s\S]*?</think>', '', text)
+            # Also strip unclosed <think> tag at the end (streaming partial)
+            text = re.sub(r'<think>[\s\S]*$', '', text)
        return text

    def _hash_args(self, args: dict) -> str:
@@ -178,7 +239,10 @@ class AgentStreamExecutor:
            Final response text
        """
        # Log user message with model info
-        logger.info(f"🤖 {self.model.model} | 👤 {user_message}")
+        
+        thinking_enabled = self._is_thinking_enabled()
+        thinking_label = " | 💭 thinking" if thinking_enabled else ""
+        logger.info(f"🤖 {self.model.model}{thinking_label} | 👤 {user_message}")        
        
        # Add user message (Claude format - use content blocks for consistency)
        self.messages.append({
@@ -227,6 +291,9 @@ class AgentStreamExecutor:
                        if turn > 1:
                            logger.info(f"[Agent] Requesting explicit response from LLM...")
                            
+                            # Remember position so we can remove the injected prompt later
+                            prompt_insert_idx = len(self.messages)
+                            
                            # 添加一条消息，明确要求回复用户
                            self.messages.append({
                                "role": "user",
@@ -240,8 +307,24 @@ class AgentStreamExecutor:
                            assistant_msg, tool_calls = self._call_llm_stream(retry_on_empty=False)
                            final_response = assistant_msg
                            
-                            # 如果还是空，才使用 fallback
-                            if not assistant_msg and not tool_calls:
+                            # Remove the injected prompt from history so it doesn't
+                            # appear as a user message in persisted conversations.
+                            # _call_llm_stream may have appended an assistant message
+                            # after the prompt, so we locate and remove only the prompt.
+                            if (prompt_insert_idx < len(self.messages)
+                                    and self.messages[prompt_insert_idx].get("role") == "user"):
+                                self.messages.pop(prompt_insert_idx)
+                                logger.debug("[Agent] Removed injected explicit-response prompt from message history")
+                            
+                            # If LLM responded with tool_calls instead of text, fall through
+                            # to the tool execution path below (don't break the loop).
+                            if tool_calls:
+                                logger.info(
+                                    f"[Agent] LLM returned tool_calls in explicit-response retry, "
+                                    f"continuing to execute tools instead of breaking"
+                                )
+                            elif not assistant_msg:
+                                # Still empty (no text and no tool_calls): use fallback
                                logger.warning(f"[Agent] Still empty after explicit request")
                                final_response = (
                                    "抱歉，我暂时无法生成回复。请尝试换一种方式描述你的需求，或稍后再试。"
@@ -256,20 +339,28 @@ class AgentStreamExecutor:
                    else:
                        logger.info(f"💭 {assistant_msg[:150]}{'...' if len(assistant_msg) > 150 else ''}")
                    
-                    logger.debug(f"✅ 完成 (无工具调用)")
-                    self._emit_event("turn_end", {
-                        "turn": turn,
-                        "has_tool_calls": False
-                    })
-                    break
+                    # If the explicit-response retry produced tool_calls, skip the break
+                    # and continue down to the tool execution branch in this same iteration.
+                    if not tool_calls:
+                        logger.debug(f"✅ 完成 (无工具调用)")
+                        self._emit_event("turn_end", {
+                            "turn": turn,
+                            "has_tool_calls": False
+                        })
+                        break

-                # Log tool calls with arguments
+                # Log tool calls with arguments (truncate long values like base64)
                tool_calls_str = []
                for tc in tool_calls:
-                    # Safely handle None or missing arguments
                    args = tc.get('arguments') or {}
                    if isinstance(args, dict):
-                        args_str = ', '.join([f"{k}={v}" for k, v in args.items()])
+                        parts = []
+                        for k, v in args.items():
+                            v_str = str(v)
+                            if len(v_str) > 200:
+                                v_str = v_str[:200] + f"...({len(v_str)} chars)"
+                            parts.append(f"{k}={v_str}")
+                        args_str = ', '.join(parts)
                        if args_str:
                            tool_calls_str.append(f"{tc['name']}({args_str})")
                        else:
@@ -588,7 +679,8 @@ class AgentStreamExecutor:
                    reasoning_delta = delta.get("reasoning_content") or ""
                    if reasoning_delta:
                        full_reasoning += reasoning_delta
-                        self._emit_event("reasoning_update", {"delta": reasoning_delta})
+                        if self._is_thinking_enabled():
+                            self._emit_event("reasoning_update", {"delta": reasoning_delta})

                    # Handle text content
                    content_delta = delta.get("content") or ""
@@ -622,8 +714,11 @@ class AgentStreamExecutor:
                                    tool_calls_buffer[index]["arguments"] += func["arguments"]

                    # Preserve _gemini_raw_parts for Gemini thoughtSignature round-trip
+                    # (direct Gemini: list of parts; LinkAI proxy: base64 string of JSON parts)
                    if "_gemini_raw_parts" in delta:
                        gemini_raw_parts = delta["_gemini_raw_parts"]
+                    elif isinstance(choice, dict) and choice.get("_gemini_raw_parts"):
+                        gemini_raw_parts = choice["_gemini_raw_parts"]

        except Exception as e:
            error_str = str(e)
@@ -790,9 +885,15 @@ class AgentStreamExecutor:
        assistant_msg = {"role": "assistant", "content": []}

        if full_reasoning:
+            stored_reasoning = _truncate_reasoning_for_storage(full_reasoning)
+            if len(stored_reasoning) < len(full_reasoning):
+                logger.info(
+                    f"[reasoning] truncated for storage: "
+                    f"{len(full_reasoning)} -> {len(stored_reasoning)} chars"
+                )
            assistant_msg["content"].append({
                "type": "thinking",
-                "thinking": full_reasoning
+                "thinking": stored_reasoning
            })

        if full_content:
@@ -1198,6 +1299,56 @@ class AgentStreamExecutor:
        logger.warning("🔧 Aggressive trim: nothing to trim, will clear history")
        return False

+    def _build_context_summary_callback(self, discarded_turns: list, kept_turns: list):
+        """
+        Build a callback that injects an LLM summary into the first user
+        message of *kept_turns*. Returns None if no valid injection target.
+
+        The callback is passed to flush_from_messages so that the same LLM
+        call that writes daily memory also provides the in-context summary.
+        """
+        if not kept_turns:
+            return None
+
+        # Find the first user text block in kept_turns as injection target
+        target_block = None
+        for turn in kept_turns:
+            for msg in turn["messages"]:
+                if msg.get("role") == "user":
+                    content = msg.get("content", [])
+                    if isinstance(content, list):
+                        for block in content:
+                            if isinstance(block, dict) and block.get("type") == "text":
+                                target_block = block
+                                break
+                    if target_block:
+                        break
+            if target_block:
+                break
+
+        if not target_block:
+            return None
+
+        turn_count = len(discarded_turns)
+        original_text = target_block["text"]
+
+        def _on_summary_ready(summary: str):
+            if not summary or not summary.strip():
+                return
+            target_block["text"] = (
+                f"[System: Previous conversation summary — "
+                f"{turn_count} turns were compacted]\n\n"
+                f"{summary.strip()}\n\n"
+                f"The recent conversation continues below.\n\n---\n\n"
+                f"{original_text}"
+            )
+            logger.info(
+                f"📝 Context summary injected "
+                f"({len(summary)} chars, {turn_count} turns)"
+            )
+
+        return _on_summary_ready
+
    def _trim_messages(self):
        """
        智能清理消息历史，保持对话完整性
@@ -1224,25 +1375,28 @@ class AgentStreamExecutor:
            removed_count = len(turns) // 2
            keep_count = len(turns) - removed_count
            
-            # Flush discarded turns to daily memory
-            if self.agent.memory_manager:
-                discarded_messages = []
-                for turn in turns[:removed_count]:
-                    discarded_messages.extend(turn["messages"])
-                if discarded_messages:
-                    user_id = getattr(self.agent, '_current_user_id', None)
-                    self.agent.memory_manager.flush_memory(
-                        messages=discarded_messages, user_id=user_id,
-                        reason="trim", max_messages=0
-                    )
-            
+            discarded_turns = turns[:removed_count]
            turns = turns[-keep_count:]
-            
+
            logger.info(
                f"💾 上下文轮次超限: {keep_count + removed_count} > {self.max_context_turns}，"
                f"裁剪至 {keep_count} 轮（移除 {removed_count} 轮）"
            )

+            # Flush to daily memory + inject context summary (single async LLM call)
+            if self.agent.memory_manager:
+                discarded_messages = []
+                for turn in discarded_turns:
+                    discarded_messages.extend(turn["messages"])
+                if discarded_messages:
+                    user_id = getattr(self.agent, '_current_user_id', None)
+                    cb = self._build_context_summary_callback(discarded_turns, turns)
+                    self.agent.memory_manager.flush_memory(
+                        messages=discarded_messages, user_id=user_id,
+                        reason="trim", max_messages=0,
+                        context_summary_callback=cb,
+                    )
+
        # Step 3: Token 限制 - 保留完整轮次
        # Get context window from agent (based on model)
        context_window = self.agent._get_model_context_window()
@@ -1318,6 +1472,7 @@ class AgentStreamExecutor:
        # --- Many turns (>=5): discard the older half, keep the newer half ---
        removed_count = len(turns) // 2
        keep_count = len(turns) - removed_count
+        discarded_turns = turns[:removed_count]
        kept_turns = turns[-keep_count:]
        kept_tokens = sum(self._estimate_turn_tokens(t) for t in kept_turns)

@@ -1328,13 +1483,15 @@ class AgentStreamExecutor:

        if self.agent.memory_manager:
            discarded_messages = []
-            for turn in turns[:removed_count]:
+            for turn in discarded_turns:
                discarded_messages.extend(turn["messages"])
            if discarded_messages:
                user_id = getattr(self.agent, '_current_user_id', None)
+                cb = self._build_context_summary_callback(discarded_turns, kept_turns)
                self.agent.memory_manager.flush_memory(
                    messages=discarded_messages, user_id=user_id,
-                    reason="trim", max_messages=0
+                    reason="trim", max_messages=0,
+                    context_summary_callback=cb,
                )

        new_messages = []
--- a/agent/tools/bash/bash.py
+++ b/agent/tools/bash/bash.py
@@ -169,10 +169,16 @@ SAFETY:
                except Exception as retry_err:
                    logger.warning(f"[Bash] Retry failed: {retry_err}")

-            # Combine stdout and stderr
-            output = result.stdout
-            if result.stderr:
-                output += "\n" + result.stderr
+            # When command succeeds with stdout, keep output clean (stderr goes to server log only).
+            # When command fails or stdout is empty, include stderr so the agent can diagnose.
+            if result.returncode == 0 and result.stdout.strip():
+                output = result.stdout
+                if result.stderr:
+                    logger.info(f"[Bash] stderr (not forwarded): {result.stderr[:500]}")
+            else:
+                output = result.stdout
+                if result.stderr:
+                    output += "\n" + result.stderr

            # Check if we need to save full output to temp file
            temp_file_path = None
--- a/agent/tools/utils/truncate.py
+++ b/agent/tools/utils/truncate.py
@@ -8,7 +8,10 @@ Truncation is based on two independent limits - whichever is hit first wins:
 Never returns partial lines (except bash tail truncation edge case).
 """

-from typing import Dict, Any, Optional, Literal, Tuple
+from __future__ import annotations
+from typing import Dict, Any, Optional, Tuple, TYPE_CHECKING
+if TYPE_CHECKING:
+    from typing import Literal


 DEFAULT_MAX_LINES = 2000
--- a/agent/tools/vision/vision.py
+++ b/agent/tools/vision/vision.py
@@ -43,7 +43,7 @@ _MAIN_MODEL_PROVIDER_NAME = "MainModel"
 # Auto-discovered as fallback vision providers when their API key is configured.
 # OpenAI and LinkAI are handled separately (raw HTTP providers), so not listed here.
 _DISCOVERABLE_MODELS = [
-    ("moonshot_api_key", const.MOONSHOT, const.KIMI_K2_5, "Moonshot"),
+    ("moonshot_api_key", const.MOONSHOT, const.KIMI_K2_6, "Moonshot"),
    ("ark_api_key", const.DOUBAO, const.DOUBAO_SEED_2_PRO, "Doubao"),
    ("dashscope_api_key", const.QWEN_DASHSCOPE, const.QWEN36_PLUS, "DashScope"),
    ("claude_api_key", const.CLAUDEAPI, const.CLAUDE_4_6_SONNET, "Claude"),
--- a/app.py
+++ b/app.py
@@ -274,6 +274,39 @@ def sigterm_handler_wrap(_signo):
    signal.signal(_signo, func)


+def _sync_builtin_skills():
+    """Sync builtin skills from project skills/ to workspace skills/ on startup."""
+    import shutil
+    try:
+        workspace = conf().get("agent_workspace", "~/cow")
+        workspace = os.path.expanduser(workspace)
+        project_root = os.path.dirname(os.path.abspath(__file__))
+        builtin_dir = os.path.join(project_root, "skills")
+        custom_dir = os.path.join(workspace, "skills")
+
+        if not os.path.isdir(builtin_dir):
+            return
+
+        os.makedirs(custom_dir, exist_ok=True)
+        synced = 0
+        for name in os.listdir(builtin_dir):
+            src = os.path.join(builtin_dir, name)
+            if not os.path.isdir(src) or not os.path.isfile(os.path.join(src, "SKILL.md")):
+                continue
+            dst = os.path.join(custom_dir, name)
+            try:
+                if os.path.isdir(dst):
+                    shutil.rmtree(dst)
+                shutil.copytree(src, dst)
+                synced += 1
+            except Exception as e:
+                logger.warning(f"[App] Failed to sync builtin skill '{name}': {e}")
+        if synced:
+            logger.info(f"[App] Synced {synced} builtin skill(s) to workspace")
+    except Exception as e:
+        logger.warning(f"[App] Builtin skills sync failed: {e}")
+
+
 def run():
    global _channel_mgr
    try:
@@ -299,6 +332,9 @@ def run():
        if web_console_enabled and "web" not in channel_names:
            channel_names.append("web")

+        # Sync builtin skills to workspace before channels start
+        _sync_builtin_skills()
+
        logger.info(f"[App] Starting channels: {channel_names}")

        _channel_mgr = ChannelManager()
--- a/bridge/agent_bridge.py
+++ b/bridge/agent_bridge.py
@@ -160,13 +160,23 @@ class AgentLLMModel(LLMModel):
                    kwargs['system'] = system_prompt

                # Pass context metadata to bot
-                channel_type = getattr(self, 'channel_type', None)
+                channel_type = getattr(self, 'channel_type', None) or ''
                if channel_type:
                    kwargs['channel_type'] = channel_type
                session_id = getattr(self, 'session_id', None)
                if session_id:
                    kwargs['session_id'] = session_id

+                # Thinking mode is a global toggle independent of the channel.
+                # IM channels (WeChat/WeCom/DingTalk/Feishu) won't render the
+                # reasoning trace, but still benefit from the higher answer
+                # quality the thinking pass produces.
+                from config import conf
+                kwargs['thinking'] = (
+                    {"type": "enabled"} if conf().get("enable_thinking", False)
+                    else {"type": "disabled"}
+                )
+
                response = self.bot.call_with_tools(**kwargs)
                return self._format_response(response)
            else:
@@ -205,13 +215,23 @@ class AgentLLMModel(LLMModel):
                    kwargs['system'] = system_prompt

                # Pass context metadata to bot
-                channel_type = getattr(self, 'channel_type', None)
+                channel_type = getattr(self, 'channel_type', None) or ''
                if channel_type:
                    kwargs['channel_type'] = channel_type
                session_id = getattr(self, 'session_id', None)
                if session_id:
                    kwargs['session_id'] = session_id

+                # Thinking mode is a global toggle independent of the channel.
+                # IM channels (WeChat/WeCom/DingTalk/Feishu) won't render the
+                # reasoning trace, but still benefit from the higher answer
+                # quality the thinking pass produces.
+                from config import conf
+                kwargs['thinking'] = (
+                    {"type": "enabled"} if conf().get("enable_thinking", False)
+                    else {"type": "disabled"}
+                )
+
                stream = self.bot.call_with_tools(**kwargs)
                
                # Convert stream format to our expected format
@@ -430,7 +450,7 @@ class AgentBridge:
                        except Exception as e:
                            logger.warning(f"[AgentBridge] Failed to clear DB after recovery: {e}")
            
-            # Check if there are files to send (from read tool)
+            # Check if there are files to send (from send/read tool)
            if hasattr(agent, 'stream_executor') and hasattr(agent.stream_executor, 'files_to_send'):
                files_to_send = agent.stream_executor.files_to_send
                if files_to_send:
@@ -592,18 +612,55 @@ class AgentBridge:
            from config import conf
            if not conf().get("conversation_persistence", True):
                return
+            # When deep-thinking display is disabled, strip "thinking" content
+            # blocks before persisting so they don't resurface on history reload.
+            # The in-memory message list keeps them intact for this run's
+            # multi-turn LLM context.
+            thinking_enabled = bool(conf().get("enable_thinking", False))
        except Exception:
-            pass
+            thinking_enabled = False
+
+        messages_to_store = new_messages
+        if not thinking_enabled:
+            messages_to_store = self._strip_thinking_blocks(new_messages)
+
        try:
            from agent.memory import get_conversation_store
            get_conversation_store().append_messages(
-                session_id, new_messages, channel_type=channel_type
+                session_id, messages_to_store, channel_type=channel_type
            )
        except Exception as e:
            logger.warning(
                f"[AgentBridge] Failed to persist messages for session={session_id}: {e}"
            )

+    @staticmethod
+    def _strip_thinking_blocks(messages: list) -> list:
+        """Return a shallow copy of messages with assistant "thinking" blocks removed."""
+        cleaned = []
+        for msg in messages:
+            if not isinstance(msg, dict):
+                cleaned.append(msg)
+                continue
+            if msg.get("role") != "assistant":
+                cleaned.append(msg)
+                continue
+            content = msg.get("content")
+            if not isinstance(content, list):
+                cleaned.append(msg)
+                continue
+            filtered_blocks = [
+                b for b in content
+                if not (isinstance(b, dict) and b.get("type") == "thinking")
+            ]
+            if len(filtered_blocks) == len(content):
+                cleaned.append(msg)
+            else:
+                new_msg = dict(msg)
+                new_msg["content"] = filtered_blocks
+                cleaned.append(new_msg)
+        return cleaned
+
    def clear_session(self, session_id: str):
        """
        Clear a specific session's agent and conversation history
--- a/bridge/agent_initializer.py
+++ b/bridge/agent_initializer.py
@@ -548,14 +548,17 @@ class AgentInitializer:
        import threading

        def _daily_flush_loop():
+            import random
            while True:
                try:
                    now = datetime.datetime.now()
-                    target = now.replace(hour=23, minute=55, second=0, microsecond=0)
+                    jitter_min = random.randint(50, 55)
+                    jitter_sec = random.randint(0, 59)
+                    target = now.replace(hour=23, minute=jitter_min, second=jitter_sec, microsecond=0)
                    if target <= now:
                        target += datetime.timedelta(days=1)
                    wait_seconds = (target - now).total_seconds()
-                    logger.info(f"[DailyFlush] Next flush at {target.strftime('%Y-%m-%d %H:%M')} (in {wait_seconds/3600:.1f}h)")
+                    logger.info(f"[DailyFlush] Next flush at {target.strftime('%Y-%m-%d %H:%M:%S')} (in {wait_seconds/3600:.1f}h)")
                    time.sleep(wait_seconds)

                    self._flush_all_agents()
@@ -567,7 +570,7 @@ class AgentInitializer:
        t.start()

    def _flush_all_agents(self):
-        """Flush memory for all active agent sessions."""
+        """Flush memory for all active agent sessions, then run Deep Dream."""
        agents = []
        if self.agent_bridge.default_agent:
            agents.append(("default", self.agent_bridge.default_agent))
@@ -577,7 +580,10 @@ class AgentInitializer:
        if not agents:
            return

+        # Phase 1: flush daily summaries
        flushed = 0
+        flush_threads = []
+        dream_candidate = None
        for label, agent in agents:
            try:
                if not agent.memory_manager:
@@ -589,8 +595,26 @@ class AgentInitializer:
                result = agent.memory_manager.flush_manager.create_daily_summary(messages)
                if result:
                    flushed += 1
+                    t = agent.memory_manager.flush_manager._last_flush_thread
+                    if t:
+                        flush_threads.append(t)
+                if dream_candidate is None:
+                    dream_candidate = agent.memory_manager.flush_manager
            except Exception as e:
                logger.warning(f"[DailyFlush] Failed for session {label}: {e}")

        if flushed:
            logger.info(f"[DailyFlush] Flushed {flushed}/{len(agents)} agent session(s)")
+
+        # Wait for all flush threads to finish before dreaming
+        for t in flush_threads:
+            t.join(timeout=60)
+
+        # Phase 2: Deep Dream — distill daily memories → MEMORY.md + dream diary
+        if dream_candidate:
+            try:
+                result = dream_candidate.deep_dream()
+                if result:
+                    logger.info("[DeepDream] Memory distillation completed successfully")
+            except Exception as e:
+                logger.warning(f"[DeepDream] Failed: {e}")
--- a/channel/chat_channel.py
+++ b/channel/chat_channel.py
@@ -297,8 +297,12 @@ class ChatChannel(Channel):
                logger.debug("[chat_channel] sending reply: {}, context: {}".format(reply, context))
                
                # 如果是文本回复，尝试提取并发送图片
-                if reply.type == ReplyType.TEXT:
+                # Web channel renders images/videos inline via renderMarkdown,
+                # so skip the extract-and-send step to avoid duplicate media.
+                if reply.type == ReplyType.TEXT and context.get("channel_type") != "web":
                    self._extract_and_send_images(reply, context)
+                elif reply.type == ReplyType.TEXT:
+                    self._send(reply, context)
                # 如果是图片回复但带有文本内容，先发文本再发图片
                elif reply.type == ReplyType.IMAGE_URL and hasattr(reply, 'text_content') and reply.text_content:
                    # 先发送文本
--- a/channel/web/chat.html
+++ b/channel/web/chat.html
@@ -213,6 +213,9 @@
            <div id="session-list" class="session-list"></div>
        </aside>

+        <!-- Mobile overlay for session panel (click to close) -->
+        <div id="session-panel-overlay" class="session-panel-overlay hidden" onclick="closeSessionPanel()"></div>
+
        <!-- ================================================================ -->
        <!-- MAIN CONTENT                                                     -->
        <!-- ================================================================ -->
@@ -285,7 +288,7 @@
                <!-- ====================================================== -->
                <!-- VIEW: Chat                                              -->
                <!-- ====================================================== -->
-                <div id="view-chat" class="view active">
+                <div id="view-chat" class="view active relative">
                    <!-- Messages -->
                    <div id="chat-messages" class="flex-1 overflow-y-auto">
                        <!-- Welcome Screen -->
@@ -361,6 +364,18 @@
                        </div>
                    </div>

+                    <!-- Scroll-to-bottom FAB -->
+                    <button id="scroll-to-bottom-btn"
+                            class="hidden absolute right-5 bottom-[80px] z-10
+                                   w-9 h-9 rounded-full shadow-lg
+                                   bg-white dark:bg-[#2A2A2A] border border-slate-200 dark:border-white/15
+                                   text-slate-500 dark:text-slate-400 hover:text-primary-500 dark:hover:text-primary-400
+                                   flex items-center justify-center cursor-pointer transition-all duration-200
+                                   hover:shadow-xl hover:scale-105"
+                            onclick="_autoScrollEnabled = true; scrollChatToBottom(true);">
+                        <i class="fas fa-chevron-down text-sm"></i>
+                    </button>
+
                    <!-- Chat Input -->
                    <div class="flex-shrink-0 border-t border-slate-200 dark:border-white/10 bg-white dark:bg-[#1A1A1A] px-4 py-3">
                        <div class="max-w-3xl mx-auto">
@@ -445,6 +460,9 @@
                                                </div>
                                                <div class="cfg-dropdown-menu"></div>
                                            </div>
+                                            <div id="cfg-custom-tip" class="mt-1.5 text-xs text-slate-400 dark:text-slate-500 hidden">
+                                                <i class="fas fa-info-circle mr-1"></i><span data-i18n="config_custom_tip">接口需遵循 OpenAI API 协议</span>
+                                            </div>
                                        </div>
                                        <!-- Model -->
                                        <div>
@@ -540,6 +558,18 @@
                                                          bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
                                                          focus:outline-none focus:border-primary-500 font-mono transition-colors">
                                        </div>
+                                        <div class="flex items-center justify-between">
+                                            <label class="flex items-center gap-1.5 text-sm font-medium text-slate-600 dark:text-slate-400">
+                                                <span data-i18n="config_enable_thinking">Deep Thinking</span>
+                                                <span class="cfg-tip" data-tip-key="config_enable_thinking_hint"><i class="fas fa-circle-question"></i></span>
+                                            </label>
+                                            <label class="relative inline-flex items-center cursor-pointer">
+                                                <input id="cfg-enable-thinking" type="checkbox" class="sr-only peer">
+                                                <div class="w-9 h-5 bg-slate-200 dark:bg-slate-700 peer-checked:bg-primary-400 rounded-full
+                                                            after:content-[''] after:absolute after:top-[2px] after:left-[2px] after:bg-white
+                                                            after:rounded-full after:h-4 after:w-4 after:transition-all peer-checked:after:translate-x-full"></div>
+                                            </label>
+                                        </div>
                                        <div class="flex items-center justify-end gap-3 pt-1">
                                            <span id="cfg-agent-status" class="text-xs text-primary-500 opacity-0 transition-opacity duration-300"></span>
                                            <button id="cfg-agent-save"
@@ -648,6 +678,16 @@
                                        <h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="memory_title">记忆管理</h2>
                                        <p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="memory_desc">查看 Agent 记忆文件和内容</p>
                                    </div>
+                                    <div class="flex items-center bg-slate-100 dark:bg-white/10 rounded-lg p-0.5">
+                                        <button id="memory-tab-files" onclick="switchMemoryTab('files')"
+                                                class="memory-tab px-3 py-1.5 rounded-md text-xs font-medium cursor-pointer transition-colors duration-150 active">
+                                            <i class="fas fa-file-lines mr-1.5"></i><span data-i18n="memory_tab_files">记忆文件</span>
+                                        </button>
+                                        <button id="memory-tab-dreams" onclick="switchMemoryTab('dreams')"
+                                                class="memory-tab px-3 py-1.5 rounded-md text-xs font-medium cursor-pointer transition-colors duration-150">
+                                            <i class="fas fa-moon mr-1.5"></i><span data-i18n="memory_tab_dreams">梦境日记</span>
+                                        </button>
+                                    </div>
                                </div>
                                <div id="memory-empty" class="flex flex-col items-center justify-center py-20">
                                    <div class="w-16 h-16 rounded-2xl bg-purple-50 dark:bg-purple-900/20 flex items-center justify-center mb-4">
--- a/channel/web/static/css/console.css
+++ b/channel/web/static/css/console.css
@@ -339,6 +339,23 @@
 }
 .confirm-btn-ok:hover { background: #dc2626; }

+/* Session panel overlay (mobile only, click to close) */
+.session-panel-overlay {
+    display: none;
+}
+@media (max-width: 768px) {
+    .session-panel-overlay {
+        display: block;
+        position: fixed;
+        inset: 0;
+        z-index: 44;
+        background: rgba(0, 0, 0, 0.3);
+    }
+    .session-panel-overlay.hidden {
+        display: none;
+    }
+}
+
 /* Mobile: session panel as overlay */
@media (max-width: 768px) {
    .session-panel {
@@ -455,9 +472,8 @@
    cursor: pointer;
    user-select: none;
 }
-.agent-thinking-step .thinking-header.no-toggle { cursor: default; }
-.agent-thinking-step .thinking-header:not(.no-toggle):hover { color: #64748b; }
-.dark .agent-thinking-step .thinking-header:not(.no-toggle):hover { color: #cbd5e1; }
+.agent-thinking-step .thinking-header:hover { color: #64748b; }
+.dark .agent-thinking-step .thinking-header:hover { color: #cbd5e1; }
 .agent-thinking-step .thinking-header i:first-child { font-size: 0.625rem; margin-top: 1px; }
 .agent-thinking-step .thinking-chevron {
    font-size: 0.5rem;
@@ -488,6 +504,27 @@
 .agent-thinking-step .thinking-full p { margin: 0.25em 0; }
 .agent-thinking-step .thinking-full p:first-child { margin-top: 0; }
 .agent-thinking-step .thinking-full p:last-child { margin-bottom: 0; }
+.agent-thinking-step .thinking-duration {
+    font-size: 0.625rem;
+    color: #b0b8c4;
+    margin-bottom: 0.375rem;
+}
+/* Streaming reasoning: render as plain pre to avoid expensive markdown
+   re-parsing on every chunk. Wrap long lines so the bubble width is
+   respected and use the same font size/color as the rendered version. */
+.agent-thinking-step .thinking-stream-pre {
+    margin: 0;
+    padding: 0;
+    background: transparent;
+    border: 0;
+    font-family: inherit;
+    font-size: inherit;
+    line-height: 1.5;
+    color: inherit;
+    white-space: pre-wrap;
+    word-break: break-word;
+    overflow-wrap: anywhere;
+}

 /* Content step - real text output frozen before tool calls */
 .agent-content-step {
@@ -886,15 +923,15 @@
   ============================================================ */

 /* Tab toggle */
-.knowledge-tab {
+.knowledge-tab, .memory-tab {
    color: #64748b;
 }
-.knowledge-tab.active {
+.knowledge-tab.active, .memory-tab.active {
    background: #fff;
    color: #334155;
    box-shadow: 0 1px 3px rgba(0,0,0,0.08);
 }
-.dark .knowledge-tab.active {
+.dark .knowledge-tab.active, .dark .memory-tab.active {
    background: rgba(255,255,255,0.1);
    color: #e2e8f0;
 }
@@ -931,13 +968,13 @@
    font-size: 8px;
    transition: transform 0.15s;
 }
-.knowledge-tree-group.open .chevron {
+.knowledge-tree-group.open > .knowledge-tree-group-btn .chevron {
    transform: rotate(90deg);
 }
 .knowledge-tree-group-items {
    display: none;
 }
-.knowledge-tree-group.open .knowledge-tree-group-items {
+.knowledge-tree-group.open > .knowledge-tree-group-items {
    display: block;
 }

@@ -1031,12 +1068,10 @@
 }
 .cfg-tip:hover { color: #64748b; }
 .dark .cfg-tip:hover { color: #cbd5e1; }
-.cfg-tip::after {
-    content: attr(data-tooltip);
-    position: absolute;
-    left: 50%;
-    bottom: calc(100% + 6px);
-    transform: translateX(-50%);
+/* Floating tooltip portal — appended to <body> by JS so it isn't clipped
+   by overflow:hidden ancestors. */
+.cfg-tip-floating {
+    position: fixed;
    padding: 6px 10px;
    border-radius: 8px;
    font-size: 12px;
@@ -1049,13 +1084,13 @@
    opacity: 0;
    pointer-events: none;
    transition: opacity 0.15s;
-    z-index: 50;
+    z-index: 9999;
 }
-.dark .cfg-tip::after {
+.dark .cfg-tip-floating {
    background: #334155;
    color: #f1f5f9;
 }
-.cfg-tip:hover::after {
+.cfg-tip-floating.show {
    opacity: 1;
 }

--- a/channel/web/static/js/console.js
+++ b/channel/web/static/js/console.js
@@ -38,12 +38,14 @@ const I18N = {
        config_max_tokens: '最大上下文 Token', config_max_tokens_hint: '对话中 Agent 能输入的最大 Token 长度，超过后会智能压缩处理',
        config_max_turns: '最大记忆轮次', config_max_turns_hint: '一问一答为一轮，超过后会智能压缩处理',
        config_max_steps: '最大执行步数', config_max_steps_hint: '单次对话中 Agent 最多调用工具的次数',
+        config_enable_thinking: '深度思考', config_enable_thinking_hint: '是否启用深度思考模式',
        config_channel_type: '通道类型',
        config_provider: '模型厂商', config_model_name: '模型',
        config_custom_model_hint: '输入自定义模型名称',
        config_save: '保存', config_saved: '已保存',
        config_save_error: '保存失败',
        config_custom_option: '自定义...',
+        config_custom_tip: '接口需遵循 OpenAI API 协议',
        config_security: '安全设置', config_password: '访问密码',
        config_password_hint: '留空则不启用密码保护',
        config_password_changed: '密码已更新，请重新登录',
@@ -54,6 +56,7 @@ const I18N = {
        skills_section_title: '技能', skill_enable: '启用', skill_disable: '禁用',
        skill_toggle_error: '操作失败，请稍后再试',
        memory_title: '记忆管理', memory_desc: '查看 Agent 记忆文件和内容',
+        memory_tab_files: '记忆文件', memory_tab_dreams: '梦境日记',
        memory_loading: '加载记忆文件中...', memory_loading_desc: '记忆文件将显示在此处',
        memory_back: '返回列表',
        memory_col_name: '文件名', memory_col_type: '类型', memory_col_size: '大小', memory_col_updated: '更新时间',
@@ -92,6 +95,7 @@ const I18N = {
        confirm_yes: '确认',
        confirm_cancel: '取消',
        error_send: '发送失败，请稍后再试。', error_timeout: '请求超时，请再试一次。',
+        thinking_in_progress: '思考中...', thinking_done: '已深度思考', thinking_duration: '耗时',
    },
    en: {
        console: 'Console',
@@ -120,12 +124,14 @@ const I18N = {
        config_max_tokens: 'Max Context Tokens', config_max_tokens_hint: 'Max tokens the Agent can input per conversation, auto-compressed when exceeded',
        config_max_turns: 'Max Memory Turns', config_max_turns_hint: 'One Q&A pair = one turn, auto-compressed when exceeded',
        config_max_steps: 'Max Steps', config_max_steps_hint: 'Max tool calls the Agent can make in a single conversation',
+        config_enable_thinking: 'Deep Thinking', config_enable_thinking_hint: 'Enable deep thinking mode',
        config_channel_type: 'Channel Type',
        config_provider: 'Provider', config_model_name: 'Model',
        config_custom_model_hint: 'Enter custom model name',
        config_save: 'Save', config_saved: 'Saved',
        config_save_error: 'Save failed',
        config_custom_option: 'Custom...',
+        config_custom_tip: 'API must follow OpenAI protocol.',
        config_security: 'Security', config_password: 'Password',
        config_password_hint: 'Leave empty to disable password protection',
        config_password_changed: 'Password updated, please re-login',
@@ -136,6 +142,7 @@ const I18N = {
        skills_section_title: 'Skills', skill_enable: 'Enable', skill_disable: 'Disable',
        skill_toggle_error: 'Operation failed, please try again',
        memory_title: 'Memory', memory_desc: 'View agent memory files and contents',
+        memory_tab_files: 'Memory Files', memory_tab_dreams: 'Dream Diary',
        memory_loading: 'Loading memory files...', memory_loading_desc: 'Memory files will be displayed here',
        memory_back: 'Back to list',
        memory_col_name: 'Filename', memory_col_type: 'Type', memory_col_size: 'Size', memory_col_updated: 'Updated',
@@ -174,6 +181,7 @@ const I18N = {
        confirm_yes: 'Confirm',
        confirm_cancel: 'Cancel',
        error_send: 'Failed to send. Please try again.', error_timeout: 'Request timeout. Please try again.',
+        thinking_in_progress: 'Thinking...', thinking_done: 'Thought', thinking_duration: 'Duration',
    }
 };

@@ -196,8 +204,9 @@ function applyI18n() {
    document.querySelectorAll('[data-tip-key]').forEach(el => {
        el.setAttribute('data-tooltip', t(el.dataset.tipKey));
    });
+    installCfgTipPortal();
    const langLabel = document.getElementById('lang-label');
-    if (langLabel) langLabel.textContent = currentLang === 'zh' ? 'EN' : '中文';
+    if (langLabel) langLabel.textContent = currentLang === 'zh' ? '中文' : 'EN';
 }

 function toggleLanguage() {
@@ -207,6 +216,54 @@ function toggleLanguage() {
    _applyInputTooltips();
 }

+// Floating tooltip portal for [data-tip-key] elements. Tooltip nodes are
+// appended to <body> so they aren't clipped by overflow:hidden ancestors
+// (e.g. the config panel's scroll container).
+let _cfgTipPortalEl = null;
+let _cfgTipPortalInstalled = false;
+function installCfgTipPortal() {
+    if (_cfgTipPortalInstalled) return;
+    _cfgTipPortalInstalled = true;
+
+    const showTip = (target) => {
+        const text = target.getAttribute('data-tooltip');
+        if (!text) return;
+        if (!_cfgTipPortalEl) {
+            _cfgTipPortalEl = document.createElement('div');
+            _cfgTipPortalEl.className = 'cfg-tip-floating';
+            document.body.appendChild(_cfgTipPortalEl);
+        }
+        _cfgTipPortalEl.textContent = text;
+        const rect = target.getBoundingClientRect();
+        // Render once to measure, then position above the target, centered.
+        _cfgTipPortalEl.style.left = '0px';
+        _cfgTipPortalEl.style.top = '0px';
+        _cfgTipPortalEl.classList.add('show');
+        const tipRect = _cfgTipPortalEl.getBoundingClientRect();
+        let left = rect.left + rect.width / 2 - tipRect.width / 2;
+        // Clamp horizontally to the viewport with an 8px gutter.
+        left = Math.max(8, Math.min(left, window.innerWidth - tipRect.width - 8));
+        const top = rect.top - tipRect.height - 6;
+        _cfgTipPortalEl.style.left = left + 'px';
+        _cfgTipPortalEl.style.top = top + 'px';
+    };
+    const hideTip = () => {
+        if (_cfgTipPortalEl) _cfgTipPortalEl.classList.remove('show');
+    };
+
+    document.addEventListener('mouseover', (e) => {
+        const target = e.target.closest('[data-tip-key]');
+        if (target) showTip(target);
+    });
+    document.addEventListener('mouseout', (e) => {
+        const target = e.target.closest('[data-tip-key]');
+        if (target) hideTip();
+    });
+    // Hide on scroll/resize so the tooltip doesn't drift away from its anchor.
+    window.addEventListener('scroll', hideTip, true);
+    window.addEventListener('resize', hideTip);
+}
+
 // =====================================================================
 // Theme
 // =====================================================================
@@ -331,18 +388,59 @@ function createMd() {
 const md = createMd();

 const VIDEO_EXT_RE = /\.(?:mp4|webm|mov|avi|mkv)$/i;  // tested against URL without query string
+const IMAGE_EXT_RE = /\.(?:jpg|jpeg|png|gif|webp|bmp|svg)$/i;  // tested against URL without query string
+
+function _toWebUrl(url) {
+    if (/^\/[A-Za-z]/.test(url) && !url.startsWith('/api/')) {
+        return '/api/file?path=' + encodeURIComponent(url);
+    }
+    if (/^file:\/\/\//i.test(url)) {
+        return '/api/file?path=' + encodeURIComponent(url.replace(/^file:\/\/\//i, '/'));
+    }
+    return url;
+}

 function _buildVideoHtml(url) {
+    const webUrl = _toWebUrl(url);
    const fileName = url.split('/').pop().split('?')[0];
    return `<div style="margin:10px 0;">` +
        `<video controls preload="metadata" ` +
        `style="max-width:100%;border-radius:10px;box-shadow:0 2px 8px rgba(0,0,0,0.15);display:block;">` +
-        `<source src="${url}"></video>` +
-        `<a href="${url}" target="_blank" ` +
+        `<source src="${webUrl}"></video>` +
+        `<a href="${webUrl}" target="_blank" ` +
        `style="display:inline-flex;align-items:center;gap:4px;margin-top:4px;font-size:12px;color:#8b8fa8;text-decoration:none;">` +
        `<i class="fas fa-download"></i> ${escapeHtml(fileName)}</a></div>`;
 }

+function _openImageLightbox(src) {
+    let overlay = document.getElementById('cow-lightbox');
+    if (!overlay) {
+        overlay = document.createElement('div');
+        overlay.id = 'cow-lightbox';
+        overlay.style.cssText = 'position:fixed;inset:0;z-index:9999;background:rgba(0,0,0,0.85);display:flex;align-items:center;justify-content:center;cursor:zoom-out;opacity:0;transition:opacity .2s';
+        overlay.onclick = () => { overlay.style.opacity = '0'; setTimeout(() => overlay.style.display = 'none', 200); };
+        const img = document.createElement('img');
+        img.id = 'cow-lightbox-img';
+        img.style.cssText = 'max-width:92vw;max-height:92vh;border-radius:8px;box-shadow:0 4px 24px rgba(0,0,0,0.5);object-fit:contain;';
+        img.onclick = (e) => e.stopPropagation();
+        overlay.appendChild(img);
+        document.body.appendChild(overlay);
+    }
+    overlay.querySelector('#cow-lightbox-img').src = src;
+    overlay.style.display = 'flex';
+    requestAnimationFrame(() => overlay.style.opacity = '1');
+}
+
+function _buildImageHtml(url) {
+    const webUrl = _toWebUrl(url);
+    const safeUrl = webUrl.replace(/"/g, '&quot;');
+    return `<div style="margin:10px 0;">` +
+        `<img src="${safeUrl}" alt="image" loading="lazy" ` +
+        `onclick="_openImageLightbox(this.src)" ` +
+        `style="max-width:520px;width:100%;border-radius:10px;box-shadow:0 2px 8px rgba(0,0,0,0.15);display:block;cursor:zoom-in;">` +
+        `</div>`;
+}
+
 function injectVideoPlayers(html) {
    // Step 1: replace markdown-it anchor tags whose href points to a video file.
    const step1 = html.replace(
@@ -361,10 +459,43 @@ function injectVideoPlayers(html) {
    }).join('');
 }

+// Convert image URLs into inline <img> previews. Mirrors injectVideoPlayers but for images.
+// Handles three cases produced by markdown-it:
+//   1. <a href="...image.jpg">...</a>  (bare URL or autolink that linkify turned into an anchor)
+//   2. <img src="...">                  (markdown image syntax) — leave as-is, but normalize style
+//   3. raw URL still present in a text node                    — only as a safety net
+function injectImagePreviews(html) {
+    // Step 1: anchor whose href points to an image file -> replace with <img> preview.
+    const step1 = html.replace(
+        /<a\s+href="(https?:\/\/[^"]+)"[^>]*>[^<]*<\/a>/gi,
+        (match, url) => IMAGE_EXT_RE.test(url.split('?')[0]) ? _buildImageHtml(url) : match
+    );
+    // Step 2: bare image URLs left in text nodes (rare — markdown-it's linkify usually catches them).
+    return step1.split(/(<[^>]+>)/).map((chunk, idx) => {
+        if (idx % 2 !== 0) return chunk;
+        return chunk.replace(/https?:\/\/\S+/gi, (url) => {
+            const bare = url.replace(/[),.\s]+$/, '');
+            return IMAGE_EXT_RE.test(bare.split('?')[0]) ? _buildImageHtml(bare) : url;
+        });
+    }).join('');
+}
+
+function _rewriteLocalImgSrc(html) {
+    return html.replace(/<img\s([^>]*?)src="([^"]+)"([^>]*?)>/gi, (match, pre, src, post) => {
+        const webSrc = _toWebUrl(src);
+        const safeSrc = webSrc.replace(/"/g, '&quot;');
+        const hasClick = /onclick/i.test(pre + post);
+        const clickAttr = hasClick ? '' : ` onclick="_openImageLightbox(this.src)" style="cursor:zoom-in;"`;
+        return `<img ${pre}src="${safeSrc}"${post}${clickAttr}>`;
+    });
+}
+
 function renderMarkdown(text) {
    try {
-        const html = md.render(text);
-        return injectVideoPlayers(html);
+        let html = md.render(text);
+        html = _rewriteLocalImgSrc(html);
+        // Order matters: video first (more specific), then image.
+        return injectImagePreviews(injectVideoPlayers(html));
    }
    catch (e) { return text.replace(/\n/g, '<br>'); }
 }
@@ -421,6 +552,46 @@ const chatInput = document.getElementById('chat-input');
 const sendBtn = document.getElementById('send-btn');
 const messagesDiv = document.getElementById('chat-messages');
 const fileInput = document.getElementById('file-input');
+
+// Smart auto-scroll: pause when user scrolls up, resume when near bottom
+let _autoScrollEnabled = true;
+const _SCROLL_THRESHOLD = 80; // px from bottom to re-enable auto-scroll
+
+messagesDiv.addEventListener('scroll', () => {
+    const distFromBottom = messagesDiv.scrollHeight - messagesDiv.scrollTop - messagesDiv.clientHeight;
+    _autoScrollEnabled = distFromBottom <= _SCROLL_THRESHOLD;
+    _updateScrollToBottomBtn();
+});
+
+// Intercept internal navigation links in chat messages
+messagesDiv.addEventListener('click', (e) => {
+    const copyBtn = e.target.closest('.copy-msg-btn');
+    if (copyBtn) {
+        e.preventDefault();
+        const msgRoot = copyBtn.closest('.flex.gap-3');
+        const answerEl = msgRoot && msgRoot.querySelector('.answer-content');
+        const rawMd = answerEl && answerEl.dataset.rawMd;
+        if (rawMd) {
+            navigator.clipboard.writeText(rawMd).then(() => {
+                const icon = copyBtn.querySelector('i');
+                if (icon) { icon.className = 'fas fa-check'; setTimeout(() => { icon.className = 'fas fa-copy'; }, 1500); }
+            });
+        }
+        return;
+    }
+    const a = e.target.closest('a');
+    if (!a) return;
+    const href = a.getAttribute('href') || '';
+    if (href === '/memory/dreams') {
+        e.preventDefault();
+        navigateTo('memory');
+        setTimeout(() => switchMemoryTab('dreams'), 50);
+    } else if (href === '/memory/MEMORY.md') {
+        e.preventDefault();
+        navigateTo('memory');
+        setTimeout(() => { switchMemoryTab('files'); openMemoryFile('MEMORY.md', 'memory'); }, 50);
+    }
+});
 const attachmentPreview = document.getElementById('attachment-preview');

 // Pending attachments: [{file_path, file_name, file_type, preview_url}]
@@ -560,6 +731,7 @@ const SLASH_COMMANDS = [
    { cmd: '/skill info ',         desc: '查看技能详情' },
    { cmd: '/skill enable ',       desc: '启用技能' },
    { cmd: '/skill disable ',      desc: '禁用技能' },
+    { cmd: '/memory dream ',        desc: '手动触发记忆蒸馏 (可指定天数, 默认3)' },
    { cmd: '/knowledge',            desc: '查看知识库统计' },
    { cmd: '/knowledge list',      desc: '查看知识库文件树' },
    { cmd: '/knowledge on',        desc: '开启知识库' },
@@ -892,6 +1064,7 @@ function startSSE(requestId, loadingEl, timestamp, titleInfo) {
    let currentToolEl = null;
    let currentReasoningEl = null;  // live reasoning bubble
    let reasoningText = '';
+    let reasoningStartTime = 0;
    let done = false;

    const MAX_RECONNECTS = 10;
@@ -912,7 +1085,12 @@ function startSSE(requestId, loadingEl, timestamp, titleInfo) {
                    <div class="answer-content sse-streaming"></div>
                    <div class="media-content"></div>
                </div>
-                <div class="text-xs text-slate-400 dark:text-slate-500 mt-1.5">${formatTime(timestamp)}</div>
+                <div class="flex items-center gap-2 mt-1.5">
+                    <span class="text-xs text-slate-400 dark:text-slate-500">${formatTime(timestamp)}</span>
+                    <button class="copy-msg-btn text-xs text-slate-300 dark:text-slate-600 hover:text-slate-500 dark:hover:text-slate-400 transition-colors cursor-pointer" title="${currentLang === 'zh' ? '复制' : 'Copy'}" style="display:none">
+                        <i class="fas fa-copy"></i>
+                    </button>
+                </div>
            </div>
        `;
        messagesDiv.appendChild(botEl);
@@ -936,28 +1114,68 @@ function startSSE(requestId, loadingEl, timestamp, titleInfo) {
                ensureBotEl();
                reasoningText += item.content;
                if (!currentReasoningEl) {
+                    reasoningStartTime = Date.now();
                    currentReasoningEl = document.createElement('div');
                    currentReasoningEl.className = 'agent-step agent-thinking-step';
+                    // During streaming, use a <pre> with a single text node and
+                    // append-only updates. This avoids re-parsing markdown and
+                    // re-setting innerHTML on every chunk, which is what causes
+                    // the page to crash on long chains-of-thought.
                    currentReasoningEl.innerHTML = `
                        <div class="thinking-header" onclick="this.parentElement.classList.toggle('expanded')">
                            <i class="fas fa-lightbulb text-amber-400 flex-shrink-0"></i>
-                            <span class="thinking-summary"></span>
+                            <span class="thinking-summary">${t('thinking_in_progress')}</span>
                            <i class="fas fa-chevron-right thinking-chevron"></i>
                        </div>
-                        <div class="thinking-full"></div>`;
+                        <div class="thinking-full"><pre class="thinking-stream-pre"></pre></div>`;
                    stepsEl.appendChild(currentReasoningEl);
+                    const preEl = currentReasoningEl.querySelector('.thinking-stream-pre');
+                    preEl.appendChild(document.createTextNode(''));
+                    currentReasoningEl._streamTextNode = preEl.firstChild;
+                    currentReasoningEl._streamPendingText = '';
+                    currentReasoningEl._streamRafScheduled = false;
+                    currentReasoningEl._streamCharsRendered = 0;
+                    currentReasoningEl._streamCapped = false;
+                }
+                // Hard cap: once REASONING_RENDER_CAP chars are in the DOM, stop
+                // appending further deltas. The full text is still kept in
+                // `reasoningText` for finalize-time head+tail rendering.
+                if (!currentReasoningEl._streamCapped) {
+                    currentReasoningEl._streamPendingText += item.content;
+                    if (!currentReasoningEl._streamRafScheduled) {
+                        currentReasoningEl._streamRafScheduled = true;
+                        const elRef = currentReasoningEl;
+                        requestAnimationFrame(() => {
+                            elRef._streamRafScheduled = false;
+                            if (!elRef.isConnected || !elRef._streamTextNode) return;
+                            let pending = elRef._streamPendingText;
+                            elRef._streamPendingText = '';
+                            if (!pending) return;
+                            const remaining = REASONING_RENDER_CAP - elRef._streamCharsRendered;
+                            if (remaining <= 0) {
+                                elRef._streamCapped = true;
+                            } else {
+                                if (pending.length > remaining) {
+                                    pending = pending.slice(0, remaining);
+                                    elRef._streamCapped = true;
+                                }
+                                elRef._streamTextNode.appendData(pending);
+                                elRef._streamCharsRendered += pending.length;
+                                if (elRef._streamCapped) {
+                                    elRef._streamTextNode.appendData(
+                                        '\n\n... [reasoning truncated for display] ...'
+                                    );
+                                }
+                            }
+                            scrollChatToBottom();
+                        });
+                    }
                }
-                const oneLine = reasoningText.trim().replace(/\n+/g, ' ');
-                currentReasoningEl.querySelector('.thinking-summary').textContent =
-                    oneLine.length > 80 ? oneLine.substring(0, 80) + '…' : oneLine;
-                currentReasoningEl.querySelector('.thinking-full').innerHTML = renderMarkdown(reasoningText);
-                scrollChatToBottom();

            } else if (item.type === 'delta') {
                ensureBotEl();
                if (currentReasoningEl) {
-                    if (reasoningText.trim().replace(/\n+/g, ' ').length <= 80)
-                        currentReasoningEl.classList.add('no-expand');
+                    finalizeThinking(currentReasoningEl, reasoningStartTime, reasoningText);
                    currentReasoningEl = null;
                    reasoningText = '';
                }
@@ -980,8 +1198,7 @@ function startSSE(requestId, loadingEl, timestamp, titleInfo) {
            } else if (item.type === 'tool_start') {
                ensureBotEl();
                if (currentReasoningEl) {
-                    if (reasoningText.trim().replace(/\n+/g, ' ').length <= 80)
-                        currentReasoningEl.classList.add('no-expand');
+                    finalizeThinking(currentReasoningEl, reasoningStartTime, reasoningText);
                    currentReasoningEl = null;
                    reasoningText = '';
                }
@@ -1040,8 +1257,8 @@ function startSSE(requestId, loadingEl, timestamp, titleInfo) {
                const imgEl = document.createElement('img');
                imgEl.src = item.content;
                imgEl.alt = 'screenshot';
-                imgEl.style.cssText = 'max-width:600px;border-radius:8px;margin:8px 0;cursor:pointer;box-shadow:0 1px 4px rgba(0,0,0,0.1);';
-                imgEl.onclick = () => window.open(item.content, '_blank');
+                imgEl.style.cssText = 'max-width:600px;border-radius:8px;margin:8px 0;cursor:zoom-in;box-shadow:0 1px 4px rgba(0,0,0,0.1);';
+                imgEl.onclick = () => _openImageLightbox(imgEl.src);
                mediaEl.appendChild(imgEl);
                scrollChatToBottom();

@@ -1096,8 +1313,10 @@ function startSSE(requestId, loadingEl, timestamp, titleInfo) {
                    addBotMessage(finalText, new Date((item.timestamp || Date.now() / 1000) * 1000), requestId);
                } else if (botEl) {
                    contentEl.classList.remove('sse-streaming');
-                    // Only update text content when there is something new to show.
                    if (finalText) contentEl.innerHTML = renderMarkdown(finalText);
+                    contentEl.dataset.rawMd = finalText || '';
+                    const copyBtn = botEl.querySelector('.copy-msg-btn');
+                    if (copyBtn && finalText) copyBtn.style.display = '';
                    applyHighlighting(botEl);
                }
                scrollChatToBottom();
@@ -1125,8 +1344,7 @@ function startSSE(requestId, loadingEl, timestamp, titleInfo) {
            if (done) return;

            if (currentReasoningEl) {
-                if (reasoningText.trim().replace(/\n+/g, ' ').length <= 80)
-                    currentReasoningEl.classList.add('no-expand');
+                finalizeThinking(currentReasoningEl, reasoningStartTime, reasoningText);
                currentReasoningEl = null;
                reasoningText = '';
            }
@@ -1250,28 +1468,54 @@ function renderToolCallsHtml(toolCalls) {
    }).join('');
 }

+// Cap for rendering reasoning content in the bubble. Beyond this size,
+// we skip markdown rendering entirely and show plain text head + tail to
+// keep the page responsive (very long chains-of-thought can otherwise
+// stall or crash the browser when re-parsed by marked.js).
+// Keep this in sync with backend MAX_STORED_REASONING_CHARS and
+// MAX_REASONING_STREAM_CHARS so storage / SSE / display stay aligned.
+const REASONING_RENDER_CAP = 4 * 1024; // 4 KB
+
+function _truncateReasoningForDisplay(text) {
+    if (!text || text.length <= REASONING_RENDER_CAP) return { text, truncated: false, omitted: 0 };
+    const half = Math.floor(REASONING_RENDER_CAP / 2);
+    const head = text.slice(0, half);
+    const tail = text.slice(-half);
+    return {
+        text: head + '\n\n... [' + (text.length - head.length - tail.length) + ' chars omitted] ...\n\n' + tail,
+        truncated: true,
+        omitted: text.length - head.length - tail.length,
+    };
+}
+
+function _renderReasoningBody(text) {
+    // For short reasoning, render as markdown. For long ones, fall back to
+    // an escaped <pre> block to avoid expensive markdown parsing.
+    const { text: shown, truncated } = _truncateReasoningForDisplay(text);
+    if (truncated || shown.length > REASONING_RENDER_CAP) {
+        return '<pre class="thinking-stream-pre">' + escapeHtml(shown) + '</pre>';
+    }
+    return renderMarkdown(shown);
+}
+
+function finalizeThinking(el, startTime, text) {
+    const elapsed = ((Date.now() - startTime) / 1000).toFixed(1);
+    el.querySelector('.thinking-summary').textContent = t('thinking_done');
+    const fullDiv = el.querySelector('.thinking-full');
+    fullDiv.innerHTML = `<div class="thinking-duration">${t('thinking_duration')} ${elapsed}s</div>` + _renderReasoningBody(text);
+}
+
 function renderThinkingHtml(text) {
    if (!text || !text.trim()) return '';
    const full = text.trim();
-    const oneLine = full.replace(/\n+/g, ' ');
-    if (oneLine.length > 80) {
-        const truncated = oneLine.substring(0, 80) + '…';
-        return `
+    return `
 <div class="agent-step agent-thinking-step">
    <div class="thinking-header" onclick="this.parentElement.classList.toggle('expanded')">
        <i class="fas fa-lightbulb text-amber-400 flex-shrink-0"></i>
-        <span class="thinking-summary">${escapeHtml(truncated)}</span>
+        <span class="thinking-summary">${t('thinking_done')}</span>
        <i class="fas fa-chevron-right thinking-chevron"></i>
    </div>
-    <div class="thinking-full">${renderMarkdown(full)}</div>
-</div>`;
-    }
-    return `
-<div class="agent-step agent-thinking-step no-expand">
-    <div class="thinking-header no-toggle">
-        <i class="fas fa-lightbulb text-amber-400 flex-shrink-0"></i>
-        <span>${escapeHtml(oneLine)}</span>
-    </div>
+    <div class="thinking-full">${_renderReasoningBody(full)}</div>
 </div>`;
 }

@@ -1318,11 +1562,40 @@ function renderStepsHtml(steps) {
        </div>` : ''}
    </div>
 </div>`;
+            // If this tool sent a file (send/read tool), render the media inline
+            // so it persists across page refreshes (SSE-only file events are not stored).
+            const mediaHtml = _renderSentFileFromToolResult(step);
+            if (mediaHtml) html += mediaHtml;
        }
    }
    return { stepsHtml: html, lastContentText };
 }

+// Extract file-to-send metadata from a tool's result and render an inline preview.
+// Returns '' if the result isn't a file_to_send payload.
+function _renderSentFileFromToolResult(step) {
+    if (!step || !step.result) return '';
+    let payload;
+    try {
+        payload = typeof step.result === 'string' ? JSON.parse(step.result) : step.result;
+    } catch (_) { return ''; }
+    if (!payload || payload.type !== 'file_to_send' || !payload.path) return '';
+    const webUrl = _toWebUrl(payload.path);
+    const fileType = payload.file_type || 'file';
+    const fileName = payload.file_name || payload.path.split('/').pop();
+    if (fileType === 'image') {
+        return `<div class="agent-step">${_buildImageHtml(webUrl)}</div>`;
+    }
+    if (fileType === 'video') {
+        return `<div class="agent-step">${_buildVideoHtml(webUrl)}</div>`;
+    }
+    return `<div class="agent-step"><a href="${webUrl}" download="${escapeHtml(fileName)}" target="_blank" ` +
+        `style="display:inline-flex;align-items:center;gap:6px;padding:8px 14px;margin:8px 0;border-radius:8px;` +
+        `background:var(--bg-secondary,#f3f4f6);color:var(--text-primary,#374151);text-decoration:none;font-size:14px;` +
+        `border:1px solid var(--border-color,#e5e7eb);">` +
+        `<i class="fas fa-file-download" style="color:#6b7280;"></i> ${escapeHtml(fileName)}</a></div>`;
+}
+
 function createBotMessageEl(content, timestamp, requestId, msg) {
    const el = document.createElement('div');
    el.className = 'flex gap-3 px-4 sm:px-6 py-3';
@@ -1351,9 +1624,15 @@ function createBotMessageEl(content, timestamp, requestId, msg) {
                ${stepsHtml ? `<div class="agent-steps">${stepsHtml}</div>` : ''}
                <div class="answer-content">${renderMarkdown(displayContent)}</div>
            </div>
-            <div class="text-xs text-slate-400 dark:text-slate-500 mt-1.5">${formatTime(timestamp)}</div>
+            <div class="flex items-center gap-2 mt-1.5">
+                <span class="text-xs text-slate-400 dark:text-slate-500">${formatTime(timestamp)}</span>
+                <button class="copy-msg-btn text-xs text-slate-300 dark:text-slate-600 hover:text-slate-500 dark:hover:text-slate-400 transition-colors cursor-pointer" title="${currentLang === 'zh' ? '复制' : 'Copy'}">
+                    <i class="fas fa-copy"></i>
+                </button>
+            </div>
        </div>
    `;
+    el.querySelector('.answer-content').dataset.rawMd = displayContent;
    applyHighlighting(el);
    bindChatKnowledgeLinks(el);
    return el;
@@ -1362,7 +1641,8 @@ function createBotMessageEl(content, timestamp, requestId, msg) {
 function addUserMessage(content, timestamp, attachments) {
    const el = createUserMessageEl(content, timestamp, attachments);
    messagesDiv.appendChild(el);
-    scrollChatToBottom();
+    _autoScrollEnabled = true;
+    scrollChatToBottom(true);
 }

 function addBotMessage(content, timestamp, requestId) {
@@ -1455,7 +1735,7 @@ function loadHistory(page) {
            if (isFirstLoad) {
                // Use requestAnimationFrame to ensure the DOM has fully rendered
                // before scrolling, otherwise scrollHeight may not reflect new content.
-                requestAnimationFrame(() => scrollChatToBottom());
+                requestAnimationFrame(() => scrollChatToBottom(true));
            } else {
                // Restore scroll position so loading older messages doesn't jump the view
                messagesDiv.scrollTop = messagesDiv.scrollHeight - prevScrollHeight;
@@ -1584,6 +1864,7 @@ function newChat() {
    if (panel && !sessionPanelOpen) {
        sessionPanelOpen = true;
        panel.classList.remove('hidden');
+        _showSessionOverlay();
        _persistPanelState();
    }
    const newSid = sessionId;
@@ -1601,11 +1882,40 @@ function _persistPanelState() {
    localStorage.setItem(SESSION_PANEL_KEY, sessionPanelOpen ? '1' : '0');
 }

+function _isMobileView() {
+    return window.innerWidth <= 768;
+}
+
+function _showSessionOverlay() {
+    if (!_isMobileView()) return;
+    const overlay = document.getElementById('session-panel-overlay');
+    if (overlay) overlay.classList.remove('hidden');
+}
+
+function _hideSessionOverlay() {
+    const overlay = document.getElementById('session-panel-overlay');
+    if (overlay) overlay.classList.add('hidden');
+}
+
+function closeSessionPanel() {
+    const panel = document.getElementById('session-panel');
+    if (!panel || !sessionPanelOpen) return;
+    sessionPanelOpen = false;
+    panel.classList.add('hidden');
+    _hideSessionOverlay();
+    _persistPanelState();
+}
+
 function toggleSessionPanel() {
    const panel = document.getElementById('session-panel');
    if (!panel) return;
    sessionPanelOpen = !sessionPanelOpen;
    panel.classList.toggle('hidden', !sessionPanelOpen);
+    if (sessionPanelOpen) {
+        _showSessionOverlay();
+    } else {
+        _hideSessionOverlay();
+    }
    _persistPanelState();
    if (sessionPanelOpen) loadSessionList();
 }
@@ -1615,6 +1925,7 @@ function openSessionPanel() {
    if (!panel || sessionPanelOpen) return;
    sessionPanelOpen = true;
    panel.classList.remove('hidden');
+    _showSessionOverlay();
    _persistPanelState();
    loadSessionList();
 }
@@ -1622,11 +1933,13 @@ function openSessionPanel() {
 function _restoreSessionPanel() {
    const panel = document.getElementById('session-panel');
    if (!panel) return;
-    if (sessionPanelOpen) {
+    if (sessionPanelOpen && !_isMobileView()) {
        panel.classList.remove('hidden');
+        _showSessionOverlay();
        loadSessionList();
    } else {
        panel.classList.add('hidden');
+        _hideSessionOverlay();
    }
 }

@@ -1818,6 +2131,7 @@ function switchSession(newSessionId) {
        el.classList.toggle('active', el.dataset.sessionId === sessionId);
    });

+    if (_isMobileView()) closeSessionPanel();
    if (currentView !== 'chat') navigateTo('chat');
 }

@@ -1939,8 +2253,17 @@ function formatToolArgs(args) {
    }
 }

-function scrollChatToBottom() {
-    messagesDiv.scrollTop = messagesDiv.scrollHeight;
+function scrollChatToBottom(force) {
+    if (force || _autoScrollEnabled) {
+        messagesDiv.scrollTop = messagesDiv.scrollHeight;
+    }
+}
+
+function _updateScrollToBottomBtn() {
+    const btn = document.getElementById('scroll-to-bottom-btn');
+    if (!btn) return;
+    const distFromBottom = messagesDiv.scrollHeight - messagesDiv.scrollTop - messagesDiv.clientHeight;
+    btn.classList.toggle('hidden', distFromBottom <= _SCROLL_THRESHOLD);
 }

 function applyHighlighting(container) {
@@ -2038,6 +2361,7 @@ function initConfigView(data) {
    document.getElementById('cfg-max-tokens').value = data.agent_max_context_tokens || 50000;
    document.getElementById('cfg-max-turns').value = data.agent_max_context_turns || 20;
    document.getElementById('cfg-max-steps').value = data.agent_max_steps || 20;
+    document.getElementById('cfg-enable-thinking').checked = data.enable_thinking === true;

    const pwdInput = document.getElementById('cfg-password');
    const maskedPwd = data.web_password_masked || '';
@@ -2081,6 +2405,9 @@ function onProviderChange(pid) {
    const p = configProviders[cfgProviderValue];
    if (!p) return;

+    const customTip = document.getElementById('cfg-custom-tip');
+    if (customTip) customTip.classList.toggle('hidden', cfgProviderValue !== 'custom');
+
    const modelEl = document.getElementById('cfg-model-select');
    const modelOpts = (p.models || []).map(m => ({ value: m, label: m }));
    modelOpts.push({ value: '__custom__', label: t('config_custom_option') });
@@ -2129,12 +2456,17 @@ function onProviderChange(pid) {
    }

    // API Base
+    const apiBaseInput = document.getElementById('cfg-api-base');
    if (p.api_base_key) {
        document.getElementById('cfg-api-base-wrap').classList.remove('hidden');
-        document.getElementById('cfg-api-base').value = configApiBases[p.api_base_key] || p.api_base_default || '';
+        apiBaseInput.value = configApiBases[p.api_base_key] || p.api_base_default || '';
+        // Hint the version-path tail (e.g. /v1) so users are reminded to
+        // include it themselves. We don't auto-rewrite anything server-side.
+        apiBaseInput.placeholder = p.api_base_placeholder || 'https://...';
    } else {
        document.getElementById('cfg-api-base-wrap').classList.add('hidden');
-        document.getElementById('cfg-api-base').value = '';
+        apiBaseInput.value = '';
+        apiBaseInput.placeholder = 'https://...';
    }

    onModelSelectChange(modelOpts[0] ? modelOpts[0].value : '');
@@ -2272,6 +2604,7 @@ function saveAgentConfig() {
        agent_max_context_tokens: parseInt(document.getElementById('cfg-max-tokens').value) || 50000,
        agent_max_context_turns: parseInt(document.getElementById('cfg-max-turns').value) || 20,
        agent_max_steps: parseInt(document.getElementById('cfg-max-steps').value) || 20,
+        enable_thinking: document.getElementById('cfg-enable-thinking').checked,
    };

    const btn = document.getElementById('cfg-agent-save');
@@ -2497,12 +2830,20 @@ function toggleSkill(name, currentlyEnabled) {
 // Memory View
 // =====================================================================
 let memoryPage = 1;
+let memoryCategory = 'memory';   // 'memory' | 'dream'
 const memoryPageSize = 10;

+function switchMemoryTab(tab) {
+    document.querySelectorAll('.memory-tab').forEach(el => el.classList.remove('active'));
+    document.getElementById('memory-tab-' + tab).classList.add('active');
+    memoryCategory = tab === 'dreams' ? 'dream' : 'memory';
+    loadMemoryView(1);
+}
+
 function loadMemoryView(page) {
    page = page || 1;
    memoryPage = page;
-    fetch(`/api/memory?page=${page}&page_size=${memoryPageSize}`).then(r => r.json()).then(data => {
+    fetch(`/api/memory?page=${page}&page_size=${memoryPageSize}&category=${memoryCategory}`).then(r => r.json()).then(data => {
        if (data.status !== 'success') return;
        const emptyEl = document.getElementById('memory-empty');
        const listEl = document.getElementById('memory-list');
@@ -2510,7 +2851,15 @@ function loadMemoryView(page) {
        const total = data.total || 0;

        if (total === 0) {
-            emptyEl.querySelector('p').textContent = currentLang === 'zh' ? '暂无记忆文件' : 'No memory files';
+            const emptyIcon = emptyEl.querySelector('i');
+            const emptyTitle = emptyEl.querySelector('p');
+            if (memoryCategory === 'dream') {
+                emptyIcon.className = 'fas fa-moon text-purple-400 text-xl';
+                emptyTitle.textContent = currentLang === 'zh' ? '暂无梦境日记' : 'No dream diaries yet';
+            } else {
+                emptyIcon.className = 'fas fa-brain text-purple-400 text-xl';
+                emptyTitle.textContent = currentLang === 'zh' ? '暂无记忆文件' : 'No memory files';
+            }
            emptyEl.classList.remove('hidden');
            listEl.classList.add('hidden');
            return;
@@ -2523,10 +2872,15 @@ function loadMemoryView(page) {
        files.forEach(f => {
            const tr = document.createElement('tr');
            tr.className = 'border-b border-slate-100 dark:border-white/5 hover:bg-slate-50 dark:hover:bg-white/5 cursor-pointer transition-colors';
-            tr.onclick = () => openMemoryFile(f.filename);
-            const typeLabel = f.type === 'global'
-                ? '<span class="px-2 py-0.5 rounded-full text-xs bg-primary-50 dark:bg-primary-900/30 text-primary-600 dark:text-primary-400">Global</span>'
-                : '<span class="px-2 py-0.5 rounded-full text-xs bg-blue-50 dark:bg-blue-900/30 text-blue-600 dark:text-blue-400">Daily</span>';
+            tr.onclick = () => openMemoryFile(f.filename, memoryCategory);
+            let typeLabel;
+            if (f.type === 'global') {
+                typeLabel = '<span class="px-2 py-0.5 rounded-full text-xs bg-primary-50 dark:bg-primary-900/30 text-primary-600 dark:text-primary-400">Global</span>';
+            } else if (f.type === 'dream') {
+                typeLabel = '<span class="px-2 py-0.5 rounded-full text-xs bg-violet-50 dark:bg-violet-900/30 text-violet-600 dark:text-violet-400">Dream</span>';
+            } else {
+                typeLabel = '<span class="px-2 py-0.5 rounded-full text-xs bg-blue-50 dark:bg-blue-900/30 text-blue-600 dark:text-blue-400">Daily</span>';
+            }
            const sizeStr = f.size < 1024 ? f.size + ' B' : (f.size / 1024).toFixed(1) + ' KB';
            tr.innerHTML = `
                <td class="px-4 py-3 text-sm font-mono text-slate-700 dark:text-slate-200">${escapeHtml(f.filename)}</td>
@@ -2548,8 +2902,9 @@ function loadMemoryView(page) {
    }).catch(() => {});
 }

-function openMemoryFile(filename) {
-    fetch(`/api/memory/content?filename=${encodeURIComponent(filename)}`).then(r => r.json()).then(data => {
+function openMemoryFile(filename, category) {
+    category = category || 'memory';
+    fetch(`/api/memory/content?filename=${encodeURIComponent(filename)}&category=${category}`).then(r => r.json()).then(data => {
        if (data.status !== 'success') return;
        document.getElementById('memory-panel-list').classList.add('hidden');
        const panel = document.getElementById('memory-panel-viewer');
@@ -3446,10 +3801,9 @@ navigateTo = function(viewId) {
    if (viewId === 'config') loadConfigView();
    else if (viewId === 'skills') loadSkillsView();
    else if (viewId === 'memory') {
-        // Always start from the list panel when navigating to memory
        document.getElementById('memory-panel-viewer').classList.add('hidden');
        document.getElementById('memory-panel-list').classList.remove('hidden');
-        loadMemoryView(1);
+        switchMemoryTab('files');
    }
    else if (viewId === 'knowledge') loadKnowledgeView();
    else if (viewId === 'channels') loadChannelsView();
@@ -3461,6 +3815,7 @@ navigateTo = function(viewId) {
 // Knowledge View
 // =====================================================================
 let _knowledgeTreeData = [];
+let _knowledgeRootFiles = [];
 let _knowledgeCurrentFile = null;
 let _knowledgeGraphLoaded = false;

@@ -3478,7 +3833,9 @@ function loadKnowledgeView() {
        const statsEl = document.getElementById('knowledge-stats');

        const tree = data.tree || [];
+        const rootFiles = data.root_files || [];
        _knowledgeTreeData = tree;
+        _knowledgeRootFiles = rootFiles;
        const stats = data.stats || {};
        const totalPages = stats.pages || 0;
        const sizeStr = stats.size < 1024 ? stats.size + ' B' : (stats.size / 1024).toFixed(1) + ' KB';
@@ -3496,14 +3853,17 @@ function loadKnowledgeView() {
        emptyEl.classList.add('hidden');
        docsPanel.classList.remove('hidden');

-        renderKnowledgeTree(tree);
+        renderKnowledgeTree(tree, rootFiles);

        // Auto-select the first file (desktop only)
        if (window.innerWidth >= 768) {
-            const firstGroup = tree.find(g => g.files && g.files.length > 0);
-            if (firstGroup) {
-                const firstFile = firstGroup.files[0];
-                openKnowledgeFile(firstGroup.dir + '/' + firstFile.name, firstFile.title);
+            const firstFile = rootFiles.length > 0 ? rootFiles[0] : null;
+            const firstGroup = !firstFile ? tree.find(g => g.files && g.files.length > 0) : null;
+            if (firstFile) {
+                openKnowledgeFile(firstFile.name, firstFile.title);
+            } else if (firstGroup) {
+                const gf = firstGroup.files[0];
+                openKnowledgeFile(firstGroup.dir + '/' + gf.name, gf.title);
            }
        } else {
            document.getElementById('knowledge-content-placeholder').classList.add('hidden');
@@ -3512,23 +3872,48 @@ function loadKnowledgeView() {
    }).catch(() => {});
 }

-function renderKnowledgeTree(tree, filter) {
+function renderKnowledgeTree(tree, rootFilesOrFilter, filter) {
    const container = document.getElementById('knowledge-tree');
    container.innerHTML = '';
-    const lowerFilter = (filter || '').toLowerCase();
+    let rootFiles, lowerFilter;
+    if (typeof rootFilesOrFilter === 'string') {
+        rootFiles = _knowledgeRootFiles;
+        lowerFilter = (rootFilesOrFilter || '').toLowerCase();
+    } else {
+        rootFiles = rootFilesOrFilter || _knowledgeRootFiles;
+        lowerFilter = (filter || '').toLowerCase();
+    }
+    (rootFiles || []).forEach(f => {
+        if (lowerFilter && !f.title.toLowerCase().includes(lowerFilter) && !f.name.toLowerCase().includes(lowerFilter)) return;
+        const fbtn = document.createElement('button');
+        fbtn.className = 'knowledge-tree-file' + (_knowledgeCurrentFile === f.name ? ' active' : '');
+        fbtn.dataset.path = f.name;
+        fbtn.innerHTML = `<i class="fas fa-file-lines text-[10px] text-slate-400"></i><span class="truncate">${escapeHtml(f.title)}</span>`;
+        fbtn.onclick = () => openKnowledgeFile(f.name, f.title);
+        container.appendChild(fbtn);
+    });
+    _renderKnowledgeGroups(container, tree, '', lowerFilter, 0);
+}

-    tree.forEach(group => {
-        const files = group.files.filter(f =>
+function _renderKnowledgeGroups(container, groups, parentPath, lowerFilter, depth) {
+    const indent = depth * 12;
+    groups.forEach(group => {
+        const groupPath = parentPath ? parentPath + '/' + group.dir : group.dir;
+        const files = (group.files || []).filter(f =>
            !lowerFilter || f.title.toLowerCase().includes(lowerFilter) || f.name.toLowerCase().includes(lowerFilter)
        );
-        if (files.length === 0 && lowerFilter) return;
+        const children = group.children || [];
+        const hasMatchingChildren = lowerFilter ? _hasFilterMatch(children, lowerFilter) : children.length > 0;
+        if (files.length === 0 && !hasMatchingChildren && lowerFilter) return;

        const div = document.createElement('div');
        div.className = 'knowledge-tree-group open';

+        const fileCount = _countFiles(group);
        const btn = document.createElement('button');
        btn.className = 'knowledge-tree-group-btn';
-        btn.innerHTML = `<i class="fas fa-chevron-right chevron"></i><i class="fas fa-folder text-amber-400 text-[11px]"></i><span>${escapeHtml(group.dir)}</span><span class="ml-auto text-[10px] text-slate-400">${files.length}</span>`;
+        btn.style.paddingLeft = (8 + indent) + 'px';
+        btn.innerHTML = `<i class="fas fa-chevron-right chevron"></i><i class="fas fa-folder text-amber-400 text-[11px]"></i><span>${escapeHtml(group.dir)}</span><span class="ml-auto text-[10px] text-slate-400">${fileCount}</span>`;
        btn.onclick = () => div.classList.toggle('open');
        div.appendChild(btn);

@@ -3536,20 +3921,42 @@ function renderKnowledgeTree(tree, filter) {
        items.className = 'knowledge-tree-group-items';
        files.forEach(f => {
            const fbtn = document.createElement('button');
-            const fpath = group.dir + '/' + f.name;
+            const fpath = groupPath + '/' + f.name;
            fbtn.className = 'knowledge-tree-file' + (_knowledgeCurrentFile === fpath ? ' active' : '');
            fbtn.dataset.path = fpath;
+            fbtn.style.paddingLeft = (24 + indent) + 'px';
            fbtn.innerHTML = `<i class="fas fa-file-lines text-[10px] text-slate-400"></i><span class="truncate">${escapeHtml(f.title)}</span>`;
            fbtn.onclick = () => openKnowledgeFile(fpath, f.title);
            items.appendChild(fbtn);
        });
+        if (children.length > 0) {
+            _renderKnowledgeGroups(items, children, groupPath, lowerFilter, depth + 1);
+        }
        div.appendChild(items);
        container.appendChild(div);
    });
 }

+function _hasFilterMatch(groups, lowerFilter) {
+    for (const g of groups) {
+        for (const f of (g.files || [])) {
+            if (f.title.toLowerCase().includes(lowerFilter) || f.name.toLowerCase().includes(lowerFilter)) return true;
+        }
+        if (_hasFilterMatch(g.children || [], lowerFilter)) return true;
+    }
+    return false;
+}
+
+function _countFiles(group) {
+    let count = (group.files || []).length;
+    for (const child of (group.children || [])) {
+        count += _countFiles(child);
+    }
+    return count;
+}
+
 function filterKnowledgeTree(query) {
-    renderKnowledgeTree(_knowledgeTreeData, query);
+    renderKnowledgeTree(_knowledgeTreeData, _knowledgeRootFiles, query);
 }

 function resolveKnowledgePath(currentFilePath, relativeHref) {
@@ -3628,12 +4035,22 @@ function bindChatKnowledgeLinks(container) {
 }

 function _findKnowledgeFileByName(filename) {
-    for (const group of _knowledgeTreeData) {
-        for (const f of group.files) {
+    for (const f of _knowledgeRootFiles) {
+        if (f.name === filename) return { path: f.name, title: f.title };
+    }
+    return _searchFileInGroups(_knowledgeTreeData, '', filename);
+}
+
+function _searchFileInGroups(groups, parentPath, filename) {
+    for (const group of groups) {
+        const groupPath = parentPath ? parentPath + '/' + group.dir : group.dir;
+        for (const f of (group.files || [])) {
            if (f.name === filename) {
-                return { path: group.dir + '/' + f.name, title: f.title };
+                return { path: groupPath + '/' + f.name, title: f.title };
            }
        }
+        const found = _searchFileInGroups(group.children || [], groupPath, filename);
+        if (found) return found;
    }
    return null;
 }
@@ -3957,7 +4374,10 @@ function initApp() {
    _restoreSessionPanel();

    fetch('/api/knowledge/list').then(r => r.json()).then(data => {
-        if (data.status === 'success') _knowledgeTreeData = data.tree || [];
+        if (data.status === 'success') {
+            _knowledgeTreeData = data.tree || [];
+            _knowledgeRootFiles = data.root_files || [];
+        }
    }).catch(() => {});

    fetch('/api/version').then(r => r.json()).then(data => {
--- a/channel/web/web_channel.py
+++ b/channel/web/web_channel.py
@@ -91,39 +91,9 @@ def _get_upload_dir() -> str:


 def _generate_session_title(user_message: str, assistant_reply: str = "") -> str:
-    """
-    Generate a short session title by calling the current bot's reply_text.
-    """
-    import re
-    fallback = user_message[:50].split("\n")[0].strip() or "New Chat"
-    try:
-        from bridge.bridge import Bridge
-        from models.session_manager import Session
-        bot = Bridge().get_bot("chat")
-
-        prompt_parts = [f"User: {user_message[:300]}"]
-        if assistant_reply:
-            prompt_parts.append(f"Assistant: {assistant_reply[:300]}")
-
-        session = Session("__title_gen__", system_prompt="")
-        session.messages = [
-            {"role": "user", "content": (
-                "Generate a very short title (max 15 characters for Chinese, max 6 words for English) "
-                "summarizing this conversation. Return ONLY the title text, nothing else.\n\n"
-                + "\n".join(prompt_parts)
-            )}
-        ]
-
-        result = bot.reply_text(session)
-        raw = (result.get("content") or "").strip()
-        # Strip <think>...</think> reasoning blocks
-        title = re.sub(r'<think>.*?</think>', '', raw, flags=re.DOTALL).strip().strip('"\'')
-        logger.info(f"[WebChannel] Title generation result: '{title}' (len={len(title)})")
-        if title and len(title) <= 50:
-            return title
-    except Exception as e:
-        logger.warning(f"[WebChannel] Title generation failed: {e}")
-    return fallback
+    """Delegate to the shared SessionService implementation."""
+    from agent.chat.session_service import generate_session_title
+    return generate_session_title(user_message, assistant_reply)


 class WebMessage(ChatMessage):
@@ -238,9 +208,24 @@ class WebChannel(ChatChannel):

            # Fallback: polling mode
            if session_id in self.session_queues:
+                content = reply.content if reply.content is not None else ""
+                # Skip file:// IMAGE_URL/FILE replies originating from an SSE-enabled
+                # request: they were already pushed via the `file_to_send` event during
+                # agent execution. By the time the chat_channel sends the IMAGE_URL reply,
+                # the SSE stream has typically closed (after the text "done") and the
+                # request_id is gone from sse_queues, so we'd otherwise duplicate the file
+                # as a polling bubble. Scheduler/push tasks have no on_event and must
+                # still go through polling normally.
+                if (
+                    reply.type in (ReplyType.IMAGE_URL, ReplyType.FILE)
+                    and content.startswith("file://")
+                    and context.get("on_event") is not None
+                ):
+                    logger.debug(f"Polling skipped duplicate file reply for session {session_id}")
+                    return
                response_data = {
                    "type": str(reply.type),
-                    "content": reply.content,
+                    "content": content,
                    "timestamp": time.time(),
                    "request_id": request_id
                }
@@ -255,6 +240,17 @@ class WebChannel(ChatChannel):
    def _make_sse_callback(self, request_id: str):
        """Build an on_event callback that pushes agent stream events into the SSE queue."""

+        # Cap reasoning bytes pushed to the frontend per request to avoid
+        # browser stalls / crashes on very long chains-of-thought. Anything
+        # beyond the cap is dropped from the stream (DB still persists a
+        # truncated copy via _truncate_reasoning_for_storage).
+        # Keep aligned with frontend REASONING_RENDER_CAP and backend
+        # MAX_STORED_REASONING_CHARS.
+        MAX_REASONING_STREAM_CHARS = 4 * 1024  # 4 KB
+        # Use a single-element list as a mutable counter accessible from closure.
+        reasoning_chars_sent = [0]
+        reasoning_capped_notified = [False]
+
        def on_event(event: dict):
            if request_id not in self.sse_queues:
                return
@@ -264,8 +260,21 @@ class WebChannel(ChatChannel):

            if event_type == "reasoning_update":
                delta = data.get("delta", "")
-                if delta:
-                    q.put({"type": "reasoning", "content": delta})
+                if not delta:
+                    return
+                remaining = MAX_REASONING_STREAM_CHARS - reasoning_chars_sent[0]
+                if remaining <= 0:
+                    if not reasoning_capped_notified[0]:
+                        reasoning_capped_notified[0] = True
+                        q.put({
+                            "type": "reasoning",
+                            "content": "\n\n... [reasoning truncated for display] ...",
+                        })
+                    return
+                if len(delta) > remaining:
+                    delta = delta[:remaining]
+                reasoning_chars_sent[0] += len(delta)
+                q.put({"type": "reasoning", "content": delta})

            elif event_type == "message_update":
                delta = data.get("delta", "")
@@ -299,6 +308,25 @@ class WebChannel(ChatChannel):
                if tool_calls:
                    q.put({"type": "message_end", "has_tool_calls": True})

+            elif event_type == "agent_end":
+                # Safety net: if the agent finishes with an empty final_response,
+                # chat_channel skips _send_reply (because reply.content is empty),
+                # which means no "done" event is ever emitted and the SSE stream
+                # would hang until the 10-min idle timeout. Push a fallback "done"
+                # here so the frontend always gets closure.
+                final_response = data.get("final_response", "")
+                if not final_response or not str(final_response).strip():
+                    logger.warning(
+                        f"[WebChannel] agent_end with empty final_response for "
+                        f"request {request_id}, sending fallback done"
+                    )
+                    q.put({
+                        "type": "done",
+                        "content": "(模型未返回任何内容，请重试或换一种方式描述你的需求)",
+                        "request_id": request_id,
+                        "timestamp": time.time(),
+                    })
+
            elif event_type == "file_to_send":
                file_path = data.get("path", "")
                file_name = data.get("file_name", os.path.basename(file_path))
@@ -742,65 +770,58 @@ class ChatHandler:
 class ConfigHandler:

    _RECOMMENDED_MODELS = [
-        const.MINIMAX_M2_7, const.MINIMAX_M2_5, const.MINIMAX_M2_1, const.MINIMAX_M2_1_LIGHTNING,
-        const.GLM_5_TURBO, const.GLM_5, const.GLM_4_7,
-        const.QWEN36_PLUS, const.QWEN35_PLUS, const.QWEN3_MAX,
-        const.KIMI_K2_5, const.KIMI_K2,
-        const.DOUBAO_SEED_2_PRO, const.DOUBAO_SEED_2_CODE,
-        const.CLAUDE_4_6_SONNET, const.CLAUDE_4_6_OPUS, const.CLAUDE_4_5_SONNET,
+        const.DEEPSEEK_V4_FLASH, const.DEEPSEEK_V4_PRO, const.DEEPSEEK_CHAT, const.DEEPSEEK_REASONER,
+        const.MINIMAX_M2_7_HIGHSPEED, const.MINIMAX_M2_7, const.MINIMAX_M2_5, const.MINIMAX_M2_1, const.MINIMAX_M2_1_LIGHTNING,
+        const.CLAUDE_4_6_SONNET, const.CLAUDE_4_7_OPUS, const.CLAUDE_4_6_OPUS, const.CLAUDE_4_5_SONNET,
        const.GEMINI_31_FLASH_LITE_PRE, const.GEMINI_31_PRO_PRE, const.GEMINI_3_FLASH_PRE,
        const.GPT_54, const.GPT_54_MINI, const.GPT_54_NANO, const.GPT_5, const.GPT_41, const.GPT_4o,
-        const.DEEPSEEK_CHAT, const.DEEPSEEK_REASONER,
+        const.GLM_5_1, const.GLM_5_TURBO, const.GLM_5, const.GLM_4_7,
+        const.QWEN36_PLUS, const.QWEN35_PLUS, const.QWEN3_MAX,
+        const.DOUBAO_SEED_2_PRO, const.DOUBAO_SEED_2_CODE,
+        const.KIMI_K2_6, const.KIMI_K2_5, const.KIMI_K2,
    ]

+    # Generic placeholder hints surfaced in the web console. We deliberately
+    # show the version-path tail (e.g. "/v1") so users are reminded to type
+    # the full base URL. The form is intentionally vague (`...../v1`) so it
+    # never looks like a real default a user might paste verbatim — and we
+    # never auto-rewrite anything on the server side.
+    _PLACEHOLDER_V1 = "https://...../v1"
+    _PLACEHOLDER_ZHIPU = "https://...../api/paas/v4"
+    _PLACEHOLDER_DOUBAO = "https://...../api/v3"
+    _PLACEHOLDER_GEMINI = "https://....."
+
    PROVIDER_MODELS = OrderedDict([
+        ("deepseek", {
+            "label": "DeepSeek",
+            "api_key_field": "deepseek_api_key",
+            "api_base_key": "deepseek_api_base",
+            "api_base_default": "https://api.deepseek.com/v1",
+            "api_base_placeholder": _PLACEHOLDER_V1,
+            "models": [const.DEEPSEEK_V4_FLASH, const.DEEPSEEK_V4_PRO, const.DEEPSEEK_CHAT, const.DEEPSEEK_REASONER],
+        }),
        ("minimax", {
            "label": "MiniMax",
            "api_key_field": "minimax_api_key",
            "api_base_key": None,
            "api_base_default": None,
-            "models": [const.MINIMAX_M2_7, const.MINIMAX_M2_5, const.MINIMAX_M2_1, const.MINIMAX_M2_1_LIGHTNING],
-        }),
-        ("zhipu", {
-            "label": "智谱AI",
-            "api_key_field": "zhipu_ai_api_key",
-            "api_base_key": "zhipu_ai_api_base",
-            "api_base_default": "https://open.bigmodel.cn/api/paas/v4",
-            "models": [const.GLM_5_TURBO, const.GLM_5, const.GLM_4_7],
-        }),
-        ("dashscope", {
-            "label": "通义千问",
-            "api_key_field": "dashscope_api_key",
-            "api_base_key": None,
-            "api_base_default": None,
-            "models": [const.QWEN36_PLUS, const.QWEN35_PLUS, const.QWEN3_MAX],
-        }),
-        ("moonshot", {
-            "label": "Kimi",
-            "api_key_field": "moonshot_api_key",
-            "api_base_key": "moonshot_base_url",
-            "api_base_default": "https://api.moonshot.cn/v1",
-            "models": [const.KIMI_K2_5, const.KIMI_K2],
-        }),
-        ("doubao", {
-            "label": "豆包",
-            "api_key_field": "ark_api_key",
-            "api_base_key": "ark_base_url",
-            "api_base_default": "https://ark.cn-beijing.volces.com/api/v3",
-            "models": [const.DOUBAO_SEED_2_PRO, const.DOUBAO_SEED_2_CODE],
+            "api_base_placeholder": "",
+            "models": [const.MINIMAX_M2_7, const.MINIMAX_M2_7_HIGHSPEED, const.MINIMAX_M2_5, const.MINIMAX_M2_1, const.MINIMAX_M2_1_LIGHTNING],
        }),
        ("claudeAPI", {
            "label": "Claude",
            "api_key_field": "claude_api_key",
            "api_base_key": "claude_api_base",
            "api_base_default": "https://api.anthropic.com/v1",
-            "models": [const.CLAUDE_4_6_SONNET, const.CLAUDE_4_6_OPUS, const.CLAUDE_4_5_SONNET],
+            "api_base_placeholder": _PLACEHOLDER_V1,
+            "models": [const.CLAUDE_4_6_SONNET, const.CLAUDE_4_7_OPUS, const.CLAUDE_4_6_OPUS, const.CLAUDE_4_5_SONNET],
        }),
        ("gemini", {
            "label": "Gemini",
            "api_key_field": "gemini_api_key",
            "api_base_key": "gemini_api_base",
            "api_base_default": "https://generativelanguage.googleapis.com",
+            "api_base_placeholder": _PLACEHOLDER_GEMINI,
            "models": [const.GEMINI_31_FLASH_LITE_PRE, const.GEMINI_31_PRO_PRE, const.GEMINI_3_FLASH_PRE],
        }),
        ("openai", {
@@ -808,20 +829,47 @@ class ConfigHandler:
            "api_key_field": "open_ai_api_key",
            "api_base_key": "open_ai_api_base",
            "api_base_default": "https://api.openai.com/v1",
+            "api_base_placeholder": _PLACEHOLDER_V1,
            "models": [const.GPT_54, const.GPT_54_MINI, const.GPT_54_NANO, const.GPT_5, const.GPT_41, const.GPT_4o],
        }),
-        ("deepseek", {
-            "label": "DeepSeek",
-            "api_key_field": "deepseek_api_key",
-            "api_base_key": "deepseek_api_base",
-            "api_base_default": "https://api.deepseek.com/v1",
-            "models": [const.DEEPSEEK_CHAT, const.DEEPSEEK_REASONER],
+        ("zhipu", {
+            "label": "智谱AI",
+            "api_key_field": "zhipu_ai_api_key",
+            "api_base_key": "zhipu_ai_api_base",
+            "api_base_default": "https://open.bigmodel.cn/api/paas/v4",
+            "api_base_placeholder": _PLACEHOLDER_ZHIPU,
+            "models": [const.GLM_5_1, const.GLM_5_TURBO, const.GLM_5, const.GLM_4_7],
+        }),
+        ("dashscope", {
+            "label": "通义千问",
+            "api_key_field": "dashscope_api_key",
+            "api_base_key": None,
+            "api_base_default": None,
+            "api_base_placeholder": "",
+            "models": [const.QWEN36_PLUS, const.QWEN35_PLUS, const.QWEN3_MAX],
+        }),
+        ("doubao", {
+            "label": "豆包",
+            "api_key_field": "ark_api_key",
+            "api_base_key": "ark_base_url",
+            "api_base_default": "https://ark.cn-beijing.volces.com/api/v3",
+            "api_base_placeholder": _PLACEHOLDER_DOUBAO,
+            "models": [const.DOUBAO_SEED_2_PRO, const.DOUBAO_SEED_2_CODE],
+        }),
+        ("moonshot", {
+            "label": "Kimi",
+            "api_key_field": "moonshot_api_key",
+            "api_base_key": "moonshot_base_url",
+            "api_base_default": "https://api.moonshot.cn/v1",
+            "api_base_placeholder": _PLACEHOLDER_V1,
+            "models": [const.KIMI_K2_6, const.KIMI_K2_5, const.KIMI_K2],
        }),
        ("modelscope", {
            "label": "ModelScope",
            "api_key_field": "modelscope_api_key",
            "api_base_key": None,
            "api_base_default": None,
+            "api_base_placeholder": "",
            "models": [const.QWEN3_5_27B, const.QWEN3_235B_A22B_INSTRUCT_2507],
        }),
        ("linkai", {
@@ -829,19 +877,28 @@ class ConfigHandler:
            "api_key_field": "linkai_api_key",
            "api_base_key": None,
            "api_base_default": None,
+            "api_base_placeholder": "",
            "models": _RECOMMENDED_MODELS,
        }),
+        ("custom", {
+            "label": "自定义",
+            "api_key_field": "custom_api_key",
+            "api_base_key": "custom_api_base",
+            "api_base_default": "",
+            "api_base_placeholder": _PLACEHOLDER_V1,
+            "models": [],
+        }),
    ])

    EDITABLE_KEYS = {
        "model", "bot_type", "use_linkai",
        "open_ai_api_base", "deepseek_api_base", "claude_api_base", "gemini_api_base",
-        "zhipu_ai_api_base", "moonshot_base_url", "ark_base_url",
+        "zhipu_ai_api_base", "moonshot_base_url", "ark_base_url", "custom_api_base",
        "open_ai_api_key", "deepseek_api_key", "claude_api_key", "gemini_api_key",
        "zhipu_ai_api_key", "dashscope_api_key", "moonshot_api_key",
-        "ark_api_key", "minimax_api_key", "linkai_api_key",
+        "ark_api_key", "minimax_api_key", "linkai_api_key", "custom_api_key",
        "agent_max_context_tokens", "agent_max_context_turns", "agent_max_steps",
-        "web_password",
+        "enable_thinking", "web_password",
    }

    @staticmethod
@@ -877,6 +934,7 @@ class ConfigHandler:
                    "models": p["models"],
                    "api_base_key": p["api_base_key"],
                    "api_base_default": p["api_base_default"],
+                    "api_base_placeholder": p.get("api_base_placeholder", ""),
                    "api_key_field": p.get("api_key_field"),
                }

@@ -894,6 +952,7 @@ class ConfigHandler:
                "agent_max_context_tokens": local_config.get("agent_max_context_tokens", 50000),
                "agent_max_context_turns": local_config.get("agent_max_context_turns", 20),
                "agent_max_steps": local_config.get("agent_max_steps", 20),
+                "enable_thinking": bool(local_config.get("enable_thinking", False)),
                "api_bases": api_bases,
                "api_keys": api_keys_masked,
                "providers": providers,
@@ -919,7 +978,7 @@ class ConfigHandler:
                    continue
                if key in ("agent_max_context_tokens", "agent_max_context_turns", "agent_max_steps"):
                    value = int(value)
-                if key == "use_linkai":
+                if key in ("use_linkai", "enable_thinking"):
                    value = bool(value)
                local_config[key] = value
                applied[key] = value
@@ -939,6 +998,19 @@ class ConfigHandler:
                json.dump(file_cfg, f, indent=4, ensure_ascii=False)

            logger.info(f"[WebChannel] Config updated: {list(applied.keys())}")
+
+            # Reset Bridge so that bot routing reflects the new config.
+            # Without this, Bridge keeps its cached bot instance (e.g. LinkAIBot)
+            # even after the user switches bot_type / use_linkai / model in UI.
+            bridge_routing_keys = {"bot_type", "use_linkai", "model"}
+            if any(k in applied for k in bridge_routing_keys):
+                try:
+                    from bridge.bridge import Bridge
+                    Bridge().reset_bot()
+                    logger.info("[WebChannel] Bridge bot routing reset due to config change")
+                except Exception as reset_err:
+                    logger.warning(f"[WebChannel] Failed to reset bridge: {reset_err}")
+
            return json.dumps({"status": "success", "applied": applied}, ensure_ascii=False)
        except Exception as e:
            logger.error(f"Error updating config: {e}")
@@ -1537,10 +1609,13 @@ class MemoryHandler:
        web.header('Content-Type', 'application/json; charset=utf-8')
        try:
            from agent.memory.service import MemoryService
-            params = web.input(page='1', page_size='20')
+            params = web.input(page='1', page_size='20', category='memory')
            workspace_root = _get_workspace_root()
            service = MemoryService(workspace_root)
-            result = service.list_files(page=int(params.page), page_size=int(params.page_size))
+            result = service.list_files(
+                page=int(params.page), page_size=int(params.page_size),
+                category=params.category,
+            )
            return json.dumps({"status": "success", **result}, ensure_ascii=False)
        except Exception as e:
            logger.error(f"[WebChannel] Memory API error: {e}")
@@ -1553,12 +1628,12 @@ class MemoryContentHandler:
        web.header('Content-Type', 'application/json; charset=utf-8')
        try:
            from agent.memory.service import MemoryService
-            params = web.input(filename='')
+            params = web.input(filename='', category='memory')
            if not params.filename:
                return json.dumps({"status": "error", "message": "filename required"})
            workspace_root = _get_workspace_root()
            service = MemoryService(workspace_root)
-            result = service.get_content(params.filename)
+            result = service.get_content(params.filename, category=params.category)
            return json.dumps({"status": "success", **result}, ensure_ascii=False)
        except ValueError:
            return json.dumps({"status": "error", "message": "invalid filename"})
--- a/cli/VERSION
+++ b/cli/VERSION
@@ -1 +1 @@
-2.0.5
+2.0.7
--- a/cli/commands/skill.py
+++ b/cli/commands/skill.py
@@ -644,32 +644,52 @@ def _list_local():
    skills_dir = get_skills_dir()
    builtin_dir = get_builtin_skills_dir()

+    # Merge builtin skills that are on disk but missing from config
+    _merge_builtin_into_config(config, builtin_dir, skills_dir)
+
    if not config:
-        # Fallback: scan directories directly
-        entries = []
-        for d in [builtin_dir, skills_dir]:
-            if not os.path.isdir(d):
-                continue
-            source = "builtin" if d == builtin_dir else "custom"
-            for name in sorted(os.listdir(d)):
-                skill_path = os.path.join(d, name)
-                if os.path.isdir(skill_path) and not name.startswith("."):
-                    has_skill_md = os.path.exists(os.path.join(skill_path, "SKILL.md"))
-                    if has_skill_md:
-                        entries.append({"name": name, "source": source, "enabled": True, "description": ""})
-        if not entries:
-            click.echo("No skills installed.")
-            return
-        _print_skill_table(entries)
+        click.echo("No skills installed.")
        return

    entries = sorted(config.values(), key=lambda x: x.get("name", ""))
-    if not entries:
-        click.echo("No skills installed.")
-        return
    _print_skill_table(entries)


+def _merge_builtin_into_config(config: dict, builtin_dir: str, skills_dir: str):
+    """Scan builtin and custom dirs, add any new skills into config dict."""
+    dirty = False
+    for d, source in [(builtin_dir, "builtin"), (skills_dir, "custom")]:
+        if not os.path.isdir(d):
+            continue
+        for name in os.listdir(d):
+            if name.startswith(".") or name in ("skills_config.json",):
+                continue
+            skill_path = os.path.join(d, name)
+            if not os.path.isdir(skill_path):
+                continue
+            if not os.path.isfile(os.path.join(skill_path, "SKILL.md")):
+                continue
+            if name in config:
+                continue
+            desc = _read_skill_description(skill_path)
+            config[name] = {
+                "name": name,
+                "description": desc,
+                "source": source,
+                "enabled": True,
+                "category": "skill",
+            }
+            dirty = True
+    if dirty:
+        config_path = os.path.join(skills_dir, "skills_config.json")
+        try:
+            os.makedirs(skills_dir, exist_ok=True)
+            with open(config_path, "w", encoding="utf-8") as f:
+                json.dump(config, f, indent=4, ensure_ascii=False)
+        except Exception:
+            pass
+
+
 def _print_skill_table(entries):
    """Print skills as a formatted table."""
    def _display_label(e):
--- a/common/cloud_client.py
+++ b/common/cloud_client.py
@@ -56,6 +56,7 @@ class CloudClient(LinkAIClient):
        self._memory_service = None
        self._knowledge_service = None
        self._chat_service = None
+        self._session_service = None

    @property
    def skill_service(self):
@@ -118,6 +119,18 @@ class CloudClient(LinkAIClient):
                logger.error(f"[CloudClient] Failed to init ChatService: {e}")
        return self._chat_service

+    @property
+    def session_service(self):
+        """Lazy-init SessionService."""
+        if self._session_service is None:
+            try:
+                from agent.chat.session_service import SessionService
+                self._session_service = SessionService()
+                logger.debug("[CloudClient] SessionService initialised")
+            except Exception as e:
+                logger.error(f"[CloudClient] Failed to init SessionService: {e}")
+        return self._session_service
+
    # ------------------------------------------------------------------
    # message push callback
    # ------------------------------------------------------------------
@@ -546,12 +559,23 @@ class CloudClient(LinkAIClient):
    # ------------------------------------------------------------------
    # history callback
    # ------------------------------------------------------------------
+    # Session-related actions handled via the HISTORY channel
+    _SESSION_ACTIONS = {
+        "list_sessions", "delete_session", "rename_session",
+        "clear_context", "generate_title",
+    }
+
    def on_history(self, data: dict) -> dict:
        """
        Handle HISTORY messages from the cloud console.
-        Returns paginated conversation history for a session.

-        :param data: message data with 'action' and 'payload' (session_id, page, page_size)
+        Supports both history query and session management actions
+        through a unified HISTORY message channel:
+          - query: paginated conversation history
+          - list_sessions / delete_session / rename_session /
+            clear_context / generate_title: session lifecycle
+
+        :param data: message data with 'action' and 'payload'
        :return: response dict
        """
        action = data.get("action", "query")
@@ -561,8 +585,19 @@ class CloudClient(LinkAIClient):
        if action == "query":
            return self._query_history(payload)

+        if action in self._SESSION_ACTIONS:
+            return self._dispatch_session(action, payload)
+
        return {"action": action, "code": 404, "message": f"unknown action: {action}", "payload": None}

+    def _dispatch_session(self, action: str, payload: dict) -> dict:
+        """Delegate session actions to SessionService."""
+        svc = self.session_service
+        if svc is None:
+            return {"action": action, "code": 500,
+                    "message": "SessionService not available", "payload": None}
+        return svc.dispatch(action, payload)
+
    def _query_history(self, payload: dict) -> dict:
        """Query paginated conversation history using ConversationStore."""
        session_id = payload.get("session_id", "")
--- a/common/const.py
+++ b/common/const.py
@@ -14,6 +14,7 @@ ZHIPU_AI = "zhipu"
 MOONSHOT = "moonshot"
 MiniMax = "minimax"
 DEEPSEEK = "deepseek"
+CUSTOM = "custom"  # custom OpenAI-compatible API, bot_type won't auto-switch on model change
 MODELSCOPE = "modelscope"

 # 模型列表
@@ -27,6 +28,7 @@ CLAUDE_35_SONNET = "claude-3-5-sonnet-latest"  # 带 latest 标签的模型名
 CLAUDE_35_SONNET_1022 = "claude-3-5-sonnet-20241022"  # 带具体日期的模型名称，会固定为该日期发布的模型
 CLAUDE_35_SONNET_0620 = "claude-3-5-sonnet-20240620"
 CLAUDE_4_OPUS = "claude-opus-4-0"
+CLAUDE_4_7_OPUS = "claude-opus-4-7"      # Claude Opus 4.7
 CLAUDE_4_6_OPUS = "claude-opus-4-6"      # Claude Opus 4.6 - Agent推荐模型
 CLAUDE_4_SONNET = "claude-sonnet-4-0"    # Claude Sonnet 4.0
 CLAUDE_4_5_SONNET = "claude-sonnet-4-5"  # Claude Sonnet 4.5 - Agent推荐模型
@@ -80,6 +82,8 @@ TTS_1_HD = "tts-1-hd"
 # DeepSeek
 DEEPSEEK_CHAT = "deepseek-chat"  # DeepSeek-V3对话模型
 DEEPSEEK_REASONER = "deepseek-reasoner"  # DeepSeek-R1模型
+DEEPSEEK_V4_FLASH = "deepseek-v4-flash"  # DeepSeek V4 Flash - 默认推荐 (思考模式 + 工具调用)
+DEEPSEEK_V4_PRO = "deepseek-v4-pro"  # DeepSeek V4 Pro - 复杂任务更强 (思考模式 + 工具调用)

 # Qwen (通义千问 - 阿里云 DashScope)
 QWEN_TURBO = "qwen-turbo"
@@ -101,7 +105,8 @@ MINIMAX_M2 = "MiniMax-M2"  # MiniMax M2
 MINIMAX_ABAB6_5 = "abab6.5-chat"  # MiniMax abab6.5

 # GLM (智谱AI)
-GLM_5_TURBO = "glm-5-turbo"  # 智谱 GLM-5-Turbo - Latest
+GLM_5_1 = "glm-5.1"  # 智谱 GLM-5.1 - Agent recommended model (default)
+GLM_5_TURBO = "glm-5-turbo"  # 智谱 GLM-5-Turbo
 GLM_5 = "glm-5"  # 智谱 GLM-5
 GLM_4 = "glm-4"
 GLM_4_PLUS = "glm-4-plus"
@@ -117,6 +122,7 @@ GLM_4_7 = "glm-4.7"  # 智谱 GLM-4.7 - Agent推荐模型
 MOONSHOT = "moonshot"
 KIMI_K2 = "kimi-k2"
 KIMI_K2_5 = "kimi-k2.5"
+KIMI_K2_6 = "kimi-k2.6"  # Kimi K2.6 - Agent recommended model (default)

 # Doubao (Volcengine Ark)
 DOUBAO = "doubao"
@@ -150,15 +156,21 @@ MODELSCOPE_MODEL_LIST = ["deepseek-ai/DeepSeek-R1-0528", "deepseek-ai/DeepSeek-R


 MODEL_LIST = [
+              # DeepSeek
+              DEEPSEEK_V4_FLASH, DEEPSEEK_V4_PRO, DEEPSEEK_CHAT, DEEPSEEK_REASONER,
+
+              # MiniMax
+              MiniMax, MINIMAX_M2_7, MINIMAX_M2_7_HIGHSPEED, MINIMAX_M2_5, MINIMAX_M2_1, MINIMAX_M2_1_LIGHTNING, MINIMAX_M2, MINIMAX_ABAB6_5,
+
              # Claude
-              CLAUDE3, CLAUDE_4_6_SONNET, CLAUDE_4_6_OPUS, CLAUDE_4_OPUS, CLAUDE_4_5_SONNET, CLAUDE_4_SONNET, CLAUDE_3_OPUS, CLAUDE_3_OPUS_0229, 
-              CLAUDE_35_SONNET, CLAUDE_35_SONNET_1022, CLAUDE_35_SONNET_0620, CLAUDE_3_SONNET, CLAUDE_3_HAIKU, 
+              CLAUDE3, CLAUDE_4_6_SONNET, CLAUDE_4_7_OPUS, CLAUDE_4_6_OPUS, CLAUDE_4_OPUS, CLAUDE_4_5_SONNET, CLAUDE_4_SONNET, CLAUDE_3_OPUS, CLAUDE_3_OPUS_0229,
+              CLAUDE_35_SONNET, CLAUDE_35_SONNET_1022, CLAUDE_35_SONNET_0620, CLAUDE_3_SONNET, CLAUDE_3_HAIKU,
              "claude", "claude-3-haiku", "claude-3-sonnet", "claude-3-opus", "claude-3.5-sonnet",
-              
+
              # Gemini
              GEMINI_31_FLASH_LITE_PRE, GEMINI_31_PRO_PRE, GEMINI_3_PRO_PRE, GEMINI_3_FLASH_PRE, GEMINI_25_PRO_PRE, GEMINI_25_FLASH_PRE,
              GEMINI_20_FLASH, GEMINI_20_flash_exp, GEMINI_15_PRO, GEMINI_15_flash, GEMINI_PRO, GEMINI,
-              
+
              # OpenAI
              GPT35, GPT35_0125, GPT35_1106, "gpt-3.5-turbo-16k",
              GPT4, GPT4_06_13, GPT4_32k, GPT4_32k_06_13,
@@ -168,31 +180,29 @@ MODEL_LIST = [
              GPT_5, GPT_5_MINI, GPT_5_NANO,
              GPT_54, GPT_54_MINI, GPT_54_NANO,
              O1, O1_MINI,
-              
-              # DeepSeek
-              DEEPSEEK_CHAT, DEEPSEEK_REASONER,
-              
-              # Qwen
-              QWEN36_PLUS, QWEN35_PLUS, QWEN3_MAX, QWEN_MAX, QWEN_PLUS, QWEN_TURBO, QWEN_LONG,
-              
-              # MiniMax
-              MiniMax, MINIMAX_M2_7, MINIMAX_M2_7_HIGHSPEED, MINIMAX_M2_5, MINIMAX_M2_1, MINIMAX_M2_1_LIGHTNING, MINIMAX_M2, MINIMAX_ABAB6_5,

-              # GLM
-              ZHIPU_AI, GLM_5_TURBO, GLM_5, GLM_4, GLM_4_PLUS, GLM_4_flash, GLM_4_LONG, GLM_4_ALLTOOLS,
+              # GLM (智谱AI)
+              ZHIPU_AI, GLM_5_1, GLM_5_TURBO, GLM_5, GLM_4, GLM_4_PLUS, GLM_4_flash, GLM_4_LONG, GLM_4_ALLTOOLS,
              GLM_4_0520, GLM_4_AIR, GLM_4_AIRX, GLM_4_7,

-              # Kimi
-              MOONSHOT, "moonshot-v1-8k", "moonshot-v1-32k", "moonshot-v1-128k",
-              KIMI_K2, KIMI_K2_5,
+              # Qwen (通义千问)
+              QWEN36_PLUS, QWEN35_PLUS, QWEN3_MAX, QWEN_MAX, QWEN_PLUS, QWEN_TURBO, QWEN_LONG,

-              # Doubao
+              # Doubao (豆包)
              DOUBAO, DOUBAO_SEED_2_CODE, DOUBAO_SEED_2_PRO, DOUBAO_SEED_2_LITE, DOUBAO_SEED_2_MINI,

+              # Kimi (Moonshot)
+              MOONSHOT, "moonshot-v1-8k", "moonshot-v1-32k", "moonshot-v1-128k",
+              KIMI_K2_6, KIMI_K2_5, KIMI_K2,
+
+              # ModelScope
+              MODELSCOPE,
+
+              # LinkAI
+              LINKAI_35, LINKAI_4_TURBO, LINKAI_4o,
+
              # 其他模型
              WEN_XIN, WEN_XIN_4, XUNFEI,
-              LINKAI_35, LINKAI_4_TURBO, LINKAI_4o,
-              MODELSCOPE
            ]

 MODEL_LIST = MODEL_LIST + GITEE_AI_MODEL_LIST + MODELSCOPE_MODEL_LIST
--- a/config-template.json
+++ b/config-template.json
@@ -1,6 +1,8 @@
 {
  "channel_type": "weixin",
-  "model": "MiniMax-M2.7",
+  "model": "deepseek-v4-flash",
+  "deepseek_api_key": "",
+  "deepseek_api_base": "https://api.deepseek.com/v1",
  "minimax_api_key": "",
  "zhipu_ai_api_key": "",
  "ark_api_key": "",
@@ -31,5 +33,6 @@
  "agent_max_context_tokens": 50000,
  "agent_max_context_turns": 20,
  "agent_max_steps": 20,
+  "enable_thinking": false,
  "knowledge": true
 }
--- a/config.py
+++ b/config.py
@@ -17,10 +17,12 @@ available_setting = {
    "open_ai_api_base": "https://api.openai.com/v1",
    "claude_api_base": "https://api.anthropic.com/v1",  # claude api base
    "gemini_api_base": "https://generativelanguage.googleapis.com",  # gemini api base
+    "custom_api_key": "",  # custom OpenAI-compatible provider api key (used when bot_type is "custom")
+    "custom_api_base": "",  # custom OpenAI-compatible provider api base (used when bot_type is "custom")
    "proxy": "",  # openai使用的代理
    # chatgpt模型， 当use_azure_chatgpt为true时，其名称为Azure上model deployment名称
    "model": "gpt-3.5-turbo",  # 可选择: gpt-4o, pt-4o-mini, gpt-4-turbo, claude-3-sonnet, wenxin, moonshot, qwen-turbo, xunfei, glm-4, minimax, gemini等模型，全部可选模型详见common/const.py文件
-    "bot_type": "",  # 可选配置，使用兼容openai格式的三方服务时候，需填"openai"（历史值"chatGPT"仍兼容）。bot具体名称详见common/const.py文件，如不填根据model名称判断
+    "bot_type": "",  # 可选配置，使用兼容openai格式的三方服务时候，需填"openai"或"custom"（custom模式下切换模型不会自动切换bot_type）。bot具体名称详见common/const.py文件，如不填根据model名称判断
    "use_azure_chatgpt": False,  # 是否使用azure的chatgpt
    "azure_deployment_id": "",  # azure 模型部署名称
    "azure_api_version": "",  # azure api版本
@@ -194,6 +196,8 @@ available_setting = {
    "minimax_api_key": "",
    "Minimax_group_id": "",
    "Minimax_base_url": "",
+    "deepseek_api_key": "",
+    "deepseek_api_base": "https://api.deepseek.com/v1",
    "web_port": 9899,
    "web_password": "",  # Web console password; empty means no authentication required
    "web_session_expire_days": 30,  # Auth session expiry in days
@@ -202,7 +206,12 @@ available_setting = {
    "agent_max_context_tokens": 50000,  # Agent模式下最大上下文tokens
    "agent_max_context_turns": 20,  # Agent模式下最大上下文记忆轮次
    "agent_max_steps": 20,  # Agent模式下单次运行最大决策步数
+    "enable_thinking": False,  # Enable deep-thinking mode for thinking-capable models
    "knowledge": True,  # 是否开启知识库功能
+    # Per-skill runtime config. Nested keys are flattened to env vars at startup
+    # using the rule: skill[<name>][<key>] -> SKILL_<NAME>_<KEY>
+    # (e.g. skill["image-generation"].model -> SKILL_IMAGE_GENERATION_MODEL).
+    "skill": {},
 }


@@ -375,12 +384,16 @@ def load_config():
        "gemini_api_base": "GEMINI_API_BASE",
        "minimax_api_key": "MINIMAX_API_KEY",
        "minimax_api_base": "MINIMAX_API_BASE",
+        "deepseek_api_key": "DEEPSEEK_API_KEY",
+        "deepseek_api_base": "DEEPSEEK_API_BASE",
        "zhipu_ai_api_key": "ZHIPU_AI_API_KEY",
        "zhipu_ai_api_base": "ZHIPU_AI_API_BASE",
        "moonshot_api_key": "MOONSHOT_API_KEY",
        "moonshot_api_base": "MOONSHOT_API_BASE",
        "ark_api_key": "ARK_API_KEY",
        "ark_api_base": "ARK_API_BASE",
+        "dashscope_api_key": "DASHSCOPE_API_KEY",
+        "dashscope_api_base": "DASHSCOPE_API_BASE",
        # Channel credentials (used by skills that check env vars)
        "feishu_app_id": "FEISHU_APP_ID",
        "feishu_app_secret": "FEISHU_APP_SECRET",
@@ -401,12 +414,45 @@ def load_config():
            if val:
                os.environ[env_key] = str(val)
                injected += 1
+
+    injected += _sync_skill_config_to_env(config.get("skill", {}))
+
    if injected:
        logger.info("[INIT] Synced {} config values to environment variables".format(injected))

    config.load_user_datas()


+def _sync_skill_config_to_env(skill_section) -> int:
+    """Flatten skill-namespaced config into environment variables.
+
+    Mapping rule: ``config["skill"][<name>][<key>]`` -> ``SKILL_<NAME>_<KEY>``
+    (e.g. ``skill["image-generation"].model`` -> ``SKILL_IMAGE_GENERATION_MODEL``).
+
+    This lets subprocess-based skill scripts read their own settings without
+    importing project code. Existing env vars are NOT overwritten so the
+    real environment always wins.
+
+    Returns the number of variables actually injected.
+    """
+    if not isinstance(skill_section, dict):
+        return 0
+    injected = 0
+    for skill_name, skill_conf in skill_section.items():
+        if not isinstance(skill_conf, dict):
+            continue
+        name_part = str(skill_name).replace("-", "_").upper()
+        for key, val in skill_conf.items():
+            if val is None or val == "":
+                continue
+            env_key = "SKILL_{}_{}".format(name_part, str(key).upper())
+            if env_key in os.environ:
+                continue
+            os.environ[env_key] = str(val)
+            injected += 1
+    return injected
+
+
 def get_root():
    return os.path.dirname(os.path.abspath(__file__))

--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@@ -9,7 +9,9 @@ services:
      - "9899:9899"
    environment:
      CHANNEL_TYPE: 'weixin'
-      MODEL: 'MiniMax-M2.7'
+      MODEL: 'deepseek-v4-flash'
+      DEEPSEEK_API_KEY: ''
+      DEEPSEEK_API_BASE: 'https://api.deepseek.com/v1'
      MINIMAX_API_KEY: ''
      ZHIPU_AI_API_KEY: ''
      ARK_API_KEY: ''
--- a/docs/channels/web.mdx
+++ b/docs/channels/web.mdx
@@ -10,7 +10,9 @@ Web 控制台是 CowAgent 的默认通道，启动后会自动运行，通过浏
 ```json
 {
  "channel_type": "web",
-  "web_port": 9899
+  "web_port": 9899,
+  "web_password": "",
+  "enable_thinking": false
 }
 ```

@@ -18,6 +20,11 @@ Web 控制台是 CowAgent 的默认通道，启动后会自动运行，通过浏
 | --- | --- | --- |
 | `channel_type` | 设为 `web` | `web` |
 | `web_port` | Web 服务监听端口 | `9899` |
+| `web_password` | 访问密码，留空表示不启用密码保护 | `""` |
+| `web_session_expire_days` | 登录会话有效天数 | `30` |
+| `enable_thinking` | 是否启用深度思考模式 | `false` |
+
+配置密码后，访问控制台时需先输入密码完成登录。登录状态默认保持 30 天，期间重启服务也无需重新登录。密码也支持在控制台的「配置」页面中在线修改。

 ## 访问地址

@@ -30,30 +37,11 @@ Web 控制台是 CowAgent 的默认通道，启动后会自动运行，通过浏
  请确保服务器防火墙和安全组已放行对应端口。
 </Note>

-## 密码保护
-
-Web 控制台默认无需密码即可访问。如果部署在公网环境，建议配置访问密码：
-
-```json
-{
-  "web_password": "your_password"
-}
-```
-
-| 参数 | 说明 | 默认值 |
-| --- | --- | --- |
-| `web_password` | 访问密码，留空表示不启用密码保护 | `""` |
-| `web_session_expire_days` | 登录会话有效天数 | `30` |
-
-配置密码后，访问控制台时需先输入密码完成登录。登录状态默认保持 30 天，期间重启服务也无需重新登录。修改密码后，所有已登录的会话将自动失效。
-
-密码也支持在控制台的「配置」页面中在线修改。
-
 ## 功能介绍

 ### 对话界面

-支持流式输出，可实时展示 Agent 的思考过程（Reasoning）和工具调用过程（Tool Calls），更直观地观察 Agent 的决策过程：
+支持流式输出，可实时展示 Agent 的思考过程（Reasoning）和工具调用过程（Tool Calls），更直观地观察 Agent 的决策过程。深度思考功能可通过配置或控制台的「Agent 配置」开关控制。

 <img width="850" src="https://cdn.link-ai.tech/doc/20260227180120.png" />

--- a/docs/cli/general.mdx
+++ b/docs/cli/general.mdx
@@ -58,17 +58,18 @@ Session: 12 messages | 8 skills loaded
 **修改配置项：**

 ```text
-/config model deepseek-chat
+/config model deepseek-v4-flash
 ```

 **支持修改的配置项：**

 | 配置项 | 说明 | 示例值 |
 | --- | --- | --- |
-| `model` | AI 模型名称 | `deepseek-chat` |
+| `model` | AI 模型名称 | `deepseek-v4-flash` |
 | `agent_max_context_tokens` | 最大上下文 tokens | `40000` |
 | `agent_max_context_turns` | 最大上下文记忆轮次 | `30` |
 | `agent_max_steps` | 单次任务最大决策步数 | `15` |
+| `enable_thinking` | 是否启用深度思考模式 | `true` / `false` |

 <Note>
  修改 `model` 时，系统会自动匹配对应的模型调用方式。配置会写入 `config.json` 并持久保存。
@@ -106,45 +107,6 @@ Session: 12 messages | 8 skills loaded
 /logs 50
 ```

-## knowledge
-
-查看和管理个人知识库。默认显示知识库统计信息。
-
-```text
-/knowledge
-```
-
-输出示例：
-
-```
-📚 知识库
-
- 状态：已开启
- 页面数：12
- 总大小：45.2 KB
- 分类明细：
-  - concepts/: 5 篇
-  - entities/: 4 篇
-  - sources/: 3 篇
-```
-
-**查看目录结构：**
-
-```text
-/knowledge list
-```
-
-**开启 / 关闭知识库：**
-
-```text
-/knowledge on
-/knowledge off
-```
-
-<Note>
-  终端 CLI 中 `cow knowledge` 和 `cow knowledge list` 可用，但 `on|off` 仅支持在对话中使用（需实时生效）。
-</Note>
-
 ## version

 显示当前 CowAgent 版本号。
--- a/docs/cli/index.mdx
+++ b/docs/cli/index.mdx
@@ -40,7 +40,8 @@ Service:
 Skills:
  skill     Manage skills (list / search / install / uninstall ...)

-Knowledge:
+Memory & Knowledge:
+  memory    Memory distillation (dream)
  knowledge View knowledge base stats and structure

 Others:
@@ -58,6 +59,7 @@ Others:
 | `/status` | 查看服务状态和配置 |
 | `/config` | 查看或修改运行时配置 |
 | `/skill` | 管理技能（安装、卸载、启用、禁用等） |
+| `/memory dream [N]` | 手动触发记忆蒸馏（默认 3 天，最大 30） |
 | `/knowledge` | 查看知识库统计信息 |
 | `/knowledge list` | 查看知识库目录结构 |
 | `/knowledge on\|off` | 开启或关闭知识库 |
@@ -82,6 +84,7 @@ Others:
 | logs | ✓ | ✓ |
 | config | ✗ | ✓ |
 | context | — | ✓ |
+| memory (子命令) | ✗ | ✓ |
 | knowledge (子命令) | ✓ | ✓ |
 | skill (子命令) | ✓ | ✓ |
 | start / stop / restart | ✓ | ✗ |
--- a/docs/cli/memory-knowledge.mdx
+++ b/docs/cli/memory-knowledge.mdx
@@ -0,0 +1,77 @@
+---
+title: 记忆与知识库
+description: 记忆蒸馏和知识库管理命令
+---
+
+## memory
+
+管理 Agent 的长期记忆系统。
+
+### memory dream
+
+手动触发记忆蒸馏（Deep Dream），整理近期的天级记忆，蒸馏合并到 MEMORY.md，并生成梦境日记。
+
+```text
+/memory dream [N]
+```
+
+- `N`：整理近 N 天的记忆，默认 3 天，最大 30 天
+- 蒸馏在后台异步执行，完成后会在对话中通知结果
+- 无需等待 Agent 初始化，首次对话前即可使用
+
+**示例：**
+
+```text
+/memory dream       # 整理近 3 天
+/memory dream 7     # 整理近 7 天
+/memory dream 30    # 整理近 30 天（全量）
+```
+
+蒸馏完成后，Web 端会收到带有跳转链接的通知，可直接查看更新后的 MEMORY.md 和梦境日记。
+
+<Tip>
+  系统每天 23:55 会自动执行一次蒸馏（lookback 1 天）。手动触发适用于首次部署后的历史整理，或需要立即更新记忆时使用。
+</Tip>
+
+## knowledge
+
+查看和管理个人知识库。默认显示知识库统计信息。
+
+```text
+/knowledge
+```
+
+输出示例：
+
+```
+📚 知识库
+
+- 状态：已开启
+- 页面数：12
+- 总大小：45.2 KB
+- 分类明细：
+  - concepts/: 5 篇
+  - entities/: 4 篇
+  - sources/: 3 篇
+```
+
+### knowledge list
+
+查看知识库目录树结构。
+
+```text
+/knowledge list
+```
+
+### knowledge on / off
+
+开启或关闭知识库。关闭后不再注入知识提示词和索引知识文件。
+
+```text
+/knowledge on
+/knowledge off
+```
+
+<Note>
+  终端 CLI 中 `cow knowledge` 和 `cow knowledge list` 可用，但 `on|off` 仅支持在对话中使用（需实时生效）。
+</Note>
--- a/docs/docs.json
+++ b/docs/docs.json
@@ -72,17 +72,18 @@
                "group": "模型配置",
                "pages": [
                  "models/index",
+                  "models/deepseek",
                  "models/minimax",
-                  "models/glm",
-                  "models/qwen",
-                  "models/kimi",
-                  "models/doubao",
                  "models/claude",
                  "models/gemini",
                  "models/openai",
-                  "models/deepseek",
+                  "models/glm",
+                  "models/qwen",
+                  "models/doubao",
+                  "models/kimi",
                  "models/linkai",
-                  "models/coding-plan"
+                  "models/coding-plan",
+                  "models/custom"
                ]
              }
            ]
@@ -132,6 +133,14 @@
                  "skills/create",
                  "skills/hub"
                ]
+              },
+              {
+                "group": "内置技能",
+                "pages": [
+                  "skills/skill-creator",
+                  "skills/knowledge-wiki",
+                  "skills/image-generation"
+                ]
              }
            ]
          },
@@ -142,7 +151,8 @@
                "group": "记忆系统",
                "pages": [
                  "memory/index",
-                  "memory/context"
+                  "memory/context",
+                  "memory/deep-dream"
                ]
              }
            ]
@@ -185,6 +195,7 @@
                  "cli/index",
                  "cli/process",
                  "cli/skill",
+                  "cli/memory-knowledge",
                  "cli/general"
                ]
              }
@@ -197,6 +208,8 @@
                "group": "发布记录",
                "pages": [
                  "releases/overview",
+                  "releases/v2.0.7",
+                  "releases/v2.0.6",
                  "releases/v2.0.5",
                  "releases/v2.0.4",
                  "releases/v2.0.3",
@@ -244,17 +257,18 @@
                "group": "Model Configuration",
                "pages": [
                  "en/models/index",
+                  "en/models/deepseek",
                  "en/models/minimax",
-                  "en/models/glm",
-                  "en/models/qwen",
-                  "en/models/kimi",
-                  "en/models/doubao",
                  "en/models/claude",
                  "en/models/gemini",
                  "en/models/openai",
-                  "en/models/deepseek",
+                  "en/models/glm",
+                  "en/models/qwen",
+                  "en/models/doubao",
+                  "en/models/kimi",
                  "en/models/linkai",
-                  "en/models/coding-plan"
+                  "en/models/coding-plan",
+                  "en/models/custom"
                ]
              }
            ]
@@ -301,9 +315,16 @@
                "pages": [
                  "en/skills/index",
                  "en/skills/install",
-                  "en/skills/skill-creator",
                  "en/skills/hub"
                ]
+              },
+              {
+                "group": "Built-in Skills",
+                "pages": [
+                  "en/skills/skill-creator",
+                  "en/skills/knowledge-wiki",
+                  "en/skills/image-generation"
+                ]
              }
            ]
          },
@@ -314,7 +335,8 @@
                "group": "Memory System",
                "pages": [
                  "en/memory/index",
-                  "en/memory/context"
+                  "en/memory/context",
+                  "en/memory/deep-dream"
                ]
              }
            ]
@@ -357,6 +379,7 @@
                  "en/cli/index",
                  "en/cli/process",
                  "en/cli/skill",
+                  "en/cli/memory-knowledge",
                  "en/cli/chat"
                ]
              }
@@ -369,6 +392,8 @@
                "group": "Release Notes",
                "pages": [
                  "en/releases/overview",
+                  "en/releases/v2.0.7",
+                  "en/releases/v2.0.6",
                  "en/releases/v2.0.5",
                  "en/releases/v2.0.4",
                  "en/releases/v2.0.2",
@@ -416,17 +441,18 @@
                "group": "モデル設定",
                "pages": [
                  "ja/models/index",
+                  "ja/models/deepseek",
                  "ja/models/minimax",
-                  "ja/models/glm",
-                  "ja/models/qwen",
-                  "ja/models/kimi",
-                  "ja/models/doubao",
                  "ja/models/claude",
                  "ja/models/gemini",
                  "ja/models/openai",
-                  "ja/models/deepseek",
+                  "ja/models/glm",
+                  "ja/models/qwen",
+                  "ja/models/doubao",
+                  "ja/models/kimi",
                  "ja/models/linkai",
-                  "ja/models/coding-plan"
+                  "ja/models/coding-plan",
+                  "ja/models/custom"
                ]
              }
            ]
@@ -476,6 +502,14 @@
                  "ja/skills/create",
                  "ja/skills/hub"
                ]
+              },
+              {
+                "group": "内蔵スキル",
+                "pages": [
+                  "ja/skills/skill-creator",
+                  "ja/skills/knowledge-wiki",
+                  "ja/skills/image-generation"
+                ]
              }
            ]
          },
@@ -486,7 +520,8 @@
                "group": "メモリシステム",
                "pages": [
                  "ja/memory/index",
-                  "ja/memory/context"
+                  "ja/memory/context",
+                  "ja/memory/deep-dream"
                ]
              }
            ]
@@ -529,6 +564,7 @@
                  "ja/cli/index",
                  "ja/cli/process",
                  "ja/cli/skill",
+                  "ja/cli/memory-knowledge",
                  "ja/cli/general"
                ]
              }
@@ -541,6 +577,8 @@
                "group": "リリースノート",
                "pages": [
                  "ja/releases/overview",
+                  "ja/releases/v2.0.7",
+                  "ja/releases/v2.0.6",
                  "ja/releases/v2.0.5",
                  "ja/releases/v2.0.4",
                  "ja/releases/v2.0.3",
--- a/docs/en/README.md
+++ b/docs/en/README.md
@@ -22,13 +22,13 @@
 > CowAgent is both an out-of-the-box AI super assistant and a highly extensible Agent framework. You can extend it with new model interfaces, channels, built-in tools, and the Skills system to flexibly implement various customization needs.

 - ✅ **Autonomous Task Planning**: Understands complex tasks and autonomously plans execution, continuously thinking and invoking tools until goals are achieved.
- ✅ **Long-term Memory**: Automatically persists conversation memory to local files and databases, including core memory and daily memory, with keyword and vector retrieval support.
+- ✅ **Long-term Memory**: Automatically persists conversation memory to local files and databases, including core memory, daily memory, and Deep Dream distillation, with keyword and vector retrieval support.
 - ✅ **Personal Knowledge Base**: Automatically organizes structured knowledge with cross-references to build a knowledge graph, with web-based visualization and conversational management.
 - ✅ **Skills System**: Implements a Skills creation and execution engine, supports installing skills from [Skill Hub](https://skills.cowagent.ai), GitHub, etc., or creating custom Skills through conversation.
 - ✅ **Tool System**: Built-in tools for file I/O, terminal execution, browser automation, scheduled tasks, messaging, and more — autonomously invoked by the Agent.
 - ✅ **CLI System**: Provides terminal commands and in-chat commands for process management, skill installation, configuration, and more.
 - ✅ **Multimodal Messages**: Supports parsing, processing, generating, and sending text, images, voice, files, and other message types.
- ✅ **Multiple Model Support**: Supports OpenAI, Claude, Gemini, DeepSeek, MiniMax, GLM, Qwen, Kimi, Doubao, and other mainstream model providers.
+- ✅ **Multiple Model Support**: Supports DeepSeek, MiniMax, Claude, Gemini, OpenAI, GLM, Qwen, Doubao, Kimi, and other mainstream model providers.
 - ✅ **Multi-platform Deployment**: Runs on local computers or servers, integrable into WeChat, Web, Feishu, DingTalk, WeChat Official Account, and WeCom applications.

 ## Disclaimer
@@ -43,6 +43,8 @@ Try online (no deployment needed): [CowAgent](https://link-ai.tech/cowagent/crea

 ## Changelog

+> **2026.04.14:** [v2.0.6](https://github.com/zhayujie/CowAgent/releases/tag/2.0.6) — Knowledge Base, Deep Dream Memory Distillation, Smart Context Compression, Web Console upgrades.
+
 > **2026.04.01:** [v2.0.5](https://github.com/zhayujie/CowAgent/releases/tag/2.0.5) — Cow CLI, Skill Hub open source, Browser tool, WeCom Bot QR scan, and more.

 > **2026.02.27:** [v2.0.2](https://github.com/zhayujie/CowAgent/releases/tag/2.0.2) — Web console overhaul (streaming chat, model/skill/memory/channel/scheduler/log management), multi-channel concurrent running, session persistence, new models including Gemini 3.1 Pro / Claude 4.6 Sonnet / Qwen3.5 Plus.
@@ -162,15 +164,15 @@ Supports mainstream model providers. Recommended models for Agent mode:

 | Provider | Recommended Model |
 | --- | --- |
+| DeepSeek | `deepseek-v4-flash` |
 | MiniMax | `MiniMax-M2.7` |
-| GLM | `glm-5-turbo` |
-| Kimi | `kimi-k2.5` |
-| Doubao | `doubao-seed-2-0-code-preview-260215` |
-| Qwen | `qwen3.6-plus` |
 | Claude | `claude-sonnet-4-6` |
 | Gemini | `gemini-3.1-pro-preview` |
 | OpenAI | `gpt-5.4` |
-| DeepSeek | `deepseek-chat` |
+| GLM | `glm-5.1` |
+| Qwen | `qwen3.6-plus` |
+| Doubao | `doubao-seed-2-0-code-preview-260215` |
+| Kimi | `kimi-k2.6` |

 For detailed configuration of each model, see the [Models documentation](https://docs.cowagent.ai/en/models/index).

--- a/docs/en/cli/general.mdx
+++ b/docs/en/cli/general.mdx
@@ -44,17 +44,18 @@ View or modify runtime configuration. Changes take effect immediately without re
 **Modify a config item:**

 ```text
-/config model deepseek-chat
+/config model deepseek-v4-flash
 ```

 **Configurable items:**

 | Item | Description | Example |
 | --- | --- | --- |
-| `model` | AI model name | `deepseek-chat` |
+| `model` | AI model name | `deepseek-v4-flash` |
 | `agent_max_context_tokens` | Max context tokens | `40000` |
 | `agent_max_context_turns` | Max context memory turns | `30` |
 | `agent_max_steps` | Max decision steps per task | `15` |
+| `enable_thinking` | Enable deep thinking mode | `true` / `false` |

 <Note>
  When changing `model`, the system automatically matches the corresponding model API. Configuration is persisted to `config.json`.
@@ -92,31 +93,6 @@ View recent service logs. Shows the last 20 lines by default, up to 50.
 /logs 50
 ```

-## knowledge
-
-View and manage the personal knowledge base. Shows statistics by default.
-
-```text
-/knowledge
-```
-
-**View directory structure:**
-
-```text
-/knowledge list
-```
-
-**Enable / disable knowledge base:**
-
-```text
-/knowledge on
-/knowledge off
-```
-
-<Note>
-  In the terminal CLI, `cow knowledge` and `cow knowledge list` are available, but `on|off` is only supported in chat (requires runtime effect).
-</Note>
-
 ## version

 Show the current CowAgent version.
--- a/docs/en/cli/index.mdx
+++ b/docs/en/cli/index.mdx
@@ -40,7 +40,8 @@ Service:
 Skills:
  skill     Manage skills (list / search / install / uninstall ...)

-Knowledge:
+Memory & Knowledge:
+  memory    Memory distillation (dream)
  knowledge View knowledge base stats and structure

 Others:
@@ -58,6 +59,7 @@ In the Web console or any connected channel, type `/` to see command suggestions
 | `/status` | View service status and configuration |
 | `/config` | View or modify runtime configuration |
 | `/skill` | Manage skills (install, uninstall, enable, disable, etc.) |
+| `/memory dream [N]` | Manually trigger memory distillation (default 3 days, max 30) |
 | `/knowledge` | View knowledge base statistics |
 | `/knowledge list` | View knowledge base directory structure |
 | `/knowledge on\|off` | Enable or disable knowledge base |
@@ -80,6 +82,7 @@ In the Web console or any connected channel, type `/` to see command suggestions
 | logs | ✓ | ✓ |
 | config | ✗ | ✓ |
 | context | — | ✓ |
+| memory (subcommands) | ✗ | ✓ |
 | knowledge (subcommands) | ✓ | ✓ |
 | skill (subcommands) | ✓ | ✓ |
 | start / stop / restart | ✓ | ✗ |
--- a/docs/en/cli/memory-knowledge.mdx
+++ b/docs/en/cli/memory-knowledge.mdx
@@ -0,0 +1,63 @@
+---
+title: Memory & Knowledge
+description: Memory distillation and knowledge base management commands
+---
+
+## memory
+
+Manage the Agent's long-term memory system.
+
+### memory dream
+
+Manually trigger memory distillation (Deep Dream) — consolidate recent daily memories into MEMORY.md and generate a dream diary.
+
+```text
+/memory dream [N]
+```
+
+- `N`: Consolidate the last N days of memory (default 3, max 30)
+- Runs asynchronously in the background; you'll be notified in chat when complete
+- Works without Agent initialization — can be used before the first conversation
+
+**Examples:**
+
+```text
+/memory dream       # Consolidate last 3 days
+/memory dream 7     # Consolidate last 7 days
+/memory dream 30    # Consolidate last 30 days (full)
+```
+
+On the Web console, the completion notification includes clickable links to view the updated MEMORY.md and dream diary.
+
+<Tip>
+  The system automatically runs distillation daily at 23:55 (lookback 1 day). Manual trigger is useful for consolidating historical memories after first deployment, or when you need an immediate memory update.
+</Tip>
+
+## knowledge
+
+View and manage the personal knowledge base. Shows statistics by default.
+
+```text
+/knowledge
+```
+
+### knowledge list
+
+View the knowledge base directory tree.
+
+```text
+/knowledge list
+```
+
+### knowledge on / off
+
+Enable or disable the knowledge base. When disabled, knowledge prompts and file indexing are not injected.
+
+```text
+/knowledge on
+/knowledge off
+```
+
+<Note>
+  In the terminal CLI, `cow knowledge` and `cow knowledge list` are available, but `on|off` is only supported in chat (requires runtime effect).
+</Note>
--- a/docs/en/guide/manual-install.mdx
+++ b/docs/en/guide/manual-install.mdx
@@ -121,7 +121,8 @@ sudo docker logs -f chatgpt-on-wechat
 ```json
 {
  "channel_type": "web",
-  "model": "MiniMax-M2.5",
+  "model": "deepseek-v4-flash",
+  "deepseek_api_key": "",
  "agent": true,
  "agent_workspace": "~/cow",
  "agent_max_context_tokens": 40000,
@@ -133,7 +134,7 @@ sudo docker logs -f chatgpt-on-wechat
 | Parameter | Description | Default |
 | --- | --- | --- |
 | `channel_type` | Channel type | `web` |
-| `model` | Model name | `MiniMax-M2.5` |
+| `model` | Model name | `deepseek-v4-flash` |
 | `agent` | Enable Agent mode | `true` |
 | `agent_workspace` | Agent workspace path | `~/cow` |
 | `agent_max_context_tokens` | Max context tokens | `40000` |
--- a/docs/en/intro/architecture.mdx
+++ b/docs/en/intro/architecture.mdx
@@ -9,7 +9,7 @@ CowAgent 2.0 has evolved from a simple chatbot into a super intelligent assistan

 CowAgent's architecture consists of the following core modules:

-<img src="https://cdn.link-ai.tech/doc/68ef7b212c6f791e0e74314b912149f9-sz_5847990.png" alt="CowAgent Architecture" />
+<img src="https://cdn.link-ai.tech/doc/cow-agent-arch-en.jpg.jpg" alt="CowAgent Architecture" />

 | Module | Description |
 | --- | --- |
--- a/docs/en/intro/features.mdx
+++ b/docs/en/intro/features.mdx
@@ -5,16 +5,18 @@ description: CowAgent long-term memory, task planning, skills system, CLI comman

 ## 1. Long-term Memory

-The memory system enables the Agent to remember important information over time. The Agent proactively stores information when users share preferences, decisions, or key facts, and automatically extracts summaries when conversations reach a certain length. Memory is divided into core memory and daily memory, with hybrid retrieval supporting both keyword search and vector search.
+The memory system enables the Agent to remember important information over time, using a three-tier memory flow: conversation context (short-term) → daily memory (mid-term) → MEMORY.md (long-term), forming a complete memory lifecycle.

 On first launch, the Agent proactively asks the user for key information and records it in the workspace (default `~/cow`) — including agent settings, user identity, and memory files.

-In subsequent long-term conversations, the Agent intelligently stores or retrieves memory as needed, continuously updating its own settings, user preferences, and memory files, summarizing experiences and lessons learned — truly achieving autonomous thinking and continuous growth.
+In subsequent long-term conversations, the Agent intelligently stores or retrieves memory as needed, continuously updating its own settings, user preferences, and memory files. **Deep Dream** distillation runs daily, consolidating scattered daily memories into refined long-term memory and generating a narrative-style dream diary.

 <Frame>
  <img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
 </Frame>

+See [Long-term Memory](/en/memory) and [Deep Dream](/en/memory/deep-dream) for details.
+
 ## 2. Personal Knowledge Base

 > The knowledge base system enables the Agent to continuously accumulate and organize structured knowledge. Unlike memory which records along a timeline, the knowledge base is organized by topics, transforming articles, conversation insights, and learning materials into interconnected Markdown pages that form a continuously growing knowledge network.
@@ -26,6 +28,10 @@ The Agent automatically organizes valuable information from conversations into k
 - **Chat integration**: Knowledge document links referenced in Agent replies can be clicked directly in the Web console for viewing
 - **CLI management**: Use `/knowledge` commands to view stats, browse directory, and toggle the feature with `/knowledge on|off`

+<Frame>
+  <img src="https://cdn.link-ai.tech/doc/20260413105435.png" width="800" />
+</Frame>
+
 See [Personal Knowledge Base](/en/knowledge) for details.

 ## 3. Task Planning and Tool Use
@@ -47,7 +53,7 @@ Access to the OS terminal and file system is the most fundamental and core capab
 Combining programming and system access, the Agent can execute the complete **Vibecoding workflow** — from information search, asset generation, coding, testing, deployment, Nginx configuration, to publishing — all triggered by a single command from your phone:

 <Frame>
-  <img src="https://cdn.link-ai.tech/doc/20260203121008.png" width="800" />
+  <img src="https://cdn.link-ai.tech/doc/20260318211018.png" width="800" />
 </Frame>

 ### 3.3 Scheduled Tasks
--- a/docs/en/intro/index.mdx
+++ b/docs/en/intro/index.mdx
@@ -20,7 +20,7 @@ CowAgent can proactively think and plan tasks, operate computers and external re
    Understands complex tasks and autonomously plans execution, continuously thinking and invoking tools until goals are achieved. Supports accessing file systems, terminals, browsers, schedulers, and other system resources through tools.
  </Card>
  <Card title="Long-term Memory" icon="database" href="/en/memory">
-    Automatically persists conversation memory to local files and databases, including core memory and daily memory, with keyword and vector retrieval support.
+    Three-tier memory flow (context → daily memory → global memory) with daily Deep Dream distillation, keyword and vector retrieval support.
  </Card>
  <Card title="Knowledge Base" icon="book" href="/en/knowledge">
    Automatically organizes structured knowledge with knowledge graph visualization, building a continuously growing knowledge network through cross-references.
--- a/docs/en/memory/context.mdx
+++ b/docs/en/memory/context.mdx
@@ -39,14 +39,15 @@ When conversation turns exceed `agent_max_context_turns`:

 - The **oldest half** of complete turns is trimmed (preserving tool call chain integrity)
 - Trimmed messages are summarized by LLM and **written to the daily memory file**
- Remaining turns stay intact
+- Once the LLM summary is ready, it is also **injected into the first user message** of the retained context, helping the model maintain conversational continuity
+- Summary injection runs asynchronously in the background and takes effect from the next turn onward

 ### 3. Token Budget Trimming

 After turn trimming, if tokens still exceed the budget:

 - **Fewer than 5 turns**: All turns undergo **text compression** — each turn keeps only the first user text and last Agent reply, removing intermediate tool call chains
- **5 or more turns**: The **first half** of turns is trimmed again, with discarded content also written to memory
+- **5 or more turns**: The **first half** of turns is trimmed again, with discarded content written to memory and a context summary injected

 ### 4. Overflow Emergency Handling

--- a/docs/en/memory/deep-dream.mdx
+++ b/docs/en/memory/deep-dream.mdx
@@ -0,0 +1,90 @@
+---
+title: Deep Dream
+description: Deep Dream — automatic distillation from conversations to permanent memory
+---
+
+Deep Dream is the core consolidation mechanism of CowAgent's memory system, responsible for distilling scattered daily memories into refined long-term memory and generating dream diaries.
+
+## Memory Flow
+
+CowAgent's memory progresses through three stages from short-term to long-term:
+
+```
+Conversation context (short-term) → Daily memory (mid-term) → MEMORY.md (long-term)
+```
+
+### 1. Conversation → Daily Memory
+
+When conversation context is trimmed or during the daily scheduled summary, the system uses LLM to summarize conversation content into key events, writing them to the daily memory file `memory/YYYY-MM-DD.md`.
+
+Triggers:
+- **Context trimming** — Trimmed content is summarized when turn or token limits are exceeded
+- **Daily schedule** — Automatically triggered at 23:55
+- **API overflow** — Emergency save of current conversation summary
+
+### 2. Daily Memory → MEMORY.md (Distillation)
+
+After the daily summary completes, Deep Dream automatically runs distillation:
+
+1. **Read materials** — Current `MEMORY.md` + today's daily memory
+2. **LLM distillation** — Deduplicate, merge, prune, extract new information
+3. **Overwrite MEMORY.md** — Output the refined long-term memory
+4. **Generate dream diary** — Record discoveries and insights from the consolidation
+
+### 3. Role of MEMORY.md
+
+`MEMORY.md` is injected into the system prompt for every conversation, keeping the Agent aware of user preferences, decisions, and key facts. Therefore it must stay concise — Deep Dream targets approximately 30 entries or fewer.
+
+## Distillation Rules
+
+Deep Dream follows these consolidation rules:
+
+| Operation | Description |
+| --- | --- |
+| **Merge & refine** | Combine similar entries into single high-density statements |
+| **Extract new** | Pull preferences, decisions, people, experiences from daily memory |
+| **Conflict update** | When new info contradicts old entries, newer info takes precedence |
+| **Clean invalid** | Remove temporary records, blank entries, formatting artifacts |
+| **Remove redundancy** | Delete old entries already covered by more refined statements |
+
+## Dream Diary
+
+Each distillation generates a dream diary saved at `memory/dreams/YYYY-MM-DD.md`, written in a narrative style recording:
+
+- Duplications or contradictions found
+- New insights extracted from daily memory
+- Cleanups and optimizations performed
+- Overall observations
+
+Dream diaries can be viewed in the Web console under "Memory → Dream Diary" tab.
+
+<Frame>
+  <img src="https://cdn.link-ai.tech/doc/20260414110032.png" width="800" />
+</Frame>
+
+## Manual Trigger
+
+In addition to the automatic daily run, you can manually trigger distillation in chat:
+
+```text
+/memory dream [N]
+```
+
+- `N`: Consolidate the last N days of memory (default 3, max 30)
+- Runs asynchronously in the background; you'll be notified in chat when complete
+- Web notifications include clickable links to view MEMORY.md and dream diary
+- Works without Agent initialization — can be used before the first conversation
+
+<Tip>
+  After first deployment, it's recommended to run `/memory dream 30` once to distill all historical daily memories into MEMORY.md.
+</Tip>
+
+## Safety Mechanisms
+
+| Mechanism | Description |
+| --- | --- |
+| **Skip on no content** | Distillation skipped when no daily memory exists, avoiding empty overwrites |
+| **Input dedup** | In scheduled tasks, automatically skipped when input materials haven't changed |
+| **Async execution** | Distillation runs in a background thread, never blocking conversation |
+| **Sequential guarantee** | In scheduled tasks, daily flush completes before distillation starts |
+| **No fabrication** | Prompt explicitly constrains consolidation to existing materials only |
--- a/docs/en/memory/index.mdx
+++ b/docs/en/memory/index.mdx
@@ -5,6 +5,8 @@ description: CowAgent long-term memory system — file persistence, automatic wr

 Long-term memory is stored in workspace files, persisting across sessions. The Agent loads historical memory on demand via retrieval tools during conversation, and automatically writes conversation summaries to long-term memory when context is trimmed.

+<img src="https://cdn.link-ai.tech/doc/memory-architecture-en.jpg" alt="Memory Architecture" />
+
 ## Memory Types

 ### Core Memory (MEMORY.md)
@@ -15,12 +17,17 @@ Stored in `~/cow/MEMORY.md`, containing long-term user preferences, important de

 Stored in `~/cow/memory/` directory, named by date (e.g., `2026-03-08.md`), recording daily conversation summaries and key events. Files are only created on first write to avoid generating empty files.

+### Dream Diary (memory/dreams/YYYY-MM-DD.md)
+
+A byproduct of the Deep Dream (memory distillation) process, recording discoveries, deduplication operations, and new insights from each consolidation. Stored in `~/cow/memory/dreams/` directory, named by date.
+
 ## Automatic Writing

 The Agent automatically persists conversation content to long-term memory through the following mechanisms:

- **On context trimming** — When conversation turns or tokens exceed the configured limit, the oldest half of the context is trimmed, and the discarded content is summarized by LLM into key information and written to the daily memory file
+- **On context trimming** — When conversation turns or tokens exceed the configured limit, the oldest half of the context is trimmed, and the discarded content is summarized by LLM into key information and written to the daily memory file. The summary is also asynchronously injected into the retained context for conversational continuity
 - **Daily scheduled summary** — A full summary is automatically triggered at 23:55 every day, ensuring memory is preserved even on low-activity days (skipped if content hasn't changed)
+- **[Deep Dream (memory distillation)](/en/memory/deep-dream)** — Runs automatically after the daily summary, distilling daily memories into MEMORY.md and generating a dream diary
 - **On API context overflow** — When the model API returns a context overflow error, the current conversation summary is saved as an emergency measure

 All memory writes run asynchronously in a background thread (LLM summarization + file writing), never blocking normal conversation replies.
@@ -34,19 +41,25 @@ The memory system supports hybrid retrieval modes:

 The Agent automatically triggers memory retrieval during conversation as needed, incorporating relevant historical information into context. Results are ranked by a combined score (default: 0.7 vector weight + 0.3 keyword weight). Daily memory scores decay over time (30-day half-life), while core memory does not decay.

-## First Launch
+## Related Files

-On first launch, the Agent will proactively ask the user for key information and save it to the workspace (default `~/cow`):
+Files related to memory in the workspace (default `~/cow`):

 | File | Description |
 | --- | --- |
-| `system.md` | Agent system prompt and behavior settings |
-| `user.md` | User identity information and preferences |
+| `AGENT.md` | Agent personality and behavior settings |
+| `USER.md` | User identity information and preferences |
+| `RULE.md` | Custom rules and constraints |
 | `MEMORY.md` | Core memory (long-term) |
 | `memory/YYYY-MM-DD.md` | Daily memory (created on demand) |
+| `memory/dreams/YYYY-MM-DD.md` | Dream diary (auto-generated by Deep Dream) |
+
+## Web Console
+
+The memory management page in the Web console allows browsing memory files and dream diaries, with tab switching support:

 <Frame>
-  <img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
+  <img src="https://cdn.link-ai.tech/doc/20260414171014.png" width="800" />
 </Frame>

 ## Configuration
--- a/docs/en/models/claude.mdx
+++ b/docs/en/models/claude.mdx
@@ -12,6 +12,6 @@ description: Claude model configuration

 | Parameter | Description |
 | --- | --- |
-| `model` | Options include `claude-sonnet-4-6`, `claude-opus-4-6`, `claude-sonnet-4-5`, `claude-sonnet-4-0`, `claude-3-5-sonnet-latest`, etc. See [official models](https://docs.anthropic.com/en/docs/about-claude/models/overview) |
+| `model` | Options include `claude-sonnet-4-6`, `claude-opus-4-7`, `claude-opus-4-6`, `claude-sonnet-4-5`, `claude-sonnet-4-0`, `claude-3-5-sonnet-latest`, etc. See [official models](https://docs.anthropic.com/en/docs/about-claude/models/overview) |
 | `claude_api_key` | Create at [Claude Console](https://console.anthropic.com/settings/keys) |
 | `claude_api_base` | Optional. Defaults to `https://api.anthropic.com/v1`. Change to use third-party proxy |
--- a/docs/en/models/coding-plan.mdx
+++ b/docs/en/models/coding-plan.mdx
@@ -102,18 +102,18 @@ Reference: [China Quick Start](https://docs.bigmodel.cn/cn/coding-plan/quick-sta

 ```json
 {
-  "bot_type": "openai",
+  "bot_type": "moonshot",
  "model": "kimi-for-coding",
-  "open_ai_api_base": "https://api.kimi.com/coding/v1",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "moonshot_base_url": "https://api.kimi.com/coding/v1",
+  "moonshot_api_key": "YOUR_API_KEY"
 }
 ```

 | Parameter | Description |
 | --- | --- |
-| `model` | `kimi-for-coding` |
-| `open_ai_api_base` | `https://api.kimi.com/coding/v1` |
-| `open_ai_api_key` | Coding Plan specific key (not shared with pay-as-you-go) |
+| `model` | Use `kimi-for-coding` for auto-updating model, or specify a model such as `kimi-k2.6` |
+| `moonshot_base_url` | `https://api.kimi.com/coding/v1` |
+| `moonshot_api_key` | Coding Plan specific key (not shared with pay-as-you-go) |

 Reference: [Key & Docs](https://www.kimi.com/code/docs/)

--- a/docs/en/models/custom.mdx
+++ b/docs/en/models/custom.mdx
@@ -0,0 +1,62 @@
+---
+title: Custom
+description: Custom provider for third-party APIs and local models
+---
+
+For models accessed via OpenAI-compatible APIs, such as:
+
+- **Third-party API proxies**: Use a unified API Base to call multiple models
+- **Local models**: Models deployed locally via Ollama, vLLM, LocalAI, etc.
+- **Private deployments**: Self-hosted model services within your organization
+
+<Note>
+  Unlike the `openai` provider, switching models under the Custom provider will not auto-switch the provider type. Your custom API address is always preserved.
+</Note>
+
+## Configuration
+
+### Third-party API Proxy
+
+```json
+{
+  "bot_type": "custom",
+  "model": "deepseek-v4-flash",
+  "custom_api_key": "YOUR_API_KEY",
+  "custom_api_base": "https://{your-proxy.com}/v1"
+}
+```
+
+| Parameter | Description |
+| --- | --- |
+| `bot_type` | Must be set to `custom` |
+| `model` | Model name, any model supported by your proxy service |
+| `custom_api_key` | API key provided by your proxy service |
+| `custom_api_base` | API base URL, must be OpenAI-compatible |
+
+### Local Models
+
+Local models typically don't require an API key — just set the API base:
+
+```json
+{
+  "bot_type": "custom",
+  "model": "qwen3.5:27b",
+  "custom_api_base": "http://localhost:11434/v1"
+}
+```
+
+Common local deployment tools and their default addresses:
+
+| Tool | Default API Base |
+| --- | --- |
+| [Ollama](https://ollama.com) | `http://localhost:11434/v1` |
+| [vLLM](https://docs.vllm.ai) | `http://localhost:8000/v1` |
+| [LocalAI](https://localai.io) | `http://localhost:8080/v1` |
+
+## Switching Models
+
+Under the Custom provider, switching models only changes `model` without affecting `bot_type` or the API address:
+
+```
+/config model qwen3.5:27b
+```
--- a/docs/en/models/deepseek.mdx
+++ b/docs/en/models/deepseek.mdx
@@ -7,26 +7,57 @@ Option 1: Native integration (recommended):

 ```json
 {
-  "model": "deepseek-chat",
+  "model": "deepseek-v4-flash",
  "deepseek_api_key": "YOUR_API_KEY"
 }
 ```

 | Parameter | Description |
 | --- | --- |
-| `model` | `deepseek-chat` (DeepSeek-V3.2, non-thinking mode), `deepseek-reasoner` (DeepSeek-R1, thinking mode) |
+| `model` | Supports `deepseek-v4-flash` (default) and `deepseek-v4-pro` |
 | `deepseek_api_key` | Create at [DeepSeek Platform](https://platform.deepseek.com/api_keys) |
 | `deepseek_api_base` | Optional, defaults to `https://api.deepseek.com/v1`. Can be changed to a third-party proxy |

+## Model Selection
+
+| Model | Use Case |
+| --- | --- |
+| `deepseek-v4-flash` | Default: fast and cost-effective |
+| `deepseek-v4-pro` | Stronger on complex tasks |
+
+## Thinking Mode
+
+The V4 series (`deepseek-v4-flash` / `deepseek-v4-pro`) supports an explicit "thinking mode": the model emits a chain-of-thought (`reasoning_content`) before the final answer to improve answer quality.
+
+### Toggle
+
+Controlled by the global `enable_thinking` setting:
+
+```json
+{
+  "enable_thinking": true
+}
+```
+
+- `true`: thinking is on across all channels. The Web console renders the reasoning trace; IM channels (WeChat / WeCom / DingTalk / Feishu) don't render it but still benefit from higher answer quality.
+- `false`: thinking off, faster responses with lower first-token latency.
+
+### Notes
+
+- **Sampling parameters**: under thinking mode, `temperature`, `top_p`, `presence_penalty`, and `frequency_penalty` are silently ignored by the server (no error). CowAgent skips sending them automatically.
+- **Multi-turn tool calls**: once the history contains any tool-call turn, DeepSeek requires `reasoning_content` on every assistant message. CowAgent handles the round-trip automatically, including across mid-session toggles of the thinking switch.
+
+<Tip>
+  Start with `deepseek-v4-flash`; switch to `deepseek-v4-pro` for harder tasks; enable `enable_thinking` when you want deeper reasoning.
+</Tip>
+
 Option 2: OpenAI-compatible configuration:

 ```json
 {
-  "model": "deepseek-chat",
+  "model": "deepseek-v4-flash",
  "bot_type": "openai",
  "open_ai_api_key": "YOUR_API_KEY",
  "open_ai_api_base": "https://api.deepseek.com/v1"
 }
 ```
-
-
--- a/docs/en/models/glm.mdx
+++ b/docs/en/models/glm.mdx
@@ -5,14 +5,14 @@ description: Zhipu AI GLM model configuration

 ```json
 {
-  "model": "glm-5-turbo",
+  "model": "glm-5.1",
  "zhipu_ai_api_key": "YOUR_API_KEY"
 }
 ```

 | Parameter | Description |
 | --- | --- |
-| `model` | Options include `glm-5-turbo`, `glm-5`, `glm-4.7`, `glm-4-plus`, `glm-4-flash`, `glm-4-air`, etc. See [model codes](https://bigmodel.cn/dev/api/normal-model/glm-4) |
+| `model` | Options include `glm-5.1`, `glm-5-turbo`, `glm-5`, `glm-4.7`, `glm-4-plus`, `glm-4-flash`, `glm-4-air`, etc. See [model codes](https://bigmodel.cn/dev/api/normal-model/glm-4) |
 | `zhipu_ai_api_key` | Create at [Zhipu AI Console](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) |

 OpenAI-compatible configuration is also supported:
@@ -20,7 +20,7 @@ OpenAI-compatible configuration is also supported:
 ```json
 {
  "bot_type": "openai",
-  "model": "glm-5-turbo",
+  "model": "glm-5.1",
  "open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
  "open_ai_api_key": "YOUR_API_KEY"
 }
--- a/docs/en/models/index.mdx
+++ b/docs/en/models/index.mdx
@@ -6,7 +6,7 @@ description: Supported models and recommended choices for CowAgent
 CowAgent supports mainstream LLMs from domestic and international providers. Model interfaces are implemented in the project's `models/` directory.

 <Note>
-  For Agent mode, the following models are recommended based on quality and cost: MiniMax-M2.7, glm-5-turbo, kimi-k2.5, qwen3.6-plus, claude-sonnet-4-6, gemini-3.1-pro-preview
+  For Agent mode, the following models are recommended based on quality and cost: deepseek-v4-flash, MiniMax-M2.7, claude-sonnet-4-6, gemini-3.1-pro-preview, glm-5.1, qwen3.6-plus, kimi-k2.6
 </Note>

 ## Configuration
@@ -18,21 +18,12 @@ You can also use the [LinkAI](https://link-ai.tech) platform interface to flexib
 ## Supported Models

 <CardGroup cols={2}>
+  <Card title="DeepSeek" href="/en/models/deepseek">
+    deepseek-v4-flash, deepseek-v4-pro, and more
+  </Card>
  <Card title="MiniMax" href="/en/models/minimax">
    MiniMax-M2.7 and other series models
  </Card>
-  <Card title="GLM (Zhipu AI)" href="/en/models/glm">
-    glm-5-turbo, glm-5 and other series models
-  </Card>
-  <Card title="Qwen (Tongyi Qianwen)" href="/en/models/qwen">
-    qwen3.6-plus, qwen3-max and more
-  </Card>
-  <Card title="Kimi" href="/en/models/kimi">
-    kimi-k2.5, kimi-k2 and more
-  </Card>
-  <Card title="Doubao (ByteDance)" href="/en/models/doubao">
-    doubao-seed series models
-  </Card>
  <Card title="Claude" href="/en/models/claude">
    claude-sonnet-4-6 and more
  </Card>
@@ -42,8 +33,17 @@ You can also use the [LinkAI](https://link-ai.tech) platform interface to flexib
  <Card title="OpenAI" href="/en/models/openai">
    gpt-5.4, gpt-4.1, o-series and more
  </Card>
-  <Card title="DeepSeek" href="/en/models/deepseek">
-    deepseek-chat, deepseek-reasoner
+  <Card title="GLM (Zhipu AI)" href="/en/models/glm">
+    glm-5.1, glm-5-turbo, glm-5 and other series models
+  </Card>
+  <Card title="Qwen (Tongyi Qianwen)" href="/en/models/qwen">
+    qwen3.6-plus, qwen3-max and more
+  </Card>
+  <Card title="Doubao (ByteDance)" href="/en/models/doubao">
+    doubao-seed series models
+  </Card>
+  <Card title="Kimi" href="/en/models/kimi">
+    kimi-k2.6, kimi-k2.5, kimi-k2 and more
  </Card>
  <Card title="LinkAI" href="/en/models/linkai">
    Unified multi-model interface + knowledge base
--- a/docs/en/models/kimi.mdx
+++ b/docs/en/models/kimi.mdx
@@ -5,14 +5,14 @@ description: Kimi (Moonshot) model configuration

 ```json
 {
-  "model": "kimi-k2.5",
+  "model": "kimi-k2.6",
  "moonshot_api_key": "YOUR_API_KEY"
 }
 ```

 | Parameter | Description |
 | --- | --- |
-| `model` | Options include `kimi-k2.5`, `kimi-k2`, `moonshot-v1-8k`, `moonshot-v1-32k`, `moonshot-v1-128k` |
+| `model` | Options include `kimi-k2.6`, `kimi-k2.5`, `kimi-k2`, `moonshot-v1-8k`, `moonshot-v1-32k`, `moonshot-v1-128k` |
 | `moonshot_api_key` | Create at [Moonshot Console](https://platform.moonshot.cn/console/api-keys) |

 OpenAI-compatible configuration is also supported:
@@ -20,7 +20,7 @@ OpenAI-compatible configuration is also supported:
 ```json
 {
  "bot_type": "openai",
-  "model": "kimi-k2.5",
+  "model": "kimi-k2.6",
  "open_ai_api_base": "https://api.moonshot.cn/v1",
  "open_ai_api_key": "YOUR_API_KEY"
 }
--- a/docs/en/models/linkai.mdx
+++ b/docs/en/models/linkai.mdx
@@ -3,7 +3,7 @@ title: LinkAI
 description: Unified access to multiple models via LinkAI platform
 ---

-The [LinkAI](https://link-ai.tech) platform lets you flexibly switch between OpenAI, Claude, Gemini, DeepSeek, Qwen, Kimi, and other models, with support for knowledge base, workflows, plugins, and other Agent capabilities.
+The [LinkAI](https://link-ai.tech) platform lets you flexibly switch between OpenAI, Claude, Gemini, DeepSeek, MiniMax, Qwen, Kimi, and other models, with support for knowledge base, workflows, plugins, and other Agent capabilities.

 ```json
 {
--- a/docs/en/releases/overview.mdx
+++ b/docs/en/releases/overview.mdx
@@ -5,6 +5,8 @@ description: CowAgent version history

 | Version | Date | Description |
 | --- | --- | --- |
+| [2.0.7](/en/releases/v2.0.7) | 2026.04.22 | Image Generation Skill (6-provider auto-routing), new models (Kimi K2.6, Claude Opus 4.7, GLM 5.1), knowledge base and Web Console improvements |
+| [2.0.6](/en/releases/v2.0.6) | 2026.04.14 | Knowledge Base, Deep Dream Memory Distillation, Smart Context Compression, Web Console upgrades |
 | [2.0.5](/en/releases/v2.0.5) | 2026.04.01 | Cow CLI, Skill Hub open source, Browser tool, WeCom Bot QR scan, and more |
 | [2.0.4](/en/releases/v2.0.4) | 2026.03.22 | Personal WeChat channel, new model support, Japanese docs, script refactoring and bug fixes |
 | [2.0.2](/en/releases/v2.0.2) | 2026.02.27 | Web Console upgrade, multi-channel concurrency, session persistence |
--- a/docs/en/releases/v2.0.6.mdx
+++ b/docs/en/releases/v2.0.6.mdx
@@ -0,0 +1,83 @@
+---
+title: v2.0.6
+description: CowAgent 2.0.6 - Knowledge Base, Deep Dream Memory Distillation, Smart Context Compression, Web Console Multi-Session and More
+---
+
+## Project Renamed to CowAgent
+
+The repository has been officially renamed from `chatgpt-on-wechat` to **CowAgent**, evolving into a full-featured AI Agent assistant.
+
+- New URL: [github.com/zhayujie/CowAgent](https://github.com/zhayujie/CowAgent) — GitHub auto-redirects the old URL
+- CLI commands, config files, and documentation links remain compatible — no extra steps needed
+
+## 📚 Knowledge Base
+
+New personal knowledge base system — Agent can autonomously build and maintain structured knowledge, retrieving it on demand during conversations:
+
+- **Index-driven self-organizing structure**: Knowledge is stored in `knowledge/` directory, auto-organized by category, with each knowledge page as an independent Markdown file
+- **Auto-write**: Send files, links, or other knowledge to the Agent, or it will automatically create/update knowledge pages when valuable information is identified in conversation
+- **Hybrid retrieval**: Supports keyword full-text search and vector semantic retrieval, loading relevant knowledge on demand during conversations
+- **Visualization**: File tree browsing and knowledge graph visualization, with in-document links for direct navigation
+- **Command management**: `/knowledge` for stats, `/knowledge list` for directory structure, `/knowledge on|off` to toggle
+
+<img src="https://cdn.link-ai.tech/doc/20260413105435.png" width="750" />
+
+
+Docs: [Knowledge Base](https://docs.cowagent.ai/en/knowledge)
+
+## 🌙 Deep Dream Memory Distillation
+
+A new memory consolidation mechanism that automatically distills scattered conversation memories into refined long-term memory daily:
+
+- **Three-tier memory flow**: Conversation context (short-term) → Daily memory (mid-term) → MEMORY.md (long-term), forming a complete memory lifecycle
+- **Auto-distillation**: Runs daily at 23:55, reads the day's daily memory and MEMORY.md, performs deduplication, merging, and pruning via LLM, outputting a refined MEMORY.md
+- **Dream diary**: Each distillation generates a narrative-style dream diary recording discoveries and insights, stored in `memory/dreams/`
+- **Manual trigger**: `/memory dream [N]` to manually trigger with configurable lookback days (default 3, max 30), with chat notification on completion
+- **Web console**: Memory management page now includes a "Dream Diary" tab for browsing all dream diaries
+
+Docs: [Deep Dream](https://docs.cowagent.ai/en/memory/deep-dream)
+
+<img src="https://cdn.link-ai.tech/doc/20260414120158.png" width="750" />
+
+## 🧠 Smart Context Compression
+
+When context exceeds limits, trimmed portions are summarized by LLM and asynchronously injected to maintain conversation continuity:
+
+- **Async LLM summary**: Trimmed messages are summarized into key information by LLM, written to daily memory files and injected into retained context
+- **Multi-model compatible**: Uses the primary model for summarization, compatible with Claude, OpenAI, MiniMax and other model message format requirements
+
+Docs: [Short-term Memory](https://docs.cowagent.ai/en/memory/context)
+
+## 💬 Web Console Upgrades
+
+Multiple enhancements to the Web console:
+
+- **Multi-session management**: Create and switch between independent sessions, sidebar session list with auto-generated and manually editable titles
+- **Password protection**: Set a login password via `web_console_password` config option
+- **Deep thinking**: Display model thinking process in Web console, controlled by `enable_thinking` config option
+- **Scheduled push**: Scheduled task results can be pushed to Web console
+- **Message copy**: One-click copy of raw Markdown content from AI reply bubbles
+- **Language toggle**: Top language switch button now shows current language for more intuitive interaction
+
+## 🤖 Model Updates
+
+- **Vision optimization**: Image recognition tool prefers the primary model with automatic multi-provider fallback. Docs: [Vision Tool](https://docs.cowagent.ai/en/tools/vision)
+- **MiniMax new model**: Added MiniMax-M2.7-highspeed model and MiniMax TTS voice synthesis support. Thanks @octo-patch
+- **Qwen**: Added qwen3.6-plus model support
+
+## 🐛 Other Improvements & Fixes
+
+- **Memory prompts**: `MEMORY.md` injected into system prompt by default, with refined memory retrieval and write trigger conditions for enhanced proactive writing
+- **System prompt**: Optimized system prompt style and tone guidance
+- **Browser tool**: Enhanced implicit interactive element detection
+- **File send**: Fixed common file types (tar.gz, zip, etc.) not being sent correctly. Thanks @6vision
+- **macOS compatibility**: Fixed network pre-check timeout compatibility issue. Thanks @Moliang Zhou
+- **Windows compatibility**: Fixed PowerShell compatibility, process updates, terminal encoding and other issues on Windows
+- **Python 3.13+**: Fixed missing `legacy-cgi` dependency for Python 3.13+
+- **WeChat channel**: Updated personal WeChat channel version
+
+## 📦 Upgrade
+
+Run `cow update` or `./run.sh update` to upgrade, or pull the latest code and restart. See [Upgrade Guide](https://docs.cowagent.ai/en/guide/upgrade).
+
+**Release Date**: 2026.04.14 | [Full Changelog](https://github.com/zhayujie/CowAgent/compare/2.0.5...master)
--- a/docs/en/releases/v2.0.7.mdx
+++ b/docs/en/releases/v2.0.7.mdx
@@ -0,0 +1,65 @@
+---
+title: v2.0.7
+description: CowAgent 2.0.7 - Image Generation Skill (6-provider auto-routing), new models, knowledge base enhancements, Web Console improvements and bug fixes
+---
+
+## 🎨 Image Generation Skill
+
+New built-in `image-generation` skill supporting text-to-image, image-to-image, and multi-image fusion across six major providers:
+
+- **6-provider auto-routing**: OpenAI (GPT-Image-2) → Gemini (Nano Banana) → Seedream (Volcengine Ark) → Qwen (DashScope) → MiniMax → LinkAI — automatically selects from configured providers in fixed priority order, with automatic fallback on failure
+- **Zero model selection**: Just configure an API key and it works — no need to manually specify a model. You can also name a specific model in conversation (e.g. "draw a cat with seedream")
+- **Flexible control**: Supports `quality`, `size` (512/1K–4K), and `aspect_ratio` parameters, with each provider automatically mapping to its supported values
+- **Image editing**: Pass existing images for editing, style transfer, or multi-image fusion (Seedream supports up to 14 reference images)
+- **Skill-level config**: Pin a default model via `skill.image-generation.model` in `config.json`
+- **Image lightbox**: All images in the Web console now support click-to-enlarge preview
+
+Docs: [Image Generation Skill](https://docs.cowagent.ai/en/skills/image-generation)
+
+## 🤖 New Model Support
+
+- **Kimi K2.6**: Added `kimi-k2.6` model support
+- **Claude Opus 4.7**: Added `claude-opus-4-7` model support
+- **GLM 5.1**: Added `glm-5.1` model support
+- **Kimi Coding Plan**: Support for Kimi Coding Plan mode
+- **Custom model providers**: New custom model provider configuration for easier integration with additional vendors
+
+## 💬 Web Console Improvements
+
+- **Smart auto-scroll**: Improved chat scroll behaviour — no longer forces scroll to bottom while the user is reading earlier messages
+- **Reasoning content cap**: Deep thinking content capped at 4 KB to prevent frontend lag
+- **Mobile optimisation**: Session sidebar hidden by default on mobile, with overlay dismiss support
+- **Session title fix**: Fixed title auto-generation fallback logic and Bridge reset on config change
+- **Image preview dedup**: Fixed duplicate image rendering within the same message
+
+## 📚 Knowledge Base Enhancements
+
+- **Nested directory support**: Knowledge base listing and display now support multi-level nested directories
+- **Root-level file display**: Show `index.md`, `log.md` and other root-level files in the knowledge tree
+- **Empty state stats fix**: Root-level files no longer interfere with empty-state detection
+
+## 🌙 Dream Memory Improvements
+
+- **Structured organisation**: Dream memory files are now auto-archived by date with a cleaner directory structure
+- **Schedule jitter**: Daily dream trigger includes random jitter to avoid concurrency conflicts in cluster deployments
+
+## 🛠 Skill System Improvements
+
+- **Skill manager refresh**: `/skill` commands now automatically refresh the skill manager to keep state in sync
+- **Installation sources**: Skill installation supports multiple source formats (URL, zip, local file, etc.) with automatic target directory handling
+
+## 🐛 Other Fixes
+
+- **Gemini fix**: Fixed Gemini tool calls not returning results
+- **Agent retry**: Empty-response retries no longer drop `tool_calls`
+- **Docker env sync**: Fixed environment variables not syncing after config update in Docker environments
+- **Python 3.7 compat**: Deferred `Literal` import for Python 3.7 compatibility
+- **Model switch notification**: Fixed bot_type change notification not showing after model switch. Thanks @6vision
+- **Config command**: `/config` now supports setting `enable_thinking`
+- **Thinking display**: Deep thinking display disabled by default
+
+## 📦 Upgrade
+
+Run `cow update` or `./run.sh update` to upgrade, or pull the latest code and restart. See [Upgrade Guide](https://docs.cowagent.ai/en/guide/upgrade).
+
+**Release Date**: 2026.04.22 | [Full Changelog](https://github.com/zhayujie/CowAgent/compare/2.0.6...master)
--- a/docs/en/skills/image-generation.mdx
+++ b/docs/en/skills/image-generation.mdx
@@ -0,0 +1,158 @@
+---
+title: image-generation - Image Generation
+description: Text-to-image / image-to-image / multi-image fusion with automatic multi-provider routing and fallback
+---
+
+A general-purpose image generation and editing skill supporting six providers: OpenAI, Gemini, Seedream (Volcengine Ark), Qwen (DashScope), MiniMax, and LinkAI. No need to choose a model manually — the script automatically selects a configured provider based on a fixed priority order.
+
+## Model Selection
+
+`image-generation` uses a "fixed priority + automatic fallback" strategy — just configure your keys and it works:
+
+1. **Priority order**: `OpenAI → Gemini → Seedream → Qwen → MiniMax → LinkAI`
+2. **Unconfigured providers are skipped**: only providers with an API key participate
+3. **Automatic fallback on failure**: on errors like 401, model not enabled, or network issues, the next provider is tried
+4. **Specified model goes first**: if a specific model name is provided, its provider is promoted to the front
+
+### Supported Models
+
+| Provider | Models / Aliases | Notes |
+| --- | --- | --- |
+| OpenAI | `gpt-image-2`, `gpt-image-1` | General-purpose, high quality, supports `quality` parameter |
+| Gemini Nano Banana | `nano-banana-2`, `nano-banana-pro`, `nano-banana` | Corresponds to `gemini-3.1-flash`, `gemini-3-pro`, `gemini-2.5-flash` image variants |
+| Seedream (Volcengine Ark) | `seedream-5.0-lite`, `seedream-4.5` | Native 2K–4K, up to 14 reference images for fusion |
+| Qwen (DashScope) | `qwen-image-2.0`, `qwen-image-2.0-pro` | Strong with Chinese text rendering and text-image layouts |
+| MiniMax | `image-01` | Fast and simple image generation |
+| LinkAI | Any model | Universal proxy, used as fallback |
+
+<Note>
+By default, the Agent does not pick a model — it uses automatic routing. If you want a specific model, just say so in the conversation, e.g. "use seedream to draw a cat" or "generate a poster with gpt-image-2". You can also pin a default model via the "Custom Configuration" section below.
+</Note>
+
+## Custom Configuration
+
+### API Key Setup
+
+You need **at least one** provider key. Configuring multiple providers enables automatic fallback. There are three ways to set up keys:
+
+#### Option 1: Automatic Reuse of Existing Keys
+
+If you have already configured model keys in the web console or `config.json` (e.g. `openai_api_key`, `gemini_api_key`, etc.), these keys are **automatically synced** to the corresponding environment variables at startup. In other words, if your chat model works, image generation can use the same key with zero extra configuration.
+
+#### Option 2: Configure in config.json
+
+Add the key fields directly to `config.json`:
+
+```json
+{
+  "openai_api_key": "sk-xxx",
+  "openai_api_base": "https://api.openai.com/v1",
+  "gemini_api_key": "AIza-xxx",
+  "ark_api_key": "xxx",
+  "dashscope_api_key": "sk-xxx",
+  "minimax_api_key": "xxx",
+  "linkai_api_key": "xxx"
+}
+```
+
+A restart is required after changes. Each key also has a corresponding `*_api_base` field for custom endpoints.
+
+#### Option 3: Configure via Conversation
+
+Send an API key in the chat and the Agent will save it to `~/cow/.env` using the `env_config` tool — **no restart needed**. For example:
+
+```
+Set OPENAI_API_KEY to sk-xxx
+```
+
+Or:
+
+```
+Configure ARK_API_KEY as xxx
+```
+
+### API Key Reference
+
+| Environment Variable | config.json Field | Provider | Default Base URL |
+| --- | --- | --- | --- |
+| `OPENAI_API_KEY` | `openai_api_key` | OpenAI | `https://api.openai.com/v1` |
+| `GEMINI_API_KEY` | `gemini_api_key` | Gemini | `https://generativelanguage.googleapis.com` |
+| `ARK_API_KEY` | `ark_api_key` | Volcengine Ark (Seedream) | `https://ark.cn-beijing.volces.com/api/v3` |
+| `DASHSCOPE_API_KEY` | `dashscope_api_key` | Alibaba DashScope (Qwen) | `https://dashscope.aliyuncs.com` |
+| `MINIMAX_API_KEY` | `minimax_api_key` | MiniMax | `https://api.minimaxi.com` |
+| `LINKAI_API_KEY` | `linkai_api_key` | LinkAI | `https://api.link-ai.tech` |
+
+### Pinning a Default Model
+
+To force all image generation through a specific provider's model, add this to `config.json`:
+
+```json
+"skill": {
+  "image-generation": {
+    "model": "seedream-5.0-lite"
+  }
+}
+```
+
+At startup, this is automatically converted to the environment variable `SKILL_IMAGE_GENERATION_MODEL`, and the script will always use this model's provider for generation.
+
+## Enabling and Disabling
+
+`image-generation` is a built-in skill that **automatically adjusts its status based on API keys**:
+
+- **Key configured**: the skill is active — the Agent will invoke it when asked to draw
+- **Key not configured**: the skill still appears in context (marked as "needs configuration") — the Agent will guide the user to set up a key rather than failing silently
+
+To control it manually:
+
+```text
+/skill disable image-generation    # Disable (won't be invoked even if keys are present)
+/skill enable image-generation     # Re-enable
+```
+
+In the terminal: `cow skill disable image-generation` / `cow skill enable image-generation`.
+
+## Parameters
+
+| Parameter | Type | Required | Default | Description |
+| --- | --- | --- | --- | --- |
+| `prompt` | string | Yes | — | Image description |
+| `image_url` | string / list | No | null | Input image(s) for editing — local path or URL. Pass multiple for multi-image fusion |
+| `quality` | string | No | auto | `low` / `medium` / `high` — only some providers support this |
+| `size` | string | No | auto | `512` / `1K` / `2K` / `3K` / `4K`, or pixel value like `1024x1024` |
+| `aspect_ratio` | string | No | null | `1:1` / `3:2` / `2:3` / `16:9` / `9:16` / `21:9`; Gemini also supports `1:4` / `4:1` / `1:8` / `8:1` |
+
+<Warning>
+**Higher quality and larger size cost more and take longer.**
+
+- For everyday conversations and quick previews, use the defaults (`auto`) or `quality=low` + `size=1K` — roughly 20 seconds
+- For posters or when the user explicitly asks for high resolution, use `quality=high` + `size=2K/4K` — may take 1–5 minutes depending on the model
+</Warning>
+
+## Output
+
+On success:
+
+```json
+{
+  "model": "doubao-seedream-5-0-260128",
+  "images": [
+    {"url": "/path/to/output.png"}
+  ]
+}
+```
+
+On failure: `{ "error": "..." }`. After an error, **do not retry directly** — it is almost always a configuration issue (wrong key, incorrect API base, model not enabled). Have the user fix the configuration first.
+
+## Common Use Cases
+
+- **Text-to-image**: generate illustrations, posters, icons, avatars, storyboards, etc. from a description
+- **Image-to-image**: change styles, swap elements, add decorations or text on an existing image
+- **Multi-image fusion**: combine multiple reference images into one (outfit swaps, character group photos, etc.)
+
+<Note>
+- Bash timeout should be set to 600 seconds. Each provider has a 300-second HTTP timeout, but the script may try multiple providers sequentially
+- Input images are automatically compressed to ≤ 4 MB with the longest edge ≤ 4096 px
+- Gemini / Seedream / Qwen / MiniMax do not support the `quality` parameter — passing it has no effect
+- Seedream defaults to 2K; `seedream-5.0-lite` supports up to 3K; `seedream-4.5` supports up to 4K
+</Note>
--- a/docs/en/skills/knowledge-wiki.mdx
+++ b/docs/en/skills/knowledge-wiki.mdx
@@ -0,0 +1,112 @@
+---
+title: knowledge-wiki - Knowledge Base
+description: Maintain a local structured knowledge base with automatic archiving, categorisation, and cross-referencing
+---
+
+Organises notes, insights, and reference materials from your conversations into a structured local knowledge base, automatically maintaining an index and cross-references between pages.
+
+`knowledge-wiki` maintains a `knowledge/` directory in your workspace — essentially the Agent's "second brain". The skill is marked `always: true`, so it is **always loaded** and requires no external dependencies.
+
+## When It Triggers
+
+- You share an article, document, or URL that you want to keep for future reference
+- A conversation produces conclusions worth retaining long-term
+- You want to look up something you accumulated earlier
+
+## Directory Structure
+
+```
+knowledge/
+├── index.md           # Global index (must be maintained)
+├── log.md             # Operation log (append-only)
+└── <category>/        # Category subdirectories (grouped by content)
+    └── <slug>.md      # Knowledge page (lowercase-hyphenated filename)
+```
+
+## Three Core Operations
+
+### 1. Ingest
+
+When you share some material, the Agent will:
+
+1. Read and understand the original content, extracting key information
+2. Decide which category it belongs to — check `index.md` first; create a new category if none fits
+3. Generate a knowledge page at `knowledge/<category>/<slug>.md`
+4. Update the index `index.md` and the log `log.md`
+
+### 2. Synthesise
+
+When a conversation produces new conclusions or insights:
+
+1. Create a new knowledge page under an appropriate category
+2. Add cross-links to and from related existing pages
+3. Update the index and log
+
+### 3. Query
+
+When you ask about previously accumulated knowledge:
+
+1. Search `index.md` for potentially relevant pages
+2. Open specific pages with the `read` tool
+3. Supplement with `memory_search` if needed
+4. Include links to knowledge pages in the answer so you can click through to the source
+
+## Page Format
+
+```markdown
+# Page Title
+
+> Source: <source URL or brief description>
+
+Body content. Link between pages using relative paths:
+[Related Page](../category/related-page.md)
+
+## Key Points
+
+- ...
+
+## Related Pages
+
+- [Page A](../category/page-a.md) — why it's related
+```
+
+<Note>
+- `> Source:` records where this knowledge came from. Always include it when there is a clear source
+- Cross-references are important: when creating or updating a page, remember to add back-links in the related pages too
+- **Only link to pages that already exist.** If a concept deserves its own page, create it first, then add the link
+</Note>
+
+## Index Format
+
+`knowledge/index.md` uses a flat list grouped by category, one knowledge page per line:
+
+```markdown
+# Knowledge Index
+
+## Category A
+- [Page Title](category-a/page-slug.md) — one-line summary
+
+## Category B
+- [Page Title](category-b/page-slug.md) — one-line summary
+```
+
+No tables, no emojis. Category names and organisation can be adjusted freely.
+
+## Log Format
+
+`knowledge/log.md` is append-only — newest entries go at the bottom:
+
+```markdown
+## [YYYY-MM-DD] ingest | Page Title
+## [YYYY-MM-DD] synthesize | Page Title
+```
+
+## Writing Guidelines
+
+- **Filenames**: lowercase with hyphens, e.g. `machine-learning.md`
+- **One topic per page** — link related content across pages
+- **Update, don't duplicate** — if a page already exists, update it rather than creating a new one
+- **Always update the index** `knowledge/index.md` after any change
+- **Distill, don't copy** — capture the key points, not the entire source
+- **Use full paths when referencing knowledge pages in conversations**, e.g. `[Title](knowledge/<category>/<slug>.md)`. Use relative paths only for inter-page links
+- **Include links when answering questions based on knowledge pages** so users can dig deeper
--- a/docs/en/skills/skill-creator.mdx
+++ b/docs/en/skills/skill-creator.mdx
@@ -0,0 +1,180 @@
+---
+title: skill-creator - Skill Creator
+description: Create, install, and update skills — standardises SKILL.md format and directory structure
+---
+
+`skill-creator` is a "meta-skill" that helps the Agent create, install, and update other skills, ensuring every skill follows a consistent `SKILL.md` format and directory layout.
+
+## When It Triggers
+
+- The user wants to install a skill from a URL or remote repository
+- The user wants to create a brand-new skill from scratch
+- An existing skill needs upgrading or restructuring
+
+## What Is a Skill?
+
+A skill is a reusable instruction set plus optional scripts and assets. It injects domain expertise into the Agent so it can handle specific tasks like a specialist.
+
+A skill typically contains:
+
+1. **Specialised workflow** — step-by-step instructions for a category of tasks
+2. **Tool usage** — how to call a particular API or process a particular file format
+3. **Domain knowledge** — team conventions, business rules, data schemas, etc.
+4. **Attached resources** — scripts, reference docs, templates, etc.
+
+<Note>
+**Core principle: less is more.** Only write what the Agent wouldn't figure out on its own. For every line you add, ask yourself: is it worth the tokens?
+</Note>
+
+## Directory Structure
+
+```
+skill-name/
+├── SKILL.md            # Required: skill definition
+│   ├── YAML frontmatter (name / description are mandatory)
+│   └── Markdown body (instructions + examples)
+└── Optional resources
+    ├── scripts/        # Executable scripts (Python / Bash, etc.)
+    ├── references/     # Large reference docs the Agent reads on demand
+    └── assets/         # Templates, icons, etc. used directly in output
+```
+
+## SKILL.md Specification
+
+Frontmatter fields in the SKILL.md header:
+
+| Field | Description |
+| --- | --- |
+| `name` | Skill name — lowercase with hyphens, must match the directory name |
+| `description` | **The most important field.** Clearly state what the skill does and when to use it. The Agent reads this to decide whether to invoke it. All trigger-related descriptions go here, not in the body |
+| `metadata.cowagent.requires.bins` | System CLI tools that must be installed |
+| `metadata.cowagent.requires.env` | Required environment variables (all must be present) |
+| `metadata.cowagent.requires.anyEnv` | Multiple API keys — at least one must be set |
+| `metadata.cowagent.requires.anyBins` | Multiple tools — at least one must be installed |
+| `metadata.cowagent.always` | Set to `true` to always load, skipping dependency checks |
+| `metadata.cowagent.emoji` | Display emoji (optional) |
+| `metadata.cowagent.os` | OS restriction, e.g. `["darwin", "linux"]` |
+
+<Note>
+The `category` field does not need to be set manually — the system automatically sets it to `skill`.
+</Note>
+
+Two ways to declare API key dependencies:
+
+```yaml
+metadata:
+  cowagent:
+    requires:
+      env: ["MYAPI_KEY"]            # Must be present
+```
+
+```yaml
+metadata:
+  cowagent:
+    requires:
+      anyEnv: ["OPENAI_API_KEY", "LINKAI_API_KEY"]   # At least one
+```
+
+**Skills are auto-enabled/disabled based on dependencies**: they activate when all required environment variables are present and deactivate when any are missing — no need for manual `/skill enable`.
+
+## Resource Directories
+
+| Directory | What goes here | What does NOT go here |
+| --- | --- | --- |
+| `scripts/` | Code that needs to run repeatedly, or scripts that produce deterministic results | Demo-only code snippets |
+| `references/` | Documents **over 500 lines** that genuinely won't fit in SKILL.md (e.g. a full DB schema) | General API docs, tutorials, examples |
+| `assets/` | Files that appear in the final output (templates, icons, boilerplate, etc.) | Explanatory documentation |
+
+<Warning>
+**In principle, everything goes in `SKILL.md`** — only split into resource directories when it truly won't fit.
+
+Do not add `README.md`, `CHANGELOG.md`, or `INSTALLATION_GUIDE.md` to a skill — put everything in `SKILL.md`. Resource directories should only contain scripts that actually run or assets that are actually used.
+</Warning>
+
+## Installing External Skills
+
+After installation, the skill lands in `<workspace>/skills/<name>/`.
+
+| Source | How to install |
+| --- | --- |
+| URL (single file) | curl / web_fetch |
+| URL (zip archive) | Download and extract |
+| Local SKILL.md | Read directly |
+| Local zip archive | Extract |
+
+Installation steps:
+
+1. Locate the `SKILL.md` (may be at the root or in a subdirectory of the archive)
+2. Read the `name` from the frontmatter
+3. Copy the **entire skill directory** (including `SKILL.md`, `scripts/`, `assets/`, etc.) to `<workspace>/skills/<name>/`
+4. If the archive contains an `INSTALL.md` or similar setup script, run it — but the final result must still reside under `<workspace>/skills/<name>/`
+
+## Creating a Skill from Scratch
+
+Recommended order:
+
+1. **Clarify requirements** — ask the user for a few concrete use cases (don't ask too many at once)
+2. **Plan the structure** — does this skill need scripts? Reference docs? Template assets?
+3. **Scaffold** — use the init script:
+
+   ```bash
+   scripts/init_skill.py <skill-name> --path <workspace>/skills [--resources scripts,references,assets] [--examples]
+   ```
+
+4. **Fill in content** — write SKILL.md, add scripts and resources. Always test scripts after writing them
+5. **Validate** (optional):
+
+   ```bash
+   scripts/quick_validate.py <workspace>/skills/<skill-name>
+   ```
+
+6. **Iterate** — keep improving based on real-world usage feedback
+
+## Naming Conventions
+
+- Use only lowercase letters, digits, and hyphens. Normalise user-given names, e.g. `Plan Mode` → `plan-mode`
+- Maximum 64 characters
+- Keep it short, start with a verb, make it self-explanatory
+- Use tool names as prefixes when appropriate, e.g. `gh-address-comments`, `linear-address-issue`
+- The directory name and the `name` field must match exactly
+
+## Three-Level Loading
+
+Skills are not loaded into context all at once — they use a three-level progressive loading mechanism:
+
+1. **Metadata** (`name` + `description`) — always in context (~100 words). The Agent uses this to decide whether to invoke the skill
+2. **SKILL.md body** — loaded only when the skill is activated; keep it under 500 lines
+3. **Resource files** — read on demand by the Agent
+
+For skills with multiple variants (e.g. multi-cloud deployment), organise like this:
+
+```
+cloud-deploy/
+├── SKILL.md             # Main workflow and provider selection logic
+└── references/
+    ├── aws.md
+    ├── gcp.md
+    └── azure.md
+```
+
+When the user picks AWS, the Agent only reads `aws.md` — no need to load all three providers.
+
+## Common Design Patterns
+
+**Step-by-step**: numbered steps with corresponding scripts.
+
+```markdown
+1. Analyse form structure (run analyze_form.py)
+2. Generate field mappings (edit fields.json)
+3. Auto-fill the form (run fill_form.py)
+```
+
+**Branching**: different flows based on user intent.
+
+```markdown
+1. Determine operation type:
+   **Creating new content?** → follow the "Create" workflow
+   **Editing existing content?** → follow the "Edit" workflow
+```
+
+**Template-based**: when output format has strict requirements, include a template in SKILL.md for the Agent to follow.
--- a/docs/en/tools/vision.mdx
+++ b/docs/en/tools/vision.mdx
@@ -27,7 +27,7 @@ If the current provider fails, the tool automatically tries the next one until i
 | Claude | Main model | Anthropic native image format |
 | Gemini | Main model | inlineData format |
 | Doubao | Main model | doubao-seed-2-0 series natively supported |
-| Kimi (Moonshot) | Main model | kimi-k2.5 natively supported |
+| Kimi (Moonshot) | Main model | kimi-k2.6, kimi-k2.5 natively supported |
 | ZhipuAI | glm-5v-turbo | Always uses dedicated vision model |
 | MiniMax | MiniMax-Text-01 | Always uses dedicated vision model |

--- a/docs/guide/manual-install.mdx
+++ b/docs/guide/manual-install.mdx
@@ -139,7 +139,8 @@ sudo docker logs -f chatgpt-on-wechat
    ```json
    {
      "channel_type": "web",
-      "model": "MiniMax-M2.7",
+      "model": "deepseek-v4-flash",
+      "deepseek_api_key": "",
      "agent": true,
      "agent_workspace": "~/cow",
      "agent_max_context_tokens": 40000,
@@ -152,8 +153,9 @@ sudo docker logs -f chatgpt-on-wechat
    ```yaml
    environment:
      CHANNEL_TYPE: 'web'
-      MODEL: 'MiniMax-M2.7'
-      MINIMAX_API_KEY: 'your-api-key'
+      MODEL: 'deepseek-v4-flash'
+      DEEPSEEK_API_KEY: 'your-api-key'
+      DEEPSEEK_API_BASE: 'https://api.deepseek.com/v1'
      AGENT: 'True'
      AGENT_MAX_CONTEXT_TOKENS: 40000
      AGENT_MAX_CONTEXT_TURNS: 30
@@ -165,7 +167,7 @@ sudo docker logs -f chatgpt-on-wechat
 | 参数 | 环境变量 | 说明 | 默认值 |
 | --- | --- | --- | --- |
 | `channel_type` | `CHANNEL_TYPE` | 接入渠道类型 | `web` |
-| `model` | `MODEL` | 模型名称 | `MiniMax-M2.5` |
+| `model` | `MODEL` | 模型名称 | `deepseek-v4-flash` |
 | `agent` | `AGENT` | 是否启用 Agent 模式 | `true` |
 | `agent_workspace` | - | Agent 工作空间路径 | `~/cow` |
 | `agent_max_context_tokens` | `AGENT_MAX_CONTEXT_TOKENS` | 最大上下文 tokens | `40000` |
--- a/docs/intro/architecture.mdx
+++ b/docs/intro/architecture.mdx
@@ -9,7 +9,7 @@ CowAgent 2.0 从简单的聊天机器人全面升级为超级智能助理，采

 CowAgent 的整体架构由以下核心模块组成：

-<img src="https://cdn.link-ai.tech/doc/68ef7b212c6f791e0e74314b912149f9-sz_5847990.png" alt="CowAgent Architecture" />
+<img src="https://cdn.link-ai.tech/doc/cow-agent-arch-zh.jpg" alt="CowAgent Architecture" />

 | 模块 | 说明 |
 | --- | --- |
@@ -69,7 +69,8 @@ Agent 的工作空间默认位于 `~/cow` 目录，用于存储系统提示词
  "agent_workspace": "~/cow",
  "agent_max_context_tokens": 40000,
  "agent_max_context_turns": 30,
-  "agent_max_steps": 15
+  "agent_max_steps": 15,
+  "enable_thinking": false
 }
 ```

@@ -80,4 +81,5 @@ Agent 的工作空间默认位于 `~/cow` 目录，用于存储系统提示词
 | `agent_max_context_tokens` | 最大上下文 token 数 | `50000` |
 | `agent_max_context_turns` | 最大上下文记忆轮次 | `20` |
 | `agent_max_steps` | 单次任务最大决策步数 | `20` |
+| `enable_thinking` | 是否启用深度思考模式 | `false` |
 | `knowledge` | 是否启用个人知识库 | `true` |
--- a/docs/intro/features.mdx
+++ b/docs/intro/features.mdx
@@ -5,16 +5,18 @@ description: CowAgent 长期记忆、个人知识库、任务规划、技能系

 ## 1. 长期记忆

-> 记忆系统让 Agent 能够长期记住重要信息。Agent 会在用户分享偏好、决策、事实等重要信息时主动存储，也会在对话达到一定长度时自动提取摘要。记忆分为核心记忆、天级记忆，支持语义搜索和向量检索的混合检索模式。
+> 记忆系统让 Agent 能够长期记住重要信息，采用三层记忆流转架构：对话上下文（短期）→ 天级记忆（中期）→ MEMORY.md（长期），形成完整的记忆生命周期。

 第一次启动 Agent 时，Agent 会主动询问关键信息，并记录至工作空间（默认 `~/cow`）中的智能体设定、用户身份、记忆文件中。

-在后续的长期对话中，Agent 会在需要时智能记录或检索记忆，并对自身设定、用户偏好、记忆文件等进行不断更新，总结和记录经验和教训，真正实现自主思考和不断成长。
+在后续的长期对话中，Agent 会在需要时智能记录或检索记忆，并对自身设定、用户偏好、记忆文件等进行不断更新。每日自动执行 **梦境蒸馏（Deep Dream）**，将分散的天级记忆整合为精炼的长期记忆，同时生成叙事风格的梦境日记。

 <Frame>
  <img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
 </Frame>

+详细说明请参考 [长期记忆](/memory) 和 [梦境蒸馏](/memory/deep-dream)。
+
 ## 2. 个人知识库

 > 知识库系统让 Agent 能够持续积累和组织结构化知识。与按时间线记录的记忆不同，知识库以主题为维度，将文章、对话洞察、学习材料等整理为互相关联的 Markdown 页面，形成持续增长的知识网络。
@@ -26,6 +28,10 @@ Agent 会在对话中自动将有价值的信息整理为知识页面，维护
 - **对话联动**：Agent 回复中引用的知识文档链接可在 Web 控制台中直接点击跳转查看
 - **CLI 管理**：通过 `/knowledge` 命令查看统计、浏览目录，通过 `/knowledge on|off` 开关功能

+<Frame>
+  <img src="https://cdn.link-ai.tech/doc/20260413105435.png" width="800" />
+</Frame>
+
 详细说明请参考 [个人知识库](/knowledge)。

 ## 3. 任务规划和工具调用
@@ -47,7 +53,7 @@ Agent 会在对话中自动将有价值的信息整理为知识页面，维护
 基于编程能力和系统访问能力，Agent 可以实现从信息搜索、图片等素材生成、编码、测试、部署、Nginx 配置修改、发布的 **Vibecoding 全流程**，通过手机端简单的一句命令完成应用的快速 demo：

 <Frame>
-  <img src="https://cdn.link-ai.tech/doc/20260203121008.png" width="800" />
+  <img src="https://cdn.link-ai.tech/doc/20260318211018.png" width="800" />
 </Frame>

 ### 3.3 定时任务
--- a/docs/intro/index.mdx
+++ b/docs/intro/index.mdx
@@ -25,7 +25,7 @@ CowAgent 支持灵活切换多种模型，能处理文本、语音、图片、
    能够理解复杂任务并自主规划执行，持续思考和调用各类工具和技能直到完成目标。
  </Card>
  <Card title="长期记忆" icon="database" href="/memory">
-    自动将对话记忆持久化至本地文件和数据库中，包括全局记忆和天级记忆，支持关键词及向量检索。
+    三层记忆流转（上下文→天级记忆→全局记忆），每日梦境蒸馏整理，支持关键词及向量检索。
  </Card>
  <Card title="个人知识库" icon="book" href="/knowledge">
    自动整理结构化知识，支持知识图谱可视化，通过交叉引用构建持续增长的知识网络。
--- a/docs/ja/README.md
+++ b/docs/ja/README.md
@@ -22,13 +22,13 @@
 > CowAgentは、すぐに使えるAIスーパーアシスタントであると同時に、高い拡張性を持つAgentフレームワークでもあります。新しいモデルインターフェース、チャネル、組み込みツール、Skillシステムを拡張することで、さまざまなカスタマイズニーズに柔軟に対応できます。

 - ✅ **自律的タスク計画**: 複雑なタスクを理解し、自律的に実行計画を立て、目標達成までツールを呼び出しながら継続的に思考します。
- ✅ **長期記憶**: 会話の記憶をローカルファイルやデータベースに自動的に永続化します。コアメモリとデイリーメモリを含み、キーワード検索やベクトル検索に対応しています。
+- ✅ **長期記憶**: 会話の記憶をローカルファイルやデータベースに自動的に永続化します。コアメモリ、デイリーメモリ、Deep Dream 蒸留を含み、キーワード検索やベクトル検索に対応しています。
 - ✅ **パーソナルナレッジベース**: 構造化された知識を自動整理し、相互参照によるナレッジグラフを構築。Web での可視化ブラウジングと対話による管理をサポートします。
 - ✅ **Skillシステム**: Skillの作成・実行エンジンを実装。[Skill Hub](https://skills.cowagent.ai)、GitHubなどからSkillをインストールでき、会話を通じたカスタムSkill作成もサポートしています。
 - ✅ **ツールシステム**: ファイル読み書き、ターミナル実行、ブラウザ操作、スケジュールタスク、メッセージ送信などの組み込みツールを提供。Agentが自律的に呼び出して複雑なタスクを完了します。
 - ✅ **CLIシステム**: ターミナルコマンドとチャットコマンドを提供し、プロセス管理、Skillインストール、設定変更などの操作をサポートします。
 - ✅ **マルチモーダルメッセージ**: テキスト、画像、音声、ファイルなど、さまざまなメッセージタイプの解析・処理・生成・送信に対応しています。
- ✅ **複数モデル対応**: OpenAI、Claude、Gemini、DeepSeek、MiniMax、GLM、Qwen、Kimi、Doubaoなど、主要なモデルプロバイダーに対応しています。
+- ✅ **複数モデル対応**: DeepSeek、MiniMax、Claude、Gemini、OpenAI、GLM、Qwen、Doubao、Kimiなど、主要なモデルプロバイダーに対応しています。
 - ✅ **マルチプラットフォームデプロイ**: ローカルPCやサーバー上で実行でき、WeChat、Web、Feishu、DingTalk、WeChat公式アカウント、WeComアプリケーションに統合可能です。

 ## 免責事項
@@ -43,6 +43,8 @@

 ## 更新履歴

+> **2026.04.14:** [v2.0.6](https://github.com/zhayujie/CowAgent/releases/tag/2.0.6) — ナレッジベース、Deep Dream 記憶蒸留、スマートコンテキスト圧縮、Web コンソールアップグレード。
+
 > **2026.04.01:** [v2.0.5](https://github.com/zhayujie/CowAgent/releases/tag/2.0.5) — Cow CLI、Skill Hubオープンソース化、ブラウザツール、WeCom Botスキャン作成など。

 > **2026.02.27:** [v2.0.2](https://github.com/zhayujie/CowAgent/releases/tag/2.0.2) — Webコンソールの全面刷新（ストリーミングチャット、モデル/Skill/メモリ/チャネル/スケジューラ/ログ管理）、マルチチャネル同時実行、セッション永続化、Gemini 3.1 Pro / Claude 4.6 Sonnet / Qwen3.5 Plusなど新モデル追加。
@@ -162,15 +164,15 @@ sudo docker logs -f chatgpt-on-wechat

 | プロバイダー | 推奨モデル |
 | --- | --- |
+| DeepSeek | `deepseek-v4-flash` |
 | MiniMax | `MiniMax-M2.7` |
-| GLM | `glm-5-turbo` |
-| Kimi | `kimi-k2.5` |
-| Doubao | `doubao-seed-2-0-code-preview-260215` |
-| Qwen | `qwen3.6-plus` |
 | Claude | `claude-sonnet-4-6` |
 | Gemini | `gemini-3.1-pro-preview` |
 | OpenAI | `gpt-5.4` |
-| DeepSeek | `deepseek-chat` |
+| GLM | `glm-5.1` |
+| Qwen | `qwen3.6-plus` |
+| Doubao | `doubao-seed-2-0-code-preview-260215` |
+| Kimi | `kimi-k2.6` |

 各モデルの詳細設定については、[モデルドキュメント](https://docs.cowagent.ai/en/models/index)を参照してください。

--- a/docs/ja/cli/general.mdx
+++ b/docs/ja/cli/general.mdx
@@ -44,17 +44,18 @@ description: ステータスの確認、設定管理、コンテキスト制御
 **設定項目を変更：**

 ```text
-/config model deepseek-chat
+/config model deepseek-v4-flash
 ```

 **変更可能な設定項目：**

 | 項目 | 説明 | 例 |
 | --- | --- | --- |
-| `model` | AI モデル名 | `deepseek-chat` |
+| `model` | AI モデル名 | `deepseek-v4-flash` |
 | `agent_max_context_tokens` | 最大コンテキストトークン数 | `40000` |
 | `agent_max_context_turns` | 最大コンテキスト記憶ターン数 | `30` |
 | `agent_max_steps` | タスクごとの最大判断ステップ数 | `15` |
+| `enable_thinking` | ディープシンキングモードの有効化 | `true` / `false` |

 <Note>
  `model` を変更すると、システムが対応するモデル API を自動的にマッチングします。設定は `config.json` に永続的に保存されます。
@@ -92,31 +93,6 @@ description: ステータスの確認、設定管理、コンテキスト制御
 /logs 50
 ```

-## knowledge
-
-パーソナルナレッジベースの表示と管理を行います。デフォルトでは統計情報を表示します。
-
-```text
-/knowledge
-```
-
-**ディレクトリ構造を表示：**
-
-```text
-/knowledge list
-```
-
-**ナレッジベースの有効化・無効化：**
-
-```text
-/knowledge on
-/knowledge off
-```
-
-<Note>
-  ターミナル CLI では `cow knowledge` と `cow knowledge list` が利用可能ですが、`on|off` はチャットでのみサポートされます（実行時に即座に反映するため）。
-</Note>
-
 ## version

 現在の CowAgent のバージョンを表示します。
--- a/docs/ja/cli/index.mdx
+++ b/docs/ja/cli/index.mdx
@@ -40,7 +40,8 @@ Service:
 Skills:
  skill     Manage skills (list / search / install / uninstall ...)

-Knowledge:
+Memory & Knowledge:
+  memory    Memory distillation (dream)
  knowledge View knowledge base stats and structure

 Others:
@@ -58,6 +59,7 @@ Web コンソールや接続されたチャネルの会話で `/` を入力す
 | `/status` | サービスの状態と設定を表示 |
 | `/config` | 実行時設定の表示・変更 |
 | `/skill` | スキル管理（インストール、アンインストール、有効化、無効化など） |
+| `/memory dream [N]` | 記憶蒸留を手動トリガー（デフォルト 3 日、最大 30） |
 | `/knowledge` | ナレッジベースの統計情報を表示 |
 | `/knowledge list` | ナレッジベースのディレクトリ構造を表示 |
 | `/knowledge on\|off` | ナレッジベースの有効化・無効化 |
@@ -80,6 +82,7 @@ Web コンソールや接続されたチャネルの会話で `/` を入力す
 | logs | ✓ | ✓ |
 | config | ✗ | ✓ |
 | context | — | ✓ |
+| memory（サブコマンド） | ✗ | ✓ |
 | knowledge（サブコマンド） | ✓ | ✓ |
 | skill（サブコマンド） | ✓ | ✓ |
 | start / stop / restart | ✓ | ✗ |
--- a/docs/ja/cli/memory-knowledge.mdx
+++ b/docs/ja/cli/memory-knowledge.mdx
@@ -0,0 +1,63 @@
+---
+title: 記憶とナレッジベース
+description: 記憶蒸留とナレッジベース管理コマンド
+---
+
+## memory
+
+Agent の長期記憶システムを管理します。
+
+### memory dream
+
+記憶蒸留（Deep Dream）を手動でトリガーします — 最近の日次記憶を整理し、MEMORY.md に統合し、夢日記を生成します。
+
+```text
+/memory dream [N]
+```
+
+- `N`：直近 N 日間の記憶を整理（デフォルト 3 日、最大 30 日）
+- バックグラウンドで非同期に実行され、完了するとチャットで通知されます
+- Agent の初期化不要 — 最初の会話前でも使用可能
+
+**例：**
+
+```text
+/memory dream       # 直近 3 日間を整理
+/memory dream 7     # 直近 7 日間を整理
+/memory dream 30    # 直近 30 日間を整理（全量）
+```
+
+Web コンソールでは、完了通知にクリック可能なリンクが含まれ、更新された MEMORY.md と夢日記を直接確認できます。
+
+<Tip>
+  システムは毎日 23:55 に自動で蒸留を実行します（lookback 1 日）。手動トリガーは、初回デプロイ後の履歴整理や、即座に記憶を更新したい場合に使用します。
+</Tip>
+
+## knowledge
+
+パーソナルナレッジベースの表示と管理。デフォルトで統計情報を表示します。
+
+```text
+/knowledge
+```
+
+### knowledge list
+
+ナレッジベースのディレクトリツリーを表示します。
+
+```text
+/knowledge list
+```
+
+### knowledge on / off
+
+ナレッジベースの有効化・無効化。無効化すると、ナレッジプロンプトとファイルインデックスが注入されなくなります。
+
+```text
+/knowledge on
+/knowledge off
+```
+
+<Note>
+  ターミナル CLI では `cow knowledge` と `cow knowledge list` が利用可能ですが、`on|off` はチャットでのみサポートされます（ランタイム効果が必要なため）。
+</Note>
--- a/docs/ja/guide/manual-install.mdx
+++ b/docs/ja/guide/manual-install.mdx
@@ -121,7 +121,8 @@ sudo docker logs -f chatgpt-on-wechat
 ```json
 {
  "channel_type": "web",
-  "model": "MiniMax-M2.5",
+  "model": "deepseek-v4-flash",
+  "deepseek_api_key": "",
  "agent": true,
  "agent_workspace": "~/cow",
  "agent_max_context_tokens": 40000,
@@ -133,7 +134,7 @@ sudo docker logs -f chatgpt-on-wechat
 | パラメータ | 説明 | デフォルト値 |
 | --- | --- | --- |
 | `channel_type` | チャネルタイプ | `web` |
-| `model` | モデル名 | `MiniMax-M2.5` |
+| `model` | モデル名 | `deepseek-v4-flash` |
 | `agent` | Agent モードを有効化 | `true` |
 | `agent_workspace` | Agent のワークスペースパス | `~/cow` |
 | `agent_max_context_tokens` | 最大コンテキストトークン数 | `40000` |
--- a/docs/ja/intro/architecture.mdx
+++ b/docs/ja/intro/architecture.mdx
@@ -9,7 +9,7 @@ CowAgent 2.0 は、シンプルなチャットボットから、自律的な思

 CowAgent のアーキテクチャは以下のコアモジュールで構成されています：

-<img src="https://cdn.link-ai.tech/doc/68ef7b212c6f791e0e74314b912149f9-sz_5847990.png" alt="CowAgent Architecture" />
+<img src="https://cdn.link-ai.tech/doc/cow-agent-arch-en.jpg.jpg" alt="CowAgent Architecture" />

 | モジュール | 説明 |
 | --- | --- |
--- a/docs/ja/intro/features.mdx
+++ b/docs/ja/intro/features.mdx
@@ -5,16 +5,18 @@ description: CowAgent の長期記憶、タスク計画、Skill システム、C

 ## 1. 長期記憶

-記憶システムにより、Agent は重要な情報を長期にわたって記憶できます。ユーザーが好みや決定、重要な事実を共有すると、Agent は自発的に情報を保存し、会話が一定の長さに達すると自動的に要約を抽出します。記憶はコアメモリとデイリーメモリに分かれており、キーワード検索とベクトル検索の両方をサポートするハイブリッド検索が可能です。
+記憶システムにより、Agent は重要な情報を長期にわたって記憶できます。三層記憶フローを採用：会話コンテキスト（短期）→ デイリーメモリ（中期）→ MEMORY.md（長期）、完全な記憶ライフサイクルを形成します。

 初回起動時、Agent はユーザーに重要な情報を自発的に尋ね、ワークスペース（デフォルト `~/cow`）に記録します。これには Agent の設定、ユーザーの身元情報、記憶ファイルが含まれます。

-その後の長期的な会話において、Agent は必要に応じてインテリジェントに記憶を保存・取得し、自身の設定やユーザーの好み、記憶ファイルを継続的に更新し、経験と教訓を要約します。これにより、真に自律的な思考と継続的な成長を実現しています。
+その後の長期的な会話において、Agent は必要に応じてインテリジェントに記憶を保存・取得し、自身の設定やユーザーの好み、記憶ファイルを継続的に更新します。毎日 **Deep Dream（夢境蒸留）** が自動実行され、散在するデイリーメモリを精製された長期記憶に統合し、ナラティブスタイルの夢日記を生成します。

 <Frame>
  <img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
 </Frame>

+詳細は [長期記憶](/ja/memory) と [Deep Dream](/ja/memory/deep-dream) を参照してください。
+
 ## 2. パーソナルナレッジベース

 > ナレッジベースシステムにより、Agent は構造化された知識を継続的に蓄積・整理できます。時系列で記録されるメモリとは異なり、ナレッジベースはトピック別に整理され、記事、会話からの洞察、学習資料などを相互にリンクされた Markdown ページとして整理し、継続的に成長するナレッジネットワークを形成します。
@@ -26,6 +28,10 @@ Agent は会話中に価値ある情報を自動的にナレッジページと
 - **チャット連携**：Agent の回答で参照されるナレッジドキュメントのリンクを Web コンソールで直接クリックして閲覧可能
 - **CLI 管理**：`/knowledge` コマンドで統計表示、ディレクトリ閲覧、`/knowledge on|off` で機能の切り替えが可能

+<Frame>
+  <img src="https://cdn.link-ai.tech/doc/20260413105435.png" width="800" />
+</Frame>
+
 詳細は [パーソナルナレッジベース](/ja/knowledge) を参照してください。

 ## 3. タスク計画とツール活用
@@ -47,7 +53,7 @@ OS のターミナルとファイルシステムへのアクセスは、最も
 プログラミングとシステムアクセスを組み合わせることで、Agent は完全な **Vibecoding ワークフロー** を実行できます。情報検索、アセット生成、コーディング、テスト、デプロイ、Nginx 設定、公開まで、すべてスマートフォンからの一つのコマンドで実行可能です：

 <Frame>
-  <img src="https://cdn.link-ai.tech/doc/20260203121008.png" width="800" />
+  <img src="https://cdn.link-ai.tech/doc/20260318211018.png" width="800" />
 </Frame>

 ### 3.3 スケジュールタスク
--- a/docs/ja/intro/index.mdx
+++ b/docs/ja/intro/index.mdx
@@ -20,7 +20,7 @@ CowAgent は自ら思考しタスクを計画し、コンピュータや外部
    複雑なタスクを理解し、自律的に実行計画を立て、目標が達成されるまで思考とツール呼び出しを続けます。ツールを通じてファイルシステム、ターミナル、ブラウザ、スケジューラなどのシステムリソースにアクセスできます。
  </Card>
  <Card title="長期記憶" icon="database" href="/ja/memory">
-    会話の記憶をローカルファイルやデータベースに自動的に永続化します。コアメモリとデイリーメモリを含み、キーワード検索とベクトル検索に対応しています。
+    三層記憶フロー（コンテキスト→デイリーメモリ→グローバルメモリ）、毎日 Deep Dream 蒸留で整理、キーワード検索とベクトル検索に対応。
  </Card>
  <Card title="ナレッジベース" icon="book" href="/ja/knowledge">
    構造化された知識を自動整理し、ナレッジグラフの可視化をサポート。相互参照により継続的に成長するナレッジネットワークを構築します。
--- a/docs/ja/memory/context.mdx
+++ b/docs/ja/memory/context.mdx
@@ -39,14 +39,15 @@ description: 会話コンテキスト — メッセージ管理、圧縮戦略

 - **最も古い半分** の完全なターンがトリミングされます（ツール呼び出しチェーンの完全性を保証）
 - トリミングされたメッセージは LLM によって要約され、**日次記憶ファイルに書き込まれます**
- 残りのターンはそのまま保持されます
+- LLM 要約が完了すると、保持されたコンテキストの最初のユーザーメッセージの先頭に要約が**注入**され、モデルが会話の文脈を維持できるようにします
+- 要約注入はバックグラウンドで非同期に実行され、次のターンから有効になります

 ### 3. トークン予算のトリミング

 ターンのトリミング後、トークン数がまだ予算を超えている場合：

 - **5 ターン未満の場合**：すべてのターンで**テキスト圧縮**を実行 — 各ターンは最初のユーザーテキストと最後の Agent 返信のみを保持し、中間のツール呼び出しチェーンを削除
- **5 ターン以上の場合**：**前半のターン**を再度トリミングし、破棄されたコンテンツも記憶に書き込まれます
+- **5 ターン以上の場合**：**前半のターン**を再度トリミングし、破棄されたコンテンツも記憶に書き込まれ、コンテキスト要約も注入されます

 ### 4. オーバーフロー緊急処理

--- a/docs/ja/memory/deep-dream.mdx
+++ b/docs/ja/memory/deep-dream.mdx
@@ -0,0 +1,90 @@
+---
+title: 夢境蒸留
+description: Deep Dream — 会話から永続記憶への自動蒸留メカニズム
+---
+
+夢境蒸留（Deep Dream）は CowAgent の記憶システムの中核的な整理メカニズムであり、散在する日次記憶を精錬された長期記憶に蒸留し、夢日記を生成します。
+
+## 記憶の流れ
+
+CowAgent の記憶は短期から長期まで 3 つの段階を経ます：
+
+```
+会話コンテキスト（短期）→ 日次記憶（中期）→ MEMORY.md（長期）
+```
+
+### 1. 会話 → 日次記憶
+
+会話コンテキストがトリミングされた時、または毎日のスケジュール要約時に、LLM が会話内容を重要イベントに要約し、日次記憶ファイル `memory/YYYY-MM-DD.md` に書き込みます。
+
+トリガー：
+- **コンテキストトリミング** — ターン数またはトークン制限を超えた時、トリミングされた内容が要約されます
+- **毎日のスケジュール** — 23:55 に自動トリガー
+- **API オーバーフロー** — 現在の会話要約の緊急保存
+
+### 2. 日次記憶 → MEMORY.md（蒸留）
+
+毎日の要約完了後、Deep Dream が自動的に蒸留を実行します：
+
+1. **材料の読み込み** — 現在の `MEMORY.md` + 当日の日次記憶
+2. **LLM 蒸留** — 重複排除、統合、剪定、新情報の抽出
+3. **MEMORY.md の上書き** — 精錬された長期記憶を出力
+4. **夢日記の生成** — 整理過程の発見と洞察を記録
+
+### 3. MEMORY.md の役割
+
+`MEMORY.md` は毎回の会話のシステムプロンプトに注入され、Agent がユーザーの好み、決定、重要な事実を常に把握できるようにします。そのため簡潔に保つ必要があり、Deep Dream は約 30 項目以内に制御します。
+
+## 蒸留ルール
+
+Deep Dream は以下の整理ルールに従います：
+
+| 操作 | 説明 |
+| --- | --- |
+| **統合・精錬** | 類似する複数の項目を 1 つの高密度な表現に統合 |
+| **新規抽出** | 日次記憶から好み、決定、人物、経験を抽出 |
+| **矛盾更新** | 新情報が古い項目と矛盾する場合、新情報を優先 |
+| **無効クリーン** | 一時的な記録、空白項目、フォーマット残留を削除 |
+| **冗長削除** | より精錬された表現でカバーされた古い項目を削除 |
+
+## 夢日記
+
+各蒸留で夢日記が生成され、`memory/dreams/YYYY-MM-DD.md` に保存されます。ナラティブスタイルで以下を記録します：
+
+- 発見された重複や矛盾
+- 日次記憶から抽出された新しい洞察
+- 実行されたクリーンアップと最適化
+- 全体的な観察
+
+夢日記は Web コンソールの「メモリ管理 → 夢日記」タブで確認できます。
+
+<Frame>
+  <img src="https://cdn.link-ai.tech/doc/20260414110032.png" width="800" />
+</Frame>
+
+## 手動トリガー
+
+毎日の自動実行に加えて、チャットで手動トリガーできます：
+
+```text
+/memory dream [N]
+```
+
+- `N`：直近 N 日間の記憶を整理（デフォルト 3 日、最大 30 日）
+- バックグラウンドで非同期に実行され、完了するとチャットで通知されます
+- Web 通知にはクリック可能なリンクが含まれ、MEMORY.md と夢日記を直接確認できます
+- Agent の初期化不要 — 最初の会話前でも使用可能
+
+<Tip>
+  初回デプロイ後は `/memory dream 30` を一度実行して、すべての履歴日次記憶を MEMORY.md に蒸留することをお勧めします。
+</Tip>
+
+## 安全メカニズム
+
+| メカニズム | 説明 |
+| --- | --- |
+| **コンテンツなしでスキップ** | 日次記憶がない場合、蒸留をスキップし空の上書きを回避 |
+| **入力重複排除** | スケジュールタスクで、入力材料が変更されていない場合自動スキップ |
+| **非同期実行** | 蒸留はバックグラウンドスレッドで実行、会話をブロックしない |
+| **順序保証** | スケジュールタスクで、日次フラッシュ完了後に蒸留を開始 |
+| **捏造禁止** | プロンプトで既存の材料のみに基づく整理を明示的に制約 |
--- a/docs/ja/memory/index.mdx
+++ b/docs/ja/memory/index.mdx
@@ -5,6 +5,8 @@ description: CowAgent の長期記憶システム — ファイル永続化、

 長期記憶はワークスペースのファイルに保存され、セッション間で永続化されます。Agent は会話中に検索ツールを通じて過去の記憶をオンデマンドで読み込み、コンテキストのトリミング時に会話の要約を自動的に長期記憶に書き込みます。

+<img src="https://cdn.link-ai.tech/doc/memory-architecture-en.jpg" alt="Memory Architecture" />
+
 ## 記憶の種類

 ### コア記憶（MEMORY.md）
@@ -15,29 +17,40 @@ description: CowAgent の長期記憶システム — ファイル永続化、

 `~/cow/memory/` ディレクトリに保存され、日付で命名されます（例：`2026-03-08.md`）。日々の会話の要約と主要なイベントを記録します。空ファイルの生成を避けるため、最初の書き込み時にのみファイルが作成されます。

+### 夢日記（memory/dreams/YYYY-MM-DD.md）
+
+Deep Dream（記憶蒸留）プロセスの副産物で、各整理で発見された重複、統合操作、新しい洞察を記録します。`~/cow/memory/dreams/` ディレクトリに日付で命名されて保存されます。
+
 ## 自動書き込み

 Agent は以下のメカニズムにより、会話内容を長期記憶に自動的に永続化します：

- **コンテキストトリミング時** — 会話ターン数またはトークン数が設定上限を超えた場合、最も古い半分のコンテキストがトリミングされ、LLM によって要約されて日次記憶ファイルに書き込まれます
+- **コンテキストトリミング時** — 会話ターン数またはトークン数が設定上限を超えた場合、最も古い半分のコンテキストがトリミングされ、LLM によって要約されて日次記憶ファイルに書き込まれます。要約は保持されたコンテキストにも非同期で注入され、会話の連続性を維持します
 - **毎日のスケジュール要約** — 毎日 23:55 に自動的にフル要約がトリガーされ、アクティビティが少ない日でも記憶が保存されます（内容が変更されていない場合はスキップ）
+- **[夢境蒸留（Deep Dream）](/ja/memory/deep-dream)** — 毎日の要約完了後に自動実行され、日次記憶を MEMORY.md に蒸留し、夢日記を生成します
 - **API コンテキストオーバーフロー時** — モデル API がコンテキストオーバーフローエラーを返した場合、緊急措置として現在の会話要約が保存されます

 すべての記憶書き込みはバックグラウンドスレッドで非同期に実行され（LLM の要約 + ファイル書き込み）、通常の会話応答をブロックしません。

-## 初回起動
+## 関連ファイル

-初回起動時に、Agent はユーザーに主要な情報を積極的に尋ね、ワークスペース（デフォルト `~/cow`）に保存します：
+ワークスペース（デフォルト `~/cow`）内の記憶関連ファイル：

 | ファイル | 説明 |
 | --- | --- |
-| `system.md` | Agent のシステムプロンプトと動作設定 |
-| `user.md` | ユーザーの身元情報と好み |
+| `AGENT.md` | Agent のパーソナリティと動作設定 |
+| `USER.md` | ユーザーの身元情報と好み |
+| `RULE.md` | カスタムルールと制約 |
 | `MEMORY.md` | コア記憶（長期） |
 | `memory/YYYY-MM-DD.md` | 日次記憶（オンデマンドで作成） |
+| `memory/dreams/YYYY-MM-DD.md` | 夢日記（Deep Dream で自動生成） |
+
+## Web コンソール
+
+Web コンソールの記憶管理ページで、記憶ファイルと夢日記を閲覧できます。タブ切り替えに対応：

 <Frame>
-  <img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
+  <img src="https://cdn.link-ai.tech/doc/20260414171014.png" width="800" />
 </Frame>

 ## 設定
--- a/docs/ja/models/claude.mdx
+++ b/docs/ja/models/claude.mdx
@@ -12,6 +12,6 @@ description: Claudeモデルの設定

 | パラメータ | 説明 |
 | --- | --- |
-| `model` | `claude-sonnet-4-6`、`claude-opus-4-6`、`claude-sonnet-4-5`、`claude-sonnet-4-0`、`claude-3-5-sonnet-latest`などから選択可能。[公式モデル一覧](https://docs.anthropic.com/en/docs/about-claude/models/overview)を参照 |
+| `model` | `claude-sonnet-4-6`、`claude-opus-4-7`、`claude-opus-4-6`、`claude-sonnet-4-5`、`claude-sonnet-4-0`、`claude-3-5-sonnet-latest`などから選択可能。[公式モデル一覧](https://docs.anthropic.com/en/docs/about-claude/models/overview)を参照 |
 | `claude_api_key` | [Claude Console](https://console.anthropic.com/settings/keys)で作成 |
 | `claude_api_base` | 任意。デフォルトは`https://api.anthropic.com/v1`。サードパーティプロキシを使用する場合に変更 |
--- a/docs/ja/models/coding-plan.mdx
+++ b/docs/ja/models/coding-plan.mdx
@@ -102,18 +102,18 @@ description: Coding Planモデルの設定

 ```json
 {
-  "bot_type": "openai",
+  "bot_type": "moonshot",
  "model": "kimi-for-coding",
-  "open_ai_api_base": "https://api.kimi.com/coding/v1",
-  "open_ai_api_key": "YOUR_API_KEY"
+  "moonshot_base_url": "https://api.kimi.com/coding/v1",
+  "moonshot_api_key": "YOUR_API_KEY"
 }
 ```

 | パラメータ | 説明 |
 | --- | --- |
-| `model` | `kimi-for-coding` |
-| `open_ai_api_base` | `https://api.kimi.com/coding/v1` |
-| `open_ai_api_key` | Coding Plan専用キー（従量課金とは共有不可） |
+| `model` | `kimi-for-coding`で自動更新モデル、または`kimi-k2.6`などのモデルを指定 |
+| `moonshot_base_url` | `https://api.kimi.com/coding/v1` |
+| `moonshot_api_key` | Coding Plan専用キー（従量課金とは共有不可） |

 参考: [キー & ドキュメント](https://www.kimi.com/code/docs/)

--- a/docs/ja/models/custom.mdx
+++ b/docs/ja/models/custom.mdx
@@ -0,0 +1,62 @@
+---
+title: カスタム
+description: サードパーティAPIやローカルモデル向けのカスタムプロバイダー設定
+---
+
+OpenAI互換プロトコルでアクセスするモデルサービスに適用します：
+
+- **サードパーティAPIプロキシ**：統一APIベースで複数モデルを呼び出し
+- **ローカルモデル**：Ollama、vLLM、LocalAIなどでローカルにデプロイされたモデル
+- **プライベートデプロイ**：組織内でホストされたモデルサービス
+
+<Note>
+  `openai` プロバイダーとの違い：カスタムプロバイダーでは `/config model` でモデルを切り替えてもプロバイダータイプは自動切り替えされず、カスタムAPIアドレスが常に保持されます。
+</Note>
+
+## 設定方法
+
+### サードパーティAPIプロキシ
+
+```json
+{
+  "bot_type": "custom",
+  "model": "deepseek-v4-flash",
+  "custom_api_key": "YOUR_API_KEY",
+  "custom_api_base": "https://{your-proxy.com}/v1"
+}
+```
+
+| パラメータ | 説明 |
+| --- | --- |
+| `bot_type` | `custom` に設定必須 |
+| `model` | モデル名、プロキシサービスがサポートする任意のモデル名 |
+| `custom_api_key` | プロキシサービスが提供するAPIキー |
+| `custom_api_base` | APIアドレス、OpenAI互換プロトコルが必要 |
+
+### ローカルモデル
+
+ローカルモデルは通常APIキー不要で、APIベースのみ設定します：
+
+```json
+{
+  "bot_type": "custom",
+  "model": "qwen3.5:27b",
+  "custom_api_base": "http://localhost:11434/v1"
+}
+```
+
+一般的なローカルデプロイツールとデフォルトアドレス：
+
+| ツール | デフォルトAPIベース |
+| --- | --- |
+| [Ollama](https://ollama.com) | `http://localhost:11434/v1` |
+| [vLLM](https://docs.vllm.ai) | `http://localhost:8000/v1` |
+| [LocalAI](https://localai.io) | `http://localhost:8080/v1` |
+
+## モデル切り替え
+
+カスタムプロバイダーではモデル切り替え時に `model` のみ変更され、`bot_type` やAPIアドレスは変わりません：
+
+```
+/config model qwen3.5:27b
+```
--- a/docs/ja/models/deepseek.mdx
+++ b/docs/ja/models/deepseek.mdx
@@ -7,22 +7,55 @@ description: DeepSeekモデルの設定

 ```json
 {
-  "model": "deepseek-chat",
+  "model": "deepseek-v4-flash",
  "deepseek_api_key": "YOUR_API_KEY"
 }
 ```

 | パラメータ | 説明 |
 | --- | --- |
-| `model` | `deepseek-chat`（DeepSeek-V3.2、非思考モード）、`deepseek-reasoner`（DeepSeek-R1、思考モード） |
-| `deepseek_api_key` | [DeepSeek Platform](https://platform.deepseek.com/api_keys)で作成 |
+| `model` | `deepseek-v4-flash`（デフォルト）、`deepseek-v4-pro` をサポート |
+| `deepseek_api_key` | [DeepSeek Platform](https://platform.deepseek.com/api_keys) で作成 |
 | `deepseek_api_base` | オプション、デフォルトは `https://api.deepseek.com/v1`。サードパーティプロキシに変更可能 |

+## モデルの選び方
+
+| モデル | 適用シーン |
+| --- | --- |
+| `deepseek-v4-flash` | デフォルト推奨、高速・低コスト |
+| `deepseek-v4-pro` | 複雑なタスクでより強力 |
+
+## 思考モード
+
+V4シリーズ（`deepseek-v4-flash` / `deepseek-v4-pro`）は明示的な「思考モード」をサポートします。最終回答の前に思考内容（`reasoning_content`）を出力することで、回答品質を高めます。
+
+### スイッチ
+
+グローバル設定 `enable_thinking` で制御します：
+
+```json
+{
+  "enable_thinking": true
+}
+```
+
+- `true`：すべてのチャネルで思考モードがオン。Webコンソールでは思考過程を表示し、IMチャネル（WeChat / WeCom / DingTalk / Feishu）では表示されないものの、回答品質の向上というメリットを得られます。
+- `false`：思考オフ、応答が速く、初回トークンの遅延も低くなります。
+
+### 注意事項
+
+- **サンプリングパラメータ**：思考モード時は `temperature`、`top_p`、`presence_penalty`、`frequency_penalty` がサーバ側で無視されます（エラーにはなりません）。CowAgentは自動的に送信をスキップします。
+- **マルチターンのツール呼び出し**：履歴にツール呼び出しが含まれる場合、DeepSeekはすべてのassistantメッセージに `reasoning_content` を返送するよう要求します。CowAgentが自動でラウンドトリップ処理を行うため、セッション途中で思考スイッチを切り替えてもエラーになりません。
+
+<Tip>
+  通常は `deepseek-v4-flash` を使い、難しいタスクでは `deepseek-v4-pro` に切り替え、深い思考が必要な時は `enable_thinking` を有効にしてください。
+</Tip>
+
 方法2：OpenAI互換方式：

 ```json
 {
-  "model": "deepseek-chat",
+  "model": "deepseek-v4-flash",
  "bot_type": "openai",
  "open_ai_api_key": "YOUR_API_KEY",
  "open_ai_api_base": "https://api.deepseek.com/v1"
--- a/docs/ja/models/glm.mdx
+++ b/docs/ja/models/glm.mdx
@@ -5,14 +5,14 @@ description: 智谱AI GLMモデルの設定

 ```json
 {
-  "model": "glm-5-turbo",
+  "model": "glm-5.1",
  "zhipu_ai_api_key": "YOUR_API_KEY"
 }
 ```

 | パラメータ | 説明 |
 | --- | --- |
-| `model` | `glm-5-turbo`、`glm-5`、`glm-4.7`、`glm-4-plus`、`glm-4-flash`、`glm-4-air`などから選択可能。[モデルコード](https://bigmodel.cn/dev/api/normal-model/glm-4)を参照 |
+| `model` | `glm-5.1`、`glm-5-turbo`、`glm-5`、`glm-4.7`、`glm-4-plus`、`glm-4-flash`、`glm-4-air`などから選択可能。[モデルコード](https://bigmodel.cn/dev/api/normal-model/glm-4)を参照 |
 | `zhipu_ai_api_key` | [智谱AI Console](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys)で作成 |

 OpenAI互換の設定もサポートしています:
@@ -20,7 +20,7 @@ OpenAI互換の設定もサポートしています:
 ```json
 {
  "bot_type": "openai",
-  "model": "glm-5-turbo",
+  "model": "glm-5.1",
  "open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
  "open_ai_api_key": "YOUR_API_KEY"
 }
--- a/docs/ja/models/index.mdx
+++ b/docs/ja/models/index.mdx
@@ -6,7 +6,7 @@ description: CowAgentがサポートするモデルとおすすめの選択肢
 CowAgentは国内外の主要なLLMをサポートしています。モデルインターフェースはプロジェクトの`models/`ディレクトリに実装されています。

 <Note>
-  Agent モードでは、品質とコストのバランスから以下のモデルをおすすめします: MiniMax-M2.7、glm-5-turbo、kimi-k2.5、qwen3.6-plus、claude-sonnet-4-6、gemini-3.1-pro-preview
+  Agent モードでは、品質とコストのバランスから以下のモデルをおすすめします: deepseek-v4-flash、MiniMax-M2.7、claude-sonnet-4-6、gemini-3.1-pro-preview、glm-5.1、qwen3.6-plus、kimi-k2.6
 </Note>

 ## 設定
@@ -18,21 +18,12 @@ CowAgentは国内外の主要なLLMをサポートしています。モデルイ
 ## サポートモデル

 <CardGroup cols={2}>
+  <Card title="DeepSeek" href="/ja/models/deepseek">
+    deepseek-v4-flash、deepseek-v4-pro など
+  </Card>
  <Card title="MiniMax" href="/ja/models/minimax">
    MiniMax-M2.7およびその他のシリーズモデル
  </Card>
-  <Card title="GLM (智谱AI)" href="/ja/models/glm">
-    glm-5-turbo、glm-5およびその他のシリーズモデル
-  </Card>
-  <Card title="Qwen (通义千问)" href="/ja/models/qwen">
-    qwen3.6-plus、qwen3-maxなど
-  </Card>
-  <Card title="Kimi" href="/ja/models/kimi">
-    kimi-k2.5、kimi-k2など
-  </Card>
-  <Card title="Doubao (ByteDance)" href="/ja/models/doubao">
-    doubao-seedシリーズモデル
-  </Card>
  <Card title="Claude" href="/ja/models/claude">
    claude-sonnet-4-6など
  </Card>
@@ -42,8 +33,17 @@ CowAgentは国内外の主要なLLMをサポートしています。モデルイ
  <Card title="OpenAI" href="/ja/models/openai">
    gpt-5.4、gpt-4.1、oシリーズなど
  </Card>
-  <Card title="DeepSeek" href="/ja/models/deepseek">
-    deepseek-chat、deepseek-reasoner
+  <Card title="GLM (智谱AI)" href="/ja/models/glm">
+    glm-5.1、glm-5-turbo、glm-5およびその他のシリーズモデル
+  </Card>
+  <Card title="Qwen (通义千问)" href="/ja/models/qwen">
+    qwen3.6-plus、qwen3-maxなど
+  </Card>
+  <Card title="Doubao (ByteDance)" href="/ja/models/doubao">
+    doubao-seedシリーズモデル
+  </Card>
+  <Card title="Kimi" href="/ja/models/kimi">
+    kimi-k2.6、kimi-k2.5、kimi-k2など
  </Card>
  <Card title="LinkAI" href="/ja/models/linkai">
    統合マルチモデルインターフェース + ナレッジベース
--- a/docs/ja/models/kimi.mdx
+++ b/docs/ja/models/kimi.mdx
@@ -5,14 +5,14 @@ description: Kimi (Moonshot) モデルの設定

 ```json
 {
-  "model": "kimi-k2.5",
+  "model": "kimi-k2.6",
  "moonshot_api_key": "YOUR_API_KEY"
 }
 ```

 | パラメータ | 説明 |
 | --- | --- |
-| `model` | `kimi-k2.5`、`kimi-k2`、`moonshot-v1-8k`、`moonshot-v1-32k`、`moonshot-v1-128k`から選択可能 |
+| `model` | `kimi-k2.6`、`kimi-k2.5`、`kimi-k2`、`moonshot-v1-8k`、`moonshot-v1-32k`、`moonshot-v1-128k`から選択可能 |
 | `moonshot_api_key` | [Moonshot Console](https://platform.moonshot.cn/console/api-keys)で作成 |

 OpenAI互換の設定もサポートしています:
@@ -20,7 +20,7 @@ OpenAI互換の設定もサポートしています:
 ```json
 {
  "bot_type": "openai",
-  "model": "kimi-k2.5",
+  "model": "kimi-k2.6",
  "open_ai_api_base": "https://api.moonshot.cn/v1",
  "open_ai_api_key": "YOUR_API_KEY"
 }
--- a/docs/ja/models/linkai.mdx
+++ b/docs/ja/models/linkai.mdx
@@ -3,7 +3,7 @@ title: LinkAI
 description: LinkAIプラットフォームで複数モデルに統合アクセス
 ---

-[LinkAI](https://link-ai.tech)プラットフォームでは、OpenAI、Claude、Gemini、DeepSeek、Qwen、Kimiなどのモデルを柔軟に切り替えることができ、ナレッジベース、ワークフロー、プラグイン、その他のAgent機能をサポートしています。
+[LinkAI](https://link-ai.tech)プラットフォームでは、OpenAI、Claude、Gemini、DeepSeek、MiniMax、Qwen、Kimiなどのモデルを柔軟に切り替えることができ、ナレッジベース、ワークフロー、プラグイン、その他のAgent機能をサポートしています。

 ```json
 {
--- a/docs/ja/releases/overview.mdx
+++ b/docs/ja/releases/overview.mdx
@@ -5,6 +5,8 @@ description: CowAgent バージョン履歴

 | バージョン | 日付 | 説明 |
 | --- | --- | --- |
+| [2.0.7](/ja/releases/v2.0.7) | 2026.04.22 | 画像生成スキル（6プロバイダー自動ルーティング）、新モデル（Kimi K2.6、Claude Opus 4.7、GLM 5.1）、ナレッジベースと Web コンソールの改善 |
+| [2.0.6](/ja/releases/v2.0.6) | 2026.04.14 | ナレッジベース、Deep Dream 記憶蒸留、スマートコンテキスト圧縮、Web コンソールアップグレード |
 | [2.0.5](/ja/releases/v2.0.5) | 2026.04.01 | Cow CLI、Skill Hub オープンソース、ブラウザツール、企業微信スキャン作成、その他改善 |
 | [2.0.4](/ja/releases/v2.0.4) | 2026.03.22 | 個人WeChatチャネル追加、新モデルサポート、日本語ドキュメント、スクリプトリファクタリングおよび複数修正 |
 | [2.0.2](/ja/releases/v2.0.2) | 2026.02.27 | Web Console アップグレード、マルチチャネル同時実行、セッション永続化 |
--- a/docs/ja/releases/v2.0.6.mdx
+++ b/docs/ja/releases/v2.0.6.mdx
@@ -0,0 +1,83 @@
+---
+title: v2.0.6
+description: CowAgent 2.0.6 - ナレッジベース、Deep Dream 記憶蒸留、スマートコンテキスト圧縮、Web コンソールマルチセッションなど
+---
+
+## プロジェクト名を CowAgent に改称
+
+リポジトリが `chatgpt-on-wechat` から **CowAgent** に正式改称され、フル機能の AI Agent アシスタントへ進化しました。
+
+- 新アドレス：[github.com/zhayujie/CowAgent](https://github.com/zhayujie/CowAgent)、旧アドレスは GitHub が自動リダイレクト
+- CLI コマンド、設定ファイル、ドキュメントリンクはすべて互換性を維持、追加操作は不要
+
+## 📚 ナレッジベース
+
+新しいパーソナルナレッジベースシステム — Agent が構造化された知識を自律的に構築・維持し、会話中に必要に応じて検索・引用：
+
+- **インデックス駆動の自己組織構造**：ナレッジは `knowledge/` ディレクトリに保存、カテゴリ別に自動整理、各ナレッジページは独立した Markdown ファイル
+- **自動書き込み**：Agent にファイルやリンクなどの知識を送信、または会話で価値ある情報を識別した際にナレッジページを自動作成・更新
+- **ハイブリッド検索**：キーワード全文検索とベクトル意味検索をサポート、会話中に関連ナレッジをオンデマンドで読み込み
+- **ビジュアライゼーション**：ファイルツリー閲覧とナレッジグラフの可視化、ドキュメント内リンクで直接ナビゲーション
+- **コマンド管理**：`/knowledge` で統計表示、`/knowledge list` でディレクトリ構造、`/knowledge on|off` でオン・オフ
+
+<img src="https://cdn.link-ai.tech/doc/20260413105435.png" width="750" />
+
+
+ドキュメント：[ナレッジベース](https://docs.cowagent.ai/ja/knowledge)
+
+## 🌙 Deep Dream 記憶蒸留
+
+散在する会話記憶を毎日自動的に精製された長期記憶へ蒸留する新しい記憶整理メカニズム：
+
+- **三層記憶フロー**：会話コンテキスト（短期）→ デイリーメモリ（中期）→ MEMORY.md（長期）、完全な記憶ライフサイクルを形成
+- **自動蒸留**：毎日 23:55 に定期実行、当日のデイリーメモリと MEMORY.md を読み取り、LLM で重複排除・統合・剪定を行い、精製された新しい MEMORY.md を出力
+- **夢日記**：各蒸留でナラティブスタイルの夢日記を生成、整理過程の発見と洞察を記録、`memory/dreams/` ディレクトリに保存
+- **手動トリガー**：`/memory dream [N]` で手動トリガー、整理日数を指定可能（デフォルト 3 日、最大 30 日）、完了後にチャットで通知
+- **Web コンソール**：記憶管理ページに「夢日記」タブを追加、すべての夢日記を閲覧可能
+
+ドキュメント：[Deep Dream](https://docs.cowagent.ai/ja/memory/deep-dream)
+
+<img src="https://cdn.link-ai.tech/doc/20260414120158.png" width="750" />
+
+## 🧠 スマートコンテキスト圧縮
+
+コンテキストが上限を超えた場合、トリミング部分を LLM で要約し非同期で注入、会話の連続性を維持：
+
+- **LLM 非同期要約**：トリミングされたメッセージを LLM がキー情報に要約、デイリーメモリファイルへの書き込みと保持コンテキストへの注入を同時実行
+- **マルチモデル対応**：メインモデルを優先使用、Claude、OpenAI、MiniMax など異なるモデルのメッセージ形式要件に対応
+
+ドキュメント：[短期記憶](https://docs.cowagent.ai/ja/memory/context)
+
+## 💬 Web コンソールアップグレード
+
+Web コンソールの複数機能を強化：
+
+- **マルチセッション管理**：独立セッションの作成と切り替え、サイドバーにセッションリスト表示、タイトルの自動生成と手動編集をサポート
+- **パスワード保護**：`web_console_password` 設定でコンソールにログインパスワードを設定可能
+- **深い思考**：Web 端でモデルの思考プロセスを表示、`enable_thinking` 設定で制御
+- **定期プッシュ**：定期タスクの結果を Web コンソールにプッシュ
+- **メッセージコピー**：AI 回答バブルから元の Markdown コンテンツをワンクリックコピー
+- **言語切替**：上部の言語切替ボタンが現在の言語を表示するように改善、より直感的な操作
+
+## 🤖 モデル関連
+
+- **視覚認識の最適化**：画像認識ツールがメインモデルを優先使用、複数プロバイダーの自動フォールバック対応。ドキュメント：[ビジョンツール](https://docs.cowagent.ai/ja/tools/vision)
+- **MiniMax 新モデル**：MiniMax-M2.7-highspeed モデルと MiniMax TTS 音声合成サポートを追加。Thanks @octo-patch
+- **通義千問**：qwen3.6-plus モデルサポートを追加
+
+## 🐛 その他の改善と修正
+
+- **記憶プロンプト最適化**：`MEMORY.md` をシステムプロンプトにデフォルト注入、記憶検索と書き込みのトリガー条件を精緻化、主動的な書き込み能力を強化
+- **システムプロンプト**：システムプロンプトのスタイルとトーンガイダンスを最適化
+- **ブラウザツール**：暗黙的なインタラクティブ要素の検出を強化
+- **ファイル送信**：汎用ファイルタイプ（tar.gz、zip 等）が正しく送信されない問題を修正。Thanks @6vision
+- **macOS 互換性**：ネットワークプリチェックタイムアウトの互換性問題を修正。Thanks @Moliang Zhou
+- **Windows 互換性**：Windows での PowerShell 互換性、プロセス更新、ターミナルエンコーディングなどの問題を修正
+- **Python 3.13+**：Python 3.13 以降で `legacy-cgi` 依存関係が不足する問題を修正
+- **個人 WeChat**：個人 WeChat チャネルバージョンを更新
+
+## 📦 アップグレード
+
+`cow update` または `./run.sh update` でアップグレード、またはコードを手動で pull して再起動。詳細は[アップグレードガイド](https://docs.cowagent.ai/ja/guide/upgrade)を参照。
+
+**リリース日**：2026.04.14 | [Full Changelog](https://github.com/zhayujie/CowAgent/compare/2.0.5...master)
--- a/docs/ja/releases/v2.0.7.mdx
+++ b/docs/ja/releases/v2.0.7.mdx
@@ -0,0 +1,65 @@
+---
+title: v2.0.7
+description: CowAgent 2.0.7 - 画像生成スキル（6プロバイダー自動ルーティング）、新モデルサポート、ナレッジベース強化、Web コンソール改善およびバグ修正
+---
+
+## 🎨 画像生成スキル
+
+新しい内蔵スキル `image-generation` を追加。テキストから画像生成、画像編集、複数画像の融合に対応し、6 社の主要プロバイダーをカバー：
+
+- **6 プロバイダー自動ルーティング**：OpenAI (GPT-Image-2) → Gemini (Nano Banana) → Seedream (Volcengine Ark) → Qwen (DashScope) → MiniMax → LinkAI — 固定の優先順位で設定済みプロバイダーを自動選択、失敗時は次のプロバイダーへ自動フォールバック
+- **モデル選択不要**：API Key を設定するだけで使用可能、モデルを手動で指定する必要なし。会話で特定モデルを指名することも可能（例：「seedream で猫を描いて」）
+- **柔軟な制御**：`quality`（画質）、`size`（解像度、512/1K〜4K）、`aspect_ratio`（アスペクト比）パラメータ対応、各プロバイダーが自動的に有効な値にマッピング
+- **画像編集**：既存の画像を渡して編集・スタイル変換・複数画像融合が可能（Seedream は最大 14 枚の参照画像をサポート）
+- **スキルレベル設定**：`config.json` の `skill.image-generation.model` でデフォルトモデルを固定可能
+- **画像ライトボックス**：Web コンソールのすべての画像がクリックで拡大プレビュー対応
+
+ドキュメント：[画像生成スキル](https://docs.cowagent.ai/ja/skills/image-generation)
+
+## 🤖 新モデルサポート
+
+- **Kimi K2.6**：`kimi-k2.6` モデルサポートを追加
+- **Claude Opus 4.7**：`claude-opus-4-7` モデルサポートを追加
+- **GLM 5.1**：`glm-5.1` モデルサポートを追加
+- **Kimi Coding Plan**：Kimi Coding Plan モードをサポート
+- **カスタムモデルプロバイダー**：新しいカスタムモデルプロバイダー設定により、追加ベンダーとの統合が容易に
+
+## 💬 Web コンソール改善
+
+- **スマート自動スクロール**：チャットスクロールの動作を改善 — ユーザーが過去のメッセージを閲覧中に強制的に最下部にスクロールしなくなりました
+- **推論コンテンツ制限**：深い思考コンテンツを 4KB に制限し、フロントエンドのラグを防止
+- **モバイル最適化**：セッションサイドバーをモバイルではデフォルトで非表示、オーバーレイタップで閉じることが可能
+- **セッションタイトル修正**：タイトル自動生成のフォールバックロジックと設定変更時の Bridge リセットを修正
+- **画像プレビュー重複排除**：同一メッセージ内での画像の重複レンダリングを修正
+
+## 📚 ナレッジベース強化
+
+- **ネストディレクトリ対応**：ナレッジベースの一覧表示が多階層のネストディレクトリに対応
+- **ルートレベルファイル表示**：ナレッジツリーにルートディレクトリの `index.md`、`log.md` などを表示
+- **空状態統計の修正**：ルートレベルファイルが空状態検出に干渉しなくなりました
+
+## 🌙 夢の記憶改善
+
+- **構造化整理**：夢の記憶ファイルが日付別に自動アーカイブされ、ディレクトリ構造がより整理されました
+- **スケジュールジッター**：毎日の夢トリガーにランダムジッターを追加し、クラスター環境での同時実行の競合を回避
+
+## 🛠 スキルシステム改善
+
+- **スキルマネージャーの更新**：`/skill` コマンド実行後にスキルマネージャーを自動リフレッシュし、状態の同期を確保
+- **インストールソース拡張**：スキルインストールが複数のソース形式（URL、zip、ローカルファイルなど）に対応し、ターゲットディレクトリを自動的に確保
+
+## 🐛 その他の修正
+
+- **Gemini 修正**：Gemini の tool call が結果を返さない問題を修正
+- **Agent リトライ**：空レスポンスのリトライ時に `tool_calls` が破棄されなくなりました
+- **Docker 環境変数同期**：Docker 環境で設定更新後に環境変数が同期されない問題を修正
+- **Python 3.7 互換**：Python 3.7 互換性のために `Literal` のインポートを遅延
+- **モデル切替通知**：モデル切替後に bot_type 変更通知が表示されない問題を修正。Thanks @6vision
+- **設定コマンド**：`/config` で `enable_thinking` の設定が可能に
+- **思考表示**：深い思考の表示がデフォルトで無効に
+
+## 📦 アップグレード
+
+`cow update` または `./run.sh update` でアップグレード、またはコードを手動で pull して再起動。詳細は[アップグレードガイド](https://docs.cowagent.ai/ja/guide/upgrade)を参照。
+
+**リリース日**：2026.04.22 | [Full Changelog](https://github.com/zhayujie/CowAgent/compare/2.0.6...master)
--- a/docs/ja/skills/image-generation.mdx
+++ b/docs/ja/skills/image-generation.mdx
@@ -0,0 +1,158 @@
+---
+title: image-generation - 画像生成
+description: テキストから画像生成 / 画像編集 / 複数画像の融合、複数プロバイダーの自動ルーティングとフォールバック対応
+---
+
+汎用の画像生成・編集スキルです。OpenAI、Gemini、Seedream（Volcengine Ark）、Qwen（DashScope）、MiniMax、LinkAI の 6 社に対応。モデルを手動で選ぶ必要はなく、固定の優先順位に従って、設定済みのプロバイダーを自動的に選択します。
+
+## モデル選択
+
+`image-generation` は「固定優先度 + 自動フォールバック」のストラテジーを採用しています。API Key を設定するだけで使えます：
+
+1. **優先順位**: `OpenAI → Gemini → Seedream → Qwen → MiniMax → LinkAI`
+2. **未設定のプロバイダーはスキップ**: API Key が設定されているプロバイダーのみが参加
+3. **失敗時は自動で次へ**: 401、モデル未開通、ネットワークエラーなどの場合、次のプロバイダーを試行
+4. **モデル指定時は前置**: 特定のモデル名を渡すと、そのプロバイダーが最前列に昇格
+
+### 対応モデル
+
+| プロバイダー | モデル / エイリアス | 特徴 |
+| --- | --- | --- |
+| OpenAI | `gpt-image-2`、`gpt-image-1` | 汎用テキスト→画像、高品質、`quality` パラメータ対応 |
+| Gemini Nano Banana | `nano-banana-2`、`nano-banana-pro`、`nano-banana` | `gemini-3.1-flash`、`gemini-3-pro`、`gemini-2.5-flash` の画像バージョン |
+| Seedream（Volcengine Ark） | `seedream-5.0-lite`、`seedream-4.5` | ネイティブ 2K–4K、最大 14 枚の参照画像を融合 |
+| Qwen（DashScope） | `qwen-image-2.0`、`qwen-image-2.0-pro` | 中国語テキスト描画やテキスト・画像レイアウトに強い |
+| MiniMax | `image-01` | シンプルで高速な画像生成 |
+| LinkAI | 任意のモデル | 汎用プロキシ、フォールバック用 |
+
+<Note>
+デフォルトでは Agent はモデルを選ばず、自動ルーティングを使用します。特定のモデルを使いたい場合は、会話で直接指定してください（例：「seedream で猫を描いて」「gpt-image-2 でポスターを作って」）。下記の「カスタム設定」でデフォルトモデルを固定することもできます。
+</Note>
+
+## カスタム設定
+
+### API Key の設定
+
+**少なくとも 1 つ**のプロバイダーの Key が必要です。複数設定すると自動フォールバックが有効になります。設定方法は 3 通り：
+
+#### 方法 1：既存のモデル Key を自動再利用
+
+Web コンソールや `config.json` で対話モデルの Key（`openai_api_key`、`gemini_api_key` など）を設定済みの場合、起動時にこれらの Key は対応する環境変数に**自動同期**されます。つまり、対話モデルが使えていれば、画像生成も同じ Key で追加設定なしに利用できます。
+
+#### 方法 2：config.json で設定
+
+`config.json` に Key フィールドを直接記述：
+
+```json
+{
+  "openai_api_key": "sk-xxx",
+  "openai_api_base": "https://api.openai.com/v1",
+  "gemini_api_key": "AIza-xxx",
+  "ark_api_key": "xxx",
+  "dashscope_api_key": "sk-xxx",
+  "minimax_api_key": "xxx",
+  "linkai_api_key": "xxx"
+}
+```
+
+変更後は再起動が必要です。各 Key には対応する `*_api_base` フィールドがあり、カスタムエンドポイントを指定できます。
+
+#### 方法 3：会話で直接設定
+
+チャットで API Key を送信すると、Agent が `env_config` ツールで `~/cow/.env` に保存します。**再起動不要**でただちに反映されます。例：
+
+```
+OPENAI_API_KEY を sk-xxx に設定して
+```
+
+または：
+
+```
+ARK_API_KEY を xxx に設定して
+```
+
+### API Key 一覧
+
+| 環境変数 | config.json フィールド | プロバイダー | デフォルト Base URL |
+| --- | --- | --- | --- |
+| `OPENAI_API_KEY` | `openai_api_key` | OpenAI | `https://api.openai.com/v1` |
+| `GEMINI_API_KEY` | `gemini_api_key` | Gemini | `https://generativelanguage.googleapis.com` |
+| `ARK_API_KEY` | `ark_api_key` | Volcengine Ark（Seedream） | `https://ark.cn-beijing.volces.com/api/v3` |
+| `DASHSCOPE_API_KEY` | `dashscope_api_key` | Alibaba DashScope（Qwen） | `https://dashscope.aliyuncs.com` |
+| `MINIMAX_API_KEY` | `minimax_api_key` | MiniMax | `https://api.minimaxi.com` |
+| `LINKAI_API_KEY` | `linkai_api_key` | LinkAI | `https://api.link-ai.tech` |
+
+### デフォルトモデルの固定
+
+すべての画像生成を特定のプロバイダーのモデルで固定したい場合、`config.json` に以下を追加：
+
+```json
+"skill": {
+  "image-generation": {
+    "model": "seedream-5.0-lite"
+  }
+}
+```
+
+起動時にこの設定は環境変数 `SKILL_IMAGE_GENERATION_MODEL` に自動変換され、スクリプトはこのモデルのプロバイダーを常に使用します。
+
+## 有効化と無効化
+
+`image-generation` は内蔵スキルで、**API Key に基づいてステータスが自動調整**されます：
+
+- **Key 設定済み**：スキルはアクティブ — Agent は画像生成リクエストを受けると呼び出す
+- **Key 未設定**：スキルはコンテキストに表示される（「設定が必要」とマーク）— Agent は呼び出し失敗の代わりに Key の設定を案内する
+
+手動で制御する場合：
+
+```text
+/skill disable image-generation    # 無効化（Key があっても呼び出されない）
+/skill enable image-generation     # 再有効化
+```
+
+ターミナルでは `cow skill disable image-generation` / `cow skill enable image-generation`。
+
+## パラメータ
+
+| パラメータ | 型 | 必須 | デフォルト | 説明 |
+| --- | --- | --- | --- | --- |
+| `prompt` | string | はい | — | 画像の説明 |
+| `image_url` | string / list | いいえ | null | 編集用の入力画像。ローカルパスまたは URL。複数指定で複数画像融合 |
+| `quality` | string | いいえ | auto | `low` / `medium` / `high` — 一部のプロバイダーのみ対応 |
+| `size` | string | いいえ | auto | `512` / `1K` / `2K` / `3K` / `4K`、またはピクセル値（例: `1024x1024`） |
+| `aspect_ratio` | string | いいえ | null | `1:1` / `3:2` / `2:3` / `16:9` / `9:16` / `21:9`；Gemini は `1:4` / `4:1` / `1:8` / `8:1` にも対応 |
+
+<Warning>
+**品質が高いほど・解像度が大きいほど、コストが高く、時間がかかります。**
+
+- 日常の会話やプレビューにはデフォルト（`auto`）、または `quality=low` + `size=1K` を使用 — 約 20 秒で生成
+- ポスターやユーザーが高解像度を明示的に要求した場合は `quality=high` + `size=2K/4K` — モデルによって 1〜5 分かかる場合があります
+</Warning>
+
+## 出力
+
+成功時：
+
+```json
+{
+  "model": "doubao-seedream-5-0-260128",
+  "images": [
+    {"url": "/path/to/output.png"}
+  ]
+}
+```
+
+失敗時：`{ "error": "..." }`。エラー後は**直接リトライしないでください** — ほぼ確実に設定の問題です（Key の誤り、API ベース URL の不一致、モデル未開通など）。まず設定を修正してから再試行してください。
+
+## よくある使い方
+
+- **テキスト→画像**：説明からイラスト、ポスター、アイコン、アバター、絵コンテなどを生成
+- **画像→画像**：既存の画像のスタイル変更、要素の入れ替え、装飾やテキストの追加
+- **複数画像の融合**：複数の参照画像を 1 枚に合成（着せ替え、キャラクター集合写真など）
+
+<Note>
+- bash タイムアウトは 600 秒に設定してください。各プロバイダーの HTTP タイムアウトは 300 秒ですが、スクリプトが複数のプロバイダーを順番に試行する場合があります
+- 入力画像は自動的に 4 MB 以下・最長辺 4096 px 以下に圧縮されます
+- Gemini / Seedream / Qwen / MiniMax は `quality` パラメータに対応していません（渡しても無視されます）
+- Seedream のデフォルトは 2K。`seedream-5.0-lite` は 3K まで、`seedream-4.5` は 4K まで対応
+</Note>
--- a/docs/ja/skills/knowledge-wiki.mdx
+++ b/docs/ja/skills/knowledge-wiki.mdx
@@ -0,0 +1,112 @@
+---
+title: knowledge-wiki - ナレッジベース
+description: ローカルの構造化ナレッジベースを管理し、自動でアーカイブ・分類・相互参照を行う
+---
+
+会話で生まれた資料、アイデア、メモをローカルの構造化ナレッジベースに整理し、インデックスとページ間の相互参照を自動で維持します。
+
+`knowledge-wiki` はワークスペース内の `knowledge/` ディレクトリを管理します。Agent の「外部メモリ」のようなものです。`always: true` が設定されているため**常にコンテキストにロード**され、外部依存は不要です。
+
+## いつ起動するか
+
+- 記事、ドキュメント、URL を共有して、後で参照できるように残したいとき
+- 会話の中で長期保存に値する結論が出たとき
+- 以前蓄積したナレッジを調べたいとき
+
+## ディレクトリ構成
+
+```
+knowledge/
+├── index.md           # グローバルインデックス（必ずメンテナンスする）
+├── log.md             # 操作ログ（追記のみ）
+└── <category>/        # カテゴリサブディレクトリ（内容ごとにグループ化）
+    └── <slug>.md      # ナレッジページ（小文字ハイフン区切りのファイル名）
+```
+
+## 3 つの基本操作
+
+### 1. 収録（Ingest）
+
+資料を共有すると、Agent は：
+
+1. 原文を読んで理解し、重要な情報を抽出
+2. どのカテゴリに属するか判断 — まず `index.md` をチェックし、適切なカテゴリがなければ新規作成
+3. `knowledge/<category>/<slug>.md` にナレッジページを生成
+4. インデックス `index.md` とログ `log.md` を更新
+
+### 2. 統合（Synthesize）
+
+会話の中で新しい結論やインサイトが生まれたとき：
+
+1. 適切なカテゴリの下に新しいナレッジページを作成
+2. 関連する既存ページに相互リンクを追加
+3. インデックスとログを更新
+
+### 3. 検索（Query）
+
+以前蓄積したナレッジについて質問されたとき：
+
+1. `index.md` から関連しそうなページを探す
+2. `read` ツールで具体的なページを開く
+3. 必要に応じて `memory_search` で補完検索
+4. 回答にナレッジページへのリンクを含め、ユーザーが原文を確認できるようにする
+
+## ページの書き方
+
+```markdown
+# ページタイトル
+
+> Source: <ソース URL または簡単な説明>
+
+本文。ページ間は相対パスでリンク：
+[関連ページ](../category/related-page.md)
+
+## 要点
+
+- ...
+
+## 関連ページ
+
+- [ページ A](../category/page-a.md) — 関連する理由
+```
+
+<Note>
+- `> Source:` はこのナレッジの出典を記録します。明確な出典がある場合は必ず記載してください
+- 相互参照は重要です：ページを作成・更新したら、関連ページにも逆リンクを追加してください
+- **既に存在するページにのみリンクしてください**。ある概念が独立ページに値する場合は、先にページを作成してからリンクを追加してください
+</Note>
+
+## インデックス形式
+
+`knowledge/index.md` はフラットリスト形式で、カテゴリごとにグループ化し、各ナレッジページを 1 行で表します：
+
+```markdown
+# Knowledge Index
+
+## カテゴリ A
+- [ページタイトル](category-a/page-slug.md) — 一行の要約
+
+## カテゴリ B
+- [ページタイトル](category-b/page-slug.md) — 一行の要約
+```
+
+テーブルや絵文字は使いません。カテゴリ名や構成は柔軟に調整できます。
+
+## ログ形式
+
+`knowledge/log.md` は追記のみ、最新のエントリが一番下：
+
+```markdown
+## [YYYY-MM-DD] ingest | ページタイトル
+## [YYYY-MM-DD] synthesize | ページタイトル
+```
+
+## 執筆ガイドライン
+
+- **ファイル名**は小文字＋ハイフン（例: `machine-learning.md`）
+- **1 ページ 1 トピック** — 関連コンテンツはリンクで繋ぐ
+- **重複ページを作らず、既存ページを更新する**
+- **変更のたびにインデックスを更新する**（`knowledge/index.md`）
+- **要点を抽出し、全文をコピーしない**
+- **会話中にナレッジページを参照する際はフルパスを使用**（例: `[タイトル](knowledge/<category>/<slug>.md)`）。ページ間の相互リンクのみ相対パスを使用
+- **ナレッジページに基づいて回答する際はリンクを含める** — ユーザーが詳細を確認できるように
--- a/docs/ja/skills/skill-creator.mdx
+++ b/docs/ja/skills/skill-creator.mdx
@@ -0,0 +1,180 @@
+---
+title: skill-creator - スキル作成
+description: スキルの作成・インストール・更新、SKILL.md の書き方とディレクトリ構成の標準化
+---
+
+`skill-creator` は「メタスキル」です。Agent が他のスキルを作成・インストール・更新する際に呼び出され、すべてのスキルの `SKILL.md` の書き方とディレクトリ構成を統一します。
+
+## いつ起動するか
+
+- ユーザーが URL やリモートリポジトリからスキルをインストールしたいとき
+- ユーザーが新しいスキルをゼロから作成したいとき
+- 既存のスキルをアップグレード・リファクタリングする必要があるとき
+
+## スキルとは
+
+スキルは「再利用可能な説明書」にオプションのスクリプトやリソースを加えたものです。特定のドメインの専門知識を Agent に注入し、該当タスクをスペシャリストのように処理できるようにします。
+
+スキルには通常、以下が含まれます：
+
+1. **専門ワークフロー** — ある種のタスクの完全な手順
+2. **ツールの使い方** — 特定の API やファイル形式の処理方法
+3. **ドメイン知識** — チームの規約、ビジネスルール、データ構造など
+4. **付属リソース** — スクリプト、参考ドキュメント、テンプレートなど
+
+<Note>
+**基本原則：省けるものは省く。** Agent が自力で推測できない内容だけを書きましょう。1 行追加するたびに「このトークンコストに見合うか？」と自問してください。
+</Note>
+
+## ディレクトリ構成
+
+```
+skill-name/
+├── SKILL.md            # 必須：スキル定義
+│   ├── YAML frontmatter（name / description は必須）
+│   └── Markdown 本文（説明 + 例）
+└── オプションリソース
+    ├── scripts/        # 実行可能スクリプト（Python / Bash など）
+    ├── references/     # 分量が多い参考ドキュメント（Agent が必要時に読む）
+    └── assets/         # テンプレート、アイコンなど（出力に直接使われるもの）
+```
+
+## SKILL.md 仕様
+
+SKILL.md ヘッダーの `frontmatter` フィールド：
+
+| フィールド | 説明 |
+| --- | --- |
+| `name` | スキル名。小文字＋ハイフン、ディレクトリ名と一致させる |
+| `description` | **最も重要なフィールド**。「このスキルが何をするか」「いつ使うべきか」を明記する。Agent はこれを見て呼び出すかどうかを判断する。トリガーに関する記述はすべてここに書き、本文には書かない |
+| `metadata.cowagent.requires.bins` | システムに必要な CLI ツール |
+| `metadata.cowagent.requires.env` | 必要な環境変数（すべて揃っている必要がある） |
+| `metadata.cowagent.requires.anyEnv` | 複数の API Key のうち 1 つあればよい |
+| `metadata.cowagent.requires.anyBins` | 複数のツールのうち 1 つあればよい |
+| `metadata.cowagent.always` | `true` にすると常にロードされ、依存チェックをスキップ |
+| `metadata.cowagent.emoji` | 表示用の絵文字（任意） |
+| `metadata.cowagent.os` | OS 制限、例: `["darwin", "linux"]` |
+
+<Note>
+`category` フィールドは手動で設定する必要はありません。システムが自動的に `skill` に設定します。
+</Note>
+
+API Key 依存の宣言方法は 2 通り：
+
+```yaml
+metadata:
+  cowagent:
+    requires:
+      env: ["MYAPI_KEY"]            # 必須
+```
+
+```yaml
+metadata:
+  cowagent:
+    requires:
+      anyEnv: ["OPENAI_API_KEY", "LINKAI_API_KEY"]   # いずれか 1 つ
+```
+
+**スキルは依存関係に基づいて自動的に有効/無効になります**：環境変数が揃えば自動有効、不足すれば自動無効。手動で `/skill enable` する必要はありません。
+
+## リソースディレクトリの使い方
+
+| ディレクトリ | 入れるもの | 入れないもの |
+| --- | --- | --- |
+| `scripts/` | 繰り返し実行するコード、確定的な結果が必要なスクリプト | デモ用のコード片 |
+| `references/` | **500 行超**で SKILL.md に収まらない大きなドキュメント（完全な DB スキーマなど） | 一般的な API ドキュメント、チュートリアル |
+| `assets/` | 最終出力に含まれるファイル（テンプレート、アイコン、ボイラープレートなど） | 説明用ドキュメント |
+
+<Warning>
+**原則としてすべての内容を `SKILL.md` に書きます** — リソースディレクトリに分割するのは本当に収まらない場合だけです。
+
+`README.md`、`CHANGELOG.md`、`INSTALLATION_GUIDE.md` などをスキルに追加しないでください。すべて `SKILL.md` に入れましょう。リソースディレクトリには実際に実行するスクリプトや実際に使う素材だけを配置してください。
+</Warning>
+
+## 外部スキルのインストール
+
+インストール後、スキルは `<workspace>/skills/<name>/` に配置されます。
+
+| ソース | インストール方法 |
+| --- | --- |
+| URL（単一ファイル） | curl / web_fetch |
+| URL（zip アーカイブ） | ダウンロードして展開 |
+| ローカル SKILL.md | 直接読み込み |
+| ローカル zip アーカイブ | 展開 |
+
+インストール手順：
+
+1. `SKILL.md` を見つける（アーカイブのルートまたはサブディレクトリにある場合がある）
+2. frontmatter から `name` を読み取る
+3. **スキルディレクトリ全体**（`SKILL.md`、`scripts/`、`assets/` など）を `<workspace>/skills/<name>/` にコピー
+4. アーカイブに `INSTALL.md` などのセットアップスクリプトがあれば実行するが、最終的に `<workspace>/skills/<name>/` に収まっている必要がある
+
+## スキルをゼロから作成
+
+推奨手順：
+
+1. **要件を明確にする** — ユーザーに具体的なユースケースをいくつか挙げてもらう（一度に多く聞きすぎない）
+2. **構成を計画する** — スクリプトは必要か？参考ドキュメントは？テンプレートは？
+3. **スキャフォールド** — 初期化スクリプトを使用：
+
+   ```bash
+   scripts/init_skill.py <skill-name> --path <workspace>/skills [--resources scripts,references,assets] [--examples]
+   ```
+
+4. **内容を埋める** — SKILL.md を書き、スクリプトとリソースを追加。スクリプトは必ず実行テストする
+5. **バリデーション**（任意）：
+
+   ```bash
+   scripts/quick_validate.py <workspace>/skills/<skill-name>
+   ```
+
+6. **イテレーション** — 実際の使用フィードバックに基づいて継続的に改善
+
+## 命名規則
+
+- 小文字、数字、ハイフンのみ使用。ユーザーの入力は正規化する（例: `Plan Mode` → `plan-mode`）
+- 64 文字以内
+- 短く、動詞で始め、一目で何をするか分かるように
+- 必要に応じてツール名をプレフィックスにする（例: `gh-address-comments`、`linear-address-issue`）
+- ディレクトリ名と `name` フィールドは完全に一致させる
+
+## 3 段階ローディング
+
+スキルは一度にすべてコンテキストに読み込まれるわけではなく、3 段階で必要に応じてロードされます：
+
+1. **メタ情報**（`name` + `description`） — 常にコンテキスト内（約 100 語）。Agent がスキルを使うかどうかの判断に使用
+2. **SKILL.md 本文** — スキルが有効化されたときだけロード。500 行以内を推奨
+3. **リソースファイル** — Agent が必要なときに読み込む
+
+複数のバリエーション（例: マルチクラウドデプロイ）を持つスキルは次のように整理：
+
+```
+cloud-deploy/
+├── SKILL.md             # メインワークフローとプロバイダー選択ロジック
+└── references/
+    ├── aws.md
+    ├── gcp.md
+    └── azure.md
+```
+
+ユーザーが AWS を選んだら、Agent は `aws.md` だけを読みます。3 社分のドキュメントをすべてロードする必要はありません。
+
+## よくあるデザインパターン
+
+**ステップ式**：番号付きの手順と対応スクリプト。
+
+```markdown
+1. フォーム構造を分析（analyze_form.py を実行）
+2. フィールドマッピングを生成（fields.json を編集）
+3. フォームを自動入力（fill_form.py を実行）
+```
+
+**分岐式**：ユーザーの意図に応じて異なるフローへ。
+
+```markdown
+1. 操作タイプを判定：
+   **新規作成？** → 「作成フロー」へ
+   **既存の編集？** → 「編集フロー」へ
+```
+
+**テンプレート式**：出力形式に厳密な要件がある場合、SKILL.md にテンプレートを含め、Agent にそれに従って出力させる。
--- a/docs/ja/tools/vision.mdx
+++ b/docs/ja/tools/vision.mdx
@@ -27,7 +27,7 @@ Vision ツールは多段階の自動選択＋自動フォールバック戦略
 | Claude | メインモデル | Anthropic ネイティブ画像形式 |
 | Gemini | メインモデル | inlineData 形式 |
 | 豆包 (Doubao) | メインモデル | doubao-seed-2-0 シリーズがネイティブ対応 |
-| Kimi (Moonshot) | メインモデル | kimi-k2.5 がネイティブ対応 |
+| Kimi (Moonshot) | メインモデル | kimi-k2.6、kimi-k2.5 がネイティブ対応 |
 | 智谱 AI | glm-5v-turbo | 常にビジョン専用モデルを使用 |
 | MiniMax | MiniMax-Text-01 | 常にビジョン専用モデルを使用 |

--- a/docs/memory/context.mdx
+++ b/docs/memory/context.mdx
@@ -39,14 +39,15 @@ description: 对话上下文 — 消息管理、压缩策略和上下文操作

 - 裁剪 **最早一半** 的完整轮次（保证工具调用链的完整性）
 - 被裁剪的消息会通过 LLM 总结后**写入当天的日级记忆文件**
- 剩余轮次保持不变
+- LLM 摘要完成后，同时将摘要**注入到保留消息的第一条用户消息开头**，帮助模型在后续对话中保持上下文连贯性
+- 摘要注入在后台异步完成，不阻塞当前回复；注入的摘要在下一轮对话时生效

 ### 3. Token 预算裁剪

 裁剪轮次后，如果 token 数仍超出预算：

 - **轮次 < 5 时**：对所有轮次进行**文本压缩** — 每轮只保留第一条用户文本和最后一条 Agent 回复，去掉中间的工具调用链
- **轮次 ≥ 5 时**：再次裁剪**前半轮次**，被丢弃内容同样写入记忆
+- **轮次 ≥ 5 时**：再次裁剪**前半轮次**，被丢弃内容同样写入记忆并注入上下文摘要

 ### 4. 溢出应急处理

--- a/docs/memory/deep-dream.mdx
+++ b/docs/memory/deep-dream.mdx
@@ -0,0 +1,94 @@
+---
+title: 梦境蒸馏
+description: Deep Dream — 从对话到永久记忆的自动蒸馏机制
+---
+
+梦境蒸馏（Deep Dream）是 CowAgent 记忆系统的核心整理机制，负责将分散的天级记忆蒸馏为精炼的长期记忆，并生成梦境日记。
+
+## 记忆流转
+
+CowAgent 的记忆从短期到长期经历三个阶段：
+
+```
+对话上下文（短期）→ 天级记忆（中期）→ MEMORY.md（长期）
+```
+
+### 1. 对话 → 天级记忆
+
+当对话上下文被裁剪或每日定时总结时，系统使用 LLM 将对话内容摘要为关键事件，写入当天的天级记忆文件 `memory/YYYY-MM-DD.md`。
+
+触发时机：
+- **上下文裁剪** — 轮次或 token 超限时，裁剪的内容被总结写入
+- **每日定时** — 23:55 自动触发全量总结
+- **API 溢出** — 紧急保存当前对话摘要
+
+### 2. 天级记忆 → MEMORY.md（蒸馏）
+
+每日总结完成后，Deep Dream 自动执行蒸馏：
+
+1. **读取材料** — 当前 `MEMORY.md` + 当天的天级记忆
+2. **LLM 蒸馏** — 去重、合并、修剪、提取新信息
+3. **覆写 MEMORY.md** — 输出精炼后的长期记忆
+4. **生成梦境日记** — 记录整理过程的发现和洞察
+
+### 3. MEMORY.md 的作用
+
+`MEMORY.md` 会被注入到每次对话的系统提示词中，让 Agent 始终了解用户的偏好、决策和关键事实。因此它必须保持精炼——Deep Dream 会控制在约 30 条以内。
+
+## 蒸馏规则
+
+Deep Dream 遵循以下整理规则：
+
+| 操作 | 说明 |
+| --- | --- |
+| **合并提炼** | 含义相近的多条合并为一条高密度表述 |
+| **新增萃取** | 从天级记忆中提取偏好、决策、人物、经验等 |
+| **冲突更新** | 新信息与旧条目矛盾时，以新信息为准 |
+| **清理无效** | 删除临时性记录、空白条目、格式残留 |
+| **删除冗余** | 已被更精炼表述涵盖的旧条目删除 |
+
+## 梦境日记
+
+每次蒸馏会生成一篇梦境日记，保存在 `memory/dreams/YYYY-MM-DD.md`，用叙事风格记录：
+
+- 发现了哪些重复或矛盾
+- 从天级记忆中提取了什么新洞察
+- 做了哪些清理和优化
+- 整体感受和观察
+
+梦境日记可在 Web 控制台的「记忆管理 → 梦境日记」tab 中查看。
+
+<Frame>
+  <img src="https://cdn.link-ai.tech/doc/20260414110032.png" width="800" />
+</Frame>
+
+## 手动触发
+
+除了每日自动执行外，也可以在对话中手动触发：
+
+```text
+/memory dream [N]
+```
+
+- `N`：整理近 N 天的记忆（默认 3 天，最大 30 天）
+- 蒸馏在后台异步执行，完成后在对话中通知结果
+- Web 端通知包含可点击链接，直接跳转查看 MEMORY.md 和梦境日记
+- 无需 Agent 初始化，首次对话前即可使用
+
+<Frame>
+  <img src="https://cdn.link-ai.tech/doc/20260414120158.png" width="800" />
+</Frame>
+
+<Tip>
+  首次部署后可以手动执行一次 `/memory dream 30`，将历史天级记忆全量蒸馏到 MEMORY.md。
+</Tip>
+
+## 安全机制
+
+| 机制 | 说明 |
+| --- | --- |
+| **无新内容跳过** | 没有天级记忆时不执行蒸馏，避免空覆写 |
+| **输入去重** | 定时任务中，输入材料未变化时自动跳过 |
+| **异步执行** | 蒸馏在后台线程运行，不阻塞对话 |
+| **顺序保证** | 定时任务中，天级 flush 全部完成后才启动蒸馏 |
+| **禁止编造** | 提示词明确约束只能基于已有材料整理，不得推测或添加 |
--- a/docs/memory/index.mdx
+++ b/docs/memory/index.mdx
@@ -5,6 +5,8 @@ description: CowAgent 的长期记忆系统 — 文件持久化、自动写入

 长期记忆保存在工作空间文件中，跨会话持久存在。Agent 在对话中通过检索工具按需加载历史记忆，也会在上下文裁剪时自动将对话摘要写入长期记忆。

+<img src="https://cdn.link-ai.tech/doc/memory-architecture-zh.jpeg" alt="Memory Architecture" />
+
 ## 记忆类型

 ### 核心记忆（MEMORY.md）
@@ -15,12 +17,17 @@ description: CowAgent 的长期记忆系统 — 文件持久化、自动写入

 存储在 `~/cow/memory/` 目录下，按日期命名（如 `2026-03-08.md`），记录每天的对话摘要和关键事件。仅在首次写入时创建，避免生成空文件。

+### 梦境日记（memory/dreams/YYYY-MM-DD.md）
+
+Deep Dream（记忆蒸馏）过程的副产物，记录每次整理的发现、去重合并操作和新洞察。存储在 `~/cow/memory/dreams/` 目录下，按日期命名。
+
 ## 自动写入

 Agent 通过以下机制自动将对话内容持久化为长期记忆：

- **上下文裁剪时** — 当对话轮次或 token 超出配置上限时，裁剪最早一半的上下文，使用 LLM 将被裁剪的内容总结为关键信息写入当天记忆文件
+- **上下文裁剪时** — 当对话轮次或 token 超出配置上限时，裁剪最早一半的上下文，使用 LLM 将被裁剪的内容总结为关键信息写入当天记忆文件，并将摘要异步注入到保留的上下文中，帮助模型保持对话连贯性
 - **每日定时总结** — 每天 23:55 自动触发一次全量总结，防止低活跃日无记忆留存（内容无变化时自动跳过）
+- **[梦境蒸馏（Deep Dream）](/memory/deep-dream)** — 每日总结完成后自动执行，将天级记忆蒸馏合并到 MEMORY.md，并生成梦境日记
 - **API 上下文溢出时** — 当模型 API 返回上下文溢出错误时，紧急保存当前对话摘要

 所有记忆写入均在后台异步执行（LLM 总结 + 文件写入），不阻塞正常对话回复。
@@ -34,19 +41,25 @@ Agent 通过以下机制自动将对话内容持久化为长期记忆：

 Agent 会在对话中根据需要自动触发记忆检索，将相关历史信息纳入上下文。检索结果按混合评分排序（默认向量权重 0.7、关键词权重 0.3），日级记忆会随时间衰减（半衰期 30 天），核心记忆不衰减。

-## 首次启动
+## 相关文件

-首次启动 Agent 时，Agent 会主动向用户询问关键信息，并记录至工作空间（默认 `~/cow`）中：
+工作空间（默认 `~/cow`）中与记忆相关的文件：

 | 文件 | 说明 |
 | --- | --- |
-| `system.md` | Agent 的系统提示词和行为设定 |
-| `user.md` | 用户身份信息和偏好 |
+| `AGENT.md` | Agent 的人格和行为设定 |
+| `USER.md` | 用户身份信息和偏好 |
+| `RULE.md` | 自定义规则和约束 |
 | `MEMORY.md` | 核心记忆（长期） |
 | `memory/YYYY-MM-DD.md` | 日级记忆（按需创建） |
+| `memory/dreams/YYYY-MM-DD.md` | 梦境日记（Deep Dream 自动生成） |
+
+## Web 控制台
+
+在 Web 控制台的记忆管理页面中，可浏览记忆文件和梦境日记，支持通过 Tab 切换查看：

 <Frame>
-  <img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
+  <img src="https://cdn.link-ai.tech/doc/20260414171014.png" width="800" />
 </Frame>

 ## 相关配置
--- a/docs/models/claude.mdx
+++ b/docs/models/claude.mdx
@@ -12,6 +12,6 @@ description: Claude 模型配置

 | 参数 | 说明 |
 | --- | --- |
-| `model` | 支持 `claude-sonnet-4-6`、`claude-opus-4-6`、`claude-sonnet-4-5`、`claude-sonnet-4-0`、`claude-3-5-sonnet-latest` 等，参考 [官方模型](https://docs.anthropic.com/en/docs/about-claude/models/overview) |
+| `model` | 支持 `claude-sonnet-4-6`、`claude-opus-4-7`、`claude-opus-4-6`、`claude-sonnet-4-5`、`claude-sonnet-4-0`、`claude-3-5-sonnet-latest` 等，参考 [官方模型](https://docs.anthropic.com/en/docs/about-claude/models/overview) |
 | `claude_api_key` | 在 [Claude 控制台](https://console.anthropic.com/settings/keys) 创建 |
 | `claude_api_base` | 可选，默认为 `https://api.anthropic.com/v1`，修改可接入第三方代理 |
--- a/docs/models/coding-plan.mdx
+++ b/docs/models/coding-plan.mdx
@@ -99,27 +99,6 @@ description: Coding Plan 模式模型配置

 ---

-## Kimi
-
-```json
-{
-  "bot_type": "openai",
-  "model": "kimi-for-coding",
-  "open_ai_api_base": "https://api.kimi.com/coding/v1",
-  "open_ai_api_key": "YOUR_API_KEY"
-}
-```
-
-| 参数 | 说明 |
-| --- | --- |
-| `model` | `kimi-for-coding` |
-| `open_ai_api_base` | `https://api.kimi.com/coding/v1` |
-| `open_ai_api_key` | Coding Plan 专用 Key（与按量计费接口不通用） |
-
-官方文档：[Key 获取](https://www.kimi.com/code/docs/)
-
---
-
 ## 火山引擎

 ```json
@@ -138,3 +117,24 @@ description: Coding Plan 模式模型配置
 | `open_ai_api_key` | API Key 与普通接口通用 |

 官方文档：[快速开始](https://www.volcengine.com/docs/82379/1928261?lang=zh)
+
+---
+
+## Kimi
+
+```json
+{
+  "bot_type": "moonshot",
+  "model": "kimi-for-coding",
+  "moonshot_base_url": "https://api.kimi.com/coding/v1",
+  "moonshot_api_key": "YOUR_API_KEY"
+}
+```
+
+| 参数 | 说明 |
+| --- | --- |
+| `model` | 填写 `kimi-for-coding` 会自动更新模型，或指定模型例如 `kimi-k2.6` |
+| `moonshot_base_url` | `https://api.kimi.com/coding/v1` |
+| `moonshot_api_key` | Coding Plan 专用 Key（与按量计费接口不通用） |
+
+官方文档：[Key 获取](https://www.kimi.com/code/docs/)
--- a/docs/models/custom.mdx
+++ b/docs/models/custom.mdx
@@ -0,0 +1,62 @@
+---
+title: 自定义
+description: 自定义厂商配置，适用于第三方 API 代理和本地模型
+---
+
+适用于通过 OpenAI 兼容协议接入的第三方模型服务或本地部署的模型，例如：
+
+- **第三方 API 代理**：使用统一的 API Base 调用多种模型
+- **本地模型**：通过 Ollama、vLLM、LocalAI 等工具在本地部署的模型
+- **私有化部署**：企业内部部署的模型服务
+
+<Note>
+  与 `openai` 厂商的区别：选择自定义厂商后，通过 `/config model` 切换模型时，不会自动切换厂商类型，始终使用自定义的 API 地址。
+</Note>
+
+## 配置方式
+
+### 第三方 API 代理
+
+```json
+{
+  "bot_type": "custom",
+  "model": "",
+  "custom_api_key": "YOUR_API_KEY",
+  "custom_api_base": "https://{your-proxy.com}/v1"
+}
+```
+
+| 参数 | 说明 |
+| --- | --- |
+| `bot_type` | 必须设为 `custom` |
+| `model` | 模型名称，填写代理服务支持的任意模型名 |
+| `custom_api_key` | API 密钥，由代理服务提供 |
+| `custom_api_base` | API 地址，由代理服务提供，需兼容 OpenAI 协议 |
+
+### 本地模型
+
+本地模型通常不需要 API Key，只需填写 API Base 即可：
+
+```json
+{
+  "bot_type": "custom",
+  "model": "qwen3.5:27b",
+  "custom_api_base": "http://localhost:11434/v1"
+}
+```
+
+常见的本地部署工具及默认地址：
+
+| 工具 | 默认 API Base |
+| --- | --- |
+| [Ollama](https://ollama.com) | `http://localhost:11434/v1` |
+| [vLLM](https://docs.vllm.ai) | `http://localhost:8000/v1` |
+| [LocalAI](https://localai.io) | `http://localhost:8080/v1` |
+
+## 切换模型
+
+自定义厂商下切换模型时，只会修改 `model`，不会改变 `bot_type` 和 API 地址：
+
+```
+/config model qwen3.5:27b
+```
--- a/docs/models/deepseek.mdx
+++ b/docs/models/deepseek.mdx
@@ -7,25 +7,57 @@ description: DeepSeek 模型配置

 ```json
 {
-  "model": "deepseek-chat",
+  "model": "deepseek-v4-flash",
  "deepseek_api_key": "YOUR_API_KEY"
 }
 ```

 | 参数 | 说明 |
 | --- | --- |
-| `model` | `deepseek-chat`（DeepSeek-V3.2，非思考模式）、`deepseek-reasoner`（DeepSeek-R1，思考模式） |
+| `model` | 支持 `deepseek-v4-flash`（默认）、`deepseek-v4-pro` |
 | `deepseek_api_key` | 在 [DeepSeek 平台](https://platform.deepseek.com/api_keys) 创建 |
 | `deepseek_api_base` | 可选，默认为 `https://api.deepseek.com/v1`，可修改为第三方代理地址 |

+## 模型选择
+
+| 模型 | 适用场景 |
+| --- | --- |
+| `deepseek-v4-flash` | 默认推荐，速度快、成本低 |
+| `deepseek-v4-pro` | 更智能、复杂任务效果更强 |
+
+## 思考模式
+
+V4 系列（`deepseek-v4-flash` / `deepseek-v4-pro`）支持显式的"思考模式"：模型在输出最终回答前，先输出一段思维链（`reasoning_content`），从而提升答案质量。
+
+### 开关
+
+通过全局配置 `enable_thinking` 控制：
+
+```json
+{
+  "enable_thinking": true
+}
+```
+
+- `true`：所有渠道下模型都会先思考再作答。Web 控制台会展示思考过程，IM 渠道（微信 / 企微 / 钉钉 / 飞书）虽不展示但同样获得更好答案。
+- `false`：关闭思考，响应更快，首字延迟更低。
+
+### 行为说明
+
+- **采样参数**：思考模式下 `temperature`、`top_p`、`presence_penalty`、`frequency_penalty` 会被服务端忽略（不会报错），CowAgent 会自动跳过传入。
+- **多轮工具调用**：当历史中包含工具调用时，DeepSeek 要求所有 assistant 消息必须回传 `reasoning_content`。CowAgent 会自动处理回传逻辑，跨轮次切换思考开关也不会出错。
+
+<Tip>
+  默认使用 `deepseek-v4-flash`；复杂任务可使用 `deepseek-v4-pro`；需要深度思考可开启 `enable_thinking`。
+</Tip>
+
 方式二：OpenAI 兼容方式接入：

 ```json
 {
-  "model": "deepseek-chat",
+  "model": "deepseek-v4-flash",
  "bot_type": "openai",
  "open_ai_api_key": "YOUR_API_KEY",
  "open_ai_api_base": "https://api.deepseek.com/v1"
 }
 ```
-
--- a/docs/models/glm.mdx
+++ b/docs/models/glm.mdx
@@ -5,14 +5,14 @@ description: 智谱AI GLM 模型配置

 ```json
 {
-  "model": "glm-5-turbo",
+  "model": "glm-5.1",
  "zhipu_ai_api_key": "YOUR_API_KEY"
 }
 ```

 | 参数 | 说明 |
 | --- | --- |
-| `model` | 可填 `glm-5-turbo`、`glm-5`、`glm-4.7`、`glm-4-plus`、`glm-4-flash`、`glm-4-air` 等，参考 [模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4) |
+| `model` | 可填 `glm-5.1`、`glm-5-turbo`、`glm-5`、`glm-4.7`、`glm-4-plus`、`glm-4-flash`、`glm-4-air` 等，参考 [模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4) |
 | `zhipu_ai_api_key` | 在 [智谱AI 控制台](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) 创建 |

 也支持 OpenAI 兼容方式接入：
@@ -20,7 +20,7 @@ description: 智谱AI GLM 模型配置
 ```json
 {
  "bot_type": "openai",
-  "model": "glm-5-turbo",
+  "model": "glm-5.1",
  "open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
  "open_ai_api_key": "YOUR_API_KEY"
 }
--- a/docs/models/index.mdx
+++ b/docs/models/index.mdx
@@ -6,7 +6,7 @@ description: CowAgent 支持的模型及推荐选择
 CowAgent 支持国内外主流厂商的大语言模型，模型接口实现在项目的 `models/` 目录下。

 <Note>
-  Agent 模式下推荐使用以下模型，可根据效果及成本综合选择：MiniMax-M2.7、glm-5-turbo、kimi-k2.5、qwen3.6-plus、claude-sonnet-4-6、gemini-3.1-pro-preview
+  Agent 模式下推荐使用以下模型，可根据效果及成本综合选择：deepseek-v4-flash、MiniMax-M2.7、claude-sonnet-4-6、gemini-3.1-pro-preview、glm-5.1、qwen3.6-plus、kimi-k2.6

  同时支持使用 [LinkAI](https://link-ai.tech) 平台接口，可灵活切换多种模型，并支持知识库、工作流、插件等 Agent 能力。
 </Note>
@@ -23,21 +23,12 @@ CowAgent 支持国内外主流厂商的大语言模型，模型接口实现在
 ## 支持的模型

 <CardGroup cols={2}>
+  <Card title="DeepSeek" href="/models/deepseek">
+    deepseek-v4-flash、deepseek-v4-pro 等
+  </Card>
  <Card title="MiniMax" href="/models/minimax">
    MiniMax-M2.7 等系列模型
  </Card>
-  <Card title="智谱 GLM" href="/models/glm">
-    glm-5-turbo、glm-5 等系列模型
-  </Card>
-  <Card title="通义千问 Qwen" href="/models/qwen">
-    qwen3.6-plus、qwen3-max 等
-  </Card>
-  <Card title="Kimi" href="/models/kimi">
-    kimi-k2.5、kimi-k2 等
-  </Card>
-  <Card title="豆包 Doubao" href="/models/doubao">
-    doubao-seed 系列模型
-  </Card>
  <Card title="Claude" href="/models/claude">
    claude-sonnet-4-6 等
  </Card>
@@ -47,12 +38,24 @@ CowAgent 支持国内外主流厂商的大语言模型，模型接口实现在
  <Card title="OpenAI" href="/models/openai">
    gpt-5.4、gpt-4.1、o 系列等
  </Card>
-  <Card title="DeepSeek" href="/models/deepseek">
-    deepseek-chat、deepseek-reasoner
+  <Card title="智谱 GLM" href="/models/glm">
+    glm-5.1、glm-5-turbo、glm-5 等系列模型
+  </Card>
+  <Card title="通义千问 Qwen" href="/models/qwen">
+    qwen3.6-plus、qwen3-max 等
+  </Card>
+  <Card title="豆包 Doubao" href="/models/doubao">
+    doubao-seed 系列模型
+  </Card>
+  <Card title="Kimi" href="/models/kimi">
+    kimi-k2.6、kimi-k2.5、kimi-k2 等
  </Card>
  <Card title="LinkAI" href="/models/linkai">
    多模型统一接口 + 知识库
  </Card>
+  <Card title="自定义" href="/models/custom">
+    第三方代理、本地模型等
+  </Card>
 </CardGroup>


--- a/docs/models/kimi.mdx
+++ b/docs/models/kimi.mdx
@@ -5,14 +5,14 @@ description: Kimi (Moonshot) 模型配置

 ```json
 {
-  "model": "kimi-k2.5",
+  "model": "kimi-k2.6",
  "moonshot_api_key": "YOUR_API_KEY"
 }
 ```

 | 参数 | 说明 |
 | --- | --- |
-| `model` | 可填 `kimi-k2.5`、`kimi-k2`、`moonshot-v1-8k`、`moonshot-v1-32k`、`moonshot-v1-128k` |
+| `model` | 可填 `kimi-k2.6`、`kimi-k2.5`、`kimi-k2`、`moonshot-v1-8k`、`moonshot-v1-32k`、`moonshot-v1-128k` |
 | `moonshot_api_key` | 在 [Moonshot 控制台](https://platform.moonshot.cn/console/api-keys) 创建 |

 也支持 OpenAI 兼容方式接入：
@@ -20,7 +20,7 @@ description: Kimi (Moonshot) 模型配置
 ```json
 {
  "bot_type": "openai",
-  "model": "kimi-k2.5",
+  "model": "kimi-k2.6",
  "open_ai_api_base": "https://api.moonshot.cn/v1",
  "open_ai_api_key": "YOUR_API_KEY"
 }
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
zhayujie	5c65196e44	feat(web): hint API base version path in config placeholder	2026-04-26 17:10:24 +08:00
zhayujie	f5798bfe90	fix: remove unnecessary API Base URL in run scripts	2026-04-26 16:29:08 +08:00
zhayujie	0e556b3468	feat: switch default model to deepseek-v4-flash	2026-04-26 15:54:50 +08:00
zhayujie	31820f56e7	fix(deepseek): back-fill reasoning_content for all assistant turns	2026-04-24 16:39:48 +08:00
zhayujie	fd88828abd	fix(models): unify enable_thinking for deepseek-v4	2026-04-24 15:29:43 +08:00
zhayujie	ae11159918	feat(models): unify enable_thinking for deepseek-v4 and other thinking models	2026-04-24 15:22:45 +08:00
zhayujie	472a8605c0	feat(models): support deepseek-v4-pro and deepseek-v4-flash	2026-04-24 11:35:38 +08:00
zhayujie	e1760ba211	feat: release 2.0.7 version	2026-04-23 18:13:53 +08:00
zhayujie	ce4c0a0aa4	feat: release 2.0.7	2026-04-23 17:18:19 +08:00
zhayujie	64511593c4	feat: release 2.0.7	2026-04-23 17:16:17 +08:00
zhayujie	b0e00dfceb	feat: support glm-5.1	2026-04-23 16:43:05 +08:00
zhayujie	fc465b463d	feat: support kimi coding plan by temporary solution	2026-04-23 16:24:37 +08:00
zhayujie	68ce2e5232	feat(skill): multi-provider image generation with auto-fallback - Add Gemini, Seedream (Volcengine Ark), Qwen (DashScope), MiniMax providers to image-generation skill with universal sequential fallback: OpenAI → Gemini → Seedream → Qwen → MiniMax → LinkAI - Each provider filters unsupported size tiers to valid values (e.g. Seedream 1K→2K, Qwen 3K→2K, Gemini 3K→2K) - Pinned model only tries its native provider; auto-routing uses each provider's default model - Support skill-namespaced config (config.skill.image-generation.model → SKILL_IMAGE_GENERATION_MODEL env var) - Add image lightbox (click-to-enlarge) in web console - Add docs for built-in skills (skill-creator, knowledge-wiki, image-generation) under docs/skills/	2026-04-23 12:39:39 +08:00
zhayujie	81e8bb62ae	feat(skill): support gpt-image-2 in image generation skill	2026-04-22 20:39:49 +08:00
zhayujie	2c13e1b923	feat(models): support kimi-k2.6	2026-04-22 12:01:40 +08:00
zhayujie	a0748c2e3b	fix(web): cap reasoning content to 4KB across stream/storage/display	2026-04-21 20:31:38 +08:00
zhayujie	40599bb751	fix(web): smart auto-scroll for chat #2775	2026-04-20 21:43:21 +08:00
zhayujie	f3c64ceea7	fix: refresh skill manager on /skill	2026-04-19 19:50:16 +08:00
zhayujie	15c60de709	fix: improve skill installation to support multiple source formats and ensure target directory	2026-04-19 19:05:51 +08:00
zhayujie	6dd316547f	fix(web): fix session title generation fallback and reset Bridge on config change	2026-04-19 18:43:48 +08:00
zhayujie	54c7676a44	docs: update architecture diagram	2026-04-18 23:08:36 +08:00
zhayujie	d25b8966ce	fix(web): prevent duplicate image previews	2026-04-18 22:32:34 +08:00
zhayujie	14a119c48c	fix(gemini): solving the problem of tool call not returnings	2026-04-18 21:18:27 +08:00
zhayujie	c82515a927	fix(agent): don't drop tool_calls from empty-response retry	2026-04-18 20:50:40 +08:00
zhayujie	26e630c2dd	feat(cli): /config support set enable_thinking	2026-04-17 16:09:43 +08:00
zhayujie	13370d2056	fix: thinking display is disabled by default	2026-04-17 15:31:59 +08:00
zhayujie	35282db9e0	feat(models): support claude-opus-4-7	2026-04-16 23:24:16 +08:00
zhayujie	426fb88ce7	fix(knowledge): exclude root-level files from knowledge stats to preserve empty state	2026-04-16 22:55:46 +08:00
zhayujie	2384bd0e10	fix: update CI workflows for repo rename and add latest tag	2026-04-16 21:57:20 +08:00
zhayujie	ba3f66d3d1	feat: show root-level files (index.md, log.md) in knowledge tree	2026-04-16 21:47:44 +08:00
zhayujie	7293a0f670	fix: modify repo name in github workflow	2026-04-16 21:38:58 +08:00
zhayujie	9e86d46267	fix: sync env vars when updating config in docker env	2026-04-16 21:32:07 +08:00
zhayujie	848430f062	feat(knowledge): support nested directories in knowledge base listing and display	2026-04-16 12:28:18 +08:00
zhayujie	abd21335c4	Merge pull request #2772 from 6vision/master fix: bot_type change notification never shown after model switch	2026-04-16 10:43:41 +08:00
6vision	8fa95f058a	fix: bot_type change notification never shown after model switch Made-with: Cursor	2026-04-15 21:48:50 +08:00
zhayujie	d4e5ecd497	fix: compatible with Python 3.7 by deferring Literal import in truncate.py	2026-04-15 12:29:09 +08:00
zhayujie	3830f76729	feat: add custom model provider	2026-04-15 12:26:05 +08:00
zhayujie	83f778fec9	feat(dream): structured organization of dream memories	2026-04-15 11:27:46 +08:00
zhayujie	cabd24605f	fix: add random jitter to daily dream schedule	2026-04-15 00:33:33 +08:00
zhayujie	ae20ba1148	Merge branch 'master' of github.com:zhayujie/chatgpt-on-wechat	2026-04-14 22:58:59 +08:00
zhayujie	3a50b64977	feat: web multi session interface	2026-04-14 22:58:25 +08:00
zhayujie	8692e74536	fix(web): hide session panel by default on mobile and support overlay dismiss	2026-04-14 21:09:01 +08:00
zhayujie	1c18bd9889	docs(memory): update long-term memory docs	2026-04-14 17:14:28 +08:00
zhayujie	60e9d98d0a	feat: release 2.0.6	2026-04-14 12:37:53 +08:00
zhayujie	83f6625e0c	feat: release 2.0.6	2026-04-14 12:08:57 +08:00
zhayujie	acc09543b7	feat(dream): add memory dream cli and docs - New memory/deep-dream.mdx (zh/en/ja): memory flow, distillation rules, dream diary, manual trigger, safety mechanisms - Simplify long-term memory page, link to deep-dream for details - New cli/memory-knowledge.mdx (zh/en/ja): memory and knowledge commands - Move knowledge commands from general.mdx to memory-knowledge.mdx - Register new pages in docs.json navigation for all languages - Add /memory dream to cli/index.mdx command tables	2026-04-14 11:03:53 +08:00
zhayujie	94d8c7e366	feat(dream): add Dream Diary tab to memory management page - Backend: MemoryService supports category param (memory/dream), lists memory/dreams/*.md - Backend: MemoryContentHandler resolves dream files from memory/dreams/ directory - Frontend: add tab switcher (Memory Files / Dream Diary) matching knowledge tab style - Frontend: dream entries show purple "Dream" badge, empty state with moon icon - Cloud dispatch passes category param for consistency	2026-04-13 22:08:15 +08:00
zhayujie	ea1a0c8b3d	feat(memory): add Deep Dream module for daily memory distillation - Add Deep Dream: nightly distill daily memories → refined MEMORY.md + dream diary - Simplify flush prompt to daily-only, defer MEMORY.md maintenance to Deep Dream - Remove dead code (_append_to_main_memory) and fix fallback summary logic - Add shrinkage protection and input dedup for dream process - Ensure flush threads complete before dream starts - Update docs (zh/en/ja) with dream diary and distillation mechanism	2026-04-13 21:32:52 +08:00
zhayujie	7bc88c17e4	Merge branch 'master' of github.com:zhayujie/chatgpt-on-wechat	2026-04-13 20:13:30 +08:00
zhayujie	33cf1bc4c3	feat(memory): async LLM context summary injection on trim - Unified flush + context injection into a single async LLM call (flush_from_messages accepts context_summary_callback) - Fixed response parsing bug: handle generator returns and Claude-format dicts from bot.call_with_tools, which previously caused all LLM summaries to silently fail (falling back to rule-based extraction) - Removed standalone context summary prompts and methods; reuse the existing [DAILY]/[MEMORY] summarization pipeline - Updated docs (zh/en/ja) to reflect the new injection behavior	2026-04-13 20:13:05 +08:00
zhayujie	9402e63fe1	Merge pull request #2766 from zhayujie/feat-mulit-session feat(web): add multi-session management for web console	2026-04-13 18:51:07 +08:00
zhayujie	da97e948ca	feat: refine memory recall/write prompts for better precision and proactivity	2026-04-13 18:02:06 +08:00
zhayujie	89a07e8e74	feat: add enable_thinking config to control deep reasoning on web console	2026-04-13 16:06:28 +08:00
@@ -1 +1 @@
 .0.5
 .0.7