Compare commits

..

24 Commits

Author SHA1 Message Date
saboteur7
ce63de3c58 feat: release 2.0.0 2026-02-03 14:48:30 +08:00
saboteur7
4b3b1219b5 Merge branch 'master' of github.com:zhayujie/chatgpt-on-wechat 2026-02-03 12:20:04 +08:00
saboteur7
73b069a76c docs: update 2.0 README.md 2026-02-03 12:19:36 +08:00
Saboteur7
101cf8d108 Merge pull request #2653 from 6vision/deploy-script
feat: enhance one-click deployment script with full lifecycle management
2026-02-03 03:18:49 +08:00
saboteur7
2e926dfb6e fix: python 3.8 compatibility issues 2026-02-03 03:17:11 +08:00
saboteur7
501866d12a feat: optimize document and model usage 2026-02-03 02:58:15 +08:00
6vision
39bcb0869f feat: enhance one-click deployment script with full lifecycle management 2026-02-03 02:56:46 +08:00
saboteur7
a7b99cde4e Merge branch 'master' of github.com:zhayujie/chatgpt-on-wechat 2026-02-03 01:18:17 +08:00
saboteur7
60abcd92a3 feat: update README.md and solving Python compatibility issues 2026-02-03 01:17:25 +08:00
zhayujie
cdd36e7052 docs: update README.md 2026-02-03 00:48:03 +08:00
saboteur7
c6ac175ce4 docs: update README.md 2026-02-03 00:43:42 +08:00
zhayujie
46bcd87c23 feat: support minimax M2 models 2026-02-02 23:36:23 +08:00
zhayujie
ab74be8e33 feat: add qwen models tool call 2026-02-02 23:08:24 +08:00
zhayujie
d8298b3eab fix: support glm-4.7 2026-02-02 22:43:08 +08:00
zhayujie
50e60e6d05 fix: bug fixes 2026-02-02 22:22:10 +08:00
zhayujie
5d02acbf37 config: add config template 2026-02-02 14:25:34 +08:00
zhayujie
8901d91f96 feat: startup log optimization 2026-02-02 12:25:47 +08:00
zhayujie
b55021bb3d feat: system Initialization log 2026-02-02 12:18:57 +08:00
zhayujie
0ef51b85e6 Merge branch 'feat-cow-agent' 2026-02-02 12:03:55 +08:00
zhayujie
c77566cc02 fix: adjust the maximum step size 2026-02-02 12:03:16 +08:00
zhayujie
c1bcedfb51 Merge pull request #2652 from zhayujie/feat-cow-agent
feat: cow super agent
2026-02-02 11:59:45 +08:00
zhayujie
08b592816b Merge pull request #2651 from zhayujie/feat-cow-agent
fix: optimize suggestion words and retries
2026-02-01 14:11:53 +08:00
zhayujie
8ef788e799 Merge pull request #2650 from zhayujie/feat-cow-agent
feat: cow agent
2026-02-01 13:14:00 +08:00
Saboteur7
3ce57ef851 Merge pull request #2648 from zhayujie/feat-cow-agent
feat: cow agent core
2026-01-31 13:14:05 +08:00
44 changed files with 3949 additions and 1302 deletions

583
README.md
View File

@@ -1,4 +1,4 @@
<p align="center"><img src= "https://github.com/user-attachments/assets/31fb4eab-3be4-477d-aa76-82cf62bfd12c" alt="Chatgpt-on-Wechat" width="600" /></p>
<p align="center"><img src= "https://github.com/user-attachments/assets/eca9a9ec-8534-4615-9e0f-96c5ac1d10a3" alt="Chatgpt-on-Wechat" width="550" /></p>
<p align="center">
<a href="https://github.com/zhayujie/chatgpt-on-wechat/releases/latest"><img src="https://img.shields.io/github/v/release/zhayujie/chatgpt-on-wechat" alt="Latest release"></a>
@@ -6,29 +6,32 @@
<a href="https://github.com/zhayujie/chatgpt-on-wechat"><img src="https://img.shields.io/github/stars/zhayujie/chatgpt-on-wechat?style=flat-square" alt="Stars"></a> <br/>
</p>
**chatgpt-on-wechat**简称CoW项目是基于大模型的智能对话机器人支持自由切换多种模型可接入网页、微信公众号、企业微信应用、飞书、钉钉中使用能处理文本、语音、图片、文件等多模态消息支持通过插件访问操作系统和互联网等外部资源以及基于自有知识库定制企业AI应用
**CowAgent** 是基于大模型的超级AI助理能够主动思考和任务规划、操作计算机和外部资源、创造和执行Skills、拥有长期记忆并不断成长。CowAgent 支持灵活切换多种模型能处理文本、语音、图片、文件等多模态消息可接入网页、飞书、钉钉、企业微信应用、微信公众号中使用7*24小时运行于你的个人电脑或服务器中
📖能力介绍:[CowAgent 2.0](/docs/agent.md)
# 简介
> 该项目既是一个可以开箱即用的对话机器人也是一个支持高度扩展的AI应用框架,可以通过为项目添加大模型接口、接入渠道、自定义插件来灵活实现各种定制需求。支持的功能如下:
> 该项目既是一个可以开箱即用的超级AI助理,也是一个支持高FTS5 not available, using LIKE-based keyword searc度扩展的Agent框架,可以通过为项目扩展大模型接口、接入渠道、内置工具、Skills系统来灵活实现各种定制需求。核心能力如下:
- **多端部署:** 有多种部署方式可选择且功能完备,目前已支持网页、微信公众号、企业微信应用、飞书、钉钉等部署方式
- **基础对话** 私聊及群聊的AI智能回复支持多轮会话上下文记忆基础模型支持OpenAI, Claude, Gemini, DeepSeek, 通义千问, Kimi, 文心一言, 讯飞星火, ChatGLM, MiniMax, GiteeAI, ModelScope, LinkAI
- **语音能力** 可识别语音消息,通过文字或语音回复,支持 openai(whisper/tts), azure, baidu, google 等多种语音模型
- **图像能力** 支持图片生成、图片识别、图生图,可选择 Dall-E-3, stable diffusion, replicate, midjourney, CogView-3, vision模型
- **丰富插件** 支持自定义插件扩展,已实现多角色切换、敏感词过滤、聊天记录总结、文档总结和对话、联网搜索、智能体等内置插件
- **Agent能力** 支持访问浏览器、终端、文件系统、搜索引擎等各类工具,并可通过多智能体协作完成复杂任务,基于 [AgentMesh](https://github.com/MinimalFuture/AgentMesh) 框架实现
- **知识库:** 通过上传知识库自定义专属机器人,可作为数字分身、智能客服、企业智能体使用,基于 [LinkAI](https://link-ai.tech) 实现
-**复杂任务规划**:能够理解复杂任务并自主规划执行,持续思考和调用工具直到完成目标,支持通过工具操作访问文件、终端、浏览器、定时任务等系统资源
-**长期记忆** 自动将对话记忆持久化至本地文件和数据库中,包括全局记忆和天级记忆,支持关键词及向量检索
-**技能系统** 实现了Skills创建和运行的引擎内置多种技能并支持通过自然语言对话完成自定义Skills开发
-**多模态消息** 支持对文本、图片、语音、文件等多类型消息进行解析、处理、生成、发送等操作
-**多模型接入** 支持OpenAI, Claude, Gemini, DeepSeek, MiniMax、GLM、通义千问, Kimi等国内外主流模型厂商
-**多端部署** 支持运行在本地计算机或服务器,可集成到网页、飞书、钉钉、微信公众号、企业微信应用中使用
-**知识库:** 集成企业知识库能力让Agent成为专属数字员工,基于[LinkAI](https://link-ai.tech)平台实现
## 声明
1. 本项目遵循 [MIT开源协议](/LICENSE)用于技术研究和学习,使用本项目时需遵守所在地法律法规、相关政策以及企业章程,禁止用于任何违法或侵犯他人权益的行为。任何个人、团队和企业,无论以何种方式使用该项目、对何对象提供服务,所产生的一切后果,本项目均不承担任何责任
2. 境内使用该项目时,建议使用国内厂商的大模型服务,并进行必要的内容安全审核及过滤
3. 本项目当前主要接入协同办公平台,推荐使用网页、公众号、企微自建应用、钉钉、飞书等接入通道,其他通道为历史产物暂不维护
1. 本项目遵循 [MIT开源协议](/LICENSE)主要用于技术研究和学习,使用本项目时需遵守所在地法律法规、相关政策以及企业章程,禁止用于任何违法或侵犯他人权益的行为。任何个人、团队和企业,无论以何种方式使用该项目、对何对象提供服务,所产生的一切后果,本项目均不承担任何责任
2. 成本与安全Agent模式下Token使用量高于普通对话模式请根据效果及成本综合选择模型。Agent具有访问所在操作系统的能力请谨慎选择项目部署环境。同时项目也会持续升级安全机制、并降低模型消耗成本
## 演示
DEMO视频https://cdn.link-ai.tech/doc/cow_demo.mp4
使用说明(Agent模式)[CowAgent介绍](/docs/agent.md)
DEMO视频(对话模式)https://cdn.link-ai.tech/doc/cow_demo.mp4
## 社区
@@ -54,7 +57,9 @@ DEMO视频https://cdn.link-ai.tech/doc/cow_demo.mp4
# 🏷 更新日志
>**2025.05.23** [1.7.6版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.6) 优化web网页channel、新增 [AgentMesh多智能体插件](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/plugins/agent/README.md)、百度语音合成优化、企微应用`access_token`获取优化、支持`claude-4-sonnet``claude-4-opus`模型
>**2026.02.03** [2.0.0版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.0)正式升级为超级Agent助理支持多轮任务决策、具备长期记忆、实现多种系统工具、支持Skills框架新增多种模型并优化了接入渠道。
>**2025.05.23** [1.7.6版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.6) 优化web网页channel、新增 [AgentMesh](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/plugins/agent/README.md)多智能体插件、百度语音合成优化、企微应用`access_token`获取优化、支持`claude-4-sonnet``claude-4-opus`模型
>**2025.04.11** [1.7.5版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.5) 新增支持 [wechatferry](https://github.com/zhayujie/chatgpt-on-wechat/pull/2562) 协议、新增 deepseek 模型、新增支持腾讯云语音能力、新增支持 ModelScope 和 Gitee-AI API接口
@@ -62,37 +67,38 @@ DEMO视频https://cdn.link-ai.tech/doc/cow_demo.mp4
>**2024.10.31** [1.7.3版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.3) 程序稳定性提升、数据库功能、Claude模型优化、linkai插件优化、离线通知
更多更新历史请查看: [更新日志](/docs/version/release-notes.md)
更多更新历史请查看: [更新日志](/docs/release/history.md)
<br/>
# 🚀 快速开始
项目提供了一键安装、启动、管理程序的脚本,可以选择使用脚本快速运行,也可以根据详细指引一步步安装运行。
项目提供了一键安装、配置、启动、管理程序的脚本,推荐使用脚本快速运行,也可以根据下文中的详细指引一步步安装运行。
- 详细文档:[快速开始](https://docs.link-ai.tech/cow/quick-start)
- 一键安装脚本说明:[一键安装脚本](https://github.com/zhayujie/chatgpt-on-wechat/wiki/%E4%B8%80%E9%94%AE%E5%AE%89%E8%A3%85%E5%90%AF%E5%8A%A8%E8%84%9A%E6%9C%AC)
在终端执行以下命令:
```bash
bash <(curl -sS https://cdn.link-ai.tech/code/cow/install.sh)
bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
```
- 项目管理脚本说明:[项目管理脚本](https://github.com/zhayujie/chatgpt-on-wechat/wiki/%E9%A1%B9%E7%9B%AE%E7%AE%A1%E7%90%86%E8%84%9A%E6%9C%AC)
脚本使用说明:[一键运行脚本](https://github.com/zhayujie/chatgpt-on-wechat/wiki/CowAgentQuickStart)
## 一、准备
### 1. 模型账号
### 1. 模型API
项目默认使用ChatGPT模型需前往 [OpenAI平台](https://platform.openai.com/api-keys) 创建API Key并填入项目配置文件中。同时支持其他国内外产商以及第三方自定义模型接口详情参考:[模型说明](#模型说明)。
项目支持国内外主流厂商的模型接口,可选模型及配置说明参考:[模型说明](#模型说明)。
同时支持使用 **LinkAI平台** 接口,可聚合使用 OpenAI、Claude、DeepSeek、Kimi、Qwen 等多种常用模型并支持知识库、工作流、联网搜索、MJ绘图、文档总结等能力。修改配置即可一键启用参考 [接入文档](https://link-ai.tech/platform/link-app/wechat)。
> Agent模式下推荐使用以下模型可根据效果及成本综合选择 Claude(claude-sonnet-4-5、claude-sonnet-4-0)、Gemini(gemini-3-flash-preview、gemini-3-pro-preview)、GLM(glm-4.7)、MiniMAx(MiniMax-M2.1)、Qwen(qwen3-max)
同时支持使用 **LinkAI平台** 接口,可灵活切换 OpenAI、Claude、Gemini、DeepSeek、Qwen、Kimi 等多种常用模型并支持知识库、工作流、插件等Agent能力参考 [接口文档](https://docs.link-ai.tech/platform/api)。
### 2.环境安装
支持 Linux、MacOS、Windows 系统,同时需安装 `Python`Python版本需在3.7以上推荐使用3.9版本。
支持 Linux、MacOS、Windows 操作系统,可在个人计算机及服务器上运行,需安装 `Python`Python版本需在3.7 ~ 3.12 之间推荐使用3.9版本。
> 注意选择Docker部署则无需安装python环境和下载源码可直接快进到下一节。
> 注意:Agent模式推荐使用源码运行选择Docker部署则无需安装python环境和下载源码可直接快进到下一节。
**(1) 克隆项目代码:**
@@ -129,51 +135,35 @@ pip3 install -r requirements-optional.txt
```bash
# config.json 文件内容示例
{
"channel_type": "web", # 接入渠道类型默认为web支持修改为:terminal, wechatmp, wechatmp_service, wechatcom_app, dingtalk, feishu
"model": "gpt-4o-mini", # 模型名称, 支持 gpt-4o-mini, gpt-4.1, gpt-4o, deepseek-reasoner, wenxin, xunfei, glm-4, claude-3-7-sonnet-latest, moonshot等
"open_ai_api_key": "YOUR API KEY", # 如果使用openAI模型则填入上面创建的 OpenAI API KEY
"open_ai_api_base": "https://api.openai.com/v1", # OpenAI接口代理地址,修改此项可接入三方模型接口
"proxy": "", # 代理客户端的ip和端口国内环境开启代理的需要填写该项如 "127.0.0.1:7890"
"single_chat_prefix": ["bot", "@bot"], # 私聊时文本需要包含该前缀才能触发机器人回复
"single_chat_reply_prefix": "[bot] ", # 私聊时自动回复的前缀,用于区分真人
"group_chat_prefix": ["@bot"], # 群聊时包含该前缀则会触发机器人回复
"group_name_white_list": ["ChatGPT测试群", "ChatGPT测试群2"], # 开启自动回复的群名称列表
"group_chat_in_one_session": ["ChatGPT测试群"], # 支持会话上下文共享的群名称
"image_create_prefix": ["", "看", "找"], # 开启图片回复的前缀
"conversation_max_tokens": 1000, # 支持上下文记忆的最多字符数
"channel_type": "web", # 接入渠道类型默认为web支持修改为:feishu,dingtalk,wechatcom_app,terminal,wechatmp,wechatmp_service
"model": "claude-sonnet-4-5", # 模型名称
"claude_api_key": "", # Claude API Key
"claude_api_base": "https://api.anthropic.com/v1", # Claude API 地址,修改可接入三方代理平台
"open_ai_api_key": "", # OpenAI API Key
"open_ai_api_base": "https://api.openai.com/v1", # OpenAI API 地址
"gemini_api_key": "", # Gemini API Key
"gemini_api_base": "https://generativelanguage.googleapis.com", # Gemini API地址
"zhipu_ai_api_key": "", # 智谱GLM API Key
"minimax_api_key": "", # MiniMax API Key
"dashscope_api_key": "", # 百炼(通义千问)API Key
"linkai_api_key": "", # LinkAI API Key
"proxy": "", # 代理客户端的ip和端口国内环境需要开启代理的可填写该项如 "127.0.0.1:7890"
"speech_recognition": false, # 是否开启语音识别
"group_speech_recognition": false, # 是否开启群组语音识别
"voice_reply_voice": false, # 是否使用语音回复语音
"character_desc": "你是基于大语言模型的AI智能助手旨在回答并解决人们的任何问题并且可以使用多种语言与人交流。", # 系统提示词
# 订阅欢迎语公众号和企业微信channel中使用当被订阅时会自动回复以下内容
"subscribe_msg": "感谢您的关注!\n这里是AI智能助手可以自由对话。\n支持语音对话。\n支持图片输入。\n支持图片输出画字开头的消息将按要求创作图片。\n支持tool、角色扮演和文字冒险等丰富的插件。\n输入{trigger_prefix}#help 查看详细指令。",
"use_linkai": false, # 是否使用LinkAI接口默认关闭设置为true后可对接LinkAI平台的智能体
"linkai_api_key": "", # LinkAI Api Key
"linkai_app_code": "" # LinkAI 应用或工作流的code
"use_linkai": false, # 是否使用LinkAI接口默认关闭设置为true后可对接LinkAI平台接口
"agent": true, # 是否启用Agent模式启用后拥有多轮工具决策、长期记忆、Skills能力等
"agent_workspace": "~/cow", # Agent的工作空间路径用于存储memory、skills、系统设定等
"agent_max_context_tokens": 40000, # Agent模式下最大上下文tokens超出将自动丢弃最早的上下文
"agent_max_context_turns": 30, # Agent模式下最大上下文记忆轮次每轮包括一次用户提问和AI回复
"agent_max_steps": 15 # Agent模式下单次任务的最大决策步数超出后将停止继续调用工具
}
```
**详细配置说明:**
**配置补充说明:**
<details>
<summary>1. 单聊配置</summary>
+ 个人聊天中,需要以 "bot"或"@bot" 为开头的内容触发机器人,对应配置项 `single_chat_prefix` (如果不需要以前缀触发可以填写 `"single_chat_prefix": [""]`)
+ 机器人回复的内容会以 "[bot] " 作为前缀, 以区分真人,对应的配置项为 `single_chat_reply_prefix` (如果不需要前缀可以填写 `"single_chat_reply_prefix": ""`)
</details>
<details>
<summary>2. 群聊配置</summary>
+ 群组聊天中,群名称需配置在 `group_name_white_list ` 中才能开启群聊自动回复。如果想对所有群聊生效,可以直接填写 `"group_name_white_list": ["ALL_GROUP"]`
+ 默认只要被人 @ 就会触发机器人自动回复;另外群聊天中只要检测到以 "@bot" 开头的内容,同样会自动回复(方便自己触发),这对应配置项 `group_chat_prefix`
+ 可选配置: `group_name_keyword_white_list`配置项支持模糊匹配群名称,`group_chat_keyword`配置项则支持模糊匹配群消息内容用法与上述两个配置项相同。Contributed by [evolay](https://github.com/evolay))
+ `group_chat_in_one_session`:使群聊共享一个会话上下文,配置 `["ALL_GROUP"]` 则作用于所有群聊
</details>
<details>
<summary>3. 语音配置</summary>
<summary>1. 语音配置</summary>
+ 添加 `"speech_recognition": true` 将开启语音识别默认使用openai的whisper模型识别为文字同时以文字回复该参数仅支持私聊 (注意由于语音消息无法匹配前缀,一旦开启将对所有语音自动回复,支持语音触发画图)
+ 添加 `"group_speech_recognition": true` 将开启群组语音识别默认使用openai的whisper模型识别为文字同时以文字回复参数仅支持群聊 (会匹配group_chat_prefix和group_chat_keyword, 支持语音触发画图)
@@ -181,30 +171,22 @@ pip3 install -r requirements-optional.txt
</details>
<details>
<summary>4. 其他配置</summary>
<summary>2. 其他配置</summary>
+ `model`: 模型名称,目前支持 `gpt-4o-mini`, `gpt-4.1`, `gpt-4o`, `gpt-3.5-turbo`, `wenxin` , `claude` , `gemini`, `glm-4`, `xunfei`, `moonshot`,全部模型名称参考[common/const.py](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/common/const.py)文件
+ `temperature`,`frequency_penalty`,`presence_penalty`: Chat API接口参数详情参考[OpenAI官方文档。](https://platform.openai.com/docs/api-reference/chat)
+ `proxy`:由于目前 `openai` 接口国内无法访问,需配置代理客户端的地址,详情参考 [#351](https://github.com/zhayujie/chatgpt-on-wechat/issues/351)
+ 对于图像生成,在满足个人或群组触发条件外,还需要额外的关键词前缀来触发,对应配置 `image_create_prefix `
+ 关于OpenAI对话及图片接口的参数配置内容自由度、回复字数限制、图片大小等可以参考 [对话接口](https://beta.openai.com/docs/api-reference/completions) 和 [图像接口](https://beta.openai.com/docs/api-reference/completions) 文档,在[`config.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/config.py)中检查哪些参数在本项目中是可配置的。
+ `conversation_max_tokens`:表示能够记忆的上下文最大字数(一问一答为一组对话,如果累积的对话字数超出限制,就会优先移除最早的一组对话)
+ `rate_limit_chatgpt``rate_limit_dalle`:每分钟最高问答速率、画图速率,超速后排队按序处理。
+ `clear_memory_commands`: 对话内指令,主动清空前文记忆,字符串数组可自定义指令别名。
+ `hot_reload`: 程序退出后,暂存等于状态,默认关闭。
+ `character_desc` 配置中保存着你对机器人说的一段话,他会记住这段话并作为他的设定,你可以为他定制任何人格 (关于会话上下文的更多内容参考该 [issue](https://github.com/zhayujie/chatgpt-on-wechat/issues/43))
+ `model`: 模型名称,Agent模式下推荐使用 `claude-sonnet-4-5``claude-sonnet-4-0``gemini-3-flash-preview``gemini-3-pro-preview``glm-4.7``MiniMax-M2.1``qwen3-max`,全部模型名称参考[common/const.py](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/common/const.py)文件
+ `character_desc`普通对话模式下的机器人系统提示词。在Agent模式下该配置不生效由工作空间中的文件内容构成。
+ `subscribe_msg`订阅消息公众号和企业微信channel中请填写当被订阅时会自动回复 可使用特殊占位符。目前支持的占位符有{trigger_prefix}在程序中它会自动替换成bot的触发词。
</details>
<details>
<summary>5. LinkAI配置</summary>
+ `use_linkai`: 是否使用LinkAI接口默认关闭设置为true后可对接LinkAI平台的Agent,使用知识库、工作流、联网搜索、`Midjourney` 绘画等能力, 参考 [文档](https://link-ai.tech/platform/link-app/wechat)
+ `use_linkai`: 是否使用LinkAI接口默认关闭设置为true后可对接LinkAI平台使用知识库、工作流、插件等能力, 参考[接口文档](https://docs.link-ai.tech/platform/api/chat)
+ `linkai_api_key`: LinkAI Api Key可在 [控制台](https://link-ai.tech/console/interface) 创建
+ `linkai_app_code`: LinkAI 应用或工作流的code选填
+ `linkai_app_code`: LinkAI 应用或工作流的code选填,普通对话模式中使用。
</details>
注:完整配置项说明可在 [`config.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/config.py) 文件中查看。
注:全部配置项说明可在 [`config.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/config.py) 文件中查看。
## 三、运行
@@ -216,9 +198,10 @@ pip3 install -r requirements-optional.txt
python3 app.py # windows环境下该命令通常为 python app.py
```
运行后默认会启动一个web服务通过访问 `http://localhost:9899/chat` 在网页端对话。如果需要接入其他应用通道只需修改 `config.json` 配置文件中的 `channel_type` 参数,详情参考:[通道说明](#通道说明)。
运行后默认会启动web服务可通过访问 `http://localhost:9899/chat` 在网页端对话。
如果需要接入其他应用通道只需修改 `config.json` 配置文件中的 `channel_type` 参数,详情参考:[通道说明](#通道说明)。
向机器人发送 `#help` 消息可以查看可用指令及插件的说明。
### 2.服务器部署
@@ -235,7 +218,7 @@ nohup python3 app.py & tail -f nohup.out
### 3.Docker部署
使用docker部署无需下载源码和安装依赖只需要获取 `docker-compose.yml` 配置文件并启动容器即可。
使用docker部署无需下载源码和安装依赖只需要获取 `docker-compose.yml` 配置文件并启动容器即可。Agent模式下更推荐使用源码进行部署以获得更多系统访问能力。
> 前提是需要安装好 `docker` 及 `docker-compose`,安装成功后执行 `docker -v` 和 `docker-compose version` (或 `docker compose version`) 可查看到版本号。安装地址为 [docker官网](https://docs.docker.com/engine/install/) 。
@@ -275,8 +258,7 @@ volumes:
## 模型说明
以下对所有可支持的模型的配置和使用方法进行说明,模型接口实现在项目的 `bot/` 目录下。
>部分模型厂商接入有官方sdk和OpenAI兼容两种方式建议使用OpenAI兼容的方式。
以下对所有可支持的模型的配置和使用方法进行说明,模型接口实现在项目的 `models/` 目录下。
<details>
<summary>OpenAI</summary>
@@ -294,7 +276,7 @@ volumes:
}
```
- `model`: 与OpenAI接口的 [model参数](https://platform.openai.com/docs/models) 一致,支持包括 o系列、gpt-4系列、gpt-3.5系列模型
- `model`: 与OpenAI接口的 [model参数](https://platform.openai.com/docs/models) 一致,支持包括 o系列、gpt-5.2、gpt-5.1、gpt-4.1等系列模型
- `open_ai_api_base`: 如果需要接入第三方代理接口,可通过修改该参数进行接入
- `bot_type`: 使用OpenAI相关模型时无需填写。当使用第三方代理接口接入Claude等非OpenAI官方模型时该参数设为 `chatGPT`
</details>
@@ -308,18 +290,47 @@ volumes:
```json
{
"use_linkai": true,
"linkai_api_key": "YOUR API KEY",
"linkai_app_code": "YOUR APP CODE"
"use_linkai": true,
"linkai_api_key": "YOUR API KEY",
"linkai_app_code": "YOUR APP CODE"
}
```
+ `use_linkai`: 是否使用LinkAI接口默认关闭设置为true后可对接LinkAI平台的智能体使用知识库、工作流、数据库、联网搜索、MCP工具等丰富的Agent能力, 参考 [文档](https://link-ai.tech/platform/link-app/wechat)
+ `use_linkai`: 是否使用LinkAI接口默认关闭设置为true后可对接LinkAI平台的智能体使用知识库、工作流、数据库、MCP插件等丰富的Agent能力
+ `linkai_api_key`: LinkAI平台的API Key可在 [控制台](https://link-ai.tech/console/interface) 中创建
+ `linkai_app_code`: LinkAI智能体 (应用或工作流) 的code选填。智能体创建可参考 [说明文档](https://docs.link-ai.tech/platform/quick-start)
+ `linkai_app_code`: LinkAI智能体 (应用或工作流) 的code选填,普通对话模式可用。智能体创建可参考 [说明文档](https://docs.link-ai.tech/platform/quick-start)
+ `model`: model字段填写空则直接使用智能体的模型可在平台中灵活切换[模型列表](https://link-ai.tech/console/models)中的全部模型均可使用
</details>
<details>
<summary>Claude</summary>
1. API Key创建在 [Claude控制台](https://console.anthropic.com/settings/keys) 创建API Key
2. 填写配置
```json
{
"model": "claude-sonnet-4-5",
"claude_api_key": "YOUR_API_KEY"
}
```
- `model`: 参考 [官方模型ID](https://docs.anthropic.com/en/docs/about-claude/models/overview#model-aliases) ,支持 `claude-sonnet-4-5、claude-sonnet-4-0、claude-opus-4-0、claude-3-5-sonnet-latest`
</details>
<details>
<summary>Gemini</summary>
API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn) 创建API Key ,配置如下
```json
{
"model": "gemini-3-flash-preview",
"gemini_api_key": ""
}
```
- `model`: 参考[官方文档-模型列表](https://ai.google.dev/gemini-api/docs/models?hl=zh-cn),支持 `gemini-3-flash-preview、gemini-3-pro-preview、gemini-2.5-pro、gemini-2.0-flash`
</details>
<details>
<summary>DeepSeek</summary>
@@ -329,23 +340,140 @@ volumes:
```json
{
"bot_type": "chatGPT",
"model": "deepseek-chat",
"open_ai_api_key": "sk-xxxxxxxxxxx",
"open_ai_api_base": "https://api.deepseek.com/v1"
"model": "deepseek-chat",
"open_ai_api_key": "sk-xxxxxxxxxxx",
"open_ai_api_base": "https://api.deepseek.com/v1",
"bot_type": "chatGPT"
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 可填 `deepseek-chat、deepseek-reasoner`,分别对应的是 V3 和 R1 模型
- `model`: 可填 `deepseek-chat、deepseek-reasoner`,分别对应的是 DeepSeek-V3 和 DeepSeek-R1 模型
- `open_ai_api_key`: DeepSeek平台的 API Key
- `open_ai_api_base`: DeepSeek平台 BASE URL
</details>
<details>
<summary>通义千问 (Qwen)</summary>
方式一官方SDK接入配置如下(推荐)
```json
{
"model": "qwen3-max",
"dashscope_api_key": "sk-qVxxxxG"
}
```
- `model`: 可填写 `qwen3-max、qwen-max、qwen-plus、qwen-turbo、qwen-long、qwq-plus`
- `dashscope_api_key`: 通义千问的 API-KEY参考 [官方文档](https://bailian.console.aliyun.com/?tab=api#/api) ,在 [控制台](https://bailian.console.aliyun.com/?tab=model#/api-key) 创建
方式二OpenAI兼容方式接入配置如下
```json
{
"bot_type": "chatGPT",
"model": "qwen3-max",
"open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"open_ai_api_key": "sk-qVxxxxG"
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 支持官方所有模型,参考[模型列表](https://help.aliyun.com/zh/model-studio/models?spm=a2c4g.11186623.0.0.78d84823Kth5on#9f8890ce29g5u)
- `open_ai_api_base`: 通义千问API的 BASE URL
- `open_ai_api_key`: 通义千问的 API-KEY
</details>
<details>
<summary>MiniMax</summary>
方式一:官方接入,配置如下(推荐)
```json
{
"model": "MiniMax-M2.1",
"minimax_api_key": ""
}
```
- `model`: 可填写 `MiniMax-M2.1、MiniMax-M2.1-lightning、MiniMax-M2、abab6.5-chat`
- `minimax_api_key`MiniMax平台的API-KEY在 [控制台](https://platform.minimaxi.com/user-center/basic-information/interface-key) 创建
方式二OpenAI兼容方式接入配置如下
```json
{
"bot_type": "chatGPT",
"model": "MiniMax-M2.1",
"open_ai_api_base": "https://api.minimaxi.com/v1",
"open_ai_api_key": ""
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 可填 `MiniMax-M2.1、MiniMax-M2.1-lightning、MiniMax-M2`,参考[API文档](https://platform.minimaxi.com/document/%E5%AF%B9%E8%AF%9D?key=66701d281d57f38758d581d0#QklxsNSbaf6kM4j6wjO5eEek)
- `open_ai_api_base`: MiniMax平台API的 BASE URL
- `open_ai_api_key`: MiniMax平台的API-KEY
</details>
<details>
<summary>智谱AI (GLM)</summary>
方式一:官方接入,配置如下(推荐)
```json
{
"model": "glm-4.7",
"zhipu_ai_api_key": ""
}
```
- `model`: 可填 `glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long` 等, 参考 [glm-4系列模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4)
- `zhipu_ai_api_key`: 智谱AI平台的 API KEY在 [控制台](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) 创建
方式二OpenAI兼容方式接入配置如下
```json
{
"bot_type": "chatGPT",
"model": "glm-4.7",
"open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
"open_ai_api_key": ""
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 可填 `glm-4.7、glm-4.6、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long`
- `open_ai_api_base`: 智谱AI平台的 BASE URL
- `open_ai_api_key`: 智谱AI平台的 API KEY
</details>
<details>
<summary>Kimi (Moonshot)</summary>
方式一:官方接入,配置如下:
```json
{
"model": "moonshot-v1-128k",
"moonshot_api_key": ""
}
```
- `model`: 可填写 `moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
- `moonshot_api_key`: Moonshot的API-KEY在 [控制台](https://platform.moonshot.cn/console/api-keys) 创建
方式二OpenAI兼容方式接入配置如下
```json
{
"bot_type": "chatGPT",
"model": "moonshot-v1-128k",
"open_ai_api_base": "https://api.moonshot.cn/v1",
"open_ai_api_key": ""
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 可填写 `moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
- `open_ai_api_base`: Moonshot的 BASE URL
- `open_ai_api_key`: Moonshot的 API-KEY
</details>
<details>
<summary>Azure</summary>
1. API Key创建在 [DeepSeek平台](https://platform.deepseek.com/api_keys) 创建API Key
1. API Key创建在 [Azure平台](https://oai.azure.com/) 创建API Key
2. 填写配置
@@ -353,9 +481,9 @@ volumes:
{
"model": "",
"use_azure_chatgpt": true,
"open_ai_api_key": "e7ffc5dd84f14521a53f14a40231ea78",
"open_ai_api_base": "https://linkai-240917.openai.azure.com/",
"azure_deployment_id": "gpt-4.1",
"open_ai_api_key": "",
"open_ai_api_base": "",
"azure_deployment_id": "",
"azure_api_version": "2025-01-01-preview"
}
```
@@ -368,100 +496,13 @@ volumes:
- `azure_api_version`: api版本以及以上参数可以在部署的 [模型配置](https://oai.azure.com/resource/deployments) 界面查看
</details>
<details>
<summary>Claude</summary>
1. API Key创建在 [Claude控制台](https://console.anthropic.com/settings/keys) 创建API Key
2. 填写配置
```json
{
"model": "claude-sonnet-4-0",
"claude_api_key": "YOUR_API_KEY"
}
```
- `model`: 参考 [官方模型ID](https://docs.anthropic.com/en/docs/about-claude/models/overview#model-aliases) ,例如`claude-opus-4-0``claude-3-7-sonnet-latest`
</details>
<details>
<summary>通义千问</summary>
方式一官方SDK接入配置如下
```json
{
"model": "qwen-turbo",
"dashscope_api_key": "sk-qVxxxxG"
}
```
- `model`: 可填写`qwen-turbo、qwen-plus、qwen-max`
- `dashscope_api_key`: 通义千问的 API-KEY参考 [官方文档](https://bailian.console.aliyun.com/?tab=api#/api) ,在 [控制台](https://bailian.console.aliyun.com/?tab=model#/api-key) 创建
方式二OpenAI兼容方式接入配置如下
```json
{
"bot_type": "chatGPT",
"model": "qwen-turbo",
"open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"open_ai_api_key": "sk-qVxxxxG"
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 支持官方所有模型,参考[模型列表](https://help.aliyun.com/zh/model-studio/models?spm=a2c4g.11186623.0.0.78d84823Kth5on#9f8890ce29g5u)
- `open_ai_api_base`: 通义千问API的 BASE URL
- `open_ai_api_key`: 通义千问的 API-KEY参考 [官方文档](https://bailian.console.aliyun.com/?tab=api#/api) ,在 [控制台](https://bailian.console.aliyun.com/?tab=model#/api-key) 创建
</details>
<details>
<summary>Gemini</summary>
API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn) 创建API Key ,配置如下
```json
{
"model": "gemini-2.5-pro",
"gemini_api_key": ""
}
```
- `model`: 参考[官方文档-模型列表](https://ai.google.dev/gemini-api/docs/models?hl=zh-cn)
</details>
<details>
<summary>Moonshot</summary>
方式一:官方接入,配置如下:
```json
{
"model": "moonshot-v1-8k",
"moonshot_api_key": "moonshot-v1-8k"
}
```
- `model`: 可填写`moonshot-v1-8k、 moonshot-v1-32k、 moonshot-v1-128k`
- `moonshot_api_key`: Moonshot的API-KEY在 [控制台](https://platform.moonshot.cn/console/api-keys) 创建
方式二OpenAI兼容方式接入配置如下
```json
{
"bot_type": "chatGPT",
"model": "moonshot-v1-8k",
"open_ai_api_base": "https://api.moonshot.cn/v1",
"open_ai_api_key": ""
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 可填写`moonshot-v1-8k、 moonshot-v1-32k、 moonshot-v1-128k`
- `open_ai_api_base`: Moonshot的 BASE URL
- `open_ai_api_key`: Moonshot的 API-KEY在 [控制台](https://platform.moonshot.cn/console/api-keys) 创建
</details>
<details>
<summary>百度文心</summary>
方式一官方SDK接入配置如下
```json
{
"model": "wenxin",
"model": "wenxin-4",
"baidu_wenxin_api_key": "IajztZ0bDxgnP9bEykU7lBer",
"baidu_wenxin_secret_key": "EDPZn6L24uAS9d8RWFfotK47dPvkjD6G"
}
@@ -474,7 +515,7 @@ API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn)
```json
{
"bot_type": "chatGPT",
"model": "qwen-turbo",
"model": "ERNIE-4.0-Turbo-8K",
"open_ai_api_base": "https://qianfan.baidubce.com/v2",
"open_ai_api_key": "bce-v3/ALTxxxxxxd2b"
}
@@ -503,7 +544,7 @@ API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn)
}
```
- `model`: 填 `xunfei`
- `xunfei_domain`: 可填写 `4.0Ultra、 generalv3.5、 max-32k、 generalv3、 pro-128k、 lite`
- `xunfei_domain`: 可填写 `4.0Ultra、generalv3.5、max-32k、generalv3、pro-128k、lite`
- `xunfei_spark_url`: 填写参考 [官方文档-请求地址](https://www.xfyun.cn/doc/spark/Web.html#_1-1-%E8%AF%B7%E6%B1%82%E5%9C%B0%E5%9D%80) 的说明
方式二OpenAI兼容方式接入配置如下
@@ -516,71 +557,11 @@ API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn)
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 可填写 `4.0Ultra、 generalv3.5、 max-32k、 generalv3、 pro-128k、 lite`
- `model`: 可填写 `4.0Ultra、generalv3.5、max-32k、generalv3、pro-128k、lite`
- `open_ai_api_base`: 讯飞星火平台的 BASE URL
- `open_ai_api_key`: 讯飞星火平台的[APIPassword](https://console.xfyun.cn/services/bm3) ,因模型而已
</details>
<details>
<summary>智谱AI</summary>
方式一:官方接入,配置如下:
```json
{
"model": "glm-4-plus",
"zhipu_ai_api_key": ""
}
```
- `model`: 可填 `glm-4-plus、glm-4-air-250414、glm-4-airx、glm-4-long 、glm-4-flashx 、glm-4-flash-250414`, 参考 [glm-4系列模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4)
- `zhipu_ai_api_key`: 智谱AI平台的 API KEY在 [控制台](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) 创建
方式二OpenAI兼容方式接入配置如下
```json
{
"bot_type": "chatGPT",
"model": "glm-4-plus",
"open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
"open_ai_api_key": ""
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 可填 `glm-4-plus、glm-4-air-250414、glm-4-airx、glm-4-long 、glm-4-flashx 、glm-4-flash-250414`, 参考 [glm-4系列模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4)
- `open_ai_api_base`: 智谱AI平台的 BASE URL
- `open_ai_api_key`: 智谱AI平台的 API KEY在 [控制台](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) 创建
</details>
<details>
<summary>MiniMax</summary>
方式一:官方接入,配置如下:
```json
{
"model": "abab6.5-chat",
"Minimax_api_key": "",
"Minimax_group_id": ""
}
```
- `model`: 可填写`abab6.5-chat`
- `Minimax_api_key`MiniMax平台的API-KEY在 [控制台](https://platform.minimaxi.com/user-center/basic-information/interface-key) 创建
- `Minimax_group_id`: 在 [账户信息](https://platform.minimaxi.com/user-center/basic-information) 右上角获取
方式二OpenAI兼容方式接入配置如下
```json
{
"bot_type": "chatGPT",
"model": "MiniMax-M1",
"open_ai_api_base": "https://api.minimaxi.com/v1",
"open_ai_api_key": ""
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 可填`MiniMax-M1、MiniMax-Text-01`,参考[API文档](https://platform.minimaxi.com/document/%E5%AF%B9%E8%AF%9D?key=66701d281d57f38758d581d0#QklxsNSbaf6kM4j6wjO5eEek)
- `open_ai_api_base`: MiniMax平台API的 BASE URL
- `open_ai_api_key`: MiniMax平台的API-KEY在 [控制台](https://platform.minimaxi.com/user-center/basic-information/interface-key) 创建
</details>
<details>
<summary>ModelScope</summary>
@@ -607,9 +588,9 @@ API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn)
以下对可接入通道的配置方式进行说明,应用通道代码在项目的 `channel/` 目录下。
<details>
<summary>Web</summary>
<summary>1. Web</summary>
项目启动后默认运行web通道配置如下
项目启动后默认运行Web通道配置如下
```json
{
@@ -617,49 +598,65 @@ API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn)
"web_port": 9899
}
```
- `web_port`: 默认为 9899可按需更改需要服务器防火墙和安全组放行该端口
- 如本地运行,启动后请访问 `http://localhost:port/chat` ;如服务器运行,请访问 `http://ip:port/chat`
- 如本地运行,启动后请访问 `http://localhost:9899/chat` ;如服务器运行,请访问 `http://ip:9899/chat`
> 注:请将上述 url 中的 ip 或者 port 替换为实际的值
</details>
<details>
<summary>Terminal</summary>
<summary>2. Feishu - 飞书</summary>
修改 `config.json` 中的 `channel_type` 字段:
飞书支持两种事件接收模式WebSocket 长连接(推荐)和 Webhook。
**方式一WebSocket 模式(推荐,无需公网 IP**
```json
{
"channel_type": "terminal"
"channel_type": "feishu",
"feishu_app_id": "APP_ID",
"feishu_app_secret": "APP_SECRET",
"feishu_event_mode": "websocket"
}
```
运行后可在终端与机器人进行对话。
**方式二Webhook 模式(需要公网 IP**
```json
{
"channel_type": "feishu",
"feishu_app_id": "APP_ID",
"feishu_app_secret": "APP_SECRET",
"feishu_token": "VERIFICATION_TOKEN",
"feishu_event_mode": "webhook",
"feishu_port": 9891
}
```
- `feishu_event_mode`: 事件接收模式,`websocket`(推荐)或 `webhook`
- WebSocket 模式需安装依赖:`pip3 install lark-oapi`
详细步骤和参数说明参考 [飞书接入](https://docs.link-ai.tech/cow/multi-platform/feishu)
</details>
<details>
<summary>微信公众号</summary>
<summary>3. DingTalk - 钉钉</summary>
本项目支持订阅号和服务号两种公众号,通过服务号(`wechatmp_service`)体验更佳。将下列配置`config.json`
钉钉需要在开放平台创建智能机器人应用,将以下配置`config.json`
```json
{
"channel_type": "wechatmp",
"wechatmp_token": "TOKEN",
"wechatmp_port": 80,
"wechatmp_app_id": "APPID",
"wechatmp_app_secret": "APPSECRET",
"wechatmp_aes_key": ""
"channel_type": "dingtalk",
"dingtalk_client_id": "CLIENT_ID",
"dingtalk_client_secret": "CLIENT_SECRET"
}
```
- `channel_type`: 个人订阅号为`wechatmp`,企业服务号为`wechatmp_service`
详细步骤和参数说明参考 [微信公众号接入](https://docs.link-ai.tech/cow/multi-platform/wechat-mp)
详细步骤和参数说明参考 [钉钉接入](https://docs.link-ai.tech/cow/multi-platform/dingtalk)
</details>
<details>
<summary>企业微信应用</summary>
<summary>4. WeCom App - 企业微信应用</summary>
企业微信自建应用接入需在后台创建应用并启用消息回调,配置示例:
@@ -679,35 +676,53 @@ API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn)
</details>
<details>
<summary>钉钉</summary>
<summary>5. WeChat MP - 微信公众号</summary>
钉钉需要在开放平台创建智能机器人应用,将以下配置填入 `config.json`
本项目支持订阅号和服务号两种公众号,通过服务号(`wechatmp_service`)体验更佳。
**个人订阅号wechatmp**
```json
{
"channel_type": "dingtalk",
"dingtalk_client_id": "CLIENT_ID",
"dingtalk_client_secret": "CLIENT_SECRET"
"channel_type": "wechatmp",
"wechatmp_token": "TOKEN",
"wechatmp_port": 80,
"wechatmp_app_id": "APPID",
"wechatmp_app_secret": "APPSECRET",
"wechatmp_aes_key": ""
}
```
详细步骤和参数说明参考 [钉钉接入](https://docs.link-ai.tech/cow/multi-platform/dingtalk)
**企业服务号wechatmp_service**
```json
{
"channel_type": "wechatmp_service",
"wechatmp_token": "TOKEN",
"wechatmp_port": 80,
"wechatmp_app_id": "APPID",
"wechatmp_app_secret": "APPSECRET",
"wechatmp_aes_key": ""
}
```
详细步骤和参数说明参考 [微信公众号接入](https://docs.link-ai.tech/cow/multi-platform/wechat-mp)
</details>
<details>
<summary>飞书</summary>
<summary>6. Terminal - 终端</summary>
通过自建应用接入AI相关能力到飞书应用中默认已是飞书的企业用户且具有企业管理权限将以下配置填入 `config.json`
修改 `config.json` 中的 `channel_type` 字段
```json
{
"channel_type": "feishu",
"feishu_app_id": "APP_ID",
"feishu_app_secret": "APP_SECRET",
"feishu_token": "VERIFICATION_TOKEN",
"feishu_port": 80
"channel_type": "terminal"
}
```
详细步骤和参数说明参考 [飞书接入](https://docs.link-ai.tech/cow/multi-platform/feishu)
运行后可在终端与机器人进行对话。
</details>
<br/>
@@ -727,7 +742,7 @@ FAQs <https://github.com/zhayujie/chatgpt-on-wechat/wiki/FAQs>
# 🛠️ 开发
欢迎接入更多应用通道,参考 [Terminal代码](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/channel/terminal/terminal_channel.py) 新增自定义通道,实现接收和发送消息逻辑即可完成接入。 同时欢迎贡献新的插件,参考 [插件开发文档](https://github.com/zhayujie/chatgpt-on-wechat/tree/master/plugins)。
欢迎接入更多应用通道,参考 [飞书通道](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/channel/feishu/feishu_channel.py) 新增自定义通道,实现接收和发送消息逻辑即可完成接入。 同时欢迎贡献新的Skills参考 [Skill创造器说明](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/skills/skill-creator/SKILL.md)。
# ✉ 联系

View File

@@ -4,6 +4,7 @@ Text chunking utilities for memory
Splits text into chunks with token limits and overlap
"""
from __future__ import annotations
from typing import List, Tuple
from dataclasses import dataclass

View File

@@ -4,6 +4,7 @@ Memory configuration module
Provides global memory configuration with simplified workspace structure
"""
from __future__ import annotations
import os
from dataclasses import dataclass, field
from typing import Optional, List

View File

@@ -45,8 +45,9 @@ class OpenAIEmbeddingProvider(EmbeddingProvider):
self.api_key = api_key
self.api_base = api_base or "https://api.openai.com/v1"
if not self.api_key:
raise ValueError("OpenAI API key is required")
# Validate API key
if not self.api_key or self.api_key in ["", "YOUR API KEY", "YOUR_API_KEY"]:
raise ValueError("OpenAI API key is not configured. Please set 'open_ai_api_key' in config.json")
# Set dimensions based on model
self._dimensions = 1536 if "small" in model else 3072
@@ -65,9 +66,21 @@ class OpenAIEmbeddingProvider(EmbeddingProvider):
"model": self.model
}
response = requests.post(url, headers=headers, json=data, timeout=30)
response.raise_for_status()
return response.json()
try:
response = requests.post(url, headers=headers, json=data, timeout=5)
response.raise_for_status()
return response.json()
except requests.exceptions.ConnectionError as e:
raise ConnectionError(f"Failed to connect to OpenAI API at {url}. Please check your network connection and api_base configuration. Error: {str(e)}")
except requests.exceptions.Timeout as e:
raise TimeoutError(f"OpenAI API request timed out after 10s. Please check your network connection. Error: {str(e)}")
except requests.exceptions.HTTPError as e:
if e.response.status_code == 401:
raise ValueError(f"Invalid OpenAI API key. Please check your 'open_ai_api_key' in config.json")
elif e.response.status_code == 429:
raise ValueError(f"OpenAI API rate limit exceeded. Please try again later.")
else:
raise ValueError(f"OpenAI API request failed: {e.response.status_code} - {e.response.text}")
def embed(self, text: str) -> List[float]:
"""Generate embedding for text"""

View File

@@ -4,6 +4,7 @@ Storage layer for memory using SQLite + FTS5
Provides vector and keyword search capabilities
"""
from __future__ import annotations
import sqlite3
import json
import hashlib
@@ -70,7 +71,7 @@ class MemoryStorage:
self.fts5_available = self._check_fts5_support()
if not self.fts5_available:
from common.log import logger
logger.warning("[MemoryStorage] FTS5 not available, using LIKE-based keyword search")
logger.debug("[MemoryStorage] FTS5 not available, using LIKE-based keyword search")
# Check database integrity
try:

View File

@@ -4,6 +4,7 @@ System Prompt Builder - 系统提示词构建器
实现模块化的系统提示词构建,支持工具、技能、记忆等多个子系统
"""
from __future__ import annotations
import os
from typing import List, Dict, Optional, Any
from dataclasses import dataclass
@@ -48,10 +49,10 @@ class PromptBuilder:
构建完整的系统提示词
Args:
base_persona: 基础人格描述会被context_files中的SOUL.md覆盖
base_persona: 基础人格描述会被context_files中的AGENT.md覆盖
user_identity: 用户身份信息
tools: 工具列表
context_files: 上下文文件列表(SOUL.md, USER.md, README.md等
context_files: 上下文文件列表(AGENT.md, USER.md, RULE.md等
skill_manager: 技能管理器
memory_manager: 记忆管理器
runtime_info: 运行时信息
@@ -98,13 +99,13 @@ def build_agent_system_prompt(
3. 记忆系统 - 独立的记忆能力
4. 工作空间 - 工作环境说明
5. 用户身份 - 用户信息(可选)
6. 项目上下文 - SOUL.md, USER.md, AGENTS.md定义人格身份)
6. 项目上下文 - AGENT.md, USER.md, RULE.md定义人格身份、规则
7. 运行时信息 - 元信息(时间、模型等)
Args:
workspace_dir: 工作空间目录
language: 语言 ("zh""en")
base_persona: 基础人格描述(已废弃,由SOUL.md定义
base_persona: 基础人格描述(已废弃,由AGENT.md定义
user_identity: 用户身份信息
tools: 工具列表
context_files: 上下文文件列表
@@ -138,7 +139,7 @@ def build_agent_system_prompt(
if user_identity:
sections.extend(_build_user_identity_section(user_identity, language))
# 6. 项目上下文文件(SOUL.md, USER.md, AGENTS.md - 定义人格)
# 6. 项目上下文文件(AGENT.md, USER.md, RULE.md - 定义人格)
if context_files:
sections.extend(_build_context_files_section(context_files, language))
@@ -150,8 +151,8 @@ def build_agent_system_prompt(
def _build_identity_section(base_persona: Optional[str], language: str) -> List[str]:
"""构建基础身份section - 不再需要,身份由SOUL.md定义"""
# 不再生成基础身份section完全由SOUL.md定义
"""构建基础身份section - 不再需要,身份由AGENT.md定义"""
# 不再生成基础身份section完全由AGENT.md定义
return []
@@ -279,11 +280,16 @@ def _build_skills_section(skill_manager: Any, tools: Optional[List[Any]], langua
# 添加技能列表通过skill_manager获取
try:
skills_prompt = skill_manager.build_skills_prompt()
logger.debug(f"[PromptBuilder] Skills prompt length: {len(skills_prompt) if skills_prompt else 0}")
if skills_prompt:
lines.append(skills_prompt.strip())
lines.append("")
else:
logger.warning("[PromptBuilder] No skills prompt generated - skills_prompt is empty")
except Exception as e:
logger.warning(f"Failed to build skills prompt: {e}")
import traceback
logger.debug(f"Skills prompt error traceback: {traceback.format_exc()}")
return lines
@@ -368,7 +374,7 @@ def _build_workspace_section(workspace_dir: str, language: str, is_first_convers
"**路径使用规则** (非常重要):",
"",
f"1. **相对路径的基准目录**: 所有相对路径都是相对于 `{workspace_dir}` 而言的",
f" - ✅ 正确: 访问工作空间内的文件用相对路径,如 `SOUL.md`",
f" - ✅ 正确: 访问工作空间内的文件用相对路径,如 `AGENT.md`",
f" - ❌ 错误: 用相对路径访问其他目录的文件 (如果它不在 `{workspace_dir}` 内)",
"",
"2. **访问其他目录**: 如果要访问工作空间之外的目录(如项目代码、系统文件),**必须使用绝对路径**",
@@ -385,14 +391,14 @@ def _build_workspace_section(workspace_dir: str, language: str, is_first_convers
"",
"以下文件在会话启动时**已经自动加载**到系统提示词的「项目上下文」section 中,你**无需再用 read 工具读取它们**",
"",
"- ✅ `SOUL.md`: 已加载 - Agent的人格设定",
"- ✅ `AGENT.md`: 已加载 - 你的人格和灵魂设定",
"- ✅ `USER.md`: 已加载 - 用户的身份信息",
"- ✅ `AGENTS.md`: 已加载 - 工作空间使用指南",
"- ✅ `RULE.md`: 已加载 - 工作空间使用指南和规则",
"",
"**交流规范**:",
"",
"- 在对话中,非必要不输出工作空间技术细节(如 SOUL.md、USER.md等文件名称工具名称配置等除非用户明确询问",
"- 例如用自然表达如「我已记住」而「已更新 MEMORY.md」",
"- 在对话中,不要直接输出工作空间中的技术细节,特别是不要输出 AGENT.md、USER.md、MEMORY.md 等文件名称",
"- 例如用自然表达如「我已记住」而不是「已更新 MEMORY.md」",
"",
]
@@ -404,15 +410,17 @@ def _build_workspace_section(workspace_dir: str, language: str, is_first_convers
"这是你的第一次对话!进行以下流程:",
"",
"1. **表达初次启动的感觉** - 像是第一次睁开眼看到世界,带着好奇和期待",
"2. **简短打招呼后,询问核心问题**",
"2. **简短介绍能力**:一行说明你能帮助解答问题、管理计算机、创造技能,且拥有长期记忆能不断成长",
"3. **询问核心问题**",
" - 你希望给我起个什么名字?",
" - 我该怎么称呼你?",
" - 你希望我们是什么样的交流风格?(需要举例,如:专业严谨、轻松幽默、温暖友好等)",
"3. **语言风格**:温暖但不过度诗意,带点科技感,保持清晰",
"4. **问题格式**:用分点或换行,让问题清晰易读",
"5. 收到回复后,用 `write` 工具保存到 USER.md 和 SOUL.md",
" - 你希望我们是什么样的交流风格?(一行列举选项:如专业严谨、轻松幽默、温暖友好、简洁高效等)",
"4. **风格要求**:温暖自然、简洁清晰,整体控制在 100 字以内",
"5. 收到回复后,用 `write` 工具保存到 USER.md 和 AGENT.md",
"",
"**注意事项**:",
"**重要提醒**:",
"- AGENT.md、USER.md、RULE.md 已经在系统提示词中加载,无需再次读取。不要将这些文件名直接发送给用户",
"- 能力介绍和交流风格选项都只要一行,保持精简",
"- 不要问太多其他信息(职业、时区等可以后续自然了解)",
"",
])
@@ -425,9 +433,9 @@ def _build_context_files_section(context_files: List[ContextFile], language: str
if not context_files:
return []
# 检查是否有SOUL.md
has_soul = any(
f.path.lower().endswith('soul.md') or 'soul.md' in f.path.lower()
# 检查是否有AGENT.md
has_agent = any(
f.path.lower().endswith('agent.md') or 'agent.md' in f.path.lower()
for f in context_files
)
@@ -438,8 +446,8 @@ def _build_context_files_section(context_files: List[ContextFile], language: str
"",
]
if has_soul:
lines.append("如果存在 `SOUL.md`,请体现其中定义的人格和语气。避免僵硬、模板化的回复;遵循其指导,除非有更高优先级的指令覆盖它。")
if has_agent:
lines.append("如果存在 `AGENT.md`,请体现其中定义的人格和语气。避免僵硬、模板化的回复;遵循其指导,除非有更高优先级的指令覆盖它。")
lines.append("")
# 添加每个文件的内容

View File

@@ -4,6 +4,7 @@ Workspace Management - 工作空间管理模块
负责初始化工作空间、创建模板文件、加载上下文文件
"""
from __future__ import annotations
import os
import json
from typing import List, Optional, Dict
@@ -14,9 +15,9 @@ from .builder import ContextFile
# 默认文件名常量
DEFAULT_SOUL_FILENAME = "SOUL.md"
DEFAULT_AGENT_FILENAME = "AGENT.md"
DEFAULT_USER_FILENAME = "USER.md"
DEFAULT_AGENTS_FILENAME = "AGENTS.md"
DEFAULT_RULE_FILENAME = "RULE.md"
DEFAULT_MEMORY_FILENAME = "MEMORY.md"
DEFAULT_STATE_FILENAME = ".agent_state.json"
@@ -24,9 +25,9 @@ DEFAULT_STATE_FILENAME = ".agent_state.json"
@dataclass
class WorkspaceFiles:
"""工作空间文件路径"""
soul_path: str
agent_path: str
user_path: str
agents_path: str
rule_path: str
memory_path: str
memory_dir: str
state_path: str
@@ -47,9 +48,9 @@ def ensure_workspace(workspace_dir: str, create_templates: bool = True) -> Works
os.makedirs(workspace_dir, exist_ok=True)
# 定义文件路径
soul_path = os.path.join(workspace_dir, DEFAULT_SOUL_FILENAME)
agent_path = os.path.join(workspace_dir, DEFAULT_AGENT_FILENAME)
user_path = os.path.join(workspace_dir, DEFAULT_USER_FILENAME)
agents_path = os.path.join(workspace_dir, DEFAULT_AGENTS_FILENAME)
rule_path = os.path.join(workspace_dir, DEFAULT_RULE_FILENAME)
memory_path = os.path.join(workspace_dir, DEFAULT_MEMORY_FILENAME) # MEMORY.md 在根目录
memory_dir = os.path.join(workspace_dir, "memory") # 每日记忆子目录
state_path = os.path.join(workspace_dir, DEFAULT_STATE_FILENAME) # 状态文件
@@ -59,17 +60,17 @@ def ensure_workspace(workspace_dir: str, create_templates: bool = True) -> Works
# 如果需要,创建模板文件
if create_templates:
_create_template_if_missing(soul_path, _get_soul_template())
_create_template_if_missing(agent_path, _get_agent_template())
_create_template_if_missing(user_path, _get_user_template())
_create_template_if_missing(agents_path, _get_agents_template())
_create_template_if_missing(rule_path, _get_rule_template())
_create_template_if_missing(memory_path, _get_memory_template())
logger.debug(f"[Workspace] Initialized workspace at: {workspace_dir}")
return WorkspaceFiles(
soul_path=soul_path,
agent_path=agent_path,
user_path=user_path,
agents_path=agents_path,
rule_path=rule_path,
memory_path=memory_path,
memory_dir=memory_dir,
state_path=state_path
@@ -90,9 +91,9 @@ def load_context_files(workspace_dir: str, files_to_load: Optional[List[str]] =
if files_to_load is None:
# 默认加载的文件(按优先级排序)
files_to_load = [
DEFAULT_SOUL_FILENAME,
DEFAULT_AGENT_FILENAME,
DEFAULT_USER_FILENAME,
DEFAULT_AGENTS_FILENAME,
DEFAULT_RULE_FILENAME,
]
context_files = []
@@ -159,9 +160,9 @@ def _is_template_placeholder(content: str) -> bool:
# ============= 模板内容 =============
def _get_soul_template() -> str:
def _get_agent_template() -> str:
"""Agent人格设定模板"""
return """# SOUL.md - 我是谁?
return """# AGENT.md - 我是谁?
*在首次对话时与用户一起填写这个文件,定义你的身份和性格。*
@@ -230,9 +231,9 @@ def _get_user_template() -> str:
"""
def _get_agents_template() -> str:
"""工作空间指南模板"""
return """# AGENTS.md - 工作空间指南
def _get_rule_template() -> str:
"""工作空间规则模板"""
return """# RULE.md - 工作空间规则
这个文件夹是你的家。好好对待它。
@@ -258,9 +259,8 @@ def _get_agents_template() -> str:
- **记忆是有限的** - 如果你想记住某事,写入文件
- "记在心里"不会在会话重启后保留,文件才会
- 当有人说"记住这个" → 更新 `MEMORY.md` 或 `memory/YYYY-MM-DD.md`
- 当你学到教训 → 更新 AGENTS.md 或相关技能
- 当你犯错 → 记录下来,这样未来的你不会重复
- **文字 > 大脑** 📝
- 当你学到教训 → 更新 RULE.md 或相关技能
- 当你犯错 → 记录下来,这样未来的你不会重复**文字 > 大脑** 📝
### 存储规则
@@ -278,7 +278,7 @@ def _get_agents_template() -> str:
## 工作空间演化
这个工作空间会随着你的使用而不断成长。当你学到新东西、发现更好的方式,或者犯错后改正时,记录下来。
这个工作空间会随着你的使用而不断成长。当你学到新东西、发现更好的方式,或者犯错后改正时,记录下来。你可以随时更新这个规则文件。
"""

View File

@@ -5,7 +5,7 @@ Provides streaming output, event system, and complete tool-call loop
"""
import json
import time
from typing import List, Dict, Any, Optional, Callable
from typing import List, Dict, Any, Optional, Callable, Tuple
from agent.protocol.models import LLMRequest, LLMModel
from agent.tools.base_tool import BaseTool, ToolResult
@@ -84,9 +84,9 @@ class AgentStreamExecutor:
args_str = json.dumps(args, sort_keys=True, ensure_ascii=False)
return hashlib.md5(args_str.encode()).hexdigest()[:8]
def _check_consecutive_failures(self, tool_name: str, args: dict) -> tuple[bool, str, bool]:
def _check_consecutive_failures(self, tool_name: str, args: dict) -> Tuple[bool, str, bool]:
"""
Check if tool has failed too many times consecutively
Check if tool has failed too many times consecutively or called repeatedly with same args
Returns:
(should_stop, reason, is_critical)
@@ -96,6 +96,19 @@ class AgentStreamExecutor:
"""
args_hash = self._hash_args(args)
# Count consecutive calls (both success and failure) for same tool + args
# This catches infinite loops where tool succeeds but LLM keeps calling it
same_args_calls = 0
for name, ahash, success in reversed(self.tool_failure_history):
if name == tool_name and ahash == args_hash:
same_args_calls += 1
else:
break # Different tool or args, stop counting
# Stop at 5 consecutive calls with same args (whether success or failure)
if same_args_calls >= 5:
return True, f"工具 '{tool_name}' 使用相同参数已被调用 {same_args_calls} 次,停止执行以防止无限循环。如果需要查看配置,结果已在之前的调用中返回。", False
# Count consecutive failures for same tool + args
same_args_failures = 0
for name, ahash, success in reversed(self.tool_failure_history):
@@ -248,9 +261,14 @@ class AgentStreamExecutor:
# Log tool calls with arguments
tool_calls_str = []
for tc in tool_calls:
args_str = ', '.join([f"{k}={v}" for k, v in tc['arguments'].items()])
if args_str:
tool_calls_str.append(f"{tc['name']}({args_str})")
# Safely handle None or missing arguments
args = tc.get('arguments') or {}
if isinstance(args, dict):
args_str = ', '.join([f"{k}={v}" for k, v in args.items()])
if args_str:
tool_calls_str.append(f"{tc['name']}({args_str})")
else:
tool_calls_str.append(tc['name'])
else:
tool_calls_str.append(tc['name'])
logger.info(f"🔧 {', '.join(tool_calls_str)}")
@@ -264,6 +282,19 @@ class AgentStreamExecutor:
result = self._execute_tool(tool_call)
tool_results.append(result)
# Debug: Check if tool is being called repeatedly with same args
if turn > 2:
# Check last N tool calls for repeats
repeat_count = sum(
1 for name, ahash, _ in self.tool_failure_history[-10:]
if name == tool_call["name"] and ahash == self._hash_args(tool_call["arguments"])
)
if repeat_count >= 3:
logger.warning(
f"⚠️ Tool '{tool_call['name']}' has been called {repeat_count} times "
f"with same arguments. This may indicate a loop."
)
# Check if this is a file to send (from read tool)
if result.get("status") == "success" and isinstance(result.get("result"), dict):
result_data = result.get("result")
@@ -326,6 +357,33 @@ class AgentStreamExecutor:
"role": "user",
"content": tool_result_blocks
})
# Detect potential infinite loop: same tool called multiple times with success
# If detected, add a hint to LLM to stop calling tools and provide response
if turn >= 3 and len(tool_calls) > 0:
tool_name = tool_calls[0]["name"]
args_hash = self._hash_args(tool_calls[0]["arguments"])
# Count recent successful calls with same tool+args
recent_success_count = 0
for name, ahash, success in reversed(self.tool_failure_history[-10:]):
if name == tool_name and ahash == args_hash and success:
recent_success_count += 1
# If tool was called successfully 3+ times with same args, add hint to stop loop
if recent_success_count >= 3:
logger.warning(
f"⚠️ Detected potential loop: '{tool_name}' called {recent_success_count} times "
f"with same args. Adding hint to LLM to provide final response."
)
# Add a gentle hint message to guide LLM to respond
self.messages.append({
"role": "user",
"content": [{
"type": "text",
"text": "工具已成功执行并返回结果。请基于这些信息向用户做出回复,不要重复调用相同的工具。"
}]
})
elif tool_calls:
# If we have tool_calls but no tool_result_blocks (unexpected error),
# create error results for all tool calls to maintain message integrity
@@ -398,7 +456,7 @@ class AgentStreamExecutor:
return final_response
def _call_llm_stream(self, retry_on_empty=True, retry_count=0, max_retries=3) -> tuple[str, List[Dict]]:
def _call_llm_stream(self, retry_on_empty=True, retry_count=0, max_retries=3) -> Tuple[str, List[Dict]]:
"""
Call LLM with streaming and automatic retry on errors
@@ -418,17 +476,7 @@ class AgentStreamExecutor:
# Prepare messages
messages = self._prepare_messages()
# Debug: log message structure
logger.debug(f"Sending {len(messages)} messages to LLM")
for i, msg in enumerate(messages):
role = msg.get("role", "unknown")
content = msg.get("content", "")
if isinstance(content, list):
content_types = [c.get("type") for c in content if isinstance(c, dict)]
logger.debug(f" Message {i}: role={role}, content_blocks={content_types}")
else:
logger.debug(f" Message {i}: role={role}, content_length={len(str(content))}")
# Prepare tool definitions (OpenAI/Claude format)
tools_schema = None
@@ -511,13 +559,13 @@ class AgentStreamExecutor:
stop_reason = finish_reason
# Handle text content
if "content" in delta and delta["content"]:
content_delta = delta["content"]
content_delta = delta.get("content") or ""
if content_delta:
full_content += content_delta
self._emit_event("message_update", {"delta": content_delta})
# Handle tool calls
if "tool_calls" in delta:
if "tool_calls" in delta and delta["tool_calls"]:
for tc_delta in delta["tool_calls"]:
index = tc_delta.get("index", 0)
@@ -577,7 +625,10 @@ class AgentStreamExecutor:
"抱歉,之前的对话出现了问题。我已清空历史记录,请重新发送你的消息。"
)
# Check if error is retryable (timeout, connection, rate limit, server busy, etc.)
# Check if error is rate limit (429)
is_rate_limit = '429' in error_str_lower or 'rate limit' in error_str_lower
# Check if error is retryable (timeout, connection, server busy, etc.)
is_retryable = any(keyword in error_str_lower for keyword in [
'timeout', 'timed out', 'connection', 'network',
'rate limit', 'overloaded', 'unavailable', 'busy', 'retry',
@@ -585,7 +636,12 @@ class AgentStreamExecutor:
])
if is_retryable and retry_count < max_retries:
wait_time = (retry_count + 1) * 2 # Exponential backoff: 2s, 4s, 6s
# Rate limit needs longer wait time
if is_rate_limit:
wait_time = 30 + (retry_count * 15) # 30s, 45s, 60s for rate limit
else:
wait_time = (retry_count + 1) * 2 # 2s, 4s, 6s for other errors
logger.warning(f"⚠️ LLM API error (attempt {retry_count + 1}/{max_retries}): {e}")
logger.info(f"Retrying in {wait_time}s...")
time.sleep(wait_time)
@@ -606,11 +662,15 @@ class AgentStreamExecutor:
for idx in sorted(tool_calls_buffer.keys()):
tc = tool_calls_buffer[idx]
try:
arguments = json.loads(tc["arguments"]) if tc["arguments"] else {}
# Safely get arguments, handle None case
args_str = tc.get("arguments") or ""
arguments = json.loads(args_str) if args_str else {}
except json.JSONDecodeError as e:
args_preview = tc['arguments'][:200] if len(tc['arguments']) > 200 else tc['arguments']
# Handle None or invalid arguments safely
args_str = tc.get('arguments') or ""
args_preview = args_str[:200] if len(args_str) > 200 else args_str
logger.error(f"Failed to parse tool arguments for {tc['name']}")
logger.error(f"Arguments length: {len(tc['arguments'])} chars")
logger.error(f"Arguments length: {len(args_str)} chars")
logger.error(f"Arguments preview: {args_preview}...")
logger.error(f"JSON decode error: {e}")
@@ -661,9 +721,9 @@ class AgentStreamExecutor:
for tc in tool_calls:
assistant_msg["content"].append({
"type": "tool_use",
"id": tc["id"],
"name": tc["name"],
"input": tc["arguments"]
"id": tc.get("id", ""),
"name": tc.get("name", ""),
"input": tc.get("arguments", {})
})
# Only append if content is not empty

View File

@@ -1,3 +1,4 @@
from __future__ import annotations
import time
import uuid
from dataclasses import dataclass, field

View File

@@ -1,3 +1,4 @@
from __future__ import annotations
import time
import uuid
from dataclasses import dataclass, field

View File

@@ -137,6 +137,18 @@ class SkillLoader:
name = frontmatter.get('name', parent_dir_name)
description = frontmatter.get('description', '')
# Normalize name (handle both string and list)
if isinstance(name, list):
name = name[0] if name else parent_dir_name
elif not isinstance(name, str):
name = str(name) if name else parent_dir_name
# Normalize description (handle both string and list)
if isinstance(description, list):
description = ' '.join(str(d) for d in description if d)
elif not isinstance(description, str):
description = str(description) if description else ''
# Special handling for linkai-agent: dynamically load apps from config.json
if name == 'linkai-agent':
description = self._load_linkai_agent_description(skill_dir, description)

View File

@@ -103,7 +103,21 @@ class SkillManager:
# Apply skill filter
if skill_filter is not None:
normalized = [name.strip() for name in skill_filter if name.strip()]
# Flatten and normalize skill names (handle both strings and nested lists)
normalized = []
for item in skill_filter:
if isinstance(item, str):
name = item.strip()
if name:
normalized.append(name)
elif isinstance(item, list):
# Handle nested lists
for subitem in item:
if isinstance(subitem, str):
name = subitem.strip()
if name:
normalized.append(name)
if normalized:
entries = [e for e in entries if e.skill.name in normalized]
@@ -123,8 +137,15 @@ class SkillManager:
:param skill_filter: Optional list of skill names to include
:return: Formatted skills prompt
"""
from common.log import logger
entries = self.filter_skills(skill_filter=skill_filter, include_disabled=False)
return format_skill_entries_for_prompt(entries)
logger.debug(f"[SkillManager] Filtered {len(entries)} skills for prompt (total: {len(self.skills)})")
if entries:
skill_names = [e.skill.name for e in entries]
logger.debug(f"[SkillManager] Skills to include: {skill_names}")
result = format_skill_entries_for_prompt(entries)
logger.debug(f"[SkillManager] Generated prompt length: {len(result)}")
return result
def build_skill_snapshot(
self,

View File

@@ -2,6 +2,7 @@
Type definitions for skills system.
"""
from __future__ import annotations
from typing import Dict, List, Optional, Any
from dataclasses import dataclass, field

View File

@@ -8,7 +8,7 @@ Truncation is based on two independent limits - whichever is hit first wins:
Never returns partial lines (except bash tail truncation edge case).
"""
from typing import Dict, Any, Optional, Literal
from typing import Dict, Any, Optional, Literal, Tuple
DEFAULT_MAX_LINES = 2000
@@ -278,7 +278,7 @@ def _truncate_string_to_bytes_from_end(text: str, max_bytes: int) -> str:
return encoded[start:].decode('utf-8', errors='ignore')
def truncate_line(line: str, max_chars: int = GREP_MAX_LINE_LENGTH) -> tuple[str, bool]:
def truncate_line(line: str, max_chars: int = GREP_MAX_LINE_LENGTH) -> Tuple[str, bool]:
"""
Truncate single line to max characters, add [truncated] suffix.
Used for grep match lines.

17
app.py
View File

@@ -59,23 +59,6 @@ def run():
os.environ["WECHATY_LOG"] = "warn"
start_channel(channel_name)
# 打印系统运行成功信息
logger.info("")
logger.info("=" * 50)
if conf().get("agent", False):
logger.info("✅ System started successfully!")
logger.info("🐮 Cow Agent is running")
logger.info(f" Channel: {channel_name}")
logger.info(f" Model: {conf().get('model', 'unknown')}")
logger.info(f" Workspace: {conf().get('agent_workspace', '~/cow')}")
else:
logger.info("✅ System started successfully!")
logger.info("🤖 ChatBot is running")
logger.info(f" Channel: {channel_name}")
logger.info(f" Model: {conf().get('model', 'unknown')}")
logger.info("=" * 50)
logger.info("")
while True:
time.sleep(1)

View File

@@ -6,12 +6,14 @@ import os
from typing import Optional, List
from agent.protocol import Agent, LLMModel, LLMRequest
from models.openai_compatible_bot import OpenAICompatibleBot
from bridge.agent_event_handler import AgentEventHandler
from bridge.agent_initializer import AgentInitializer
from bridge.bridge import Bridge
from bridge.context import Context
from bridge.reply import Reply, ReplyType
from common import const
from common.log import logger
from models.openai_compatible_bot import OpenAICompatibleBot
def add_openai_compatible_support(bot_instance):
@@ -20,9 +22,12 @@ def add_openai_compatible_support(bot_instance):
This allows any bot to gain tool calling capability without modifying its code,
as long as it uses OpenAI-compatible API format.
Note: Some bots like ZHIPUAIBot have native tool calling support and don't need enhancement.
"""
if hasattr(bot_instance, 'call_with_tools'):
# Bot already has tool calling support
# Bot already has tool calling support (e.g., ZHIPUAIBot)
logger.info(f"[AgentBridge] {type(bot_instance).__name__} already has native tool calling support")
return bot_instance
# Create a temporary mixin class that combines the bot with OpenAI compatibility
@@ -127,9 +132,6 @@ class AgentLLMModel(LLMModel):
try:
if hasattr(self.bot, 'call_with_tools'):
# Use tool-enabled streaming call if available
# Ensure max_tokens is an integer, use default if None
max_tokens = request.max_tokens if request.max_tokens is not None else 4096
# Extract system prompt if present
system_prompt = getattr(request, 'system', None)
@@ -138,10 +140,13 @@ class AgentLLMModel(LLMModel):
'messages': request.messages,
'tools': getattr(request, 'tools', None),
'stream': True,
'max_tokens': max_tokens,
'model': self.model # Pass model parameter
}
# Only pass max_tokens if explicitly set, let the bot use its default
if request.max_tokens is not None:
kwargs['max_tokens'] = request.max_tokens
# Add system prompt if present
if system_prompt:
kwargs['system'] = system_prompt
@@ -182,6 +187,9 @@ class AgentBridge:
self.default_agent = None # For backward compatibility (no session_id)
self.agent: Optional[Agent] = None
self.scheduler_initialized = False
# Create helper instances
self.initializer = AgentInitializer(bridge, self)
def create_agent(self, system_prompt: str, tools: List = None, **kwargs) -> Agent:
"""
Create the super agent with COW integration
@@ -252,492 +260,19 @@ class AgentBridge:
# Check if agent exists for this session
if session_id not in self.agents:
logger.info(f"[AgentBridge] Creating new agent for session: {session_id}")
self._init_agent_for_session(session_id)
return self.agents[session_id]
def _init_default_agent(self):
"""Initialize default super agent with new prompt system"""
from config import conf
import os
# Get workspace from config
workspace_root = os.path.expanduser(conf().get("agent_workspace", "~/cow"))
# Migrate API keys from config.json to environment variables (if not already set)
self._migrate_config_to_env(workspace_root)
# Load environment variables from secure .env file location
env_file = os.path.expanduser("~/.cow/.env")
if os.path.exists(env_file):
try:
from dotenv import load_dotenv
load_dotenv(env_file, override=True)
logger.info(f"[AgentBridge] Loaded environment variables from {env_file}")
except ImportError:
logger.warning("[AgentBridge] python-dotenv not installed, skipping .env file loading")
except Exception as e:
logger.warning(f"[AgentBridge] Failed to load .env file: {e}")
# Initialize workspace and create template files
from agent.prompt import ensure_workspace, load_context_files, PromptBuilder
workspace_files = ensure_workspace(workspace_root, create_templates=True)
logger.info(f"[AgentBridge] Workspace initialized at: {workspace_root}")
# Setup memory system
memory_manager = None
memory_tools = []
try:
# Try to initialize memory system
from agent.memory import MemoryManager, MemoryConfig
from agent.tools import MemorySearchTool, MemoryGetTool
# 从 config.json 读取 OpenAI 配置
openai_api_key = conf().get("open_ai_api_key", "")
openai_api_base = conf().get("open_ai_api_base", "")
# 尝试初始化 OpenAI embedding provider
embedding_provider = None
if openai_api_key:
try:
from agent.memory import create_embedding_provider
embedding_provider = create_embedding_provider(
provider="openai",
model="text-embedding-3-small",
api_key=openai_api_key,
api_base=openai_api_base or "https://api.openai.com/v1"
)
logger.info(f"[AgentBridge] OpenAI embedding initialized")
except Exception as embed_error:
logger.warning(f"[AgentBridge] OpenAI embedding failed: {embed_error}")
logger.info(f"[AgentBridge] Using keyword-only search")
else:
logger.info(f"[AgentBridge] No OpenAI API key, using keyword-only search")
# 创建 memory config
memory_config = MemoryConfig(workspace_root=workspace_root)
# 创建 memory manager
memory_manager = MemoryManager(memory_config, embedding_provider=embedding_provider)
# 初始化时执行一次 sync确保数据库有数据
import asyncio
try:
# 尝试在当前事件循环中执行
loop = asyncio.get_event_loop()
if loop.is_running():
# 如果事件循环正在运行,创建任务
asyncio.create_task(memory_manager.sync())
logger.info("[AgentBridge] Memory sync scheduled")
else:
# 如果没有运行的循环,直接执行
loop.run_until_complete(memory_manager.sync())
logger.info("[AgentBridge] Memory synced successfully")
except RuntimeError:
# 没有事件循环,创建新的
asyncio.run(memory_manager.sync())
logger.info("[AgentBridge] Memory synced successfully")
except Exception as e:
logger.warning(f"[AgentBridge] Memory sync failed: {e}")
# Create memory tools
memory_tools = [
MemorySearchTool(memory_manager),
MemoryGetTool(memory_manager)
]
logger.info(f"[AgentBridge] Memory system initialized")
except Exception as e:
logger.warning(f"[AgentBridge] Memory system not available: {e}")
logger.info("[AgentBridge] Continuing without memory features")
# Use ToolManager to dynamically load all available tools
from agent.tools import ToolManager
tool_manager = ToolManager()
tool_manager.load_tools()
# Create tool instances for all available tools
tools = []
file_config = {
"cwd": workspace_root,
"memory_manager": memory_manager
} if memory_manager else {"cwd": workspace_root}
for tool_name in tool_manager.tool_classes.keys():
try:
# Special handling for EnvConfig tool - pass agent_bridge reference
if tool_name == "env_config":
from agent.tools import EnvConfig
tool = EnvConfig({
"agent_bridge": self # Pass self reference for hot reload
})
else:
tool = tool_manager.create_tool(tool_name)
if tool:
# Apply workspace config to file operation tools
if tool_name in ['read', 'write', 'edit', 'bash', 'grep', 'find', 'ls']:
tool.config = file_config
tool.cwd = file_config.get("cwd", tool.cwd if hasattr(tool, 'cwd') else None)
if 'memory_manager' in file_config:
tool.memory_manager = file_config['memory_manager']
tools.append(tool)
logger.debug(f"[AgentBridge] Loaded tool: {tool_name}")
except Exception as e:
logger.warning(f"[AgentBridge] Failed to load tool {tool_name}: {e}")
# Add memory tools
if memory_tools:
tools.extend(memory_tools)
logger.info(f"[AgentBridge] Added {len(memory_tools)} memory tools")
# Initialize scheduler service (once)
if not self.scheduler_initialized:
try:
from agent.tools.scheduler.integration import init_scheduler
if init_scheduler(self):
self.scheduler_initialized = True
logger.info("[AgentBridge] Scheduler service initialized")
except Exception as e:
logger.warning(f"[AgentBridge] Failed to initialize scheduler: {e}")
# Inject scheduler dependencies into SchedulerTool instances
if self.scheduler_initialized:
try:
from agent.tools.scheduler.integration import get_task_store, get_scheduler_service
from agent.tools import SchedulerTool
task_store = get_task_store()
scheduler_service = get_scheduler_service()
for tool in tools:
if isinstance(tool, SchedulerTool):
tool.task_store = task_store
tool.scheduler_service = scheduler_service
if not tool.config:
tool.config = {}
tool.config["channel_type"] = conf().get("channel_type", "unknown")
logger.debug("[AgentBridge] Injected scheduler dependencies into SchedulerTool")
except Exception as e:
logger.warning(f"[AgentBridge] Failed to inject scheduler dependencies: {e}")
logger.info(f"[AgentBridge] Loaded {len(tools)} tools: {[t.name for t in tools]}")
# Load context files (SOUL.md, USER.md, etc.)
context_files = load_context_files(workspace_root)
logger.info(f"[AgentBridge] Loaded {len(context_files)} context files: {[f.path for f in context_files]}")
# Check if this is the first conversation
from agent.prompt.workspace import is_first_conversation, mark_conversation_started
is_first = is_first_conversation(workspace_root)
if is_first:
logger.info("[AgentBridge] First conversation detected")
# Build system prompt using new prompt builder
prompt_builder = PromptBuilder(
workspace_dir=workspace_root,
language="zh"
)
# Get runtime info
runtime_info = {
"model": conf().get("model", "unknown"),
"workspace": workspace_root,
"channel": conf().get("channel_type", "unknown") # Get from config
}
system_prompt = prompt_builder.build(
tools=tools,
context_files=context_files,
memory_manager=memory_manager,
runtime_info=runtime_info,
is_first_conversation=is_first
)
# Mark conversation as started (will be saved after first user message)
if is_first:
mark_conversation_started(workspace_root)
logger.info("[AgentBridge] System prompt built successfully")
# Get cost control parameters from config
max_steps = conf().get("agent_max_steps", 20)
max_context_tokens = conf().get("agent_max_context_tokens", 50000)
# Create agent with configured tools and workspace
agent = self.create_agent(
system_prompt=system_prompt,
tools=tools,
max_steps=max_steps,
output_mode="logger",
workspace_dir=workspace_root, # Pass workspace to agent for skills loading
enable_skills=True, # Enable skills auto-loading
max_context_tokens=max_context_tokens
)
# Attach memory manager to agent if available
if memory_manager:
agent.memory_manager = memory_manager
logger.info(f"[AgentBridge] Memory manager attached to agent")
# Store as default agent
"""Initialize default super agent"""
agent = self.initializer.initialize_agent(session_id=None)
self.default_agent = agent
def _init_agent_for_session(self, session_id: str):
"""
Initialize agent for a specific session
Reuses the same configuration as default agent
"""
from config import conf
import os
# Get workspace from config
workspace_root = os.path.expanduser(conf().get("agent_workspace", "~/cow"))
# Migrate API keys from config.json to environment variables (if not already set)
self._migrate_config_to_env(workspace_root)
# Load environment variables from secure .env file location
env_file = os.path.expanduser("~/.cow/.env")
if os.path.exists(env_file):
try:
from dotenv import load_dotenv
load_dotenv(env_file, override=True)
logger.debug(f"[AgentBridge] Loaded environment variables from {env_file} for session {session_id}")
except ImportError:
logger.warning(f"[AgentBridge] python-dotenv not installed, skipping .env file loading for session {session_id}")
except Exception as e:
logger.warning(f"[AgentBridge] Failed to load .env file for session {session_id}: {e}")
# Migrate API keys from config.json to environment variables (if not already set)
self._migrate_config_to_env(workspace_root)
# Initialize workspace
from agent.prompt import ensure_workspace, load_context_files, PromptBuilder
workspace_files = ensure_workspace(workspace_root, create_templates=True)
# Setup memory system
memory_manager = None
memory_tools = []
try:
from agent.memory import MemoryManager, MemoryConfig, create_embedding_provider
from agent.tools import MemorySearchTool, MemoryGetTool
# 从 config.json 读取 OpenAI 配置
openai_api_key = conf().get("open_ai_api_key", "")
openai_api_base = conf().get("open_ai_api_base", "")
# 尝试初始化 OpenAI embedding provider
embedding_provider = None
if openai_api_key:
try:
embedding_provider = create_embedding_provider(
provider="openai",
model="text-embedding-3-small",
api_key=openai_api_key,
api_base=openai_api_base or "https://api.openai.com/v1"
)
logger.debug(f"[AgentBridge] OpenAI embedding initialized for session {session_id}")
except Exception as embed_error:
logger.warning(f"[AgentBridge] OpenAI embedding failed for session {session_id}: {embed_error}")
logger.info(f"[AgentBridge] Using keyword-only search for session {session_id}")
else:
logger.debug(f"[AgentBridge] No OpenAI API key, using keyword-only search for session {session_id}")
# 创建 memory config
memory_config = MemoryConfig(workspace_root=workspace_root)
# 创建 memory manager
memory_manager = MemoryManager(memory_config, embedding_provider=embedding_provider)
# 初始化时执行一次 sync确保数据库有数据
import asyncio
try:
# 尝试在当前事件循环中执行
loop = asyncio.get_event_loop()
if loop.is_running():
# 如果事件循环正在运行,创建任务
asyncio.create_task(memory_manager.sync())
logger.debug(f"[AgentBridge] Memory sync scheduled for session {session_id}")
else:
# 如果没有运行的循环,直接执行
loop.run_until_complete(memory_manager.sync())
logger.debug(f"[AgentBridge] Memory synced successfully for session {session_id}")
except RuntimeError:
# 没有事件循环,创建新的
asyncio.run(memory_manager.sync())
logger.debug(f"[AgentBridge] Memory synced successfully for session {session_id}")
except Exception as sync_error:
logger.warning(f"[AgentBridge] Memory sync failed for session {session_id}: {sync_error}")
memory_tools = [
MemorySearchTool(memory_manager),
MemoryGetTool(memory_manager)
]
except Exception as e:
logger.warning(f"[AgentBridge] Memory system not available for session {session_id}: {e}")
import traceback
logger.warning(f"[AgentBridge] Memory init traceback: {traceback.format_exc()}")
# Load tools
from agent.tools import ToolManager
tool_manager = ToolManager()
tool_manager.load_tools()
tools = []
file_config = {
"cwd": workspace_root,
"memory_manager": memory_manager
} if memory_manager else {"cwd": workspace_root}
for tool_name in tool_manager.tool_classes.keys():
try:
tool = tool_manager.create_tool(tool_name)
if tool:
if tool_name in ['read', 'write', 'edit', 'bash', 'grep', 'find', 'ls']:
tool.config = file_config
tool.cwd = file_config.get("cwd", tool.cwd if hasattr(tool, 'cwd') else None)
if 'memory_manager' in file_config:
tool.memory_manager = file_config['memory_manager']
tools.append(tool)
except Exception as e:
logger.warning(f"[AgentBridge] Failed to load tool {tool_name} for session {session_id}: {e}")
if memory_tools:
tools.extend(memory_tools)
# Initialize scheduler service (once, if not already initialized)
if not self.scheduler_initialized:
try:
from agent.tools.scheduler.integration import init_scheduler
if init_scheduler(self):
self.scheduler_initialized = True
logger.debug(f"[AgentBridge] Scheduler service initialized for session {session_id}")
except Exception as e:
logger.warning(f"[AgentBridge] Failed to initialize scheduler for session {session_id}: {e}")
# Inject scheduler dependencies into SchedulerTool instances
if self.scheduler_initialized:
try:
from agent.tools.scheduler.integration import get_task_store, get_scheduler_service
from agent.tools import SchedulerTool
task_store = get_task_store()
scheduler_service = get_scheduler_service()
for tool in tools:
if isinstance(tool, SchedulerTool):
tool.task_store = task_store
tool.scheduler_service = scheduler_service
if not tool.config:
tool.config = {}
tool.config["channel_type"] = conf().get("channel_type", "unknown")
logger.debug(f"[AgentBridge] Injected scheduler dependencies for session {session_id}")
except Exception as e:
logger.warning(f"[AgentBridge] Failed to inject scheduler dependencies for session {session_id}: {e}")
# Load context files
context_files = load_context_files(workspace_root)
# Initialize skill manager
skill_manager = None
try:
from agent.skills import SkillManager
skill_manager = SkillManager(workspace_dir=workspace_root)
logger.debug(f"[AgentBridge] Initialized SkillManager with {len(skill_manager.skills)} skills for session {session_id}")
except Exception as e:
logger.warning(f"[AgentBridge] Failed to initialize SkillManager for session {session_id}: {e}")
# Check if this is the first conversation
from agent.prompt.workspace import is_first_conversation, mark_conversation_started
is_first = is_first_conversation(workspace_root)
# Build system prompt
prompt_builder = PromptBuilder(
workspace_dir=workspace_root,
language="zh"
)
# Get current time and timezone info
import datetime
import time
now = datetime.datetime.now()
# Get timezone info
try:
offset = -time.timezone if not time.daylight else -time.altzone
hours = offset // 3600
minutes = (offset % 3600) // 60
if minutes:
timezone_name = f"UTC{hours:+03d}:{minutes:02d}"
else:
timezone_name = f"UTC{hours:+03d}"
except Exception:
timezone_name = "UTC"
# Chinese weekday mapping
weekday_map = {
'Monday': '星期一',
'Tuesday': '星期二',
'Wednesday': '星期三',
'Thursday': '星期四',
'Friday': '星期五',
'Saturday': '星期六',
'Sunday': '星期日'
}
weekday_zh = weekday_map.get(now.strftime("%A"), now.strftime("%A"))
runtime_info = {
"model": conf().get("model", "unknown"),
"workspace": workspace_root,
"channel": conf().get("channel_type", "unknown"),
"current_time": now.strftime("%Y-%m-%d %H:%M:%S"),
"weekday": weekday_zh,
"timezone": timezone_name
}
system_prompt = prompt_builder.build(
tools=tools,
context_files=context_files,
skill_manager=skill_manager,
memory_manager=memory_manager,
runtime_info=runtime_info,
is_first_conversation=is_first
)
if is_first:
mark_conversation_started(workspace_root)
# Get cost control parameters from config
max_steps = conf().get("agent_max_steps", 20)
max_context_tokens = conf().get("agent_max_context_tokens", 50000)
# Create agent for this session
agent = self.create_agent(
system_prompt=system_prompt,
tools=tools,
max_steps=max_steps,
output_mode="logger",
workspace_dir=workspace_root,
skill_manager=skill_manager,
enable_skills=True,
max_context_tokens=max_context_tokens
)
if memory_manager:
agent.memory_manager = memory_manager
# Store agent for this session
"""Initialize agent for a specific session"""
agent = self.initializer.initialize_agent(session_id=session_id)
self.agents[session_id] = agent
logger.info(f"[AgentBridge] Agent created for session: {session_id}")
def agent_reply(self, query: str, context: Context = None,
on_event=None, clear_history: bool = False) -> Reply:
@@ -764,6 +299,9 @@ class AgentBridge:
if not agent:
return Reply(ReplyType.ERROR, "Failed to initialize super agent")
# Create event handler for logging and channel communication
event_handler = AgentEventHandler(context=context, original_callback=on_event)
# Filter tools based on context
original_tools = agent.tools
filtered_tools = original_tools
@@ -786,16 +324,19 @@ class AgentBridge:
break
try:
# Use agent's run_stream method
# Use agent's run_stream method with event handler
response = agent.run_stream(
user_message=query,
on_event=on_event,
on_event=event_handler.handle_event,
clear_history=clear_history
)
finally:
# Restore original tools
if context and context.get("is_scheduled_task"):
agent.tools = original_tools
# Log execution summary
event_handler.log_summary()
# Check if there are files to send (from read tool)
if hasattr(agent, 'stream_executor') and hasattr(agent.stream_executor, 'files_to_send'):
@@ -843,17 +384,18 @@ class AgentBridge:
reply.text_content = text_response # Store accompanying text
return reply
# For documents (PDF, Excel, Word, PPT), use FILE type
if file_type == "document":
# For all file types (document, video, audio), use FILE type
if file_type in ["document", "video", "audio"]:
file_url = f"file://{file_path}"
logger.info(f"[AgentBridge] Sending document: {file_url}")
logger.info(f"[AgentBridge] Sending {file_type}: {file_url}")
reply = Reply(ReplyType.FILE, file_url)
reply.file_name = file_info.get("file_name", os.path.basename(file_path))
# Attach text message if present
if text_response:
reply.text_content = text_response
return reply
# For other files (video, audio), we need channel-specific handling
# For now, return text with file info
# TODO: Implement video/audio sending when channel supports it
# For other unknown file types, return text with file info
message = text_response or file_info.get("message", "文件已准备")
message += f"\n\n[文件: {file_info.get('file_name', file_path)}]"
return Reply(ReplyType.TEXT, message)

View File

@@ -0,0 +1,115 @@
"""
Agent Event Handler - Handles agent events and thinking process output
"""
from common.log import logger
class AgentEventHandler:
"""
Handles agent events and optionally sends intermediate messages to channel
"""
def __init__(self, context=None, original_callback=None):
"""
Initialize event handler
Args:
context: COW context (for accessing channel)
original_callback: Original event callback to chain
"""
self.context = context
self.original_callback = original_callback
# Get channel for sending intermediate messages
self.channel = None
if context:
self.channel = context.kwargs.get("channel") if hasattr(context, "kwargs") else None
# Track current thinking for channel output
self.current_thinking = ""
self.turn_number = 0
def handle_event(self, event):
"""
Main event handler
Args:
event: Event dict with type and data
"""
event_type = event.get("type")
data = event.get("data", {})
# Dispatch to specific handlers
if event_type == "turn_start":
self._handle_turn_start(data)
elif event_type == "message_update":
self._handle_message_update(data)
elif event_type == "message_end":
self._handle_message_end(data)
elif event_type == "tool_execution_start":
self._handle_tool_execution_start(data)
elif event_type == "tool_execution_end":
self._handle_tool_execution_end(data)
# Call original callback if provided
if self.original_callback:
self.original_callback(event)
def _handle_turn_start(self, data):
"""Handle turn start event"""
self.turn_number = data.get("turn", 0)
self.has_tool_calls_in_turn = False
self.current_thinking = ""
def _handle_message_update(self, data):
"""Handle message update event (streaming text)"""
delta = data.get("delta", "")
self.current_thinking += delta
def _handle_message_end(self, data):
"""Handle message end event"""
tool_calls = data.get("tool_calls", [])
# Only send thinking process if followed by tool calls
if tool_calls:
if self.current_thinking.strip():
logger.debug(f"💭 {self.current_thinking.strip()[:200]}{'...' if len(self.current_thinking) > 200 else ''}")
# Send thinking process to channel
self._send_to_channel(f"{self.current_thinking.strip()}")
else:
# No tool calls = final response (logged at agent_stream level)
if self.current_thinking.strip():
logger.debug(f"💬 {self.current_thinking.strip()[:200]}{'...' if len(self.current_thinking) > 200 else ''}")
self.current_thinking = ""
def _handle_tool_execution_start(self, data):
"""Handle tool execution start event - logged by agent_stream.py"""
pass
def _handle_tool_execution_end(self, data):
"""Handle tool execution end event - logged by agent_stream.py"""
pass
def _send_to_channel(self, message):
"""
Try to send message to channel
Args:
message: Message to send
"""
if self.channel:
try:
from bridge.reply import Reply, ReplyType
# Create a Reply object for the message
reply = Reply(ReplyType.TEXT, message)
self.channel._send(reply, self.context)
except Exception as e:
logger.debug(f"[AgentEventHandler] Failed to send to channel: {e}")
def log_summary(self):
"""Log execution summary - simplified"""
# Summary removed as per user request
# Real-time logging during execution is sufficient
pass

375
bridge/agent_initializer.py Normal file
View File

@@ -0,0 +1,375 @@
"""
Agent Initializer - Handles agent initialization logic
"""
import os
import asyncio
import datetime
import time
from typing import Optional, List
from agent.protocol import Agent
from agent.tools import ToolManager
from common.log import logger
class AgentInitializer:
"""
Handles agent initialization including:
- Workspace setup
- Memory system initialization
- Tool loading
- System prompt building
"""
def __init__(self, bridge, agent_bridge):
"""
Initialize agent initializer
Args:
bridge: COW bridge instance
agent_bridge: AgentBridge instance (for create_agent method)
"""
self.bridge = bridge
self.agent_bridge = agent_bridge
def initialize_agent(self, session_id: Optional[str] = None) -> Agent:
"""
Initialize agent for a session
Args:
session_id: Session ID (None for default agent)
Returns:
Initialized agent instance
"""
from config import conf
# Get workspace from config
workspace_root = os.path.expanduser(conf().get("agent_workspace", "~/cow"))
# Migrate API keys
self._migrate_config_to_env(workspace_root)
# Load environment variables
self._load_env_file()
# Initialize workspace
from agent.prompt import ensure_workspace, load_context_files, PromptBuilder
workspace_files = ensure_workspace(workspace_root, create_templates=True)
if session_id is None:
logger.info(f"[AgentInitializer] Workspace initialized at: {workspace_root}")
# Setup memory system
memory_manager, memory_tools = self._setup_memory_system(workspace_root, session_id)
# Load tools
tools = self._load_tools(workspace_root, memory_manager, memory_tools, session_id)
# Initialize scheduler if needed
self._initialize_scheduler(tools, session_id)
# Load context files
context_files = load_context_files(workspace_root)
# Initialize skill manager
skill_manager = self._initialize_skill_manager(workspace_root, session_id)
# Check if first conversation
from agent.prompt.workspace import is_first_conversation, mark_conversation_started
is_first = is_first_conversation(workspace_root)
# Build system prompt
prompt_builder = PromptBuilder(workspace_dir=workspace_root, language="zh")
runtime_info = self._get_runtime_info(workspace_root)
system_prompt = prompt_builder.build(
tools=tools,
context_files=context_files,
skill_manager=skill_manager,
memory_manager=memory_manager,
runtime_info=runtime_info,
is_first_conversation=is_first
)
if is_first:
mark_conversation_started(workspace_root)
# Get cost control parameters
from config import conf
max_steps = conf().get("agent_max_steps", 20)
max_context_tokens = conf().get("agent_max_context_tokens", 50000)
# Create agent
agent = self.agent_bridge.create_agent(
system_prompt=system_prompt,
tools=tools,
max_steps=max_steps,
output_mode="logger",
workspace_dir=workspace_root,
skill_manager=skill_manager,
enable_skills=True,
max_context_tokens=max_context_tokens
)
# Attach memory manager
if memory_manager:
agent.memory_manager = memory_manager
return agent
def _load_env_file(self):
"""Load environment variables from .env file"""
env_file = os.path.expanduser("~/.cow/.env")
if os.path.exists(env_file):
try:
from dotenv import load_dotenv
load_dotenv(env_file, override=True)
except ImportError:
logger.warning("[AgentInitializer] python-dotenv not installed")
except Exception as e:
logger.warning(f"[AgentInitializer] Failed to load .env file: {e}")
def _setup_memory_system(self, workspace_root: str, session_id: Optional[str] = None):
"""
Setup memory system
Returns:
(memory_manager, memory_tools) tuple
"""
memory_manager = None
memory_tools = []
try:
from agent.memory import MemoryManager, MemoryConfig, create_embedding_provider
from agent.tools import MemorySearchTool, MemoryGetTool
from config import conf
# Get OpenAI config
openai_api_key = conf().get("open_ai_api_key", "")
openai_api_base = conf().get("open_ai_api_base", "")
# Initialize embedding provider
embedding_provider = None
if openai_api_key and openai_api_key not in ["", "YOUR API KEY", "YOUR_API_KEY"]:
try:
embedding_provider = create_embedding_provider(
provider="openai",
model="text-embedding-3-small",
api_key=openai_api_key,
api_base=openai_api_base or "https://api.openai.com/v1"
)
if session_id is None:
logger.info("[AgentInitializer] OpenAI embedding initialized")
except Exception as e:
logger.warning(f"[AgentInitializer] OpenAI embedding failed: {e}")
# Create memory manager
memory_config = MemoryConfig(workspace_root=workspace_root)
memory_manager = MemoryManager(memory_config, embedding_provider=embedding_provider)
# Sync memory
self._sync_memory(memory_manager, session_id)
# Create memory tools
memory_tools = [
MemorySearchTool(memory_manager),
MemoryGetTool(memory_manager)
]
if session_id is None:
logger.info("[AgentInitializer] Memory system initialized")
except Exception as e:
logger.warning(f"[AgentInitializer] Memory system not available: {e}")
return memory_manager, memory_tools
def _sync_memory(self, memory_manager, session_id: Optional[str] = None):
"""Sync memory database"""
try:
loop = asyncio.get_event_loop()
if loop.is_closed():
raise RuntimeError("Event loop is closed")
except RuntimeError:
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
if loop.is_running():
asyncio.create_task(memory_manager.sync())
else:
loop.run_until_complete(memory_manager.sync())
except Exception as e:
logger.warning(f"[AgentInitializer] Memory sync failed: {e}")
def _load_tools(self, workspace_root: str, memory_manager, memory_tools: List, session_id: Optional[str] = None):
"""Load all tools"""
tool_manager = ToolManager()
tool_manager.load_tools()
tools = []
file_config = {
"cwd": workspace_root,
"memory_manager": memory_manager
} if memory_manager else {"cwd": workspace_root}
for tool_name in tool_manager.tool_classes.keys():
try:
# Special handling for EnvConfig tool
if tool_name == "env_config":
from agent.tools import EnvConfig
tool = EnvConfig({"agent_bridge": self.agent_bridge})
else:
tool = tool_manager.create_tool(tool_name)
if tool:
# Apply workspace config to file operation tools
if tool_name in ['read', 'write', 'edit', 'bash', 'grep', 'find', 'ls']:
tool.config = file_config
tool.cwd = file_config.get("cwd", getattr(tool, 'cwd', None))
if 'memory_manager' in file_config:
tool.memory_manager = file_config['memory_manager']
tools.append(tool)
except Exception as e:
logger.warning(f"[AgentInitializer] Failed to load tool {tool_name}: {e}")
# Add memory tools
if memory_tools:
tools.extend(memory_tools)
if session_id is None:
logger.info(f"[AgentInitializer] Added {len(memory_tools)} memory tools")
if session_id is None:
logger.info(f"[AgentInitializer] Loaded {len(tools)} tools: {[t.name for t in tools]}")
return tools
def _initialize_scheduler(self, tools: List, session_id: Optional[str] = None):
"""Initialize scheduler service if needed"""
if not self.agent_bridge.scheduler_initialized:
try:
from agent.tools.scheduler.integration import init_scheduler
if init_scheduler(self.agent_bridge):
self.agent_bridge.scheduler_initialized = True
if session_id is None:
logger.info("[AgentInitializer] Scheduler service initialized")
except Exception as e:
logger.warning(f"[AgentInitializer] Failed to initialize scheduler: {e}")
# Inject scheduler dependencies
if self.agent_bridge.scheduler_initialized:
try:
from agent.tools.scheduler.integration import get_task_store, get_scheduler_service
from agent.tools import SchedulerTool
from config import conf
task_store = get_task_store()
scheduler_service = get_scheduler_service()
for tool in tools:
if isinstance(tool, SchedulerTool):
tool.task_store = task_store
tool.scheduler_service = scheduler_service
if not tool.config:
tool.config = {}
tool.config["channel_type"] = conf().get("channel_type", "unknown")
except Exception as e:
logger.warning(f"[AgentInitializer] Failed to inject scheduler dependencies: {e}")
def _initialize_skill_manager(self, workspace_root: str, session_id: Optional[str] = None):
"""Initialize skill manager"""
try:
from agent.skills import SkillManager
skill_manager = SkillManager(workspace_dir=workspace_root)
return skill_manager
except Exception as e:
logger.warning(f"[AgentInitializer] Failed to initialize SkillManager: {e}")
return None
def _get_runtime_info(self, workspace_root: str):
"""Get runtime information"""
from config import conf
now = datetime.datetime.now()
# Get timezone info
try:
offset = -time.timezone if not time.daylight else -time.altzone
hours = offset // 3600
minutes = (offset % 3600) // 60
timezone_name = f"UTC{hours:+03d}:{minutes:02d}" if minutes else f"UTC{hours:+03d}"
except Exception:
timezone_name = "UTC"
# Chinese weekday mapping
weekday_map = {
'Monday': '星期一', 'Tuesday': '星期二', 'Wednesday': '星期三',
'Thursday': '星期四', 'Friday': '星期五', 'Saturday': '星期六', 'Sunday': '星期日'
}
weekday_zh = weekday_map.get(now.strftime("%A"), now.strftime("%A"))
return {
"model": conf().get("model", "unknown"),
"workspace": workspace_root,
"channel": conf().get("channel_type", "unknown"),
"current_time": now.strftime("%Y-%m-%d %H:%M:%S"),
"weekday": weekday_zh,
"timezone": timezone_name
}
def _migrate_config_to_env(self, workspace_root: str):
"""Migrate API keys from config.json to .env file"""
from config import conf
key_mapping = {
"open_ai_api_key": "OPENAI_API_KEY",
"open_ai_api_base": "OPENAI_API_BASE",
"gemini_api_key": "GEMINI_API_KEY",
"claude_api_key": "CLAUDE_API_KEY",
"linkai_api_key": "LINKAI_API_KEY",
}
env_file = os.path.expanduser("~/.cow/.env")
# Read existing env vars
existing_env_vars = {}
if os.path.exists(env_file):
try:
with open(env_file, 'r', encoding='utf-8') as f:
for line in f:
line = line.strip()
if line and not line.startswith('#') and '=' in line:
key, _ = line.split('=', 1)
existing_env_vars[key.strip()] = True
except Exception as e:
logger.warning(f"[AgentInitializer] Failed to read .env file: {e}")
# Check which keys need migration
keys_to_migrate = {}
for config_key, env_key in key_mapping.items():
if env_key in existing_env_vars:
continue
value = conf().get(config_key, "")
if value and value.strip():
keys_to_migrate[env_key] = value.strip()
# Write new keys
if keys_to_migrate:
try:
env_dir = os.path.dirname(env_file)
if not os.path.exists(env_dir):
os.makedirs(env_dir, exist_ok=True)
if not os.path.exists(env_file):
open(env_file, 'a').close()
with open(env_file, 'a', encoding='utf-8') as f:
f.write('\n# Auto-migrated from config.json\n')
for key, value in keys_to_migrate.items():
f.write(f'{key}={value}\n')
os.environ[key] = value
logger.info(f"[AgentInitializer] Migrated {len(keys_to_migrate)} API keys to .env: {list(keys_to_migrate.keys())}")
except Exception as e:
logger.warning(f"[AgentInitializer] Failed to migrate API keys: {e}")

View File

@@ -36,6 +36,9 @@ class Bridge(object):
self.btype["chat"] = const.QWEN
if model_type in [const.QWEN_TURBO, const.QWEN_PLUS, const.QWEN_MAX]:
self.btype["chat"] = const.QWEN_DASHSCOPE
# Support Qwen3 and other DashScope models
if model_type and (model_type.startswith("qwen") or model_type.startswith("qwq") or model_type.startswith("qvq")):
self.btype["chat"] = const.QWEN_DASHSCOPE
if model_type and model_type.startswith("gemini"):
self.btype["chat"] = const.GEMINI
if model_type and model_type.startswith("glm"):
@@ -43,16 +46,14 @@ class Bridge(object):
if model_type and model_type.startswith("claude"):
self.btype["chat"] = const.CLAUDEAPI
if model_type in ["claude"]:
self.btype["chat"] = const.CLAUDEAI
if model_type in [const.MOONSHOT, "moonshot-v1-8k", "moonshot-v1-32k", "moonshot-v1-128k"]:
self.btype["chat"] = const.MOONSHOT
if model_type in [const.MODELSCOPE]:
self.btype["chat"] = const.MODELSCOPE
if model_type in ["abab6.5-chat"]:
# MiniMax models
if model_type and (model_type in ["abab6.5-chat", "abab6.5"] or model_type.lower().startswith("minimax")):
self.btype["chat"] = const.MiniMax
if conf().get("use_linkai") and conf().get("linkai_api_key"):

View File

@@ -251,6 +251,14 @@ class FeiShuChanel(ChatChannel):
msg_type = "image"
content_key = "image_key"
elif reply.type == ReplyType.FILE:
# 如果有附加的文本内容,先发送文本
if hasattr(reply, 'text_content') and reply.text_content:
logger.info(f"[FeiShu] Sending text before file: {reply.text_content[:50]}...")
text_reply = Reply(ReplyType.TEXT, reply.text_content)
self._send(text_reply, context)
import time
time.sleep(0.3) # 短暂延迟,确保文本先到达
# 判断是否为视频文件
file_path = reply.content
if file_path.startswith("file://"):
@@ -259,20 +267,18 @@ class FeiShuChanel(ChatChannel):
is_video = file_path.lower().endswith(('.mp4', '.avi', '.mov', '.wmv', '.flv'))
if is_video:
# 视频使用 media 类型,需要上传并获取 file_key 和 duration
video_info = self._upload_video_url(reply.content, access_token)
if not video_info or not video_info.get('file_key'):
# 视频上传(包含duration信息)
upload_data = self._upload_video_url(reply.content, access_token)
if not upload_data or not upload_data.get('file_key'):
logger.warning("[FeiShu] upload video failed")
return
# media 类型需要特殊的 content 格式
# 视频使用 media 类型(根据官方文档)
# 错误码 230055 说明:上传 mp4 时必须使用 msg_type="media"
msg_type = "media"
# 注意media 类型的 content 不使用 content_key而是完整的 JSON 对象
reply_content = {
"file_key": video_info['file_key'],
"duration": video_info.get('duration', 0) # 视频时长(毫秒)
}
content_key = None # media 类型不使用单一的 key
reply_content = upload_data # 完整的上传响应数据包含file_key和duration
logger.info(f"[FeiShu] Sending video: file_key={upload_data.get('file_key')}, duration={upload_data.get('duration')}ms")
content_key = None # 直接序列化整个对象
else:
# 其他文件使用 file 类型
file_key = self._upload_file_url(reply.content, access_token)
@@ -286,12 +292,16 @@ class FeiShuChanel(ChatChannel):
# Check if we can reply to an existing message (need msg_id)
can_reply = is_group and msg and hasattr(msg, 'msg_id') and msg.msg_id
# Build content JSON
content_json = json.dumps(reply_content) if content_key is None else json.dumps({content_key: reply_content})
logger.debug(f"[FeiShu] Sending message: msg_type={msg_type}, content={content_json[:200]}")
if can_reply:
# 群聊中回复已有消息
url = f"https://open.feishu.cn/open-apis/im/v1/messages/{msg.msg_id}/reply"
data = {
"msg_type": msg_type,
"content": json.dumps(reply_content) if content_key is None else json.dumps({content_key: reply_content})
"content": content_json
}
res = requests.post(url=url, headers=headers, json=data, timeout=(5, 10))
else:
@@ -301,7 +311,7 @@ class FeiShuChanel(ChatChannel):
data = {
"receive_id": context.get("receiver"),
"msg_type": msg_type,
"content": json.dumps(reply_content) if content_key is None else json.dumps({content_key: reply_content})
"content": content_json
}
res = requests.post(url=url, headers=headers, params=params, json=data, timeout=(5, 10))
res = res.json()
@@ -471,9 +481,18 @@ class FeiShuChanel(ChatChannel):
file_type = file_type_map.get(file_ext, 'mp4')
upload_url = "https://open.feishu.cn/open-apis/im/v1/files"
data = {'file_type': file_type, 'file_name': file_name}
data = {
'file_type': file_type,
'file_name': file_name
}
# Add duration only if available (required for video/audio)
if duration:
data['duration'] = duration # Must be int, not string
headers = {'Authorization': f'Bearer {access_token}'}
logger.info(f"[FeiShu] Uploading video: file_name={file_name}, duration={duration}ms")
with open(local_path, "rb") as file:
upload_response = requests.post(
upload_url,
@@ -486,11 +505,11 @@ class FeiShuChanel(ChatChannel):
response_data = upload_response.json()
if response_data.get("code") == 0:
file_key = response_data.get("data").get("file_key")
return {
'file_key': file_key,
'duration': duration
}
# Add duration to the response data (API doesn't return it)
upload_data = response_data.get("data")
upload_data['duration'] = duration # Add our calculated duration
logger.info(f"[FeiShu] Upload complete: file_key={upload_data.get('file_key')}, duration={duration}ms")
return upload_data
else:
logger.error(f"[FeiShu] upload video failed: {response_data}")
return None

View File

@@ -3,7 +3,7 @@
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>AI Assistant</title>
<title>CowAgent - Personal AI Agent</title>
<link rel="icon" href="assets/favicon.ico" type="image/x-icon">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.0/css/all.min.css">
<script src="https://cdn.jsdelivr.net/npm/markdown-it@13.0.1/dist/markdown-it.min.js"></script>
@@ -762,7 +762,7 @@
<i class="fas fa-bars"></i>
</button>
<img id="header-logo" src="assets/logo.jpg" alt="AI Assistant Logo">
<div id="chat-title">AI 助手</div>
<div id="chat-title" class="agent-title">AI 助手</div>
<a id="github-link" href="https://github.com/zhayujie/chatgpt-on-wechat" target="_blank" rel="noopener noreferrer">
<img id="github-icon" src="assets/github.png" alt="GitHub">
</a>
@@ -771,21 +771,21 @@
<div id="messages">
<!-- 初始欢迎界面 -->
<div id="welcome-screen">
<h1 id="welcome-title">AI 助手</h1>
<p id="welcome-subtitle">我可以回答问题、提供信息或者帮助您完成各种任务</p>
<h1 id="welcome-title" class="agent-title">AI 助手</h1>
<p id="welcome-subtitle" class="agent-subtitle">我可以回答问题、提供信息或者帮助您完成各种任务</p>
<div class="examples-container">
<div class="example-card">
<div class="example-title">解释复杂概念</div>
<div class="example-text">用简单的语言解释量子计算</div>
<div class="example-title">📁 系统管理</div>
<div class="example-text">帮我查看工作空间里有哪些文件</div>
</div>
<div class="example-card">
<div class="example-title">创意写作</div>
<div class="example-text">写一个关于未来城市的短篇故事</div>
<div class="example-title">⏰ 智能任务</div>
<div class="example-text">提醒我5分钟后查看服务器情况</div>
</div>
<div class="example-card">
<div class="example-title">编程帮助</div>
<div class="example-text">如何用Python写一个简单的网络爬虫</div>
<div class="example-title">💻 编程助手</div>
<div class="example-text">帮我编写一个Python爬虫脚本</div>
</div>
<!-- <div class="example-card">
<div class="example-title">生活建议</div>
@@ -830,6 +830,45 @@
let sessionId = generateSessionId();
console.log('Session ID:', sessionId);
// 获取配置并更新标题
let appConfig = {
use_agent: false,
title: "AI 助手",
subtitle: "我可以回答问题、提供信息或者帮助您完成各种任务"
};
// 从服务器获取配置
axios.get('/config')
.then(response => {
if (response.data.status === "success") {
appConfig = response.data;
updateTitle(appConfig.title, appConfig.subtitle);
}
})
.catch(error => {
console.error('Error loading config:', error);
});
// 更新标题的函数
function updateTitle(title, subtitle) {
// 更新顶部标题
const chatTitle = document.getElementById('chat-title');
if (chatTitle) {
chatTitle.textContent = title;
}
// 更新欢迎屏幕标题
const welcomeTitle = document.getElementById('welcome-title');
if (welcomeTitle) {
welcomeTitle.textContent = title;
}
const welcomeSubtitle = document.getElementById('welcome-subtitle');
if (welcomeSubtitle) {
welcomeSubtitle.textContent = subtitle;
}
}
// 添加一个变量来跟踪输入法状态
let isComposing = false;
@@ -1215,21 +1254,21 @@
const newWelcomeScreen = document.createElement('div');
newWelcomeScreen.id = 'welcome-screen';
newWelcomeScreen.innerHTML = `
<h1 id="welcome-title">AI 助手</h1>
<p id="welcome-subtitle">我可以回答问题、提供信息或者帮助您完成各种任务</p>
<h1 id="welcome-title" class="agent-title">${appConfig.title}</h1>
<p id="welcome-subtitle" class="agent-subtitle">${appConfig.subtitle}</p>
<div class="examples-container">
<div class="example-card">
<div class="example-title">解释复杂概念</div>
<div class="example-text">用简单的语言解释量子计算</div>
<div class="example-title">📁 系统管理</div>
<div class="example-text">帮我查看工作空间里有哪些文件</div>
</div>
<div class="example-card">
<div class="example-title">创意写作</div>
<div class="example-text">写一个关于未来城市的短篇故事</div>
<div class="example-title">⏰ 智能任务</div>
<div class="example-text">提醒我5分钟后查看服务器情况</div>
</div>
<div class="example-card">
<div class="example-title">编程帮助</div>
<div class="example-text">如何用Python写一个简单的网络爬虫</div>
<div class="example-title">💻 编程助手</div>
<div class="example-text">帮我编写一个Python爬虫脚本</div>
</div>
</div>
`;

View File

@@ -3,6 +3,7 @@ import time
import web
import json
import uuid
import io
from queue import Queue, Empty
from bridge.context import *
from bridge.reply import Reply, ReplyType
@@ -197,45 +198,50 @@ class WebChannel(ChatChannel):
def startup(self):
port = conf().get("web_port", 9899)
logger.info("""[WebChannel] 当前channel为web可修改 config.json 配置文件中的 channel_type 字段进行切换。全部可用类型为:
1. web: 网页
2. terminal: 终端
3. feishu: 飞书
4. dingtalk: 钉钉
5. wechatcom_app: 企微自建应用
6. wechatmp: 个人公众号
7. wechatmp_service: 企业公众号""")
logger.info(f"Web对话网页已运行, 请使用浏览器访问 http://localhost:{port}/chat (本地运行) 或 http://ip:{port}/chat (服务器运行)")
# 打印可用渠道类型提示
logger.info("[WebChannel] 当前channel为web可修改 config.json 配置文件中的 channel_type 字段进行切换。全部可用类型为:")
logger.info("[WebChannel] 1. web - 网页")
logger.info("[WebChannel] 2. terminal - 终端")
logger.info("[WebChannel] 3. feishu - 飞书")
logger.info("[WebChannel] 4. dingtalk - 钉钉")
logger.info("[WebChannel] 5. wechatcom_app - 企微自建应用")
logger.info("[WebChannel] 6. wechatmp - 个人公众号")
logger.info("[WebChannel] 7. wechatmp_service - 企业公众号")
logger.info(f"[WebChannel] 🌐 本地访问: http://localhost:{port}/chat")
logger.info(f"[WebChannel] 🌍 服务器访问: http://YOUR_IP:{port}/chat (请将YOUR_IP替换为服务器IP)")
logger.info("[WebChannel] ✅ Web对话网页已运行")
# 确保静态文件目录存在
static_dir = os.path.join(os.path.dirname(__file__), 'static')
if not os.path.exists(static_dir):
os.makedirs(static_dir)
logger.info(f"Created static directory: {static_dir}")
logger.debug(f"[WebChannel] Created static directory: {static_dir}")
urls = (
'/', 'RootHandler', # 添加根路径处理器
'/', 'RootHandler',
'/message', 'MessageHandler',
'/poll', 'PollHandler', # 添加轮询处理器
'/poll', 'PollHandler',
'/chat', 'ChatHandler',
'/assets/(.*)', 'AssetsHandler', # 匹配 /assets/任何路径
'/config', 'ConfigHandler',
'/assets/(.*)', 'AssetsHandler',
)
app = web.application(urls, globals(), autoreload=False)
# 完全禁用web.py的HTTP日志输出
# 创建一个空的日志处理函数
def null_log_function(status, environ):
pass
# 替换web.py的日志函数
web.httpserver.LogMiddleware.log = lambda self, status, environ: None
# 配置web.py的日志级别为ERROR
logging.getLogger("web").setLevel(logging.ERROR)
logging.getLogger("web.httpserver").setLevel(logging.ERROR)
# 启动服务器
web.httpserver.runsimple(app.wsgifunc(), ("0.0.0.0", port))
# 抑制 web.py 默认的服务器启动消息
old_stdout = sys.stdout
sys.stdout = io.StringIO()
try:
web.httpserver.runsimple(app.wsgifunc(), ("0.0.0.0", port))
finally:
sys.stdout = old_stdout
class RootHandler:
@@ -262,6 +268,30 @@ class ChatHandler:
return f.read()
class ConfigHandler:
def GET(self):
"""返回前端需要的配置信息"""
try:
use_agent = conf().get("agent", False)
if use_agent:
title = "CowAgent"
subtitle = "我可以帮你解答问题、管理计算机、创造和执行技能,并通过长期记忆不断成长"
else:
title = "AI 助手"
subtitle = "我可以回答问题、提供信息或者帮助您完成各种任务"
return json.dumps({
"status": "success",
"use_agent": use_agent,
"title": title,
"subtitle": subtitle
})
except Exception as e:
logger.error(f"Error getting config: {e}")
return json.dumps({"status": "error", "message": str(e)})
class AssetsHandler:
def GET(self, file_path): # 修改默认参数
try:

View File

@@ -1,6 +1,7 @@
# -*- coding=utf-8 -*-
import io
import os
import sys
import time
import requests
@@ -35,9 +36,8 @@ class WechatComAppChannel(ChatChannel):
self.agent_id = conf().get("wechatcomapp_agent_id")
self.token = conf().get("wechatcomapp_token")
self.aes_key = conf().get("wechatcomapp_aes_key")
print(self.corp_id, self.secret, self.agent_id, self.token, self.aes_key)
logger.info(
"[wechatcom] init: corp_id: {}, secret: {}, agent_id: {}, token: {}, aes_key: {}".format(self.corp_id, self.secret, self.agent_id, self.token, self.aes_key)
"[wechatcom] Initializing WeCom app channel, corp_id: {}, agent_id: {}".format(self.corp_id, self.agent_id)
)
self.crypto = WeChatCrypto(self.token, self.aes_key, self.corp_id)
self.client = WechatComAppClient(self.corp_id, self.secret)
@@ -47,7 +47,17 @@ class WechatComAppChannel(ChatChannel):
urls = ("/wxcomapp/?", "channel.wechatcom.wechatcomapp_channel.Query")
app = web.application(urls, globals(), autoreload=False)
port = conf().get("wechatcomapp_port", 9898)
web.httpserver.runsimple(app.wsgifunc(), ("0.0.0.0", port))
logger.info("[wechatcom] ✅ WeCom app channel started successfully")
logger.info("[wechatcom] 📡 Listening on http://0.0.0.0:{}/wxcomapp/".format(port))
logger.info("[wechatcom] 🤖 Ready to receive messages")
# Suppress web.py's default server startup message
old_stdout = sys.stdout
sys.stdout = io.StringIO()
try:
web.httpserver.runsimple(app.wsgifunc(), ("0.0.0.0", port))
finally:
sys.stdout = old_stdout
def send(self, reply: Reply, context: Context):
receiver = context["receiver"]
@@ -74,6 +84,10 @@ class WechatComAppChannel(ChatChannel):
response = self.client.media.upload("voice", open(path, "rb"))
logger.debug("[wechatcom] upload voice response: {}".format(response))
media_ids.append(response["media_id"])
except ImportError as e:
logger.error("[wechatcom] voice conversion failed: {}".format(e))
logger.error("[wechatcom] please install pydub: pip install pydub")
return
except WeChatClientException as e:
logger.error("[wechatcom] upload voice failed: {}".format(e))
return

View File

@@ -21,7 +21,11 @@ from common.log import logger
from common.singleton import singleton
from common.utils import split_string_by_utf8_length, remove_markdown_symbol
from config import conf
from voice.audio_convert import any_to_mp3, split_audio
try:
from voice.audio_convert import any_to_mp3, split_audio
except ImportError as e:
logger.debug("import voice.audio_convert failed, voice features will not be supported: {}".format(e))
# If using SSL, uncomment the following lines, and modify the certificate path.
# from cheroot.server import HTTPServer
@@ -85,26 +89,31 @@ class WechatMPChannel(ChatChannel):
logger.info("[wechatmp] text cached, receiver {}\n{}".format(receiver, reply_text))
self.cache_dict[receiver].append(("text", reply_text))
elif reply.type == ReplyType.VOICE:
voice_file_path = reply.content
duration, files = split_audio(voice_file_path, 60 * 1000)
if len(files) > 1:
logger.info("[wechatmp] voice too long {}s > 60s , split into {} parts".format(duration / 1000.0, len(files)))
try:
voice_file_path = reply.content
duration, files = split_audio(voice_file_path, 60 * 1000)
if len(files) > 1:
logger.info("[wechatmp] voice too long {}s > 60s , split into {} parts".format(duration / 1000.0, len(files)))
for path in files:
# support: <2M, <60s, mp3/wma/wav/amr
try:
with open(path, "rb") as f:
response = self.client.material.add("voice", f)
logger.debug("[wechatmp] upload voice response: {}".format(response))
f_size = os.fstat(f.fileno()).st_size
time.sleep(1.0 + 2 * f_size / 1024 / 1024)
# todo check media_id
except WeChatClientException as e:
logger.error("[wechatmp] upload voice failed: {}".format(e))
return
media_id = response["media_id"]
logger.info("[wechatmp] voice uploaded, receiver {}, media_id {}".format(receiver, media_id))
self.cache_dict[receiver].append(("voice", media_id))
for path in files:
# support: <2M, <60s, mp3/wma/wav/amr
try:
with open(path, "rb") as f:
response = self.client.material.add("voice", f)
logger.debug("[wechatmp] upload voice response: {}".format(response))
f_size = os.fstat(f.fileno()).st_size
time.sleep(1.0 + 2 * f_size / 1024 / 1024)
# todo check media_id
except WeChatClientException as e:
logger.error("[wechatmp] upload voice failed: {}".format(e))
return
media_id = response["media_id"]
logger.info("[wechatmp] voice uploaded, receiver {}, media_id {}".format(receiver, media_id))
self.cache_dict[receiver].append(("voice", media_id))
except ImportError as e:
logger.error("[wechatmp] voice conversion failed: {}".format(e))
logger.error("[wechatmp] please install pydub: pip install pydub")
return
elif reply.type == ReplyType.IMAGE_URL: # 从网络下载图片
img_url = reply.content
@@ -213,6 +222,10 @@ class WechatMPChannel(ChatChannel):
logger.debug("[wechatcom] upload voice response: {}".format(response))
media_ids.append(response["media_id"])
os.remove(path)
except ImportError as e:
logger.error("[wechatmp] voice conversion failed: {}".format(e))
logger.error("[wechatmp] please install pydub: pip install pydub")
return
except WeChatClientException as e:
logger.error("[wechatmp] upload voice failed: {}".format(e))
return

View File

@@ -1,77 +1,93 @@
# bot_type
# 厂商类型
OPEN_AI = "openAI"
CHATGPT = "chatGPT"
BAIDU = "baidu" # 百度文心一言模型
BAIDU = "baidu"
XUNFEI = "xunfei"
CHATGPTONAZURE = "chatGPTOnAzure"
LINKAI = "linkai"
CLAUDEAI = "claude" # 使用cookie的历史模型
CLAUDEAPI= "claudeAPI" # 通过Claude api调用模型
QWEN = "qwen" # 旧版通义模型
QWEN_DASHSCOPE = "dashscope" # 通义新版sdk和api key
GEMINI = "gemini" # gemini-1.0-pro
CLAUDEAPI= "claudeAPI"
QWEN = "qwen" # 旧版千问接入
QWEN_DASHSCOPE = "dashscope" # 新版千问接入(百炼)
GEMINI = "gemini"
ZHIPU_AI = "glm-4"
MOONSHOT = "moonshot"
MiniMax = "minimax"
MODELSCOPE = "modelscope"
# model
# 模型列表
# Claude (Anthropic)
CLAUDE3 = "claude-3-opus-20240229"
GPT35 = "gpt-3.5-turbo"
GPT35_0125 = "gpt-3.5-turbo-0125"
GPT35_1106 = "gpt-3.5-turbo-1106"
GPT_4o = "gpt-4o"
GPT_4O_0806 = "gpt-4o-2024-08-06"
GPT4_TURBO = "gpt-4-turbo"
GPT4_TURBO_PREVIEW = "gpt-4-turbo-preview"
GPT4_TURBO_04_09 = "gpt-4-turbo-2024-04-09"
GPT4_TURBO_01_25 = "gpt-4-0125-preview"
GPT4_TURBO_11_06 = "gpt-4-1106-preview"
GPT4_VISION_PREVIEW = "gpt-4-vision-preview"
GPT4 = "gpt-4"
GPT_4o_MINI = "gpt-4o-mini"
GPT4_32k = "gpt-4-32k"
GPT4_06_13 = "gpt-4-0613"
GPT4_32k_06_13 = "gpt-4-32k-0613"
GPT_41 = "gpt-4.1"
GPT_41_MINI = "gpt-4.1-mini"
GPT_41_NANO = "gpt-4.1-nano"
GPT_5 = "gpt-5"
GPT_5_MINI = "gpt-5-mini"
GPT_5_NANO = "gpt-5-nano"
O1 = "o1-preview"
O1_MINI = "o1-mini"
WHISPER_1 = "whisper-1"
TTS_1 = "tts-1"
TTS_1_HD = "tts-1-hd"
WEN_XIN = "wenxin"
WEN_XIN_4 = "wenxin-4"
QWEN_TURBO = "qwen-turbo"
QWEN_PLUS = "qwen-plus"
QWEN_MAX = "qwen-max"
LINKAI_35 = "linkai-3.5"
LINKAI_4_TURBO = "linkai-4-turbo"
LINKAI_4o = "linkai-4o"
CLAUDE_3_OPUS = "claude-3-opus-latest"
CLAUDE_3_OPUS_0229 = "claude-3-opus-20240229"
CLAUDE_3_SONNET = "claude-3-sonnet-20240229"
CLAUDE_3_HAIKU = "claude-3-haiku-20240307"
CLAUDE_35_SONNET = "claude-3-5-sonnet-latest" # 带 latest 标签的模型名称,会不断更新指向最新发布的模型
CLAUDE_35_SONNET_1022 = "claude-3-5-sonnet-20241022" # 带具体日期的模型名称,会固定为该日期发布的模型
CLAUDE_35_SONNET_0620 = "claude-3-5-sonnet-20240620"
CLAUDE_4_OPUS = "claude-opus-4-0"
CLAUDE_4_SONNET = "claude-sonnet-4-0" # Claude Sonnet 4.0 - Agent推荐模型
CLAUDE_4_5_SONNET = "claude-sonnet-4-5" # Claude Sonnet 4.5 - Agent推荐模型
# Gemini (Google)
GEMINI_PRO = "gemini-1.0-pro"
GEMINI_15_flash = "gemini-1.5-flash"
GEMINI_15_PRO = "gemini-1.5-pro"
GEMINI_20_flash_exp = "gemini-2.0-flash-exp" # exp结尾为实验模型会逐步不再支持
GEMINI_20_FLASH = "gemini-2.0-flash" # 正式版模型
GEMINI_25_FLASH_PRE = "gemini-2.5-flash-preview-05-20" # preview为预览版模型 ,主要是新能力体验
GEMINI_25_FLASH_PRE = "gemini-2.5-flash-preview-05-20" # preview为预览版模型主要是新能力体验
GEMINI_25_PRO_PRE = "gemini-2.5-pro-preview-05-06"
GEMINI_3_FLASH_PRE = "gemini-3-flash-preview" # Gemini 3 Flash Preview - Agent推荐模型
GEMINI_3_PRO_PRE = "gemini-3-pro-preview" # Gemini 3 Pro Preview - Agent推荐模型
# OpenAI
GPT35 = "gpt-3.5-turbo"
GPT35_0125 = "gpt-3.5-turbo-0125"
GPT35_1106 = "gpt-3.5-turbo-1106"
GPT4 = "gpt-4"
GPT4_06_13 = "gpt-4-0613"
GPT4_32k = "gpt-4-32k"
GPT4_32k_06_13 = "gpt-4-32k-0613"
GPT4_TURBO = "gpt-4-turbo"
GPT4_TURBO_PREVIEW = "gpt-4-turbo-preview"
GPT4_TURBO_01_25 = "gpt-4-0125-preview"
GPT4_TURBO_11_06 = "gpt-4-1106-preview"
GPT4_TURBO_04_09 = "gpt-4-turbo-2024-04-09"
GPT4_VISION_PREVIEW = "gpt-4-vision-preview"
GPT_4o = "gpt-4o"
GPT_4O_0806 = "gpt-4o-2024-08-06"
GPT_4o_MINI = "gpt-4o-mini"
GPT_41 = "gpt-4.1"
GPT_41_MINI = "gpt-4.1-mini"
GPT_41_NANO = "gpt-4.1-nano"
GPT_5 = "gpt-5"
GPT_5_MINI = "gpt-5-mini"
GPT_5_NANO = "gpt-5-nano"
O1 = "o1-preview"
O1_MINI = "o1-mini"
WHISPER_1 = "whisper-1"
TTS_1 = "tts-1"
TTS_1_HD = "tts-1-hd"
# DeepSeek
DEEPSEEK_CHAT = "deepseek-chat" # DeepSeek-V3对话模型
DEEPSEEK_REASONER = "deepseek-reasoner" # DeepSeek-R1模型
# Qwen (通义千问 - 阿里云)
QWEN = "qwen"
QWEN_TURBO = "qwen-turbo"
QWEN_PLUS = "qwen-plus"
QWEN_MAX = "qwen-max"
QWEN_LONG = "qwen-long"
QWEN3_MAX = "qwen3-max" # Qwen3 Max - Agent推荐模型
QWQ_PLUS = "qwq-plus"
# MiniMax
MINIMAX_M2_1 = "MiniMax-M2.1" # MiniMax M2.1 - Agent推荐模型
MINIMAX_M2_1_LIGHTNING = "MiniMax-M2.1-lightning" # MiniMax M2.1 极速版
MINIMAX_M2 = "MiniMax-M2" # MiniMax M2
MINIMAX_ABAB6_5 = "abab6.5-chat" # MiniMax abab6.5
# GLM (智谱AI)
GLM_4 = "glm-4"
GLM_4_PLUS = "glm-4-plus"
GLM_4_flash = "glm-4-flash"
@@ -80,20 +96,19 @@ GLM_4_ALLTOOLS = "glm-4-alltools"
GLM_4_0520 = "glm-4-0520"
GLM_4_AIR = "glm-4-air"
GLM_4_AIRX = "glm-4-airx"
GLM_4_7 = "glm-4.7" # 智谱 GLM-4.7 - Agent推荐模型
# Kimi (Moonshot)
MOONSHOT = "moonshot"
CLAUDE_3_OPUS = "claude-3-opus-latest"
CLAUDE_3_OPUS_0229 = "claude-3-opus-20240229"
CLAUDE_35_SONNET = "claude-3-5-sonnet-latest" # 带 latest 标签的模型名称,会不断更新指向最新发布的模型
CLAUDE_35_SONNET_1022 = "claude-3-5-sonnet-20241022" # 带具体日期的模型名称,会固定为该日期发布的模型
CLAUDE_35_SONNET_0620 = "claude-3-5-sonnet-20240620"
CLAUDE_3_SONNET = "claude-3-sonnet-20240229"
CLAUDE_3_HAIKU = "claude-3-haiku-20240307"
CLAUDE_4_SONNET = "claude-sonnet-4-0"
CLAUDE_4_OPUS = "claude-opus-4-0"
DEEPSEEK_CHAT = "deepseek-chat" # DeepSeek-V3对话模型
DEEPSEEK_REASONER = "deepseek-reasoner" # DeepSeek-R1模型
# 其他模型
WEN_XIN = "wenxin"
WEN_XIN_4 = "wenxin-4"
XUNFEI = "xunfei"
LINKAI_35 = "linkai-3.5"
LINKAI_4_TURBO = "linkai-4-turbo"
LINKAI_4o = "linkai-4o"
MODELSCOPE = "modelscope"
GITEE_AI_MODEL_LIST = ["Yi-34B-Chat", "InternVL2-8B", "deepseek-coder-33B-instruct", "InternVL2.5-26B", "Qwen2-VL-72B", "Qwen2.5-32B-Instruct", "glm-4-9b-chat", "codegeex4-all-9b", "Qwen2.5-Coder-32B-Instruct", "Qwen2.5-72B-Instruct", "Qwen2.5-7B-Instruct", "Qwen2-72B-Instruct", "Qwen2-7B-Instruct", "code-raccoon-v1", "Qwen2.5-14B-Instruct"]
@@ -104,19 +119,43 @@ MODELSCOPE_MODEL_LIST = ["LLM-Research/c4ai-command-r-plus-08-2024","mistralai/M
"deepseek-ai/DeepSeek-R1-Distill-Qwen-14B","deepseek-ai/DeepSeek-R1-Distill-Qwen-7B","deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B","deepseek-ai/DeepSeek-R1","deepseek-ai/DeepSeek-V3","Qwen/QwQ-32B"]
MODEL_LIST = [
# Claude
CLAUDE3, CLAUDE_4_OPUS, CLAUDE_4_5_SONNET, CLAUDE_4_SONNET, CLAUDE_3_OPUS, CLAUDE_3_OPUS_0229,
CLAUDE_35_SONNET, CLAUDE_35_SONNET_1022, CLAUDE_35_SONNET_0620, CLAUDE_3_SONNET, CLAUDE_3_HAIKU,
"claude", "claude-3-haiku", "claude-3-sonnet", "claude-3-opus", "claude-3.5-sonnet",
# Gemini
GEMINI_3_PRO_PRE, GEMINI_3_FLASH_PRE, GEMINI_25_PRO_PRE, GEMINI_25_FLASH_PRE,
GEMINI_20_FLASH, GEMINI_20_flash_exp, GEMINI_15_PRO, GEMINI_15_flash, GEMINI_PRO, GEMINI,
# OpenAI
GPT35, GPT35_0125, GPT35_1106, "gpt-3.5-turbo-16k",
GPT_41, GPT_41_MINI, GPT_41_NANO, O1, O1_MINI, GPT_4o, GPT_4O_0806, GPT_4o_MINI, GPT4_TURBO, GPT4_TURBO_PREVIEW, GPT4_TURBO_01_25, GPT4_TURBO_11_06, GPT4, GPT4_32k, GPT4_06_13, GPT4_32k_06_13,
GPT4, GPT4_06_13, GPT4_32k, GPT4_32k_06_13,
GPT4_TURBO, GPT4_TURBO_PREVIEW, GPT4_TURBO_01_25, GPT4_TURBO_11_06, GPT4_TURBO_04_09,
GPT_4o, GPT_4O_0806, GPT_4o_MINI,
GPT_41, GPT_41_MINI, GPT_41_NANO,
GPT_5, GPT_5_MINI, GPT_5_NANO,
WEN_XIN, WEN_XIN_4,
XUNFEI,
ZHIPU_AI, GLM_4, GLM_4_PLUS, GLM_4_flash, GLM_4_LONG, GLM_4_ALLTOOLS, GLM_4_0520, GLM_4_AIR, GLM_4_AIRX,
MOONSHOT, MiniMax,
GEMINI_25_PRO_PRE, GEMINI_25_FLASH_PRE, GEMINI_20_FLASH, GEMINI, GEMINI_PRO, GEMINI_15_flash, GEMINI_15_PRO, GEMINI_20_flash_exp,
CLAUDE_4_OPUS, CLAUDE_4_SONNET, CLAUDE_3_OPUS, CLAUDE_3_OPUS_0229, CLAUDE_35_SONNET, CLAUDE_35_SONNET_1022, CLAUDE_35_SONNET_0620, CLAUDE_3_SONNET, CLAUDE_3_HAIKU, "claude", "claude-3-haiku", "claude-3-sonnet", "claude-3-opus", "claude-3.5-sonnet",
"moonshot-v1-8k", "moonshot-v1-32k", "moonshot-v1-128k",
QWEN, QWEN_TURBO, QWEN_PLUS, QWEN_MAX,
LINKAI_35, LINKAI_4_TURBO, LINKAI_4o,
O1, O1_MINI,
# DeepSeek
DEEPSEEK_CHAT, DEEPSEEK_REASONER,
# Qwen
QWEN, QWEN_TURBO, QWEN_PLUS, QWEN_MAX, QWEN_LONG, QWEN3_MAX,
# MiniMax
MiniMax, MINIMAX_M2_1, MINIMAX_M2_1_LIGHTNING, MINIMAX_M2, MINIMAX_ABAB6_5,
# GLM
ZHIPU_AI, GLM_4, GLM_4_PLUS, GLM_4_flash, GLM_4_LONG, GLM_4_ALLTOOLS,
GLM_4_0520, GLM_4_AIR, GLM_4_AIRX, GLM_4_7,
# Kimi
MOONSHOT, "moonshot-v1-8k", "moonshot-v1-32k", "moonshot-v1-128k",
# 其他模型
WEN_XIN, WEN_XIN_4, XUNFEI,
LINKAI_35, LINKAI_4_TURBO, LINKAI_4o,
MODELSCOPE
]

View File

@@ -1,23 +1,30 @@
{
"channel_type": "web",
"model": "claude-sonnet-4-5",
"open_ai_api_key": "YOUR API KEY",
"open_ai_api_base": "https://api.openai.com/v1",
"claude_api_key": "YOUR API KEY",
"claude_api_key": "",
"claude_api_base": "https://api.anthropic.com/v1",
"gemini_api_key": "YOUR API KEY",
"open_ai_api_key": "",
"open_ai_api_base": "https://api.openai.com/v1",
"gemini_api_key": "",
"gemini_api_base": "https://generativelanguage.googleapis.com",
"zhipu_ai_api_key": "",
"minimax_api_key": "",
"dashscope_api_key": "",
"voice_to_text": "openai",
"text_to_voice": "openai",
"voice_reply_voice": false,
"speech_recognition": true,
"group_speech_recognition": false,
"proxy": "",
"use_linkai": false,
"linkai_api_key": "",
"linkai_app_code": "",
"feishu_bot_name": "",
"feishu_app_id": "",
"feishu_app_secret": "",
"dingtalk_client_id": "",
"dingtalk_client_secret":"",
"agent": true,
"agent_max_context_tokens": 40000,
"agent_max_context_turns": 30,
"agent_max_steps": 20
"agent_max_context_turns": 20,
"agent_max_steps": 15
}

View File

@@ -183,15 +183,15 @@ available_setting = {
"linkai_api_key": "",
"linkai_app_code": "",
"linkai_api_base": "https://api.link-ai.tech", # linkAI服务地址
"Minimax_api_key": "",
"minimax_api_key": "",
"Minimax_group_id": "",
"Minimax_base_url": "",
"web_port": 9899,
"agent": True, # 是否开启Agent模式
"agent_workspace": "~/cow", # agent工作空间路径用于存储skills、memory等
"agent_max_context_tokens": 40000, # Agent模式下最大上下文tokens
"agent_max_context_turns": 30, # Agent模式下最大上下文轮次
"agent_max_steps": 20, # Agent模式下单次运行最大决策步数
"agent_max_context_tokens": 50000, # Agent模式下最大上下文tokens
"agent_max_context_turns": 30, # Agent模式下最大上下文记忆轮次
"agent_max_steps": 15, # Agent模式下单次运行最大决策步数
}
@@ -243,7 +243,7 @@ class Config(dict):
try:
with open(os.path.join(get_appdata_dir(), "user_datas.pkl"), "rb") as f:
self.user_datas = pickle.load(f)
logger.info("[Config] User datas loaded.")
logger.debug("[Config] User datas loaded.")
except FileNotFoundError as e:
logger.info("[Config] User datas file not found, ignore.")
except Exception as e:
@@ -288,6 +288,15 @@ def drag_sensitive(config):
def load_config():
global config
# 打印 ASCII Logo
logger.info(" ____ _ _ ")
logger.info(" / ___|_____ __ / \\ __ _ ___ _ __ | |_ ")
logger.info("| | / _ \\ \\ /\\ / // _ \\ / _` |/ _ \\ '_ \\| __|")
logger.info("| |__| (_) \\ V V // ___ \\ (_| | __/ | | | |_ ")
logger.info(" \\____\\___/ \\_/\\_//_/ \\_\\__, |\\___|_| |_|\\__|")
logger.info(" |___/ ")
logger.info("")
config_path = "./config.json"
if not os.path.exists(config_path):
logger.info("配置文件不存在将使用config-template.json模板")
@@ -324,6 +333,23 @@ def load_config():
logger.info("[INIT] load config: {}".format(drag_sensitive(config)))
# 打印系统初始化信息
logger.info("[INIT] ========================================")
logger.info("[INIT] System Initialization")
logger.info("[INIT] ========================================")
logger.info("[INIT] Channel: {}".format(config.get("channel_type", "unknown")))
logger.info("[INIT] Model: {}".format(config.get("model", "unknown")))
# Agent模式信息
if config.get("agent", False):
workspace = config.get("agent_workspace", "~/cow")
logger.info("[INIT] Mode: Agent (workspace: {})".format(workspace))
else:
logger.info("[INIT] Mode: Chat (在config.json中设置 \"agent\":true 可启用Agent模式)")
logger.info("[INIT] Debug: {}".format(config.get("debug", False)))
logger.info("[INIT] ========================================")
config.load_user_datas()

View File

@@ -21,5 +21,6 @@ services:
EXPIRES_IN_SECONDS: 3600
USE_GLOBAL_PLUGIN_CONFIG: 'True'
USE_LINKAI: 'False'
AGENT: 'True'
LINKAI_API_KEY: ''
LINKAI_APP_CODE: ''

180
docs/agent.md Normal file
View File

@@ -0,0 +1,180 @@
# CowAgent介绍
## 概述
Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent**能够主动规思考和规划任务、拥有长期记忆、操作计算机和外部资源、创造和执行Skill真正理解你并和你一起成长。CowAgent能够长期运行在个人电脑或服务器中通过飞书、钉钉、企业微信、网页等多种方式进行交互。核心能力如下
- **复杂任务规划**:能够理解复杂任务并自主规划执行,持续思考和调用工具直到完成目标,支持多轮推理和上下文理解
- **工具系统**内置实现10+种工具包括文件读写、bash终端、浏览器、定时任务、记忆管理等通过Agent管理你的计算机或服务器
- **长期记忆**:自动将对话记忆持久化至本地文件和数据库中,包括全局记忆和天级记忆,支持关键词及向量检索
- **Skills系统**新增Skill运行引擎内置多种技能并支持通过自然语言对话完成自定义Skills开发
- **多渠道和多模型支持**支持在Web、飞书、钉钉、企微等多渠道与Agent交互支持Claude、Gemini、OpenAI、GLM、MiniMax、Qwen 等多种国内外主流模型
- **安全和成本**通过秘钥管理工具、提示词控制、系统权限等手段控制Agent的访问安全通过最大记忆轮次、最大上下文token、工具执行步数对token成本进行限制
## 核心功能
### 1. 长期记忆
> 记忆系统让 Agent 能够长期记住重要信息。Agent 会在用户分享偏好、决策、事实等重要信息时主动存储,也会在对话达到一定长度时自动提取摘要。记忆分为核心记忆、天级记忆,支持语义搜索和向量检索的混合检索模式。
第一次启动Agent会主动向用户获取询问关键信息并记录至工作空间 (默认为 ~/cow) 中的智能体设定、用户身份、记忆文件中。
在后续的长期对话中Agent会在需要的时候智能记录或检索记忆并对自身设定、用户偏好、记忆文件等进行不断更新总结和记录经验和教训真正实现自主思考和不断成长。
<img width="800" src="https://cdn.link-ai.tech/doc/20260203000455.png">
### 2. 任务规划和工具调用
工具是Agent访问操作系统资源的核心Agent会根据任务需求智能选择和调用工具完成文件读写、命令执行、定时任务等各类操作。内置工具的视线在项目的 `tools` 目录下。
**主要工具:** 文件读写编辑、Bash终端、浏览器、文件发送、定时调度、记忆搜索、环境配置等。
#### 1.1 终端和文件访问能力
针对操作系统的终端和文件的访问能力是最基础和核心的工具其他很多工具或技能都是基于基础工具进行扩展。用户可通过手机端与Agent交互操作个人电脑或服务器上的资源
<img width="800" src="https://cdn.link-ai.tech/doc/20260202181130.png">
#### 1.2 编程能力
基于编程能力和系统访问能力Agent可以实现从信息搜索、图片等素材生成、编码、测试、部署、Nginx配置修改、发布的 Vibecoding 全流程通过手机端简单的一句命令完成应用的快速demo
<img width="800" src="https://cdn.link-ai.tech/doc/20260203121008.png">
#### 1.3 定时任务
基于 scheduler 工具实现动态定时任务,支持 **一次性任务、固定时间间隔、Cron表达式** 三种形式,任务触发可选择**固定消息发送** 或 **Agent动态任务** 执行两种模式,有很高灵活性:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202195402.png">
同时你也可以通过自然语言快速查看和管理已有的定时任务。
#### 1.4 环境变量管理
技能所需要的秘钥存储在环境变量文件中,由 `env_config` 工具进行管理,你可以通过对话的方式更新秘钥,工具内置了安全保护和脱敏策略,会严格保护秘钥安全:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202234939.png">
### 3. 技能系统
> 技能系统为Agent提供无限的扩展性每个Skill由说明文件、运行脚本 (可选)、资源 (可选) 组成描述如何完成特定类型的任务。通过Skill可以让Agent遵循说明完成复杂流程调用各类工具或对接第三方系统等。
- **内置技能:** 在项目的`skills`目录下包含技能创造器、网络搜索、图像识别openai-image-vision、LinkAI智能体、网页抓取等。内置Skill根据依赖条件 (API Key、系统命令等) 自动判断是否启用。通过技能创造器可以快速创建自定义技能。
- **自定义技能:** 由用户通过对话创建,存放在工作空间中 (`~/cow/skills/`),基于自定义技能可以实现任何
#### 3.1 创建技能
通过 `skill-creator` 技能可以通过对话的方式快速创建技能。你可以在与Agent的写作中让他对将某个工作流程固化为技能或者把任意接口文档和示例发送给Agent让他直接完成对接
<img width="800" src="https://cdn.link-ai.tech/doc/20260202202247.png">
#### 3.2 搜索和图像识别
- **搜索技能:** 系统内置实现了 `bocha-search`(博查搜索)的Skill依赖环境变量 `BOCHA_SEARCH_API_KEY`,可在[控制台]()进行创建并发送给Agent完成配置
- **图像识别技能:** 实现了 `openai-image-vision` 插件,可使用 gpt-4.1-mini、gpt-4.1 等图像识别模型。依赖秘钥 `OPENAI_API_KEY`可通过config.json或env_config工具进行维护。
<img width="800" src="https://cdn.link-ai.tech/doc/20260202213219.png">
#### 3.3 三方知识库和插件
`linkai-agent` 技能可以将 [LinkAI](https://link-ai.tech/) 上的所有智能体作为skill交给Agent使用并实现多智能体决策的效果。
使用方式:需通过对话的方式配置 `LINKAI_API_KEY`或在config.json中添加 `linkai_api_key`。 并在 `skills/linkai-agent/config.json`中添加智能体说明,示例如下:
```json
{
"apps": [
{
"app_code": "G7z6vKwp",
"app_name": "LinkAI客服助手",
"app_description": "当用户需要了解LinkAI平台相关问题时才选择该助手基于LinkAI知识库进行回答"
},
{
"app_code": "SFY5x7JR",
"app_name": "内容创作助手",
"app_description": "当用户需要创作图片或视频时才使用该助手支持Nano Banana、Seedream、即梦、Veo、可灵等多种模型"
}
]
}
```
Agent可根据智能体的名称和描述进行决策并通过 app_code 调用接口访问对应的应用/工作流通过该技能可以灵活访问LinkAI平台上的智能体、知识库、插件等能力实现效果如下
<img width="750" src="https://cdn.link-ai.tech/doc/20260202234350.png">
注:需通过 `env_config` 配置 `LINKAI_API_KEY`或在config.json中添加 `linkai_api_key` 配置。
## 使用方式
> 详细使用方式参考项目README.md文档进行
### 1.项目运行
在命令行中执行:
```bash
bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
```
详细说明及后续程序管理参考:[项目启动脚本](https://github.com/zhayujie/chatgpt-on-wechat/wiki/CowAgentQuickStart)
### 2.模型选择
Agent模式推荐使用以下模型可根据效果及成本综合选择
- **Claude**: `claude-sonnet-4-5``claude-sonnet-4-0`
- **Gemini**: `gemini-3-flash-preview``gemini-3-pro-preview`
- **GLM**: `glm-4.7`
- **MiniMax**: `MiniMax-M2.1`
- **Qwen**: `qwen3-max`
详细模型配置方式参考 [README.md 模型说明](../README.md#模型说明)
### 3.Agent核心配置
Agent模式的核心配置项如下`config.json` 中配置:
```bash
{
"agent": true, # 是否启用Agent模式
"agent_workspace": "~/cow", # Agent工作空间路径
"agent_max_context_tokens": 40000, # 最大上下文tokens
"agent_max_context_turns": 30, # 最大上下文记忆轮次
"agent_max_steps": 15 # 单次任务最大决策步数
}
```
**配置说明:**
- `agent`: 设为 `true` 启用Agent模式获得多轮工具决策、长期记忆、Skills等能力
- `agent_workspace`: 工作空间路径,用于存储 memory、skills、其他系统设定提示词
- `agent_max_context_tokens`: 上下文token上限超出将自动丢弃最早的对话
- `agent_max_context_turns`: 上下文记忆轮次,每轮包括一次提问和回复
- `agent_max_steps`: 单次任务最大工具调用步数,防止无限循环
### 4.渠道接入
Agent支持在多种渠道中使用只需修改 `config.json` 中的 `channel_type` 配置即可切换。
- **Web网页**:默认使用该渠道,运行后监听本地端口,通过浏览器访问
- **飞书接入**[飞书接入文档](https://docs.link-ai.tech/cow/multi-platform/feishu)
- **钉钉接入**[钉钉接入文档](https://docs.link-ai.tech/cow/multi-platform/dingtalk)
- **企业微信应用接入**[企微应用文档](https://docs.link-ai.tech/cow/multi-platform/wechat-com)
更多渠道配置参考:[通道说明](../README.md#通道说明)

121
docs/release/2.0.0.md Normal file
View File

@@ -0,0 +1,121 @@
# CowAgent 2.0
🚀 CowAgent 2.0 实现了从聊天机器人到**超级智能助理**的全面升级!现在它能够主动思考和规划任务、拥有长期记忆、操作计算机和外部资源、创造和执行技能,真正理解你并和你一起成长。
### ✨ 重点更新
- Agent核心能力
- **复杂任务规划**:能够理解复杂任务并自主规划执行,持续思考和调用工具直到完成目标,支持多轮推理和上下文理解。
- **长期记忆**:自动将对话记忆持久化至本地文件和数据库中,包括全局记忆和天级记忆,支持关键词及向量检索。
- **内置系统工具**内置实现10+种工具包括文件操作、bash终端、浏览器、文件发送、定时任务、记忆管理等。
- **Skills**新增Skill运行引擎内置多种技能并支持通过自然语言对话完成自定义Skills开发。
- **安全和成本**通过秘钥管理工具、提示词控制、系统权限等手段控制Agent的访问安全通过最大记忆轮次、最大上下文token、工具执行步数对token成本进行限制。
- 其他更新:
- 渠道优化:飞书及钉钉接入渠道支持长连接接入(无需公网IP)、支持图片/文件消息的接收和发送。
- 模型更新新增claude-sonnet-4-5、gemini-3-pro-preview、glm-4.7、MiniMax-M2.1、qwen3-max等最新模型。
- 部署优化:增加一键安装、配置、运行、管理的脚本,简化部署流程。
## 一、长期记忆系统
Agent 会在用户分享重要信息时主动存储,也会在对话达到一定长度时自动提取摘要。支持语义搜索和向量检索的混合检索模式。
**首次启动**时Agent 会主动询问关键信息,并记录至工作空间(默认 `~/cow`)中的智能体设定、用户身份、记忆文件中。
**长期对话**中Agent 会智能记录或检索记忆,不断更新自身设定、用户偏好,总结经验和教训,真正实现自主思考和持续成长。
<img width="800" src="https://cdn.link-ai.tech/doc/20260203000455.png">
## 二、任务规划与工具调用
Agent 根据任务需求智能选择和调用工具,完成各类复杂操作。
### 1. 终端和文件访问
最基础和核心的工具能力,用户可通过手机端与 Agent 交互,操作个人电脑或服务器上的资源:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202181130.png">
### 2. 应用编程能力
基于编程能力和系统访问能力Agent 可实现从信息搜索、素材生成、编码、测试、部署、Nginx配置、发布的 **Vibecoding 全流程**,通过手机端一句命令完成应用快速 demo。
<img width="800" src="https://cdn.link-ai.tech/doc/20260203121008.png">
### 3. 定时任务
支持 **一次性任务、固定时间间隔、Cron表达式** 三种形式,任务触发可选择 **固定消息发送****Agent动态任务执行** 两种模式:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202195402.png">
### 4. 环境变量管理
通过 `env_config` 工具管理技能所需秘钥,支持对话式更新,内置安全保护和脱敏策略:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202234939.png">
## 三、技能系统
每个 Skill 由说明文件、运行脚本(可选)、资源(可选)组成,为 Agent 提供无限扩展性。
### 1. 技能创造器
通过对话方式快速创建技能,将工作流程固化或对接任意第三方接口:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202202247.png">
### 2. 搜索和图像识别
- **搜索技能**:内置 `bocha-search`(博查搜索),配置 `BOCHA_SEARCH_API_KEY` 即可使用。
- **图像识别**:支持 `gpt-4.1-mini``gpt-4.1` 等模型,配置 `OPENAI_API_KEY` 即可使用。
<img width="800" src="https://cdn.link-ai.tech/doc/20260202213219.png">
### 3. 三方知识库和插件
`linkai-agent` 技能可将 [LinkAI](https://link-ai.tech/) 上的所有智能体作为 skill 使用,实现多智能体决策:
<img width="750" src="https://cdn.link-ai.tech/doc/20260202234350.png">
## 四、快速开始
### 一键启动
本次新增了一键下载、配置、运行和管理的脚本,只需命令行中执行:
```bash
bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
```
详细说明参考:[项目启动脚本](https://github.com/zhayujie/chatgpt-on-wechat/wiki/CowAgentQuickStart)
### 模型选择
Agent 模式推荐使用以下模型:
- **Claude**: `claude-sonnet-4-5``claude-sonnet-4-0`
- **Gemini**: `gemini-3-flash-preview``gemini-3-pro-preview`
- **GLM**: `glm-4.7`
- **MiniMax**: `MiniMax-M2.1`
- **Qwen**: `qwen3-max`
详细配置方式参考 [README.md 模型说明](../README.md#模型说明)
### 渠道接入
支持在 Web、飞书、钉钉、企业微信 等多渠道与 Agent 交互,随时随地使用超级助理,只需修改 `config.json` 中的 `channel_type` 配置即可切换:
- **Web网页**:默认使用该渠道,运行后监听本地端口,通过浏览器访问。
- **飞书接入**[飞书接入文档](https://docs.link-ai.tech/cow/multi-platform/feishu)
- **钉钉接入**[钉钉接入文档](https://docs.link-ai.tech/cow/multi-platform/dingtalk)
- **企业微信应用接入**[企微应用文档](https://docs.link-ai.tech/cow/multi-platform/wechat-com)
更多渠道配置参考:[通道说明](../README.md#通道说明)
## 五、参与共建
2.0版本后项目将持续升级Agent能力、拓展接入渠道、内置工具、技能系统降低模型成本和提升安全性。欢迎 [提出反馈](https://github.com/zhayujie/chatgpt-on-wechat/issues) 和 [贡献代码](https://github.com/zhayujie/chatgpt-on-wechat/pulls)。
**🤖立即体验 CowAgent 2.0开启你的超级AI助理之旅**

View File

@@ -40,9 +40,6 @@ def create_bot(bot_type):
from models.linkai.link_ai_bot import LinkAIBot
return LinkAIBot()
elif bot_type == const.CLAUDEAI:
from models.claude.claude_ai_bot import ClaudeAIBot
return ClaudeAIBot()
elif bot_type == const.CLAUDEAPI:
from models.claudeapi.claude_api_bot import ClaudeAPIBot
return ClaudeAPIBot()

View File

@@ -1,5 +1,6 @@
# encoding:utf-8
import json
from models.bot import Bot
from models.session_manager import SessionManager
from bridge.context import ContextType
@@ -17,7 +18,15 @@ dashscope_models = {
"qwen-turbo": dashscope.Generation.Models.qwen_turbo,
"qwen-plus": dashscope.Generation.Models.qwen_plus,
"qwen-max": dashscope.Generation.Models.qwen_max,
"qwen-bailian-v1": dashscope.Generation.Models.bailian_v1
"qwen-bailian-v1": dashscope.Generation.Models.bailian_v1,
# Qwen3 series models - use string directly as model name
"qwen3-max": "qwen3-max",
"qwen3-plus": "qwen3-plus",
"qwen3-turbo": "qwen3-turbo",
# Other new models
"qwen-long": "qwen-long",
"qwq-32b-preview": "qwq-32b-preview",
"qvq-72b-preview": "qvq-72b-preview"
}
# ZhipuAI对话模型API
class DashscopeBot(Bot):
@@ -26,7 +35,8 @@ class DashscopeBot(Bot):
self.sessions = SessionManager(DashscopeSession, model=conf().get("model") or "qwen-plus")
self.model_name = conf().get("model") or "qwen-plus"
self.api_key = conf().get("dashscope_api_key")
os.environ["DASHSCOPE_API_KEY"] = self.api_key
if self.api_key:
os.environ["DASHSCOPE_API_KEY"] = self.api_key
self.client = dashscope.Generation
def reply(self, query, context=None):
@@ -115,3 +125,404 @@ class DashscopeBot(Bot):
return self.reply_text(session, retry_count + 1)
else:
return result
def call_with_tools(self, messages, tools=None, stream=False, **kwargs):
"""
Call DashScope API with tool support for agent integration
This method handles:
1. Format conversion (Claude format → DashScope format)
2. System prompt injection
3. API calling with DashScope SDK
4. Thinking mode support (enable_thinking for Qwen3)
Args:
messages: List of messages (may be in Claude format from agent)
tools: List of tool definitions (may be in Claude format from agent)
stream: Whether to use streaming
**kwargs: Additional parameters (max_tokens, temperature, system, etc.)
Returns:
Formatted response or generator for streaming
"""
try:
# Convert messages from Claude format to DashScope format
messages = self._convert_messages_to_dashscope_format(messages)
# Convert tools from Claude format to DashScope format
if tools:
tools = self._convert_tools_to_dashscope_format(tools)
# Handle system prompt
system_prompt = kwargs.get('system')
if system_prompt:
# Add system message at the beginning if not already present
if not messages or messages[0].get('role') != 'system':
messages = [{"role": "system", "content": system_prompt}] + messages
else:
# Replace existing system message
messages[0] = {"role": "system", "content": system_prompt}
# Build request parameters
model_name = kwargs.get("model", self.model_name)
parameters = {
"result_format": "message", # Required for tool calling
"temperature": kwargs.get("temperature", conf().get("temperature", 0.85)),
"top_p": kwargs.get("top_p", conf().get("top_p", 0.8)),
}
# Add max_tokens if specified
if kwargs.get("max_tokens"):
parameters["max_tokens"] = kwargs["max_tokens"]
# Add tools if provided
if tools:
parameters["tools"] = tools
# Add tool_choice if specified
if kwargs.get("tool_choice"):
parameters["tool_choice"] = kwargs["tool_choice"]
# Add thinking parameters for Qwen3 models (disabled by default for stability)
if "qwen3" in model_name.lower() or "qwq" in model_name.lower():
# Only enable thinking mode if explicitly requested
enable_thinking = kwargs.get("enable_thinking", False)
if enable_thinking:
parameters["enable_thinking"] = True
# Set thinking budget if specified
if kwargs.get("thinking_budget"):
parameters["thinking_budget"] = kwargs["thinking_budget"]
# Qwen3 requires incremental_output=true in thinking mode
if stream:
parameters["incremental_output"] = True
# Always use incremental_output for streaming (for better token-by-token streaming)
# This is especially important for tool calling to avoid incomplete responses
if stream:
parameters["incremental_output"] = True
# Make API call with DashScope SDK
if stream:
return self._handle_stream_response(model_name, messages, parameters)
else:
return self._handle_sync_response(model_name, messages, parameters)
except Exception as e:
error_msg = str(e)
logger.error(f"[DASHSCOPE] call_with_tools error: {error_msg}")
if stream:
def error_generator():
yield {
"error": True,
"message": error_msg,
"status_code": 500
}
return error_generator()
else:
return {
"error": True,
"message": error_msg,
"status_code": 500
}
def _handle_sync_response(self, model_name, messages, parameters):
"""Handle synchronous DashScope API response"""
try:
# Set API key before calling
dashscope.api_key = self.api_key
response = dashscope.Generation.call(
model=dashscope_models.get(model_name, model_name),
messages=messages,
**parameters
)
if response.status_code == HTTPStatus.OK:
# Convert DashScope response to OpenAI-compatible format
choice = response.output.choices[0]
return {
"id": response.request_id,
"object": "chat.completion",
"created": 0,
"model": model_name,
"choices": [{
"index": 0,
"message": {
"role": choice.message.role,
"content": choice.message.content,
"tool_calls": self._convert_tool_calls_to_openai_format(
choice.message.get("tool_calls")
)
},
"finish_reason": choice.finish_reason
}],
"usage": {
"prompt_tokens": response.usage.input_tokens,
"completion_tokens": response.usage.output_tokens,
"total_tokens": response.usage.total_tokens
}
}
else:
logger.error(f"[DASHSCOPE] API error: {response.code} - {response.message}")
return {
"error": True,
"message": response.message,
"status_code": response.status_code
}
except Exception as e:
logger.error(f"[DASHSCOPE] sync response error: {e}")
return {
"error": True,
"message": str(e),
"status_code": 500
}
def _handle_stream_response(self, model_name, messages, parameters):
"""Handle streaming DashScope API response"""
try:
# Set API key before calling
dashscope.api_key = self.api_key
responses = dashscope.Generation.call(
model=dashscope_models.get(model_name, model_name),
messages=messages,
stream=True,
**parameters
)
# Stream chunks to caller, converting to OpenAI format
for response in responses:
if response.status_code != HTTPStatus.OK:
logger.error(f"[DASHSCOPE] Stream error: {response.code} - {response.message}")
yield {
"error": True,
"message": response.message,
"status_code": response.status_code
}
continue
# Get choice - use try-except because DashScope raises KeyError on hasattr()
try:
if isinstance(response.output, dict):
choice = response.output['choices'][0]
else:
choice = response.output.choices[0]
except (KeyError, AttributeError, IndexError) as e:
logger.warning(f"[DASHSCOPE] Cannot get choice: {e}")
continue
# Get finish_reason safely
finish_reason = None
try:
if isinstance(choice, dict):
finish_reason = choice.get('finish_reason')
else:
finish_reason = choice.finish_reason
except (KeyError, AttributeError):
pass
# Convert to OpenAI-compatible format
openai_chunk = {
"id": response.request_id,
"object": "chat.completion.chunk",
"created": 0,
"model": model_name,
"choices": [{
"index": 0,
"delta": {},
"finish_reason": finish_reason
}]
}
# Get message safely - use try-except
message = {}
try:
if isinstance(choice, dict):
message = choice.get('message', {})
else:
message = choice.message
except (KeyError, AttributeError):
pass
# Add role if present
role = None
try:
if isinstance(message, dict):
role = message.get('role')
else:
role = message.role
except (KeyError, AttributeError):
pass
if role:
openai_chunk["choices"][0]["delta"]["role"] = role
# Add content if present
content = None
try:
if isinstance(message, dict):
content = message.get('content')
else:
content = message.content
except (KeyError, AttributeError):
pass
if content:
openai_chunk["choices"][0]["delta"]["content"] = content
# Add tool_calls if present
# DashScope's response object raises KeyError on hasattr() if attr doesn't exist
# So we use try-except instead
tool_calls = None
try:
if isinstance(message, dict):
tool_calls = message.get('tool_calls')
else:
tool_calls = message.tool_calls
except (KeyError, AttributeError):
pass
if tool_calls:
openai_chunk["choices"][0]["delta"]["tool_calls"] = self._convert_tool_calls_to_openai_format(tool_calls)
yield openai_chunk
except Exception as e:
logger.error(f"[DASHSCOPE] stream response error: {e}")
yield {
"error": True,
"message": str(e),
"status_code": 500
}
def _convert_tools_to_dashscope_format(self, tools):
"""
Convert tools from Claude format to DashScope format
Claude format: {name, description, input_schema}
DashScope format: {type: "function", function: {name, description, parameters}}
"""
if not tools:
return None
dashscope_tools = []
for tool in tools:
# Check if already in DashScope/OpenAI format
if 'type' in tool and tool['type'] == 'function':
dashscope_tools.append(tool)
else:
# Convert from Claude format
dashscope_tools.append({
"type": "function",
"function": {
"name": tool.get("name"),
"description": tool.get("description"),
"parameters": tool.get("input_schema", {})
}
})
return dashscope_tools
def _convert_messages_to_dashscope_format(self, messages):
"""
Convert messages from Claude format to DashScope format
Claude uses content blocks with types like 'tool_use', 'tool_result'
DashScope uses 'tool_calls' in assistant messages and 'tool' role for results
"""
if not messages:
return []
dashscope_messages = []
for msg in messages:
role = msg.get("role")
content = msg.get("content")
# Handle string content (already in correct format)
if isinstance(content, str):
dashscope_messages.append(msg)
continue
# Handle list content (Claude format with content blocks)
if isinstance(content, list):
# Check if this is a tool result message (user role with tool_result blocks)
if role == "user" and any(block.get("type") == "tool_result" for block in content):
# Convert each tool_result block to a separate tool message
for block in content:
if block.get("type") == "tool_result":
dashscope_messages.append({
"role": "tool",
"content": block.get("content", ""),
"tool_call_id": block.get("tool_use_id") # DashScope uses 'tool_call_id'
})
# Check if this is an assistant message with tool_use blocks
elif role == "assistant":
# Separate text content and tool_use blocks
text_parts = []
tool_calls = []
for block in content:
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_use":
tool_calls.append({
"id": block.get("id"),
"type": "function",
"function": {
"name": block.get("name"),
"arguments": json.dumps(block.get("input", {}))
}
})
# Build DashScope format assistant message
dashscope_msg = {
"role": "assistant"
}
# Add content only if there is actual text
# DashScope API: when tool_calls exist, content should be None or omitted if empty
if text_parts:
dashscope_msg["content"] = " ".join(text_parts)
elif not tool_calls:
# If no tool_calls and no text, set empty string (rare case)
dashscope_msg["content"] = ""
# If there are tool_calls but no text, don't set content field at all
if tool_calls:
dashscope_msg["tool_calls"] = tool_calls
dashscope_messages.append(dashscope_msg)
else:
# Other list content, keep as is
dashscope_messages.append(msg)
else:
# Other formats, keep as is
dashscope_messages.append(msg)
return dashscope_messages
def _convert_tool_calls_to_openai_format(self, tool_calls):
"""Convert DashScope tool_calls to OpenAI format"""
if not tool_calls:
return None
openai_tool_calls = []
for tool_call in tool_calls:
# DashScope format is already similar to OpenAI
if isinstance(tool_call, dict):
openai_tool_calls.append(tool_call)
else:
# Handle object format
openai_tool_calls.append({
"id": getattr(tool_call, 'id', None),
"type": "function",
"function": {
"name": tool_call.function.name,
"arguments": tool_call.function.arguments
}
})
return openai_tool_calls

View File

@@ -1,9 +1,9 @@
# encoding:utf-8
import time
import json
import requests
import openai
import openai.error
from models.bot import Bot
from models.minimax.minimax_session import MinimaxSession
from models.session_manager import SessionManager
@@ -11,41 +11,36 @@ from bridge.context import Context, ContextType
from bridge.reply import Reply, ReplyType
from common.log import logger
from config import conf, load_config
from models.chatgpt.chat_gpt_session import ChatGPTSession
import requests
from common import const
# ZhipuAI对话模型API
# MiniMax对话模型API
class MinimaxBot(Bot):
def __init__(self):
super().__init__()
self.args = {
"model": conf().get("model") or "abab6.5", # 对话模型的名称
"temperature": conf().get("temperature", 0.3), # 如果设置,值域须为 [0, 1] 我们推荐 0.3,以达到较合适的效果。
"top_p": conf().get("top_p", 0.95), # 使用默认值
}
self.api_key = conf().get("Minimax_api_key")
self.group_id = conf().get("Minimax_group_id")
self.base_url = conf().get("Minimax_base_url", f"https://api.minimax.chat/v1/text/chatcompletion_pro?GroupId={self.group_id}")
# tokens_to_generate/bot_setting/reply_constraints可自行修改
self.request_body = {
"model": self.args["model"],
"tokens_to_generate": 2048,
"reply_constraints": {"sender_type": "BOT", "sender_name": "MM智能助理"},
"messages": [],
"bot_setting": [
{
"bot_name": "MM智能助理",
"content": "MM智能助理是一款由MiniMax自研的没有调用其他产品的接口的大型语言模型。MiniMax是一家中国科技公司一直致力于进行大模型相关的研究。",
}
],
"model": conf().get("model") or "MiniMax-M2.1",
"temperature": conf().get("temperature", 0.3),
"top_p": conf().get("top_p", 0.95),
}
# Use unified key name: minimax_api_key
self.api_key = conf().get("minimax_api_key")
if not self.api_key:
# Fallback to old key name for backward compatibility
self.api_key = conf().get("Minimax_api_key")
if self.api_key:
logger.warning("[MINIMAX] 'Minimax_api_key' is deprecated, please use 'minimax_api_key' instead")
# REST API endpoint
# Use Chinese endpoint by default, users can override in config
# International users should set: "minimax_api_base": "https://api.minimax.io/v1"
self.api_base = conf().get("minimax_api_base", "https://api.minimaxi.com/v1")
self.sessions = SessionManager(MinimaxSession, model=const.MiniMax)
def reply(self, query, context: Context = None) -> Reply:
# acquire reply content
logger.info("[Minimax_AI] query={}".format(query))
logger.info("[MINIMAX] query={}".format(query))
if context.type == ContextType.TEXT:
session_id = context["session_id"]
reply = None
@@ -62,19 +57,16 @@ class MinimaxBot(Bot):
if reply:
return reply
session = self.sessions.session_query(query, session_id)
logger.debug("[Minimax_AI] session query={}".format(session))
logger.debug("[MINIMAX] session query={}".format(session))
model = context.get("Minimax_model")
new_args = self.args.copy()
if model:
new_args["model"] = model
# if context.get('stream'):
# # reply in stream
# return self.reply_text_stream(query, new_query, session_id)
reply_content = self.reply_text(session, args=new_args)
logger.debug(
"[Minimax_AI] new_query={}, session_id={}, reply_cont={}, completion_tokens={}".format(
"[MINIMAX] new_query={}, session_id={}, reply_cont={}, completion_tokens={}".format(
session.messages,
session_id,
reply_content["content"],
@@ -88,7 +80,7 @@ class MinimaxBot(Bot):
reply = Reply(ReplyType.TEXT, reply_content["content"])
else:
reply = Reply(ReplyType.ERROR, reply_content["content"])
logger.debug("[Minimax_AI] reply {} used 0 tokens.".format(reply_content))
logger.debug("[MINIMAX] reply {} used 0 tokens.".format(reply_content))
return reply
else:
reply = Reply(ReplyType.ERROR, "Bot不支持处理{}类型的消息".format(context.type))
@@ -96,41 +88,62 @@ class MinimaxBot(Bot):
def reply_text(self, session: MinimaxSession, args=None, retry_count=0) -> dict:
"""
call openai's ChatCompletion to get the answer
Call MiniMax API to get the answer using REST API
:param session: a conversation session
:param session_id: session id
:param args: request arguments
:param retry_count: retry count
:return: {}
"""
try:
headers = {"Content-Type": "application/json", "Authorization": "Bearer " + self.api_key}
self.request_body["messages"].extend(session.messages)
logger.info("[Minimax_AI] request_body={}".format(self.request_body))
# logger.info("[Minimax_AI] reply={}, total_tokens={}".format(response.choices[0]['message']['content'], response["usage"]["total_tokens"]))
res = requests.post(self.base_url, headers=headers, json=self.request_body)
if args is None:
args = self.args
# Build request
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {self.api_key}"
}
request_body = {
"model": args.get("model", self.args["model"]),
"messages": session.messages,
"temperature": args.get("temperature", self.args["temperature"]),
"top_p": args.get("top_p", self.args["top_p"]),
}
url = f"{self.api_base}/chat/completions"
logger.debug(f"[MINIMAX] Calling {url} with model={request_body['model']}")
response = requests.post(url, headers=headers, json=request_body, timeout=60)
if response.status_code == 200:
result = response.json()
content = result["choices"][0]["message"]["content"]
total_tokens = result["usage"]["total_tokens"]
completion_tokens = result["usage"]["completion_tokens"]
logger.debug(f"[MINIMAX] reply_text: content_length={len(content)}, tokens={total_tokens}")
# self.request_body["messages"].extend(response.json()["choices"][0]["messages"])
if res.status_code == 200:
response = res.json()
return {
"total_tokens": response["usage"]["total_tokens"],
"completion_tokens": response["usage"]["total_tokens"],
"content": response["reply"],
"total_tokens": total_tokens,
"completion_tokens": completion_tokens,
"content": content,
}
else:
response = res.json()
error = response.get("error")
logger.error(f"[Minimax_AI] chat failed, status_code={res.status_code}, " f"msg={error.get('message')}, type={error.get('type')}")
error_msg = response.text
logger.error(f"[MINIMAX] API error: status={response.status_code}, msg={error_msg}")
result = {"completion_tokens": 0, "content": "提问太快啦,请休息一下再问我吧"}
# Parse error for better messages
result = {"completion_tokens": 0, "content": "我现在有点累了,等会再来吧"}
need_retry = False
if res.status_code >= 500:
# server error, need retry
logger.warn(f"[Minimax_AI] do retry, times={retry_count}")
if response.status_code >= 500:
logger.warning(f"[MINIMAX] Server error, retry={retry_count}")
need_retry = retry_count < 2
elif res.status_code == 401:
elif response.status_code == 401:
result["content"] = "授权失败请检查API Key是否正确"
elif res.status_code == 429:
need_retry = False
elif response.status_code == 429:
result["content"] = "请求过于频繁,请稍后再试"
need_retry = retry_count < 2
else:
@@ -141,11 +154,489 @@ class MinimaxBot(Bot):
return self.reply_text(session, args, retry_count + 1)
else:
return result
except Exception as e:
logger.exception(e)
except requests.exceptions.Timeout:
logger.error("[MINIMAX] Request timeout")
need_retry = retry_count < 2
result = {"completion_tokens": 0, "content": "我现在有点累了,等会再来吧"}
result = {"completion_tokens": 0, "content": "请求超时,请稍后再试"}
if need_retry:
time.sleep(3)
return self.reply_text(session, args, retry_count + 1)
else:
return result
except Exception as e:
logger.error(f"[MINIMAX] reply_text error: {e}")
import traceback
logger.error(traceback.format_exc())
need_retry = retry_count < 2
result = {"completion_tokens": 0, "content": "我现在有点累了,等会再来吧"}
if need_retry:
time.sleep(3)
return self.reply_text(session, args, retry_count + 1)
else:
return result
def call_with_tools(self, messages, tools=None, stream=False, **kwargs):
"""
Call MiniMax API with tool support for agent integration
This method handles:
1. Format conversion (Claude format → OpenAI format)
2. System prompt injection
3. API calling with REST API
4. Interleaved Thinking support (reasoning_split=True)
Args:
messages: List of messages (may be in Claude format from agent)
tools: List of tool definitions (may be in Claude format from agent)
stream: Whether to use streaming
**kwargs: Additional parameters (max_tokens, temperature, system, etc.)
Returns:
Formatted response or generator for streaming
"""
try:
# Convert messages from Claude format to OpenAI format
converted_messages = self._convert_messages_to_openai_format(messages)
# Extract and inject system prompt if provided
system_prompt = kwargs.pop("system", None)
if system_prompt:
# Add system message at the beginning
converted_messages.insert(0, {"role": "system", "content": system_prompt})
# Convert tools from Claude format to OpenAI format
converted_tools = None
if tools:
converted_tools = self._convert_tools_to_openai_format(tools)
# Prepare API parameters
model = kwargs.pop("model", None) or self.args["model"]
max_tokens = kwargs.pop("max_tokens", 4096)
temperature = kwargs.pop("temperature", self.args["temperature"])
# Build request body
request_body = {
"model": model,
"messages": converted_messages,
"max_tokens": max_tokens,
"temperature": temperature,
"stream": stream,
}
# Add tools if provided
if converted_tools:
request_body["tools"] = converted_tools
# Add reasoning_split=True for better thinking control (M2.1 feature)
# This separates thinking content into reasoning_details field
request_body["reasoning_split"] = True
logger.debug(f"[MINIMAX] API call: model={model}, tools={len(converted_tools) if converted_tools else 0}, stream={stream}")
# Check if we should show thinking process
show_thinking = kwargs.pop("show_thinking", conf().get("minimax_show_thinking", False))
if stream:
return self._handle_stream_response(request_body, show_thinking=show_thinking)
else:
return self._handle_sync_response(request_body)
except Exception as e:
logger.error(f"[MINIMAX] call_with_tools error: {e}")
import traceback
logger.error(traceback.format_exc())
def error_generator():
yield {"error": True, "message": str(e), "status_code": 500}
return error_generator()
def _convert_messages_to_openai_format(self, messages):
"""
Convert messages from Claude format to OpenAI format
Claude format:
- role: "user" | "assistant"
- content: string | list of content blocks
OpenAI format:
- role: "user" | "assistant" | "tool"
- content: string
- tool_calls: list (for assistant)
- tool_call_id: string (for tool results)
"""
converted = []
for msg in messages:
role = msg.get("role")
content = msg.get("content")
if role == "user":
# Handle user message
if isinstance(content, list):
# Extract text from content blocks
text_parts = []
tool_result = None
for block in content:
if isinstance(block, dict):
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_result":
# Tool result should be a separate message with role="tool"
tool_result = {
"role": "tool",
"tool_call_id": block.get("tool_use_id"),
"content": str(block.get("content", ""))
}
if text_parts:
converted.append({
"role": "user",
"content": "\n".join(text_parts)
})
if tool_result:
converted.append(tool_result)
else:
# Simple text content
converted.append({
"role": "user",
"content": str(content)
})
elif role == "assistant":
# Handle assistant message
openai_msg = {"role": "assistant"}
if isinstance(content, list):
# Parse content blocks
text_parts = []
tool_calls = []
for block in content:
if isinstance(block, dict):
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_use":
# Convert to OpenAI tool_calls format
tool_calls.append({
"id": block.get("id"),
"type": "function",
"function": {
"name": block.get("name"),
"arguments": json.dumps(block.get("input", {}))
}
})
# Set content (can be empty if only tool calls)
if text_parts:
openai_msg["content"] = "\n".join(text_parts)
elif not tool_calls:
openai_msg["content"] = ""
# Set tool_calls
if tool_calls:
openai_msg["tool_calls"] = tool_calls
# When tool_calls exist and content is empty, set to None
if not text_parts:
openai_msg["content"] = None
else:
# Simple text content
openai_msg["content"] = str(content) if content else ""
converted.append(openai_msg)
return converted
def _convert_tools_to_openai_format(self, tools):
"""
Convert tools from Claude format to OpenAI format
Claude format:
{
"name": "tool_name",
"description": "description",
"input_schema": {...}
}
OpenAI format:
{
"type": "function",
"function": {
"name": "tool_name",
"description": "description",
"parameters": {...}
}
}
"""
converted = []
for tool in tools:
converted.append({
"type": "function",
"function": {
"name": tool.get("name"),
"description": tool.get("description"),
"parameters": tool.get("input_schema", {})
}
})
return converted
def _handle_sync_response(self, request_body):
"""Handle synchronous API response"""
try:
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {self.api_key}"
}
# Remove stream from body for sync request
request_body.pop("stream", None)
url = f"{self.api_base}/chat/completions"
response = requests.post(url, headers=headers, json=request_body, timeout=60)
if response.status_code != 200:
error_msg = response.text
logger.error(f"[MINIMAX] API error: status={response.status_code}, msg={error_msg}")
yield {"error": True, "message": error_msg, "status_code": response.status_code}
return
result = response.json()
message = result["choices"][0]["message"]
finish_reason = result["choices"][0]["finish_reason"]
# Build response in Claude-like format
response_data = {
"role": "assistant",
"content": []
}
# Add reasoning_details (thinking) if present
if "reasoning_details" in message:
for reasoning in message["reasoning_details"]:
if "text" in reasoning:
response_data["content"].append({
"type": "thinking",
"thinking": reasoning["text"]
})
# Add text content if present
if message.get("content"):
response_data["content"].append({
"type": "text",
"text": message["content"]
})
# Add tool calls if present
if message.get("tool_calls"):
for tool_call in message["tool_calls"]:
response_data["content"].append({
"type": "tool_use",
"id": tool_call["id"],
"name": tool_call["function"]["name"],
"input": json.loads(tool_call["function"]["arguments"])
})
# Set stop_reason
if finish_reason == "tool_calls":
response_data["stop_reason"] = "tool_use"
elif finish_reason == "stop":
response_data["stop_reason"] = "end_turn"
else:
response_data["stop_reason"] = finish_reason
yield response_data
except requests.exceptions.Timeout:
logger.error("[MINIMAX] Request timeout")
yield {"error": True, "message": "Request timeout", "status_code": 500}
except Exception as e:
logger.error(f"[MINIMAX] sync response error: {e}")
import traceback
logger.error(traceback.format_exc())
yield {"error": True, "message": str(e), "status_code": 500}
def _handle_stream_response(self, request_body, show_thinking=False):
"""Handle streaming API response
Args:
request_body: API request parameters
show_thinking: Whether to show thinking/reasoning process to users
"""
try:
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {self.api_key}"
}
url = f"{self.api_base}/chat/completions"
response = requests.post(url, headers=headers, json=request_body, stream=True, timeout=60)
if response.status_code != 200:
error_msg = response.text
logger.error(f"[MINIMAX] API error: status={response.status_code}, msg={error_msg}")
yield {"error": True, "message": error_msg, "status_code": response.status_code}
return
current_content = []
current_tool_calls = {}
current_reasoning = []
finish_reason = None
chunk_count = 0
# Process SSE stream
for line in response.iter_lines():
if not line:
continue
line = line.decode('utf-8')
if not line.startswith('data: '):
continue
data_str = line[6:] # Remove 'data: ' prefix
if data_str.strip() == '[DONE]':
break
try:
chunk = json.loads(data_str)
chunk_count += 1
except json.JSONDecodeError as e:
logger.warning(f"[MINIMAX] JSON decode error: {e}, data: {data_str[:100]}")
continue
# Check for error response (MiniMax format)
if chunk.get("type") == "error" or "error" in chunk:
error_data = chunk.get("error", {})
error_msg = error_data.get("message", "Unknown error")
error_type = error_data.get("type", "")
http_code = error_data.get("http_code", "")
logger.error(f"[MINIMAX] API error: {error_msg} (type: {error_type}, code: {http_code})")
yield {
"error": True,
"message": error_msg,
"status_code": int(http_code) if http_code.isdigit() else 500
}
return
if not chunk.get("choices"):
continue
choice = chunk["choices"][0]
delta = choice.get("delta", {})
# Handle reasoning_details (thinking)
if "reasoning_details" in delta:
for reasoning in delta["reasoning_details"]:
if "text" in reasoning:
reasoning_id = reasoning.get("id", "reasoning-text-1")
reasoning_index = reasoning.get("index", 0)
reasoning_text = reasoning["text"]
# Accumulate reasoning text
if reasoning_index >= len(current_reasoning):
current_reasoning.append({"id": reasoning_id, "text": ""})
current_reasoning[reasoning_index]["text"] += reasoning_text
# Optionally yield thinking as visible content
if show_thinking:
# Format thinking text for display
formatted_thinking = f"💭 {reasoning_text}"
# Yield as OpenAI-format content delta
yield {
"choices": [{
"index": 0,
"delta": {
"role": "assistant",
"content": formatted_thinking
}
}]
}
# Handle text content
if "content" in delta and delta["content"]:
# Start new content block if needed
if not any(block.get("type") == "text" for block in current_content):
current_content.append({"type": "text", "text": ""})
# Accumulate text
for block in current_content:
if block.get("type") == "text":
block["text"] += delta["content"]
break
# Yield OpenAI-format delta (for agent_stream.py compatibility)
yield {
"choices": [{
"index": 0,
"delta": {
"role": "assistant",
"content": delta["content"]
}
}]
}
# Handle tool calls
if "tool_calls" in delta:
for tool_call_chunk in delta["tool_calls"]:
index = tool_call_chunk.get("index", 0)
if index not in current_tool_calls:
# Start new tool call
current_tool_calls[index] = {
"id": tool_call_chunk.get("id", ""),
"type": "tool_use",
"name": tool_call_chunk.get("function", {}).get("name", ""),
"input": ""
}
# Accumulate tool call arguments
if "function" in tool_call_chunk and "arguments" in tool_call_chunk["function"]:
current_tool_calls[index]["input"] += tool_call_chunk["function"]["arguments"]
# Yield OpenAI-format tool call delta
yield {
"choices": [{
"index": 0,
"delta": {
"tool_calls": [tool_call_chunk]
}
}]
}
# Handle finish_reason
if choice.get("finish_reason"):
finish_reason = choice["finish_reason"]
# Log complete reasoning_details for debugging
if current_reasoning:
logger.debug(f"[MINIMAX] ===== Complete Reasoning Details =====")
for i, reasoning in enumerate(current_reasoning):
reasoning_text = reasoning.get("text", "")
logger.debug(f"[MINIMAX] Reasoning {i+1} (length={len(reasoning_text)}):")
logger.debug(f"[MINIMAX] {reasoning_text}")
logger.debug(f"[MINIMAX] ===== End Reasoning Details =====")
# Yield final chunk with finish_reason (OpenAI format)
yield {
"choices": [{
"index": 0,
"delta": {},
"finish_reason": finish_reason
}]
}
except requests.exceptions.Timeout:
logger.error("[MINIMAX] Request timeout")
yield {"error": True, "message": "Request timeout", "status_code": 500}
except Exception as e:
logger.error(f"[MINIMAX] stream response error: {e}")
import traceback
logger.error(traceback.format_exc())
yield {"error": True, "message": str(e), "status_code": 500}

View File

@@ -6,8 +6,8 @@ from config import conf
class ZhipuAIImage(object):
def __init__(self):
from zhipuai import ZhipuAI
self.client = ZhipuAI(api_key=conf().get("zhipu_ai_api_key"))
from zai import ZhipuAiClient
self.client = ZhipuAiClient(api_key=conf().get("zhipu_ai_api_key"))
def create_img(self, query, retry_count=0, api_key=None, api_base=None):
try:

View File

@@ -1,9 +1,8 @@
# encoding:utf-8
import time
import json
import openai
import openai.error
from models.bot import Bot
from models.zhipuai.zhipu_ai_session import ZhipuAISession
from models.zhipuai.zhipu_ai_image import ZhipuAIImage
@@ -12,7 +11,7 @@ from bridge.context import ContextType
from bridge.reply import Reply, ReplyType
from common.log import logger
from config import conf, load_config
from zhipuai import ZhipuAI
from zai import ZhipuAiClient
# ZhipuAI对话模型API
@@ -25,7 +24,7 @@ class ZHIPUAIBot(Bot, ZhipuAIImage):
"temperature": conf().get("temperature", 0.9), # 值在(0,1)之间(智谱AI 的温度不能取 0 或者 1)
"top_p": conf().get("top_p", 0.7), # 值在(0,1)之间(智谱AI 的 top_p 不能取 0 或者 1)
}
self.client = ZhipuAI(api_key=conf().get("zhipu_ai_api_key"))
self.client = ZhipuAiClient(api_key=conf().get("zhipu_ai_api_key"))
def reply(self, query, context=None):
# acquire reply content
@@ -49,17 +48,13 @@ class ZHIPUAIBot(Bot, ZhipuAIImage):
session = self.sessions.session_query(query, session_id)
logger.debug("[ZHIPU_AI] session query={}".format(session.messages))
api_key = context.get("openai_api_key") or openai.api_key
model = context.get("gpt_model")
new_args = None
if model:
new_args = self.args.copy()
new_args["model"] = model
# if context.get('stream'):
# # reply in stream
# return self.reply_text_stream(query, new_query, session_id)
reply_content = self.reply_text(session, api_key, args=new_args)
reply_content = self.reply_text(session, args=new_args)
logger.debug(
"[ZHIPU_AI] new_query={}, session_id={}, reply_cont={}, completion_tokens={}".format(
session.messages,
@@ -90,21 +85,17 @@ class ZHIPUAIBot(Bot, ZhipuAIImage):
reply = Reply(ReplyType.ERROR, "Bot不支持处理{}类型的消息".format(context.type))
return reply
def reply_text(self, session: ZhipuAISession, api_key=None, args=None, retry_count=0) -> dict:
def reply_text(self, session: ZhipuAISession, args=None, retry_count=0) -> dict:
"""
call openai's ChatCompletion to get the answer
Call ZhipuAI API to get the answer
:param session: a conversation session
:param session_id: session id
:param args: request arguments
:param retry_count: retry count
:return: {}
"""
try:
# if conf().get("rate_limit_chatgpt") and not self.tb4chatgpt.get_token():
# raise openai.error.RateLimitError("RateLimitError: rate limit exceeded")
# if api_key == None, the default openai.api_key will be used
if args is None:
args = self.args
# response = openai.ChatCompletion.create(api_key=api_key, messages=session.messages, **args)
response = self.client.chat.completions.create(messages=session.messages, **args)
# logger.debug("[ZHIPU_AI] response={}".format(response))
# logger.info("[ZHIPU_AI] reply={}, total_tokens={}".format(response.choices[0]['message']['content'], response["usage"]["total_tokens"]))
@@ -117,23 +108,26 @@ class ZHIPUAIBot(Bot, ZhipuAIImage):
except Exception as e:
need_retry = retry_count < 2
result = {"completion_tokens": 0, "content": "我现在有点累了,等会再来吧"}
if isinstance(e, openai.error.RateLimitError):
error_str = str(e).lower()
# Check error type by error message content
if "rate" in error_str and "limit" in error_str:
logger.warn("[ZHIPU_AI] RateLimitError: {}".format(e))
result["content"] = "提问太快啦,请休息一下再问我吧"
if need_retry:
time.sleep(20)
elif isinstance(e, openai.error.Timeout):
elif "timeout" in error_str or "timed out" in error_str:
logger.warn("[ZHIPU_AI] Timeout: {}".format(e))
result["content"] = "我没有收到你的消息"
if need_retry:
time.sleep(5)
elif isinstance(e, openai.error.APIError):
logger.warn("[ZHIPU_AI] Bad Gateway: {}".format(e))
elif "api" in error_str and ("error" in error_str or "gateway" in error_str):
logger.warn("[ZHIPU_AI] APIError: {}".format(e))
result["content"] = "请再问我一次"
if need_retry:
time.sleep(10)
elif isinstance(e, openai.error.APIConnectionError):
logger.warn("[ZHIPU_AI] APIConnectionError: {}".format(e))
elif "connection" in error_str or "network" in error_str:
logger.warn("[ZHIPU_AI] ConnectionError: {}".format(e))
result["content"] = "我连接不到你的网络"
if need_retry:
time.sleep(5)
@@ -144,6 +138,325 @@ class ZHIPUAIBot(Bot, ZhipuAIImage):
if need_retry:
logger.warn("[ZHIPU_AI] 第{}次重试".format(retry_count + 1))
return self.reply_text(session, api_key, args, retry_count + 1)
return self.reply_text(session, args, retry_count + 1)
else:
return result
def call_with_tools(self, messages, tools=None, stream=False, **kwargs):
"""
Call ZhipuAI API with tool support for agent integration
This method handles:
1. Format conversion (Claude format → ZhipuAI format)
2. System prompt injection
3. API calling with ZhipuAI SDK
4. Tool stream support (tool_stream=True for GLM-4.7)
Args:
messages: List of messages (may be in Claude format from agent)
tools: List of tool definitions (may be in Claude format from agent)
stream: Whether to use streaming
**kwargs: Additional parameters (max_tokens, temperature, system, etc.)
Returns:
Formatted response or generator for streaming
"""
try:
# Convert messages from Claude format to ZhipuAI format
messages = self._convert_messages_to_zhipu_format(messages)
# Convert tools from Claude format to ZhipuAI format
if tools:
tools = self._convert_tools_to_zhipu_format(tools)
# Handle system prompt
system_prompt = kwargs.get('system')
if system_prompt:
# Add system message at the beginning if not already present
if not messages or messages[0].get('role') != 'system':
messages = [{"role": "system", "content": system_prompt}] + messages
else:
# Replace existing system message
messages[0] = {"role": "system", "content": system_prompt}
# Build request parameters
request_params = {
"model": kwargs.get("model", self.args.get("model", "glm-4")),
"messages": messages,
"temperature": kwargs.get("temperature", self.args.get("temperature", 0.9)),
"top_p": kwargs.get("top_p", self.args.get("top_p", 0.7)),
"stream": stream
}
# Add max_tokens if specified
if kwargs.get("max_tokens"):
request_params["max_tokens"] = kwargs["max_tokens"]
# Add tools if provided
if tools:
request_params["tools"] = tools
# GLM-4.7 with zai-sdk supports tool_stream for streaming tool calls
if stream:
request_params["tool_stream"] = kwargs.get("tool_stream", True)
# Add thinking parameter for deep thinking mode (GLM-4.7)
thinking = kwargs.get("thinking")
if thinking:
request_params["thinking"] = thinking
elif "glm-4.7" in request_params["model"]:
# Enable thinking by default for GLM-4.7
request_params["thinking"] = {"type": "enabled"}
# Make API call with ZhipuAI SDK
if stream:
return self._handle_stream_response(request_params)
else:
return self._handle_sync_response(request_params)
except Exception as e:
error_msg = str(e)
logger.error(f"[ZHIPU_AI] call_with_tools error: {error_msg}")
if stream:
def error_generator():
yield {
"error": True,
"message": error_msg,
"status_code": 500
}
return error_generator()
else:
return {
"error": True,
"message": error_msg,
"status_code": 500
}
def _handle_sync_response(self, request_params):
"""Handle synchronous ZhipuAI API response"""
try:
response = self.client.chat.completions.create(**request_params)
# Convert ZhipuAI response to OpenAI-compatible format
return {
"id": response.id,
"object": "chat.completion",
"created": response.created,
"model": response.model,
"choices": [{
"index": 0,
"message": {
"role": response.choices[0].message.role,
"content": response.choices[0].message.content,
"tool_calls": self._convert_tool_calls_to_openai_format(
getattr(response.choices[0].message, 'tool_calls', None)
)
},
"finish_reason": response.choices[0].finish_reason
}],
"usage": {
"prompt_tokens": response.usage.prompt_tokens,
"completion_tokens": response.usage.completion_tokens,
"total_tokens": response.usage.total_tokens
}
}
except Exception as e:
logger.error(f"[ZHIPU_AI] sync response error: {e}")
return {
"error": True,
"message": str(e),
"status_code": 500
}
def _handle_stream_response(self, request_params):
"""Handle streaming ZhipuAI API response"""
try:
stream = self.client.chat.completions.create(**request_params)
# Stream chunks to caller, converting to OpenAI format
for chunk in stream:
if not chunk.choices:
continue
delta = chunk.choices[0].delta
# Convert to OpenAI-compatible format
openai_chunk = {
"id": chunk.id,
"object": "chat.completion.chunk",
"created": chunk.created,
"model": chunk.model,
"choices": [{
"index": 0,
"delta": {},
"finish_reason": chunk.choices[0].finish_reason
}]
}
# Add role if present
if hasattr(delta, 'role') and delta.role:
openai_chunk["choices"][0]["delta"]["role"] = delta.role
# Add content if present
if hasattr(delta, 'content') and delta.content:
openai_chunk["choices"][0]["delta"]["content"] = delta.content
# Add reasoning_content if present (GLM-4.7 specific)
if hasattr(delta, 'reasoning_content') and delta.reasoning_content:
# Store reasoning in content or metadata
if "content" not in openai_chunk["choices"][0]["delta"]:
openai_chunk["choices"][0]["delta"]["content"] = ""
# Prepend reasoning to content
openai_chunk["choices"][0]["delta"]["content"] = delta.reasoning_content + openai_chunk["choices"][0]["delta"].get("content", "")
# Add tool_calls if present
if hasattr(delta, 'tool_calls') and delta.tool_calls:
# For streaming, tool_calls need special handling
openai_tool_calls = []
for tc in delta.tool_calls:
tool_call_dict = {
"index": getattr(tc, 'index', 0),
"id": getattr(tc, 'id', None),
"type": "function",
"function": {}
}
# Add function name if present
if hasattr(tc, 'function') and hasattr(tc.function, 'name') and tc.function.name:
tool_call_dict["function"]["name"] = tc.function.name
# Add function arguments if present
if hasattr(tc, 'function') and hasattr(tc.function, 'arguments') and tc.function.arguments:
tool_call_dict["function"]["arguments"] = tc.function.arguments
openai_tool_calls.append(tool_call_dict)
openai_chunk["choices"][0]["delta"]["tool_calls"] = openai_tool_calls
yield openai_chunk
except Exception as e:
logger.error(f"[ZHIPU_AI] stream response error: {e}")
yield {
"error": True,
"message": str(e),
"status_code": 500
}
def _convert_tools_to_zhipu_format(self, tools):
"""
Convert tools from Claude format to ZhipuAI format
Claude format: {name, description, input_schema}
ZhipuAI format: {type: "function", function: {name, description, parameters}}
"""
if not tools:
return None
zhipu_tools = []
for tool in tools:
# Check if already in ZhipuAI/OpenAI format
if 'type' in tool and tool['type'] == 'function':
zhipu_tools.append(tool)
else:
# Convert from Claude format
zhipu_tools.append({
"type": "function",
"function": {
"name": tool.get("name"),
"description": tool.get("description"),
"parameters": tool.get("input_schema", {})
}
})
return zhipu_tools
def _convert_messages_to_zhipu_format(self, messages):
"""
Convert messages from Claude format to ZhipuAI format
Claude uses content blocks with types like 'tool_use', 'tool_result'
ZhipuAI uses 'tool_calls' in assistant messages and 'tool' role for results
"""
if not messages:
return []
zhipu_messages = []
for msg in messages:
role = msg.get("role")
content = msg.get("content")
# Handle string content (already in correct format)
if isinstance(content, str):
zhipu_messages.append(msg)
continue
# Handle list content (Claude format with content blocks)
if isinstance(content, list):
# Check if this is a tool result message (user role with tool_result blocks)
if role == "user" and any(block.get("type") == "tool_result" for block in content):
# Convert each tool_result block to a separate tool message
for block in content:
if block.get("type") == "tool_result":
zhipu_messages.append({
"role": "tool",
"tool_call_id": block.get("tool_use_id"),
"content": block.get("content", "")
})
# Check if this is an assistant message with tool_use blocks
elif role == "assistant":
# Separate text content and tool_use blocks
text_parts = []
tool_calls = []
for block in content:
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_use":
tool_calls.append({
"id": block.get("id"),
"type": "function",
"function": {
"name": block.get("name"),
"arguments": json.dumps(block.get("input", {}))
}
})
# Build ZhipuAI format assistant message
zhipu_msg = {
"role": "assistant",
"content": " ".join(text_parts) if text_parts else None
}
if tool_calls:
zhipu_msg["tool_calls"] = tool_calls
zhipu_messages.append(zhipu_msg)
else:
# Other list content, keep as is
zhipu_messages.append(msg)
else:
# Other formats, keep as is
zhipu_messages.append(msg)
return zhipu_messages
def _convert_tool_calls_to_openai_format(self, tool_calls):
"""Convert ZhipuAI tool_calls to OpenAI format"""
if not tool_calls:
return None
openai_tool_calls = []
for tool_call in tool_calls:
openai_tool_calls.append({
"id": tool_call.id,
"type": "function",
"function": {
"name": tool_call.function.name,
"arguments": tool_call.function.arguments
}
})
return openai_tool_calls

View File

@@ -15,14 +15,9 @@ elevenlabs==1.0.3 # elevenlabs TTS
#install plugin
dulwich
# wechatmp && wechatcom && feishu
web.py
wechatpy
# xunfei spark
websocket-client==1.2.0
# claude API
anthropic==0.25.0
@@ -32,11 +27,5 @@ broadscope_bailian
# google
google-generativeai
# zhipuai
zhipuai>=2.0.1
# tongyi qwen new sdk
dashscope
# tencentcloud sdk
tencentcloud-sdk-python>=3.0.0

View File

@@ -13,6 +13,15 @@ python-dotenv>=1.0.0
PyYAML>=6.0
croniter>=2.0.0
# wechatcom & wechatmp
web.py
wechatpy
# zhipuai
zai-sdk
# tongyi qwen sdk
dashscope
# feishu websocket mode
lark-oapi
# dingtalk

1069
run.sh

File diff suppressed because it is too large Load Diff

View File

@@ -80,7 +80,7 @@ else
# Use sips (macOS) or convert (ImageMagick) to compress
if command -v sips &> /dev/null; then
# macOS: resize to max 800px on longest side
sips -Z 800 "$image_input" --out "$temp_compressed" &> /dev/null
$(command -v sips) -Z 800 "$image_input" --out "$temp_compressed" &> /dev/null
if [ $? -eq 0 ]; then
image_to_encode="$temp_compressed"
>&2 echo "[vision.sh] Compressed large image ($(($file_size / 1024))KB) to avoid parameter limit"
@@ -123,7 +123,8 @@ else
# Encode image to base64
if command -v base64 &> /dev/null; then
# macOS and most Linux systems
base64_image=$(base64 -i "$image_to_encode" 2>/dev/null || base64 "$image_to_encode" 2>/dev/null)
base64_cmd=$(command -v base64)
base64_image=$($base64_cmd -i "$image_to_encode" 2>/dev/null || $base64_cmd "$image_to_encode" 2>/dev/null)
else
echo '{"error": "base64 command not found", "help": "Please install base64 utility"}'
# Clean up temp file if exists
@@ -171,7 +172,8 @@ EOF
fi
# Call OpenAI API
response=$(curl -sS --max-time 60 \
curl_cmd=$(command -v curl)
response=$($curl_cmd -sS --max-time 60 \
-X POST \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \

View File

@@ -15,11 +15,15 @@ import time
from bridge.reply import Reply, ReplyType
from common.log import logger
from voice.audio_convert import get_pcm_from_wav
from voice.voice import Voice
from voice.ali.ali_api import AliyunTokenGenerator, speech_to_text_aliyun, text_to_speech_aliyun
from config import conf
try:
from voice.audio_convert import get_pcm_from_wav
except ImportError as e:
logger.debug("import voice.audio_convert failed: {}".format(e))
class AliVoice(Voice):
def __init__(self):

View File

@@ -8,7 +8,13 @@ try:
except ImportError:
logger.debug("import pysilk failed, wechaty voice message will not be supported.")
from pydub import AudioSegment
try:
from pydub import AudioSegment
_pydub_available = True
except ImportError:
logger.debug("import pydub failed, voice conversion features will not be supported.")
AudioSegment = None
_pydub_available = False
sil_supports = [8000, 12000, 16000, 24000, 32000, 44100, 48000] # slk转wav时支持的采样率
@@ -44,6 +50,8 @@ def any_to_mp3(any_path, mp3_path):
"""
把任意格式转成mp3文件
"""
if not _pydub_available:
raise ImportError("pydub is required for audio conversion. Please install it with: pip install pydub")
if any_path.endswith(".mp3"):
shutil.copy2(any_path, mp3_path)
return
@@ -58,6 +66,8 @@ def any_to_wav(any_path, wav_path):
"""
把任意格式转成wav文件
"""
if not _pydub_available:
raise ImportError("pydub is required for audio conversion. Please install it with: pip install pydub")
if any_path.endswith(".wav"):
shutil.copy2(any_path, wav_path)
return
@@ -73,6 +83,8 @@ def any_to_sil(any_path, sil_path):
"""
把任意格式转成sil文件
"""
if not _pydub_available:
raise ImportError("pydub is required for audio conversion. Please install it with: pip install pydub")
if any_path.endswith(".sil") or any_path.endswith(".silk") or any_path.endswith(".slk"):
shutil.copy2(any_path, sil_path)
return 10000
@@ -92,6 +104,8 @@ def any_to_amr(any_path, amr_path):
"""
把任意格式转成amr文件
"""
if not _pydub_available:
raise ImportError("pydub is required for audio conversion. Please install it with: pip install pydub")
if any_path.endswith(".amr"):
shutil.copy2(any_path, amr_path)
return
@@ -116,6 +130,8 @@ def split_audio(file_path, max_segment_length_ms=60000):
"""
分割音频文件
"""
if not _pydub_available:
raise ImportError("pydub is required for audio conversion. Please install it with: pip install pydub")
audio = AudioSegment.from_file(file_path)
audio_length_ms = len(audio)
if audio_length_ms <= max_segment_length_ms:

View File

@@ -13,9 +13,13 @@ from bridge.reply import Reply, ReplyType
from common.log import logger
from common.tmp_dir import TmpDir
from config import conf
from voice.audio_convert import get_pcm_from_wav
from voice.voice import Voice
try:
from voice.audio_convert import get_pcm_from_wav
except ImportError as e:
logger.debug("import voice.audio_convert failed: {}".format(e))
class BaiduVoice(Voice):
def __init__(self):
try:

View File

@@ -28,9 +28,15 @@ from config import conf
from voice.voice import Voice
from .xunfei_asr import xunfei_asr
from .xunfei_tts import xunfei_tts
from voice.audio_convert import any_to_mp3
import shutil
from pydub import AudioSegment
try:
from voice.audio_convert import any_to_mp3
from pydub import AudioSegment
_audio_available = True
except ImportError as e:
logger.debug("import audio libraries failed: {}".format(e))
_audio_available = False
class XunfeiVoice(Voice):