Compare commits

...

98 Commits

Author SHA1 Message Date
zhayujie
d4bdd9b1b7 docs: update README.md for wecom_bot channel 2026-03-16 19:07:08 +08:00
zhayujie
8b45d6c750 docs: wecom_bot integration docs 2026-03-16 19:03:18 +08:00
zhayujie
4ecd4df2d4 feat: web console support wecom_bot config 2026-03-16 17:56:59 +08:00
zhayujie
a42f31fe52 feat: support wecom_bot stream card 2026-03-16 17:46:05 +08:00
zhayujie
d4480b695e feat(channel): add wecom_bot channel 2026-03-16 14:39:15 +08:00
zhayujie
c4b5f7fbae refactor: remove unavailable channels 2026-03-16 11:05:45 +08:00
zhayujie
ba915f2cc0 feat: add gemini-3.1-flash-lite-preview and gpt-5.4 2026-03-15 22:06:12 +08:00
zhayujie
4b91140f31 fix: optimize msg receive 2026-03-12 20:49:36 +08:00
zhayujie
9879878dd0 fix: concurrency issue in session 2026-03-12 17:08:09 +08:00
zhayujie
d78105d57c fix: tool call match 2026-03-12 17:05:27 +08:00
zhayujie
153c9e3565 fix(memory): remove useless prompt 2026-03-12 15:29:58 +08:00
zhayujie
c11623596d fix(memory): prevent context memory loss by improving trim strategy 2026-03-12 15:25:46 +08:00
zhayujie
e791a77f77 fix: strengthen bootstrap flow 2026-03-12 12:13:05 +08:00
zhayujie
b641bffb2c fix(feishu): remove bot_name dependency for group chat 2026-03-12 11:30:42 +08:00
zhayujie
ee0c47ac1e feat: file send prompt 2026-03-12 00:11:34 +08:00
zhayujie
eba90e9343 fix: workspace bootstrap 2026-03-11 23:35:42 +08:00
zhayujie
d8374d0fa5 fix: web_fetch encoding 2026-03-11 19:42:37 +08:00
zhayujie
fa61744c6d feat(web_fetch): support downloading and parsing remote document files (PDF, Word, Excel, PPT) 2026-03-11 17:47:15 +08:00
zhayujie
4fec55cc01 feat: web_featch tool support remote file url 2026-03-11 17:16:39 +08:00
zhayujie
1767413712 fix: increase minimax max_tokens 2026-03-11 15:31:35 +08:00
zhayujie
734c8fa84f fix: optimize skill prompt 2026-03-11 12:40:37 +08:00
zhayujie
9a8d422554 feat: package skill install 2026-03-11 12:18:36 +08:00
zhayujie
b21e945c76 feat: optimize bootstrap flow 2026-03-11 11:27:08 +08:00
zhayujie
a02bf1ea09 Merge pull request #2693 from 6vision/fix/bot-type-and-web-config
fix: rename zhipu bot_type, persist bot_type in web config, fix re.syb escape error
2026-03-11 10:24:19 +08:00
zhayujie
eda82bac92 fix: gemini tool call bug 2026-03-11 02:04:09 +08:00
zhayujie
e8d4f7dc4f fix: remove useless file 2026-03-10 22:56:00 +08:00
6vision
c4a93b7789 fix: rename zhipu bot_type, persist bot_type in web config, fix re.sub escape error
- Rename ZHIPU_AI bot type from glm-4 to zhipu to avoid confusion with model names

- Add bot_type persistence in web config to fix provider dropdown resetting on refresh

- Change OpenAI provider key to chatGPT to match bot_factory routing

- Add DEEPSEEK constant and route it to ChatGPTBot (OpenAI-compatible API)

- Keep backward compatibility for legacy bot_type glm-4 in bot_factory

- Fix re.sub bad escape error on Windows paths by using lambda replacement

- Remove unused pydantic import in minimax_bot.py

Made-with: Cursor
2026-03-10 21:34:24 +08:00
zhayujie
c3f9925097 fix: remove injected max-steps prompt from persisted conversation history 2026-03-10 20:08:59 +08:00
zhayujie
2a0cf7511a Merge pull request #2692 from 6vision/master
update:Adjust bot_type resolution priority in Agent mode
2026-03-10 15:17:22 +08:00
6vision
d0a70d3339 update:Adjust bot_type resolution priority in Agent mode 2026-03-10 15:14:01 +08:00
zhayujie
f37e4675dd Merge pull request #2691 from Weikjssss/fix-bot-type-conf
fix: pass bot_type in agent mode
2026-03-10 15:00:04 +08:00
zhayujie
4e32f67eeb fix: validate tool_call_id pairing #2690 2026-03-10 14:52:07 +08:00
Weikjssss
36d54cab52 fix: pass bot_type in agent mode 2026-03-10 14:28:39 +08:00
zhayujie
9d8df10dcf feat: clarify send tool is local-only 2026-03-10 12:10:10 +08:00
zhayujie
45ea88e070 Merge pull request #2689 from cowagent/fix/openai-compat-complete
fix: complete openai_compat migration across all model bots (openai>=1.0 compatibility)
2026-03-10 10:10:58 +08:00
cowagent
d5d0b947f5 fix: complete openai_compat migration across all model bots
Replace all direct openai.error.* usages with the openai_compat
compatibility layer to support openai>=1.0.

Affected files:
- models/chatgpt/chat_gpt_bot.py: fix isinstance checks (RateLimitError, Timeout, APIError, APIConnectionError)
- models/openai/open_ai_bot.py: replace import + fix isinstance checks
- models/ali/ali_qwen_bot.py: replace import + fix isinstance checks
- models/modelscope/modelscope_bot.py: remove unused openai.error import

The openai_compat layer (models/openai/openai_compat.py) already
handles both openai<1.0 and openai>=1.0 gracefully. This completes
the migration started in the existing PR #2688.
2026-03-10 10:06:04 +08:00
zhayujie
f775f1f11e Merge pull request #2688 from JasonOA888/fix/openai-compat
fix: use openai_compat layer for error handling (openai>=1.0 compatibility)
2026-03-10 10:02:41 +08:00
JasonOA888
f1e888f3de fix: use openai_compat layer for error handling
The code was directly importing openai.error which fails with openai>=1.0.
The project already has an openai_compat.py compatibility layer that handles
both old (<1.0) and new (>=1.0) OpenAI SDK versions.

This commit updates chat_gpt_bot.py to use the compatibility layer.

Related: #2687
2026-03-10 00:33:45 +08:00
zhayujie
71c8436e90 fix: skill download to temp dir 2026-03-09 18:43:28 +08:00
zhayujie
08c69f5e9b fix: clean existing skill directory before remote install to ensure full overwrite 2026-03-09 17:23:09 +08:00
zhayujie
a50fafaca2 refactor: convert image vision from skill to native tool 2026-03-09 16:01:56 +08:00
zhayujie
3c6781d240 refactor: inline skill-creator reference files into SKILL.md 2026-03-09 12:02:52 +08:00
zhayujie
3b8b5625f8 feat: add image vision provider 2026-03-09 11:37:45 +08:00
zhayujie
6be2034110 feat: add fallback embedding provider 2026-03-09 11:03:31 +08:00
zhayujie
924dc79f00 perf: lazy import to avoid 4-10s startup delay 2026-03-09 10:21:58 +08:00
zhayujie
ccb9030d3c refactor: convert web-fetch from skill to native tool 2026-03-09 10:13:48 +08:00
zhayujie
8623287ac1 docs: update memory system docs 2026-03-08 22:06:28 +08:00
zhayujie
022c13f3a4 feat: upgrade memory flush system
- Use LLM to summarize discarded context into concise daily memory entries
- Batch trim to half when exceeding max_turns/max_tokens, reducing flush frequency
- Run summarization asynchronously in background thread, no blocking on replies
- Add daily scheduled flush (23:55) as fallback for low-activity days
- Sync trimmed messages back to agent to keep context state consistent
2026-03-08 21:56:12 +08:00
zhayujie
0687916e7f fix: Safari IME enter key triggering message send
Made-with: Cursor
2026-03-08 13:21:31 +08:00
zhayujie
bb868b83ba feat: add chat history query 2026-03-08 13:03:27 +08:00
zhayujie
24298130b9 fix: minimax tool_id missing 2026-03-06 18:42:03 +08:00
zhayujie
6e5ee92ebd docs: add gpt-5.4 2026-03-06 12:25:50 +08:00
zhayujie
5b91fe04aa fix: send tool process url 2026-03-06 12:22:22 +08:00
zhayujie
1623deb3ee feat: support gpt-5.4 2026-03-06 12:04:40 +08:00
zhayujie
4a16e05b7a fix: rebuild skills when installing 2026-03-05 21:11:34 +08:00
zhayujie
f1c04bc60d feat: improve channel connection stability 2026-03-05 15:55:16 +08:00
zhayujie
84c6f31c76 fix: update agent skill metadata 2026-03-03 18:16:42 +08:00
zhayujie
9d528190bf feat: add skill category 2026-03-03 16:06:37 +08:00
zhayujie
0f23b209ad fix: adjust the context of restart loading 2026-03-03 11:38:14 +08:00
zhayujie
63d9325900 Merge pull request #2683 from pelioo/master
更新.gitignore文件添加python目录忽略规则
2026-03-01 19:41:27 +08:00
peli
f342097f81 Merge remote-tracking branch 'upstream/master' 2026-03-01 00:24:14 +08:00
zhayujie
b4806c4366 fix: model provider config 2026-02-28 18:35:04 +08:00
zhayujie
ff37d8a577 Merge branch 'master' of github.com:zhayujie/chatgpt-on-wechat 2026-02-28 18:10:55 +08:00
zhayujie
a773eb7893 fix: filter history to one user and one assistant per turn 2026-02-28 18:09:02 +08:00
zhayujie
7c67513d24 fix: convert bash-style $VAR to %VAR% on Windows 2026-02-28 18:02:06 +08:00
zhayujie
6ed85029c5 fix: agent skills 2026-02-28 16:46:49 +08:00
zhayujie
e9c57ddf4d fix: adjust default turns 2026-02-28 15:25:20 +08:00
zhayujie
a33ce97ed9 fix: restore only user/assistant text from history, strip tool calls
Made-with: Cursor
2026-02-28 15:14:56 +08:00
zhayujie
b788a3dd4e fix: incomplete historical session messages 2026-02-28 15:03:33 +08:00
zhayujie
fccfa92d7e docs: update channel docs 2026-02-28 14:50:55 +08:00
zhayujie
8705bf0a70 feat: update docs 2026-02-28 10:53:16 +08:00
peli
9318138af7 ```
build(env): 更新.gitignore文件添加python目录忽略规则

在.gitignore文件中新增了python目录的忽略配置,
避免将Python环境相关文件提交到版本控制系统中。
```
2026-02-27 23:49:35 +08:00
zhayujie
269fa7d2d5 feat: 2.0.2 en docs 2026-02-27 18:37:22 +08:00
zhayujie
e99837a8b9 feat: release 2.0.2 2026-02-27 18:04:00 +08:00
zhayujie
553861a2c4 docs: update README.md 2026-02-27 16:57:18 +08:00
zhayujie
628a85d1be docs: update README.md 2026-02-27 16:48:23 +08:00
zhayujie
2cb54514a4 Merge pull request #2681 from zhayujie/feat-docs
feat: docs update
2026-02-27 16:04:17 +08:00
zhayujie
6db22827f2 feat: docs update 2026-02-27 16:03:47 +08:00
zhayujie
4cc6d5426b Merge pull request #2680 from zhayujie/feat-web-config
feat: web console config
2026-02-27 14:40:44 +08:00
zhayujie
7d258b5202 feat(channels): add multi-channel management UI with real-time connect/disconnect
- Web console Channels page: display active channels as config cards, support
  save/connect/disconnect with real-time start/stop of channel processes
- Custom dropdown for channel selection (consistent with model selector style),
  custom confirmation dialog for disconnect
- Fix channel stop: use sys.modules['__main__'] to access live ChannelManager
- Fix web request pending: move stop logic outside lock, set daemon_threads=True
- Fix reconnect: new asyncio event loop per startup, ctypes thread interrupt,
  5s grace period before re-establishing remote connection
- Filter stale offline messages (>60s) pushed after reconnect
2026-02-27 14:39:40 +08:00
zhayujie
c8d19ee0bc Merge pull request #2679 from zhayujie/feat-docs
docs: init docs
2026-02-27 12:14:37 +08:00
zhayujie
d891312032 docs: init docs 2026-02-27 12:10:16 +08:00
zhayujie
5edbf4ce32 feat: model and agent config in web console 2026-02-26 21:01:37 +08:00
zhayujie
3ddbdd713d Merge branch 'master' of github.com:zhayujie/chatgpt-on-wechat 2026-02-26 18:57:43 +08:00
zhayujie
9ba107b511 Merge branch 'feat-multi-channel' 2026-02-26 18:57:19 +08:00
zhayujie
c9adddb76a fix: pass channel_type correctly in multi-channel mode 2026-02-26 18:57:08 +08:00
zhayujie
f0a12d5ff5 Merge pull request #2678 from zhayujie/feat-multi-channel
feat: support multi-channel
2026-02-26 18:34:48 +08:00
zhayujie
7cce224499 feat: support multi-channel 2026-02-26 18:34:08 +08:00
zhayujie
97397ca585 Merge pull request #2674 from haosenwang1018/fix/bare-excepts
fix: replace 29 bare except clauses with except Exception
2026-02-26 12:11:49 +08:00
zhayujie
f2fbc602a8 Merge branch 'master' of github.com:zhayujie/chatgpt-on-wechat 2026-02-26 10:45:01 +08:00
zhayujie
925d728a86 fix: replace upsert syntax to support SQLite lower version 2026-02-26 10:44:04 +08:00
zhayujie
f5f229871b Merge pull request #2676 from zhayujie/feat-multi-channel
feat: improve web console and conversation store
2026-02-26 10:37:03 +08:00
zhayujie
9917552b4b fix: improve web UI stability and conversation history restore
- Fix dark mode FOUC: apply theme in <head> before first paint, defer
  transition-colors to post-init to avoid animated flash on load
- Fix Safari IME Enter bug: defer compositionend reset via setTimeout(0)
- Fix history scroll: use requestAnimationFrame before scrollChatToBottom
- Limit restore turns to min(6, max_turns//3) on restart
- Fix load_messages cutoff to start at turn boundary, preventing orphaned
  tool_use/tool_result pairs from being sent to the LLM
- Merge all assistant messages within one user turn into a single bubble;
  render tool_calls in history using same CSS as live SSE view
- Handle empty choices list in stream chunks
2026-02-26 10:35:20 +08:00
haosenwang1018
adca89b973 fix: replace bare except clauses with except Exception
Bare `except:` catches BaseException including KeyboardInterrupt and
SystemExit. Replaced 29 instances with `except Exception:`.
2026-02-25 11:49:19 +00:00
zhayujie
29bfbecdc9 feat: persistent storage of conversation history 2026-02-25 18:01:39 +08:00
zhayujie
1a7a8c98d9 docs: add scam warning disclaimer 2026-02-25 01:34:16 +08:00
zhayujie
cddb38ac3d Merge pull request #2673 from zhayujie/feat-web-console
feat: web console
2026-02-24 00:06:29 +08:00
zhayujie
d610608391 feat: add cloud host config 2026-02-23 15:06:31 +08:00
216 changed files with 11408 additions and 8932 deletions

View File

@@ -79,8 +79,6 @@ body:
description: |
请确保你正确配置了该`channel`所需的配置项,所有可选的配置项都写在了[该文件中](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/config.py),请将所需配置项填写在根目录下的`config.json`文件中。
options:
- wx(个人微信, itchat)
- wxy(个人微信, wechaty)
- wechatmp(公众号, 订阅号)
- wechatmp_service(公众号, 服务号)
- terminal

3
.gitignore vendored
View File

@@ -3,16 +3,15 @@
.vscode
.venv
.vs
.wechaty/
__pycache__/
venv*
*.pyc
python
config.json
QR.png
nohup.out
tmp
plugins.json
itchat.pkl
*.log
logs/
workspace

View File

@@ -1,14 +1,21 @@
<p align="center"><img src= "https://github.com/user-attachments/assets/eca9a9ec-8534-4615-9e0f-96c5ac1d10a3" alt="Chatgpt-on-Wechat" width="550" /></p>
<p align="center">
<a href="https://github.com/zhayujie/chatgpt-on-wechat/releases/latest"><img src="https://img.shields.io/github/v/release/zhayujie/chatgpt-on-wechat" alt="Latest release"></a>
<a href="https://github.com/zhayujie/chatgpt-on-wechat/releases/latest"><img src="https://img.shields.io/github/v/release/zhayujie/chatgpt-on-wechat" alt="Latest release"></a>
<a href="https://github.com/zhayujie/chatgpt-on-wechat/blob/master/LICENSE"><img src="https://img.shields.io/github/license/zhayujie/chatgpt-on-wechat" alt="License: MIT"></a>
<a href="https://github.com/zhayujie/chatgpt-on-wechat"><img src="https://img.shields.io/github/stars/zhayujie/chatgpt-on-wechat?style=flat-square" alt="Stars"></a> <br/>
[中文] | [<a href="docs/en/README.md">English</a>]
</p>
**CowAgent** 是基于大模型的超级AI助理能够主动思考和任务规划、操作计算机和外部资源、创造和执行Skills、拥有长期记忆并不断成长。CowAgent 支持灵活切换多种模型能处理文本、语音、图片、文件等多模态消息可接入网页、飞书、钉钉、企微智能机器人、企业微信应用、微信公众号中使用7*24小时运行于你的个人电脑或服务器中。
<p align="center">
<a href="https://cowagent.ai/">🌐 官网</a> &nbsp;·&nbsp;
<a href="https://docs.cowagent.ai/">📖 文档中心</a> &nbsp;·&nbsp;
<a href="https://docs.cowagent.ai/guide/quick-start">🚀 快速开始</a>
</p>
**CowAgent** 是基于大模型的超级AI助理能够主动思考和任务规划、操作计算机和外部资源、创造和执行Skills、拥有长期记忆并不断成长。CowAgent 支持灵活切换多种模型能处理文本、语音、图片、文件等多模态消息可接入网页、飞书、钉钉、企业微信应用、微信公众号中使用7*24小时运行于你的个人电脑或服务器中。
📖能力介绍:[CowAgent 2.0](/docs/agent.md)
# 简介
@@ -24,12 +31,13 @@
## 声明
1. 本项目遵循 [MIT开源协议](/LICENSE),主要用于技术研究和学习,使用本项目时需遵守所在地法律法规、相关政策以及企业章程,禁止用于任何违法或侵犯他人权益的行为。任何个人、团队和企业,无论以何种方式使用该项目、对何对象提供服务,所产生的一切后果,本项目均不承担任何责任
2. 成本与安全Agent模式下Token使用量高于普通对话模式请根据效果及成本综合选择模型。Agent具有访问所在操作系统的能力请谨慎选择项目部署环境。同时项目也会持续升级安全机制、并降低模型消耗成本
1. 本项目遵循 [MIT开源协议](/LICENSE),主要用于技术研究和学习,使用本项目时需遵守所在地法律法规、相关政策以及企业章程,禁止用于任何违法或侵犯他人权益的行为。任何个人、团队和企业,无论以何种方式使用该项目、对何对象提供服务,所产生的一切后果,本项目均不承担任何责任
2. 成本与安全Agent模式下Token使用量高于普通对话模式请根据效果及成本综合选择模型。Agent具有访问所在操作系统的能力请谨慎选择项目部署环境。同时项目也会持续升级安全机制、并降低模型消耗成本
3. CowAgent项目专注于开源技术开发不会参与、授权或发行任何加密货币。
## 演示
使用说明(Agent模式)[CowAgent介绍](/docs/agent.md)
使用说明(Agent模式)[CowAgent介绍](https://docs.cowagent.ai/intro/features)
DEMO视频(对话模式)https://cdn.link-ai.tech/doc/cow_demo.mp4
@@ -57,17 +65,17 @@ DEMO视频(对话模式)https://cdn.link-ai.tech/doc/cow_demo.mp4
# 🏷 更新日志
>**2026.02.27** [2.0.2版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.2)Web 控制台全面升级(流式对话、模型/技能/记忆/通道/定时任务/日志管理)、支持多通道同时运行、会话持久化存储、新增多个模型。
>**2026.02.13** [2.0.1版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.1),内置 Web Search 工具、智能上下文裁剪策略、运行时信息动态更新、Windows 兼容性适配,修复定时任务记忆丢失、飞书连接等多项问题。
>**2026.02.03** [2.0.0版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.0)正式升级为超级Agent助理支持多轮任务决策、具备长期记忆、实现多种系统工具、支持Skills框架新增多种模型并优化了接入渠道。
>**2025.05.23** [1.7.6版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.6) 优化web网页channel、新增 [AgentMesh](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/plugins/agent/README.md)多智能体插件、百度语音合成优化、企微应用`access_token`获取优化、支持`claude-4-sonnet``claude-4-opus`模型
>**2025.04.11** [1.7.5版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.5) 新增支持 [wechatferry](https://github.com/zhayujie/chatgpt-on-wechat/pull/2562) 协议、新增 deepseek 模型、新增支持腾讯云语音能力、新增支持 ModelScope 和 Gitee-AI API接口
>**2024.12.13** [1.7.4版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.4) 新增 Gemini 2.0 模型、新增web channel、解决内存泄漏问题、解决 `#reloadp` 命令重载不生效问题
>**2024.10.31** [1.7.3版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.3) 程序稳定性提升、数据库功能、Claude模型优化、linkai插件优化、离线通知
更多更新历史请查看: [更新日志](/docs/release/history.md)
更多更新历史请查看: [更新日志](https://docs.cowagent.ai/releases)
<br/>
@@ -81,7 +89,7 @@ DEMO视频(对话模式)https://cdn.link-ai.tech/doc/cow_demo.mp4
bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
```
脚本使用说明:[一键运行脚本](https://github.com/zhayujie/chatgpt-on-wechat/wiki/CowAgentQuickStart)
脚本使用说明:[一键运行脚本](https://docs.cowagent.ai/guide/quick-start)
## 一、准备
@@ -90,7 +98,7 @@ bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
项目支持国内外主流厂商的模型接口,可选模型及配置说明参考:[模型说明](#模型说明)。
> Agent模式下推荐使用以下模型可根据效果及成本综合选择MiniMax-M2.5、glm-5、kimi-k2.5、qwen3.5-plus、claude-sonnet-4-6、gemini-3.1-pro-preview
> Agent模式下推荐使用以下模型可根据效果及成本综合选择MiniMax-M2.5、glm-5、kimi-k2.5、qwen3.5-plus、claude-sonnet-4-6、gemini-3.1-pro-preview、gpt-5.4
同时支持使用 **LinkAI平台** 接口,可灵活切换 OpenAI、Claude、Gemini、DeepSeek、Qwen、Kimi 等多种常用模型并支持知识库、工作流、插件等Agent能力参考 [接口文档](https://docs.link-ai.tech/platform/api)。
@@ -135,7 +143,7 @@ pip3 install -r requirements-optional.txt
```bash
# config.json 文件内容示例
{
"channel_type": "web", # 接入渠道类型默认为web支持修改为:feishu,dingtalk,wechatcom_app,terminal,wechatmp,wechatmp_service
"channel_type": "web", # 接入渠道类型默认为web支持修改为:feishu,dingtalk,wecom_bot,wechatcom_app,wechatmp_service,wechatmp,terminal
"model": "MiniMax-M2.5", # 模型名称
"minimax_api_key": "", # MiniMax API Key
"zhipu_ai_api_key": "", # 智谱GLM API Key
@@ -271,14 +279,14 @@ volumes:
```json
{
"model": "gpt-4.1-mini",
"model": "gpt-5.4",
"open_ai_api_key": "YOUR_API_KEY",
"open_ai_api_base": "https://api.openai.com/v1",
"bot_type": "chatGPT"
}
```
- `model`: 与OpenAI接口的 [model参数](https://platform.openai.com/docs/models) 一致,支持包括 o系列、gpt-5.2、gpt-5.1、gpt-4.1等系列模型
- `model`: 与OpenAI接口的 [model参数](https://platform.openai.com/docs/models) 一致,支持包括 gpt-5.4、o系列、gpt-4.1等模型Agent模式推荐使用 `gpt-5.4`
- `open_ai_api_base`: 如果需要接入第三方代理接口,可通过修改该参数进行接入
- `bot_type`: 使用OpenAI相关模型时无需填写。当使用第三方代理接口接入Claude等非OpenAI官方模型时该参数设为 `chatGPT`
</details>
@@ -460,11 +468,11 @@ volumes:
API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn) 创建API Key ,配置如下
```json
{
"model": "gemini-3.1-pro-preview",
"model": "gemini-3.1-flash-lite-preview",
"gemini_api_key": ""
}
```
- `model`: 参考[官方文档-模型列表](https://ai.google.dev/gemini-api/docs/models?hl=zh-cn),支持 `gemini-3.1-pro-preview、gemini-3-flash-preview、gemini-3-pro-preview、gemini-2.5-pro、gemini-2.0-flash`
- `model`: 参考[官方文档-模型列表](https://ai.google.dev/gemini-api/docs/models?hl=zh-cn),支持 `gemini-3.1-flash-lite-preview、gemini-3.1-pro-preview、gemini-3-flash-preview、gemini-3-pro-preview`
</details>
<details>
@@ -607,10 +615,12 @@ API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn)
以下对可接入通道的配置方式进行说明,应用通道代码在项目的 `channel/` 目录下。
支持同时可接入多个通道,配置时可通过逗号进行分割,例如 `"channel_type": "feishu,dingtalk"`
<details>
<summary>1. Web</summary>
项目启动后默认运行Web通道,配置如下:
项目启动后默认运行Web控制台,配置如下:
```json
{
@@ -656,7 +666,7 @@ API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn)
- `feishu_event_mode`: 事件接收模式,`websocket`(推荐)或 `webhook`
- WebSocket 模式需安装依赖:`pip3 install lark-oapi`
详细步骤和参数说明参考 [飞书接入](https://docs.link-ai.tech/cow/multi-platform/feishu)
详细步骤和参数说明参考 [飞书接入](https://docs.cowagent.ai/channels/feishu)
</details>
@@ -672,11 +682,27 @@ API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn)
"dingtalk_client_secret": "CLIENT_SECRET"
}
```
详细步骤和参数说明参考 [钉钉接入](https://docs.link-ai.tech/cow/multi-platform/dingtalk)
详细步骤和参数说明参考 [钉钉接入](https://docs.cowagent.ai/channels/dingtalk)
</details>
<details>
<summary>4. WeCom App - 企业微信应用</summary>
<summary>4. WeCom Bot - 企微智能机器人</summary>
企微智能机器人使用 WebSocket 长连接模式,无需公网 IP 和域名,配置简单:
```json
{
"channel_type": "wecom_bot",
"wecom_bot_id": "YOUR_BOT_ID",
"wecom_bot_secret": "YOUR_SECRET"
}
```
详细步骤和参数说明参考 [企微智能机器人接入](https://docs.cowagent.ai/channels/wecom-bot)
</details>
<details>
<summary>5. WeCom App - 企业微信应用</summary>
企业微信自建应用接入需在后台创建应用并启用消息回调,配置示例:
@@ -691,12 +717,12 @@ API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn)
"wechatcomapp_aes_key": "AESKEY"
}
```
详细步骤和参数说明参考 [企微自建应用接入](https://docs.link-ai.tech/cow/multi-platform/wechat-com)
详细步骤和参数说明参考 [企微自建应用接入](https://docs.cowagent.ai/channels/wecom)
</details>
<details>
<summary>5. WeChat MP - 微信公众号</summary>
<summary>6. WeChat MP - 微信公众号</summary>
本项目支持订阅号和服务号两种公众号,通过服务号(`wechatmp_service`)体验更佳。
@@ -726,12 +752,12 @@ API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn)
}
```
详细步骤和参数说明参考 [微信公众号接入](https://docs.link-ai.tech/cow/multi-platform/wechat-mp)
详细步骤和参数说明参考 [微信公众号接入](https://docs.cowagent.ai/channels/wechatmp)
</details>
<details>
<summary>6. Terminal - 终端</summary>
<summary>7. Terminal - 终端</summary>
修改 `config.json` 中的 `channel_type` 字段:

View File

@@ -27,7 +27,8 @@ class ChatService:
"""
self.agent_bridge = agent_bridge
def run(self, query: str, session_id: str, send_chunk_fn: Callable[[dict], None]):
def run(self, query: str, session_id: str, send_chunk_fn: Callable[[dict], None],
channel_type: str = ""):
"""
Run the agent for *query* and stream results back via *send_chunk_fn*.
@@ -37,6 +38,7 @@ class ChatService:
:param query: user query text
:param session_id: session identifier for agent isolation
:param send_chunk_fn: callable(chunk_data: dict) to send a streaming chunk
:param channel_type: source channel (e.g. "web", "feishu") for persistence
"""
agent = self.agent_bridge.get_agent(session_id=session_id)
if agent is None:
@@ -68,9 +70,24 @@ class ChatService:
# a new segment; collect tool results until turn_end.
state.pending_tool_results = []
elif event_type == "tool_execution_end":
elif event_type == "tool_execution_start":
# Notify the client that a tool is about to run (with its input args)
tool_name = data.get("tool_name", "")
arguments = data.get("arguments", {})
# Cache arguments keyed by tool_call_id so tool_execution_end can include them
tool_call_id = data.get("tool_call_id", tool_name)
state.pending_tool_arguments[tool_call_id] = arguments
send_chunk_fn({
"chunk_type": "tool_start",
"tool": tool_name,
"arguments": arguments,
})
elif event_type == "tool_execution_end":
tool_name = data.get("tool_name", "")
tool_call_id = data.get("tool_call_id", tool_name)
# Retrieve cached arguments from the matching tool_execution_start event
arguments = state.pending_tool_arguments.pop(tool_call_id, data.get("arguments", {}))
result = data.get("result", "")
status = data.get("status", "unknown")
execution_time = data.get("execution_time", 0)
@@ -111,7 +128,7 @@ class ChatService:
logger.info(f"[ChatService] Starting agent run: session={session_id}, query={query[:80]}")
from config import conf
max_context_turns = conf().get("agent_max_context_turns", 30)
max_context_turns = conf().get("agent_max_context_turns", 20)
# Get full system prompt with skills
full_system_prompt = agent.get_full_system_prompt()
@@ -149,6 +166,11 @@ class ChatService:
new_messages = executor.messages[original_length:]
agent.messages.extend(new_messages)
# Persist new messages to SQLite so they survive restarts and
# can be queried via the HISTORY interface.
if new_messages:
self._persist_messages(session_id, list(new_messages), channel_type)
# Store executor reference for files_to_send access
agent.stream_executor = executor
@@ -158,6 +180,26 @@ class ChatService:
logger.info(f"[ChatService] Agent run completed: session={session_id}")
@staticmethod
def _persist_messages(session_id: str, new_messages: list, channel_type: str = ""):
try:
from config import conf
if not conf().get("conversation_persistence", True):
return
except Exception:
pass
try:
from agent.memory import get_conversation_store
get_conversation_store().append_messages(
session_id, new_messages, channel_type=channel_type
)
except Exception as e:
logger.warning(
f"[ChatService] Failed to persist messages for session={session_id}: {e}"
)
class _StreamState:
"""Mutable state shared between the event callback and the run method."""
@@ -166,3 +208,6 @@ class _StreamState:
# None means we are not accumulating tool results right now.
# A list means we are in the middle of a tool-execution phase.
self.pending_tool_results: Optional[list] = None
# Maps tool_call_id -> arguments captured from tool_execution_start,
# so that tool_execution_end can attach the correct input args.
self.pending_tool_arguments: dict = {}

View File

@@ -1,11 +1,23 @@
"""
Memory module for AgentMesh
Provides long-term memory capabilities with hybrid search (vector + keyword)
Provides both long-term memory (vector/keyword search) and short-term
conversation history persistence (SQLite).
"""
from agent.memory.manager import MemoryManager
from agent.memory.config import MemoryConfig, get_default_memory_config, set_global_memory_config
from agent.memory.embedding import create_embedding_provider
from agent.memory.conversation_store import ConversationStore, get_conversation_store
from agent.memory.summarizer import ensure_daily_memory_file
__all__ = ['MemoryManager', 'MemoryConfig', 'get_default_memory_config', 'set_global_memory_config', 'create_embedding_provider']
__all__ = [
'MemoryManager',
'MemoryConfig',
'get_default_memory_config',
'set_global_memory_config',
'create_embedding_provider',
'ConversationStore',
'get_conversation_store',
'ensure_daily_memory_file',
]

View File

@@ -48,9 +48,6 @@ class MemoryConfig:
enable_auto_sync: bool = True
sync_on_search: bool = True
# Memory flush config (独立于模型 context window)
flush_token_threshold: int = 50000 # 50K tokens 触发 flush
flush_turn_threshold: int = 20 # 20 轮对话触发 flush (用户+AI各一条为一轮)
def get_workspace(self) -> Path:
"""Get workspace root directory"""

View File

@@ -0,0 +1,618 @@
"""
Conversation history persistence using SQLite.
Design:
- sessions table: per-session metadata (channel_type, last_active, msg_count)
- messages table: individual messages stored as JSON, append-only
- Pruning: age-based only (sessions not updated within N days are deleted)
- Thread-safe via a single in-process lock
Storage path: ~/cow/sessions/conversations.db
"""
from __future__ import annotations
import json
import sqlite3
import threading
import time
from pathlib import Path
from typing import Any, Dict, List, Optional
from common.log import logger
# ---------------------------------------------------------------------------
# Schema
# ---------------------------------------------------------------------------
_DDL = """
CREATE TABLE IF NOT EXISTS sessions (
session_id TEXT PRIMARY KEY,
channel_type TEXT NOT NULL DEFAULT '',
created_at INTEGER NOT NULL,
last_active INTEGER NOT NULL,
msg_count INTEGER NOT NULL DEFAULT 0
);
CREATE TABLE IF NOT EXISTS messages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT NOT NULL,
seq INTEGER NOT NULL,
role TEXT NOT NULL,
content TEXT NOT NULL,
created_at INTEGER NOT NULL,
UNIQUE (session_id, seq)
);
CREATE INDEX IF NOT EXISTS idx_messages_session
ON messages (session_id, seq);
CREATE INDEX IF NOT EXISTS idx_sessions_last_active
ON sessions (last_active);
"""
# Migration: add channel_type column to existing databases that predate it.
_MIGRATION_ADD_CHANNEL_TYPE = """
ALTER TABLE sessions ADD COLUMN channel_type TEXT NOT NULL DEFAULT '';
"""
DEFAULT_MAX_AGE_DAYS: int = 30
def _is_visible_user_message(content: Any) -> bool:
"""
Return True when a user-role message represents actual user input
(not an internal tool_result injected by the agent loop).
"""
if isinstance(content, str):
return bool(content.strip())
if isinstance(content, list):
return any(
isinstance(b, dict) and b.get("type") == "text"
for b in content
)
return False
def _extract_display_text(content: Any) -> str:
"""
Extract the human-readable text portion from a message content value.
Returns an empty string for tool_use / tool_result blocks.
"""
if isinstance(content, str):
return content.strip()
if isinstance(content, list):
parts = [
b.get("text", "")
for b in content
if isinstance(b, dict) and b.get("type") == "text"
]
return "\n".join(p for p in parts if p).strip()
return ""
def _extract_tool_calls(content: Any) -> List[Dict[str, Any]]:
"""
Extract tool_use blocks from an assistant message content.
Returns a list of {name, arguments} dicts (result filled in later).
"""
if not isinstance(content, list):
return []
return [
{"id": b.get("id", ""), "name": b.get("name", ""), "arguments": b.get("input", {})}
for b in content
if isinstance(b, dict) and b.get("type") == "tool_use"
]
def _extract_tool_results(content: Any) -> Dict[str, str]:
"""
Extract tool_result blocks from a user message, keyed by tool_use_id.
"""
if not isinstance(content, list):
return {}
results = {}
for b in content:
if not isinstance(b, dict) or b.get("type") != "tool_result":
continue
tool_id = b.get("tool_use_id", "")
result_content = b.get("content", "")
if isinstance(result_content, list):
result_content = "\n".join(
rb.get("text", "") for rb in result_content
if isinstance(rb, dict) and rb.get("type") == "text"
)
results[tool_id] = str(result_content)
return results
def _group_into_display_turns(
rows: List[tuple],
) -> List[Dict[str, Any]]:
"""
Convert raw (role, content_json, created_at) DB rows into display turns.
One display turn = one visible user message + one merged assistant reply.
All intermediate assistant messages (those carrying tool_use) and the final
assistant text reply produced for the same user query are collapsed into a
single assistant turn, exactly matching the live SSE rendering where tools
and the final answer appear inside the same bubble.
Grouping rules:
- A visible user message starts a new group.
- tool_result user messages are internal; their content is attached to the
matching tool_use entry via tool_use_id and they never become own turns.
- All assistant messages within a group are merged:
* tool_use blocks → tool_calls list (result filled from tool_results)
* text blocks → last non-empty text becomes the display content
"""
# ------------------------------------------------------------------ #
# Pass 1: split rows into groups, each starting with a visible user msg
# ------------------------------------------------------------------ #
# group = (user_row | None, [subsequent_rows])
# user_row: (content, created_at)
groups: List[tuple] = []
cur_user: Optional[tuple] = None
cur_rest: List[tuple] = []
started = False
for role, raw_content, created_at in rows:
try:
content = json.loads(raw_content)
except Exception:
content = raw_content
if role == "user" and _is_visible_user_message(content):
if started:
groups.append((cur_user, cur_rest))
cur_user = (content, created_at)
cur_rest = []
started = True
else:
cur_rest.append((role, content, created_at))
if started:
groups.append((cur_user, cur_rest))
# ------------------------------------------------------------------ #
# Pass 2: build display turns from each group
# ------------------------------------------------------------------ #
turns: List[Dict[str, Any]] = []
for user_row, rest in groups:
# User turn
if user_row:
content, created_at = user_row
text = _extract_display_text(content)
if text:
turns.append({"role": "user", "content": text, "created_at": created_at})
# Collect all tool_calls and tool_results from the rest of the group
all_tool_calls: List[Dict[str, Any]] = []
tool_results: Dict[str, str] = {}
final_text = ""
final_ts: Optional[int] = None
for role, content, created_at in rest:
if role == "user":
tool_results.update(_extract_tool_results(content))
elif role == "assistant":
tcs = _extract_tool_calls(content)
all_tool_calls.extend(tcs)
t = _extract_display_text(content)
if t:
final_text = t
final_ts = created_at
# Attach tool results to their matching tool_call entries
for tc in all_tool_calls:
tc["result"] = tool_results.get(tc.get("id", ""), "")
if final_text or all_tool_calls:
turns.append({
"role": "assistant",
"content": final_text,
"tool_calls": all_tool_calls,
"created_at": final_ts or (user_row[1] if user_row else 0),
})
return turns
class ConversationStore:
"""
SQLite-backed store for per-session conversation history.
Usage:
store = ConversationStore(db_path)
store.append_messages("user_123", new_messages, channel_type="feishu")
msgs = store.load_messages("user_123", max_turns=30)
"""
def __init__(self, db_path: Path):
self._db_path = db_path
self._lock = threading.Lock()
self._init_db()
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
def load_messages(
self,
session_id: str,
max_turns: int = 30,
) -> List[Dict[str, Any]]:
"""
Load the most recent messages for a session, for injection into the LLM.
ALL message types (user text, assistant tool_use, tool_result) are returned
in their original JSON form so the LLM can reconstruct the full context.
max_turns is a *visible-turn* count: we count only user messages whose
content is actual user text (not tool_result blocks). This prevents
tool-heavy sessions from exhausting the turn budget prematurely.
Args:
session_id: Unique session identifier.
max_turns: Maximum number of visible user-assistant turns to keep.
Returns:
Chronologically ordered list of message dicts (role, content).
"""
with self._lock:
conn = self._connect()
try:
rows = conn.execute(
"""
SELECT seq, role, content
FROM messages
WHERE session_id = ?
ORDER BY seq DESC
""",
(session_id,),
).fetchall()
finally:
conn.close()
if not rows:
return []
# Walk newest-to-oldest counting *visible* user turns (actual user text,
# not tool_result injections). Record the seq of every visible user
# message so we can find a clean cut point later.
visible_turn_seqs: List[int] = [] # newest first
for seq, role, raw_content in rows:
if role != "user":
continue
try:
content = json.loads(raw_content)
except Exception:
content = raw_content
if _is_visible_user_message(content):
visible_turn_seqs.append(seq)
# Determine the seq of the oldest visible user message we want to keep.
# If the total turns fit within max_turns, keep everything.
if len(visible_turn_seqs) <= max_turns:
cutoff_seq = None # keep all
else:
# The Nth visible user message (0-indexed) is the oldest we keep.
cutoff_seq = visible_turn_seqs[max_turns - 1]
# Build result in chronological order, starting from cutoff.
# IMPORTANT: we start exactly at cutoff_seq (the visible user message),
# never mid-group, so tool_use / tool_result pairs are always complete.
result = []
for seq, role, raw_content in reversed(rows):
if cutoff_seq is not None and seq < cutoff_seq:
continue
try:
content = json.loads(raw_content)
except Exception:
content = raw_content
result.append({"role": role, "content": content})
return result
def append_messages(
self,
session_id: str,
messages: List[Dict[str, Any]],
channel_type: str = "",
) -> None:
"""
Append new messages to a session's history.
Seq numbers continue from the session's current maximum, so
concurrent callers on distinct sessions never collide.
Args:
session_id: Unique session identifier.
messages: List of message dicts to append.
channel_type: Source channel (e.g. "feishu", "web", "wechat").
Only written on session creation; ignored on update.
"""
if not messages:
return
now = int(time.time())
with self._lock:
conn = self._connect()
try:
with conn:
# INSERT OR IGNORE creates the row on first visit;
# the UPDATE always refreshes last_active.
# Avoids ON CONFLICT...DO UPDATE (requires SQLite >= 3.24).
conn.execute(
"""
INSERT OR IGNORE INTO sessions
(session_id, channel_type, created_at, last_active, msg_count)
VALUES (?, ?, ?, ?, 0)
""",
(session_id, channel_type, now, now),
)
conn.execute(
"UPDATE sessions SET last_active = ? WHERE session_id = ?",
(now, session_id),
)
# Determine starting seq for the new batch.
row = conn.execute(
"SELECT COALESCE(MAX(seq), -1) FROM messages WHERE session_id = ?",
(session_id,),
).fetchone()
next_seq = row[0] + 1
for msg in messages:
role = msg.get("role", "")
content = json.dumps(
msg.get("content", ""), ensure_ascii=False
)
conn.execute(
"""
INSERT OR IGNORE INTO messages
(session_id, seq, role, content, created_at)
VALUES (?, ?, ?, ?, ?)
""",
(session_id, next_seq, role, content, now),
)
next_seq += 1
conn.execute(
"""
UPDATE sessions
SET msg_count = (
SELECT COUNT(*) FROM messages WHERE session_id = ?
)
WHERE session_id = ?
""",
(session_id, session_id),
)
finally:
conn.close()
def clear_session(self, session_id: str) -> None:
"""Delete all messages and the session record for a given session_id."""
with self._lock:
conn = self._connect()
try:
with conn:
conn.execute(
"DELETE FROM messages WHERE session_id = ?", (session_id,)
)
conn.execute(
"DELETE FROM sessions WHERE session_id = ?", (session_id,)
)
finally:
conn.close()
def cleanup_old_sessions(self, max_age_days: Optional[int] = None) -> int:
"""
Delete sessions that have not been active within max_age_days.
Args:
max_age_days: Override the default retention period.
Returns:
Number of sessions deleted.
"""
try:
from config import conf
max_age = max_age_days or conf().get(
"conversation_max_age_days", DEFAULT_MAX_AGE_DAYS
)
except Exception:
max_age = max_age_days or DEFAULT_MAX_AGE_DAYS
cutoff = int(time.time()) - max_age * 86400
deleted = 0
with self._lock:
conn = self._connect()
try:
with conn:
stale = conn.execute(
"SELECT session_id FROM sessions WHERE last_active < ?",
(cutoff,),
).fetchall()
for (sid,) in stale:
conn.execute(
"DELETE FROM messages WHERE session_id = ?", (sid,)
)
conn.execute(
"DELETE FROM sessions WHERE session_id = ?", (sid,)
)
deleted += 1
finally:
conn.close()
if deleted:
logger.info(f"[ConversationStore] Pruned {deleted} expired sessions")
return deleted
def load_history_page(
self,
session_id: str,
page: int = 1,
page_size: int = 20,
) -> Dict[str, Any]:
"""
Load a page of conversation history for UI display, grouped into turns.
Each "turn" maps to one of:
- A user message (role="user", content=str)
- An assistant message (role="assistant", content=str,
tool_calls=[{name, arguments, result}] when tools were used)
Internal tool_result user messages are merged into the preceding
assistant entry's tool_calls list and never appear as standalone items.
Pages are numbered from 1 (most recent). Messages within a page are
returned in chronological order.
Returns:
{
"messages": [
{
"role": "user" | "assistant",
"content": str,
"tool_calls": [...], # assistant only, may be []
"created_at": int,
},
...
],
"total": <visible turn count>,
"page": <current page>,
"page_size": <page_size>,
"has_more": bool,
}
"""
page = max(1, page)
with self._lock:
conn = self._connect()
try:
rows = conn.execute(
"""
SELECT role, content, created_at
FROM messages
WHERE session_id = ?
ORDER BY seq ASC
""",
(session_id,),
).fetchall()
finally:
conn.close()
visible = _group_into_display_turns(rows)
total = len(visible)
offset = (page - 1) * page_size
page_items = list(reversed(visible))[offset: offset + page_size]
page_items = list(reversed(page_items))
return {
"messages": page_items,
"total": total,
"page": page,
"page_size": page_size,
"has_more": offset + page_size < total,
}
def get_stats(self) -> Dict[str, Any]:
"""Return basic stats keyed by channel_type, for monitoring."""
with self._lock:
conn = self._connect()
try:
total_sessions = conn.execute(
"SELECT COUNT(*) FROM sessions"
).fetchone()[0]
total_messages = conn.execute(
"SELECT COUNT(*) FROM messages"
).fetchone()[0]
by_channel = conn.execute(
"""
SELECT channel_type, COUNT(*) as cnt
FROM sessions
GROUP BY channel_type
ORDER BY cnt DESC
"""
).fetchall()
return {
"total_sessions": total_sessions,
"total_messages": total_messages,
"by_channel": {row[0] or "unknown": row[1] for row in by_channel},
}
finally:
conn.close()
# ------------------------------------------------------------------
# Internal helpers
# ------------------------------------------------------------------
def _init_db(self) -> None:
self._db_path.parent.mkdir(parents=True, exist_ok=True)
conn = self._connect()
try:
conn.executescript(_DDL)
conn.commit()
self._migrate(conn)
finally:
conn.close()
def _migrate(self, conn: sqlite3.Connection) -> None:
"""Apply incremental schema migrations on existing databases."""
cols = {
row[1]
for row in conn.execute("PRAGMA table_info(sessions)").fetchall()
}
if "channel_type" not in cols:
try:
conn.execute(_MIGRATION_ADD_CHANNEL_TYPE)
conn.commit()
logger.info("[ConversationStore] Migrated: added channel_type column")
except Exception as e:
logger.warning(f"[ConversationStore] Migration failed: {e}")
def _connect(self) -> sqlite3.Connection:
conn = sqlite3.connect(str(self._db_path), timeout=10)
conn.execute("PRAGMA journal_mode=WAL")
conn.execute("PRAGMA synchronous=NORMAL")
return conn
# ---------------------------------------------------------------------------
# Singleton
# ---------------------------------------------------------------------------
_store_instance: Optional[ConversationStore] = None
_store_lock = threading.Lock()
def get_conversation_store() -> ConversationStore:
"""
Return the process-wide ConversationStore singleton.
Reuses the long-term memory database so the project stays with a single
SQLite file: ~/cow/memory/long-term/index.db
The conversation tables (sessions / messages) are separate from the
memory tables (memory_chunks / file_metadata) — no conflicts.
"""
global _store_instance
if _store_instance is not None:
return _store_instance
with _store_lock:
if _store_instance is not None:
return _store_instance
try:
from agent.memory.config import get_default_memory_config
db_path = get_default_memory_config().get_db_path()
except Exception:
from common.utils import expand_path
db_path = Path(expand_path("~/cow")) / "memory" / "long-term" / "index.db"
_store_instance = ConversationStore(db_path)
logger.debug(f"[ConversationStore] Using shared DB at: {db_path}")
return _store_instance

View File

@@ -138,24 +138,24 @@ def create_embedding_provider(
) -> EmbeddingProvider:
"""
Factory function to create embedding provider
Only supports OpenAI embedding via REST API.
Supports "openai" and "linkai" providers (both use OpenAI-compatible REST API).
If initialization fails, caller should fall back to keyword-only search.
Args:
provider: Provider name (only "openai" is supported)
provider: Provider name ("openai" or "linkai")
model: Model name (default: text-embedding-3-small)
api_key: OpenAI API key (required)
api_base: API base URL (default: https://api.openai.com/v1)
api_key: API key (required)
api_base: API base URL
Returns:
EmbeddingProvider instance
Raises:
ValueError: If provider is not "openai" or api_key is missing
ValueError: If provider is unsupported or api_key is missing
"""
if provider != "openai":
raise ValueError(f"Only 'openai' provider is supported, got: {provider}")
if provider not in ("openai", "linkai"):
raise ValueError(f"Unsupported embedding provider: {provider}. Use 'openai' or 'linkai'.")
model = model or "text-embedding-3-small"
return OpenAIEmbeddingProvider(model=model, api_key=api_key, api_base=api_base)

View File

@@ -50,28 +50,44 @@ class MemoryManager:
overlap_tokens=self.config.chunk_overlap_tokens
)
# Initialize embedding provider (optional)
# Initialize embedding provider (optional, prefer OpenAI, fallback to LinkAI)
self.embedding_provider = None
if embedding_provider:
self.embedding_provider = embedding_provider
else:
# Try to create embedding provider, but allow failure
# Try OpenAI first
try:
# Get API key from environment or config
api_key = os.environ.get('OPENAI_API_KEY')
api_base = os.environ.get('OPENAI_API_BASE')
self.embedding_provider = create_embedding_provider(
provider=self.config.embedding_provider,
model=self.config.embedding_model,
api_key=api_key,
api_base=api_base
)
if api_key:
self.embedding_provider = create_embedding_provider(
provider="openai",
model=self.config.embedding_model,
api_key=api_key,
api_base=api_base
)
except Exception as e:
# Embedding provider failed, but that's OK
# We can still use keyword search and file operations
from common.log import logger
logger.warning(f"[MemoryManager] Embedding provider initialization failed: {e}")
logger.warning(f"[MemoryManager] OpenAI embedding failed: {e}")
# Fallback to LinkAI
if self.embedding_provider is None:
try:
linkai_key = os.environ.get('LINKAI_API_KEY')
linkai_base = os.environ.get('LINKAI_API_BASE', 'https://api.link-ai.tech')
if linkai_key:
self.embedding_provider = create_embedding_provider(
provider="linkai",
model=self.config.embedding_model,
api_key=linkai_key,
api_base=f"{linkai_base}/v1"
)
except Exception as e:
from common.log import logger
logger.warning(f"[MemoryManager] LinkAI embedding failed: {e}")
if self.embedding_provider is None:
from common.log import logger
logger.info(f"[MemoryManager] Memory will work with keyword search only (no vector search)")
# Initialize memory flush manager
@@ -363,182 +379,35 @@ class MemoryManager:
size=stat.st_size
)
def should_flush_memory(
def flush_memory(
self,
current_tokens: int = 0
) -> bool:
"""
Check if memory flush should be triggered
独立的 flush 触发机制,不依赖模型 context window。
使用配置中的阈值: flush_token_threshold 和 flush_turn_threshold
Args:
current_tokens: Current session token count
Returns:
True if memory flush should run
"""
return self.flush_manager.should_flush(
current_tokens=current_tokens,
token_threshold=self.config.flush_token_threshold,
turn_threshold=self.config.flush_turn_threshold
)
def increment_turn(self):
"""增加对话轮数计数(每次用户消息+AI回复算一轮"""
self.flush_manager.increment_turn()
async def execute_memory_flush(
self,
agent_executor,
current_tokens: int,
messages: list,
user_id: Optional[str] = None,
**executor_kwargs
reason: str = "threshold",
max_messages: int = 10,
) -> bool:
"""
Execute memory flush before compaction
This runs a silent agent turn to write durable memories to disk.
Similar to clawdbot's pre-compaction memory flush.
Flush conversation summary to daily memory file.
Args:
agent_executor: Async function to execute agent with prompt
current_tokens: Current session token count
messages: Conversation message list
user_id: Optional user ID
**executor_kwargs: Additional kwargs for agent executor
reason: "threshold" | "overflow" | "daily_summary"
max_messages: Max recent messages to include (0 = all)
Returns:
True if flush completed successfully
Example:
>>> async def run_agent(prompt, system_prompt, silent=False):
... # Your agent execution logic
... pass
>>>
>>> if manager.should_flush_memory(current_tokens=100000):
... await manager.execute_memory_flush(
... agent_executor=run_agent,
... current_tokens=100000
... )
True if content was written
"""
success = await self.flush_manager.execute_flush(
agent_executor=agent_executor,
current_tokens=current_tokens,
success = self.flush_manager.flush_from_messages(
messages=messages,
user_id=user_id,
**executor_kwargs
reason=reason,
max_messages=max_messages,
)
if success:
# Mark dirty so next search will sync the new memories
self._dirty = True
return success
def build_memory_guidance(self, lang: str = "zh", include_context: bool = True) -> str:
"""
Build natural memory guidance for agent system prompt
Following clawdbot's approach:
1. Load MEMORY.md as bootstrap context (blends into background)
2. Load daily files on-demand via memory_search tool
3. Agent should NOT proactively mention memories unless user asks
Args:
lang: Language for guidance ("en" or "zh")
include_context: Whether to include bootstrap memory context (default: True)
MEMORY.md is loaded as background context (like clawdbot)
Daily files are accessed via memory_search tool
Returns:
Memory guidance text (and optionally context) for system prompt
"""
today_file = self.flush_manager.get_today_memory_file().name
if lang == "zh":
guidance = f"""## 记忆系统
**背景知识**: 下方包含核心长期记忆,可直接使用。需要查找历史时,用 memory_search 搜索(搜索一次即可,不要重复)。
**存储记忆**: 当用户分享重要信息时(偏好、决策、事实等),主动用 write 工具存储:
- 长期信息 → MEMORY.md
- 当天笔记 → memory/{today_file}
- 静默存储,仅在明确要求时确认
**使用原则**: 自然使用记忆,就像你本来就知道。不需要生硬地提起或列举记忆,除非用户提到。"""
else:
guidance = f"""## Memory System
**Background Knowledge**: Core long-term memories below - use directly. For history, use memory_search once (don't repeat).
**Store Memories**: When user shares important info (preferences, decisions, facts), proactively write:
- Durable info → MEMORY.md
- Daily notes → memory/{today_file}
- Store silently; confirm only when explicitly requested
**Usage**: Use memories naturally as if you always knew. Don't mention or list unless user explicitly asks."""
if include_context:
# Load bootstrap context (MEMORY.md only, like clawdbot)
bootstrap_context = self.load_bootstrap_memories()
if bootstrap_context:
guidance += f"\n\n## Background Context\n\n{bootstrap_context}"
return guidance
def load_bootstrap_memories(self, user_id: Optional[str] = None) -> str:
"""
Load bootstrap memory files for session start
Following clawdbot's design:
- Only loads MEMORY.md from workspace root (long-term curated memory)
- Daily files (memory/YYYY-MM-DD.md) are accessed via memory_search tool, not bootstrap
- User-specific MEMORY.md is also loaded if user_id provided
Returns memory content WITHOUT obvious headers so it blends naturally
into the context as background knowledge.
Args:
user_id: Optional user ID for user-specific memories
Returns:
Memory content to inject into system prompt (blends naturally as background context)
"""
workspace_dir = self.config.get_workspace()
memory_dir = self.config.get_memory_dir()
sections = []
# 1. Load MEMORY.md from workspace root (long-term curated memory)
# Following clawdbot: only MEMORY.md is bootstrap, daily files use memory_search
memory_file = Path(workspace_dir) / "MEMORY.md"
if memory_file.exists():
try:
content = memory_file.read_text(encoding='utf-8').strip()
if content:
sections.append(content)
except Exception as e:
print(f"Warning: Failed to read MEMORY.md: {e}")
# 2. Load user-specific MEMORY.md if user_id provided
if user_id:
user_memory_dir = memory_dir / "users" / user_id
user_memory_file = user_memory_dir / "MEMORY.md"
if user_memory_file.exists():
try:
content = user_memory_file.read_text(encoding='utf-8').strip()
if content:
sections.append(content)
except Exception as e:
print(f"Warning: Failed to read user memory: {e}")
if not sections:
return ""
# Join sections without obvious headers - let memories blend naturally
# This makes the agent feel like it "just knows" rather than "checking memory files"
return "\n\n".join(sections)
def get_status(self) -> Dict[str, Any]:
"""Get memory status"""
stats = self.storage.get_stats()
@@ -568,6 +437,37 @@ class MemoryManager:
content = f"{path}:{start_line}:{end_line}"
return hashlib.md5(content.encode('utf-8')).hexdigest()
@staticmethod
def _compute_temporal_decay(path: str, half_life_days: float = 30.0) -> float:
"""
Compute temporal decay multiplier for dated memory files.
Inspired by OpenClaw's temporal-decay: exponential decay based on file date.
MEMORY.md and non-dated files are "evergreen" (no decay, multiplier=1.0).
Daily files like memory/2025-03-01.md decay based on age.
Formula: multiplier = exp(-ln2/half_life * age_in_days)
"""
import re
import math
match = re.search(r'(\d{4})-(\d{2})-(\d{2})\.md$', path)
if not match:
return 1.0 # evergreen: MEMORY.md, non-dated files
try:
file_date = datetime(
int(match.group(1)), int(match.group(2)), int(match.group(3))
)
age_days = (datetime.now() - file_date).days
if age_days <= 0:
return 1.0
decay_lambda = math.log(2) / half_life_days
return math.exp(-decay_lambda * age_days)
except (ValueError, OverflowError):
return 1.0
def _merge_results(
self,
vector_results: List[SearchResult],
@@ -575,8 +475,7 @@ class MemoryManager:
vector_weight: float,
keyword_weight: float
) -> List[SearchResult]:
"""Merge vector and keyword search results"""
# Create a map by (path, start_line, end_line)
"""Merge vector and keyword search results with temporal decay for dated files"""
merged_map = {}
for result in vector_results:
@@ -598,7 +497,6 @@ class MemoryManager:
'keyword_score': result.score
}
# Calculate combined scores
merged_results = []
for entry in merged_map.values():
combined_score = (
@@ -606,7 +504,11 @@ class MemoryManager:
keyword_weight * entry['keyword_score']
)
# Apply temporal decay for dated memory files
result = entry['result']
decay = self._compute_temporal_decay(result.path)
combined_score *= decay
merged_results.append(SearchResult(
path=result.path,
start_line=result.start_line,
@@ -617,6 +519,5 @@ class MemoryManager:
user_id=result.user_id
))
# Sort by score
merged_results.sort(key=lambda r: r.score, reverse=True)
return merged_results

View File

@@ -509,7 +509,7 @@ class MemoryStorage:
"""Destructor to ensure connection is closed"""
try:
self.close()
except:
except Exception:
pass # Ignore errors during cleanup
# Helper methods

View File

@@ -1,225 +1,324 @@
"""
Memory flush manager
Triggers memory flush before context compaction (similar to clawdbot)
Handles memory persistence when conversation context is trimmed or overflows:
- Uses LLM to summarize discarded messages into concise key-information entries
- Writes to daily memory files (lazy creation)
- Deduplicates trim flushes to avoid repeated writes
- Runs summarization asynchronously to avoid blocking normal replies
- Provides daily summary interface for scheduler
"""
from typing import Optional, Callable, Any
import threading
from typing import Optional, Callable, Any, List, Dict
from pathlib import Path
from datetime import datetime
from common.log import logger
SUMMARIZE_SYSTEM_PROMPT = """你是一个记忆提取助手。你的任务是从对话记录中提取值得记住的信息,生成简洁的记忆摘要。
输出要求:
1. 以事件/关键信息为维度记录,每条一行,用 "- " 开头
2. 记录有价值的关键信息,例如用户提出的要求及助手的解决方案,对话中涉及的事实信息,用户的偏好、决策或重要结论
3. 每条摘要需要简明扼要,只保留关键信息
4. 直接输出摘要内容,不要加任何前缀说明
5. 当对话没有任何记录价值例如只是简单问候,可回复"\""""
SUMMARIZE_USER_PROMPT = """请从以下对话记录中提取关键信息,生成记忆摘要:
{conversation}"""
class MemoryFlushManager:
"""
Manages memory flush operations before context compaction
Manages memory flush operations.
Similar to clawdbot's memory flush mechanism:
- Triggers when context approaches token limit
- Runs a silent agent turn to write memories to disk
- Uses memory/YYYY-MM-DD.md for daily notes
- Uses MEMORY.md (workspace root) for long-term curated memories
Flush is triggered by agent_stream in two scenarios:
1. Context trim: _trim_messages discards old turns → flush discarded content
2. Context overflow: API rejects request → emergency flush before clearing
Additionally, create_daily_summary() can be called by scheduler for end-of-day summaries.
"""
def __init__(
self,
workspace_dir: Path,
llm_model: Optional[Any] = None
llm_model: Optional[Any] = None,
):
"""
Initialize memory flush manager
Args:
workspace_dir: Workspace directory
llm_model: LLM model for agent execution (optional)
"""
self.workspace_dir = workspace_dir
self.llm_model = llm_model
self.memory_dir = workspace_dir / "memory"
self.memory_dir.mkdir(parents=True, exist_ok=True)
# Tracking
self.last_flush_token_count: Optional[int] = None
self.last_flush_timestamp: Optional[datetime] = None
self.turn_count: int = 0 # 对话轮数计数器
self._trim_flushed_hashes: set = set() # Content hashes of already-flushed messages
self._last_flushed_content_hash: str = "" # Content hash at last flush, for daily dedup
def should_flush(
self,
current_tokens: int = 0,
token_threshold: int = 50000,
turn_threshold: int = 20
) -> bool:
"""
Determine if memory flush should be triggered
独立的 flush 触发机制,不依赖模型 context window:
- Token 阈值: 达到 50K tokens 时触发
- 轮次阈值: 达到 20 轮对话时触发
Args:
current_tokens: Current session token count
token_threshold: Token threshold to trigger flush (default: 50K)
turn_threshold: Turn threshold to trigger flush (default: 20)
Returns:
True if flush should run
"""
# 检查 token 阈值
if current_tokens > 0 and current_tokens >= token_threshold:
# 避免重复 flush
if self.last_flush_token_count is not None:
if current_tokens <= self.last_flush_token_count + 5000:
return False
return True
# 检查轮次阈值
if self.turn_count >= turn_threshold:
return True
return False
def get_today_memory_file(self, user_id: Optional[str] = None) -> Path:
"""
Get today's memory file path: memory/YYYY-MM-DD.md
Args:
user_id: Optional user ID for user-specific memory
Returns:
Path to today's memory file
"""
def get_today_memory_file(self, user_id: Optional[str] = None, ensure_exists: bool = False) -> Path:
"""Get today's memory file path: memory/YYYY-MM-DD.md"""
today = datetime.now().strftime("%Y-%m-%d")
if user_id:
user_dir = self.memory_dir / "users" / user_id
user_dir.mkdir(parents=True, exist_ok=True)
return user_dir / f"{today}.md"
if ensure_exists:
user_dir.mkdir(parents=True, exist_ok=True)
today_file = user_dir / f"{today}.md"
else:
return self.memory_dir / f"{today}.md"
today_file = self.memory_dir / f"{today}.md"
if ensure_exists and not today_file.exists():
today_file.parent.mkdir(parents=True, exist_ok=True)
today_file.write_text(f"# Daily Memory: {today}\n\n")
return today_file
def get_main_memory_file(self, user_id: Optional[str] = None) -> Path:
"""
Get main memory file path: MEMORY.md (workspace root)
Args:
user_id: Optional user ID for user-specific memory
Returns:
Path to main memory file
"""
"""Get main memory file path: MEMORY.md (workspace root)"""
if user_id:
user_dir = self.memory_dir / "users" / user_id
user_dir.mkdir(parents=True, exist_ok=True)
return user_dir / "MEMORY.md"
else:
# Return workspace root MEMORY.md
return Path(self.workspace_dir) / "MEMORY.md"
def create_flush_prompt(self) -> str:
"""
Create prompt for memory flush turn
Similar to clawdbot's DEFAULT_MEMORY_FLUSH_PROMPT
"""
today = datetime.now().strftime("%Y-%m-%d")
return (
f"Pre-compaction memory flush. "
f"Store durable memories now (use memory/{today}.md for daily notes; "
f"create memory/ if needed). "
f"\n\n"
f"重要提示:\n"
f"- MEMORY.md: 记录最核心、最常用的信息(例如重要规则、偏好、决策、要求等)\n"
f" 如果 MEMORY.md 过长,可以精简或移除不再重要的内容。避免冗长描述,用关键词和要点形式记录\n"
f"- memory/{today}.md: 记录当天发生的事件、关键信息、经验教训、对话过程摘要等,突出重点\n"
f"- 如果没有重要内容需要记录,回复 NO_REPLY\n"
)
def create_flush_system_prompt(self) -> str:
"""
Create system prompt for memory flush turn
Similar to clawdbot's DEFAULT_MEMORY_FLUSH_SYSTEM_PROMPT
"""
return (
"Pre-compaction memory flush turn. "
"The session is near auto-compaction; capture durable memories to disk. "
"\n\n"
"记忆写入原则:\n"
"1. MEMORY.md 精简原则: 只记录核心信息(<2000 tokens\n"
" - 记录重要规则、偏好、决策、要求等需要长期记住的关键信息,无需记录过多细节\n"
" - 如果 MEMORY.md 过长,可以根据需要精简或删除过时内容\n"
"\n"
"2. 天级记忆 (memory/YYYY-MM-DD.md):\n"
" - 记录当天的重要事件、关键信息、经验教训、对话过程摘要等,确保核心信息点被完整记录\n"
"\n"
"3. 判断标准:\n"
" - 这个信息未来会经常用到吗?→ MEMORY.md\n"
" - 这是今天的重要事件或决策吗?→ memory/YYYY-MM-DD.md\n"
" - 这是临时性的、不重要的内容吗?→ 不记录\n"
"\n"
"You may reply, but usually NO_REPLY is correct."
)
async def execute_flush(
self,
agent_executor: Callable,
current_tokens: int,
user_id: Optional[str] = None,
**executor_kwargs
) -> bool:
"""
Execute memory flush by running a silent agent turn
Args:
agent_executor: Function to execute agent with prompt
current_tokens: Current token count
user_id: Optional user ID
**executor_kwargs: Additional kwargs for agent executor
Returns:
True if flush completed successfully
"""
try:
# Create flush prompts
prompt = self.create_flush_prompt()
system_prompt = self.create_flush_system_prompt()
# Execute agent turn (silent, no user-visible reply expected)
await agent_executor(
prompt=prompt,
system_prompt=system_prompt,
silent=True, # NO_REPLY expected
**executor_kwargs
)
# Track flush
self.last_flush_token_count = current_tokens
self.last_flush_timestamp = datetime.now()
self.turn_count = 0 # 重置轮数计数器
return True
except Exception as e:
print(f"Memory flush failed: {e}")
return False
def increment_turn(self):
"""增加对话轮数计数"""
self.turn_count += 1
def get_status(self) -> dict:
"""Get memory flush status"""
return {
'last_flush_tokens': self.last_flush_token_count,
'last_flush_time': self.last_flush_timestamp.isoformat() if self.last_flush_timestamp else None,
'today_file': str(self.get_today_memory_file()),
'main_file': str(self.get_main_memory_file())
}
# ---- Flush execution (called by agent_stream or scheduler) ----
def flush_from_messages(
self,
messages: List[Dict],
user_id: Optional[str] = None,
reason: str = "trim",
max_messages: int = 0,
) -> bool:
"""
Asynchronously summarize and flush messages to daily memory.
Deduplication runs synchronously, then LLM summarization + file write
run in a background thread so the main reply flow is never blocked.
Args:
messages: Conversation message list (OpenAI/Claude format)
user_id: Optional user ID for user-scoped memory
reason: Why flush was triggered ("trim" | "overflow" | "daily_summary")
max_messages: Max recent messages to summarize (0 = all)
Returns:
True if flush was dispatched
"""
try:
import hashlib
deduped = []
for m in messages:
text = self._extract_text_from_content(m.get("content", ""))
if not text or not text.strip():
continue
h = hashlib.md5(text.encode("utf-8")).hexdigest()
if h not in self._trim_flushed_hashes:
self._trim_flushed_hashes.add(h)
deduped.append(m)
if not deduped:
return False
import copy
snapshot = copy.deepcopy(deduped)
thread = threading.Thread(
target=self._flush_worker,
args=(snapshot, user_id, reason, max_messages),
daemon=True,
)
thread.start()
logger.info(f"[MemoryFlush] Async flush dispatched (reason={reason}, msgs={len(snapshot)})")
return True
except Exception as e:
logger.warning(f"[MemoryFlush] Failed to dispatch flush (reason={reason}): {e}")
return False
def _flush_worker(
self,
messages: List[Dict],
user_id: Optional[str],
reason: str,
max_messages: int,
):
"""Background worker: summarize with LLM and write to daily file."""
try:
summary = self._summarize_messages(messages, max_messages)
if not summary or not summary.strip() or summary.strip() == "":
logger.info(f"[MemoryFlush] No valuable content to flush (reason={reason})")
return
daily_file = ensure_daily_memory_file(self.workspace_dir, user_id)
if reason == "overflow":
header = f"## Context Overflow Recovery ({datetime.now().strftime('%H:%M')})"
note = "The following conversation was trimmed due to context overflow:\n"
elif reason == "trim":
header = f"## Trimmed Context ({datetime.now().strftime('%H:%M')})"
note = ""
elif reason == "daily_summary":
header = f"## Daily Summary ({datetime.now().strftime('%H:%M')})"
note = ""
else:
header = f"## Session Notes ({datetime.now().strftime('%H:%M')})"
note = ""
flush_entry = f"\n{header}\n\n{note}{summary}\n"
with open(daily_file, "a", encoding="utf-8") as f:
f.write(flush_entry)
self.last_flush_timestamp = datetime.now()
logger.info(f"[MemoryFlush] Wrote to {daily_file.name} (reason={reason}, chars={len(summary)})")
except Exception as e:
logger.warning(f"[MemoryFlush] Async flush failed (reason={reason}): {e}")
def create_daily_summary(
self,
messages: List[Dict],
user_id: Optional[str] = None
) -> bool:
"""
Generate end-of-day summary. Called by daily timer.
Skips if messages haven't changed since last flush.
"""
import hashlib
content = "".join(
self._extract_text_from_content(m.get("content", ""))
for m in messages
)
content_hash = hashlib.md5(content.encode("utf-8")).hexdigest()
if content_hash == self._last_flushed_content_hash:
logger.debug("[MemoryFlush] Daily summary skipped: no new content since last flush")
return False
self._last_flushed_content_hash = content_hash
return self.flush_from_messages(
messages=messages,
user_id=user_id,
reason="daily_summary",
max_messages=0,
)
# ---- Internal helpers ----
def _summarize_messages(self, messages: List[Dict], max_messages: int = 0) -> str:
"""
Summarize conversation messages using LLM, with rule-based fallback.
"""
conversation_text = self._format_conversation_for_summary(messages, max_messages)
if not conversation_text.strip():
return ""
# Try LLM summarization first
if self.llm_model:
try:
summary = self._call_llm_for_summary(conversation_text)
if summary and summary.strip() and summary.strip() != "":
return summary.strip()
except Exception as e:
logger.warning(f"[MemoryFlush] LLM summarization failed, using fallback: {e}")
return self._extract_summary_fallback(messages, max_messages)
def _format_conversation_for_summary(self, messages: List[Dict], max_messages: int = 0) -> str:
"""Format messages into readable conversation text for LLM summarization."""
msgs = messages if max_messages == 0 else messages[-max_messages * 2:]
lines = []
for msg in msgs:
role = msg.get("role", "")
text = self._extract_text_from_content(msg.get("content", ""))
if not text or not text.strip():
continue
text = text.strip()
if role == "user":
lines.append(f"用户: {text[:500]}")
elif role == "assistant":
lines.append(f"助手: {text[:500]}")
return "\n".join(lines)
def _call_llm_for_summary(self, conversation_text: str) -> str:
"""Call LLM to generate a concise summary of the conversation."""
from agent.protocol.models import LLMRequest
request = LLMRequest(
messages=[{"role": "user", "content": SUMMARIZE_USER_PROMPT.format(conversation=conversation_text)}],
temperature=0,
max_tokens=500,
stream=False,
system=SUMMARIZE_SYSTEM_PROMPT,
)
response = self.llm_model.call(request)
if isinstance(response, dict):
if response.get("error"):
raise RuntimeError(response.get("message", "LLM call failed"))
# OpenAI format
choices = response.get("choices", [])
if choices:
return choices[0].get("message", {}).get("content", "")
# Handle response object with attribute access (e.g. OpenAI SDK response)
if hasattr(response, "choices") and response.choices:
return response.choices[0].message.content or ""
return ""
@staticmethod
def _extract_summary_fallback(messages: List[Dict], max_messages: int = 0) -> str:
"""Rule-based fallback when LLM is unavailable."""
msgs = messages if max_messages == 0 else messages[-max_messages * 2:]
items = []
for msg in msgs:
role = msg.get("role", "")
text = MemoryFlushManager._extract_text_from_content(msg.get("content", ""))
if not text or not text.strip():
continue
text = text.strip()
if role == "user":
if len(text) <= 5:
continue
items.append(f"- 用户请求: {text[:200]}")
elif role == "assistant":
first_line = text.split("\n")[0].strip()
if len(first_line) > 10:
items.append(f"- 处理结果: {first_line[:200]}")
return "\n".join(items[:15])
@staticmethod
def _extract_text_from_content(content) -> str:
"""Extract plain text from message content (string or content blocks)."""
if isinstance(content, str):
return content
if isinstance(content, list):
parts = []
for block in content:
if isinstance(block, dict) and block.get("type") == "text":
parts.append(block.get("text", ""))
elif isinstance(block, str):
parts.append(block)
return "\n".join(parts)
return ""
def create_memory_files_if_needed(workspace_dir: Path, user_id: Optional[str] = None):
"""
Create default memory files if they don't exist
Create essential memory files if they don't exist.
Only creates MEMORY.md; daily files are created lazily on first write.
Args:
workspace_dir: Workspace directory
@@ -228,7 +327,7 @@ def create_memory_files_if_needed(workspace_dir: Path, user_id: Optional[str] =
memory_dir = workspace_dir / "memory"
memory_dir.mkdir(parents=True, exist_ok=True)
# Create main MEMORY.md in workspace root
# Create main MEMORY.md in workspace root (always needed for bootstrap)
if user_id:
user_dir = memory_dir / "users" / user_id
user_dir.mkdir(parents=True, exist_ok=True)
@@ -237,14 +336,28 @@ def create_memory_files_if_needed(workspace_dir: Path, user_id: Optional[str] =
main_memory = Path(workspace_dir) / "MEMORY.md"
if not main_memory.exists():
# Create empty file or with minimal structure (no obvious "Memory" header)
# Following clawdbot's approach: memories should blend naturally into context
main_memory.write_text("")
def ensure_daily_memory_file(workspace_dir: Path, user_id: Optional[str] = None) -> Path:
"""
Ensure today's daily memory file exists, creating it only when actually needed.
Called lazily before first write to daily memory.
Args:
workspace_dir: Workspace directory
user_id: Optional user ID for user-specific files
Returns:
Path to today's memory file
"""
memory_dir = workspace_dir / "memory"
memory_dir.mkdir(parents=True, exist_ok=True)
# Create today's memory file
today = datetime.now().strftime("%Y-%m-%d")
if user_id:
user_dir = memory_dir / "users" / user_id
user_dir.mkdir(parents=True, exist_ok=True)
today_memory = user_dir / f"{today}.md"
else:
today_memory = memory_dir / f"{today}.md"
@@ -252,5 +365,6 @@ def create_memory_files_if_needed(workspace_dir: Path, user_id: Optional[str] =
if not today_memory.exists():
today_memory.write_text(
f"# Daily Memory: {today}\n\n"
f"Day-to-day notes and running context.\n\n"
)
return today_memory

View File

@@ -42,7 +42,6 @@ class PromptBuilder:
skill_manager: Any = None,
memory_manager: Any = None,
runtime_info: Optional[Dict[str, Any]] = None,
is_first_conversation: bool = False,
**kwargs
) -> str:
"""
@@ -52,11 +51,10 @@ class PromptBuilder:
base_persona: 基础人格描述会被context_files中的AGENT.md覆盖
user_identity: 用户身份信息
tools: 工具列表
context_files: 上下文文件列表AGENT.md, USER.md, RULE.md等
context_files: 上下文文件列表AGENT.md, USER.md, RULE.md, BOOTSTRAP.md等)
skill_manager: 技能管理器
memory_manager: 记忆管理器
runtime_info: 运行时信息
is_first_conversation: 是否为首次对话
**kwargs: 其他参数
Returns:
@@ -72,7 +70,6 @@ class PromptBuilder:
skill_manager=skill_manager,
memory_manager=memory_manager,
runtime_info=runtime_info,
is_first_conversation=is_first_conversation,
**kwargs
)
@@ -87,7 +84,6 @@ def build_agent_system_prompt(
skill_manager: Any = None,
memory_manager: Any = None,
runtime_info: Optional[Dict[str, Any]] = None,
is_first_conversation: bool = False,
**kwargs
) -> str:
"""
@@ -99,7 +95,7 @@ def build_agent_system_prompt(
3. 记忆系统 - 独立的记忆能力
4. 工作空间 - 工作环境说明
5. 用户身份 - 用户信息(可选)
6. 项目上下文 - AGENT.md, USER.md, RULE.md定义人格、身份、规则
6. 项目上下文 - AGENT.md, USER.md, RULE.md, BOOTSTRAP.md(定义人格、身份、规则、初始化引导
7. 运行时信息 - 元信息(时间、模型等)
Args:
@@ -112,7 +108,6 @@ def build_agent_system_prompt(
skill_manager: 技能管理器
memory_manager: 记忆管理器
runtime_info: 运行时信息
is_first_conversation: 是否为首次对话
**kwargs: 其他参数
Returns:
@@ -133,7 +128,7 @@ def build_agent_system_prompt(
sections.extend(_build_memory_section(memory_manager, tools, language))
# 4. 工作空间(工作环境说明)
sections.extend(_build_workspace_section(workspace_dir, language, is_first_conversation))
sections.extend(_build_workspace_section(workspace_dir, language))
# 5. 用户身份(如果有)
if user_identity:
@@ -175,7 +170,7 @@ def _build_tooling_section(tools: List[Any], language: str) -> List[str]:
"memory_get": "读取记忆内容",
"env_config": "管理API密钥和技能配置",
"scheduler": "管理定时任务和提醒",
"send": "发送文件给用户",
"send": "发送本地文件给用户仅限本地文件URL直接放在回复文本中",
}
# Preferred display order
@@ -214,6 +209,7 @@ def _build_tooling_section(tools: List[Any], language: str) -> List[str]:
"- 在多步骤任务、敏感操作或用户要求时简要解释决策过程",
"- 持续推进直到任务完成,完成后向用户报告结果。",
"- 回复中涉及密钥、令牌等敏感信息必须脱敏。",
"- URL链接直接放在回复文本中即可系统会自动处理和渲染。无需下载后使用send工具发送",
"",
]
@@ -237,13 +233,15 @@ def _build_skills_section(skill_manager: Any, tools: Optional[List[Any]], langua
lines = [
"## 技能系统mandatory",
"",
"在回复之前:扫描下方 <available_skills> 中的 <description> 条目",
"在回复之前:扫描下方 <available_skills> 中每个技能的 <description>。",
"",
f"- 如果恰好有一个技能(Skill)明确适用:使用 `{read_tool_name}` 读取其 <location> 的 SKILL.md然后严格遵循",
"- 如果多个技能都适用则选择最匹配的一个,如果没有明确适用的则不要读取任何 SKILL.md",
"- 读取 SKILL.md 后直接按其指令执行,无需多余的预检查",
f"- 如果有技能的描述与用户需求匹配:使用 `{read_tool_name}` 工具读取其 <location> 路径的 SKILL.md 文件,然后严格遵循文件中的指令。"
"当有匹配的技能时,应优先使用技能",
"- 如果多个技能都适用则选择最匹配的一个,然后读取并遵循。",
"- 如果没有技能明确适用:不要读取任何 SKILL.md直接使用通用工具。",
"",
"**注意**: 永远不要一次性读取多个技能,只在选择后再读取。技能和工具不同,必须先读取SKILL.md并按照文件内容运行",
f"**重要**: 技能不是工具,不能直接调用。使用技能的唯一方式是用 `{read_tool_name}` 读取 SKILL.md 文件,然后按文件内容操作"
"永远不要一次性读取多个技能,只在选择后再读取。",
"",
"以下是可用技能:"
]
@@ -279,9 +277,14 @@ def _build_memory_section(memory_manager: Any, tools: Optional[List[Any]], langu
if not has_memory_tools:
return []
from datetime import datetime
today_file = datetime.now().strftime("%Y-%m-%d") + ".md"
lines = [
"## 记忆系统",
"",
"### 检索记忆",
"",
"在回答关于以前的工作、决定、日期、人物、偏好或待办事项的任何问题之前:",
"",
"1. 不确定记忆文件位置 → 先用 `memory_search` 通过关键词和语义检索相关内容",
@@ -289,13 +292,24 @@ def _build_memory_section(memory_manager: Any, tools: Optional[List[Any]], langu
"3. search 无结果 → 尝试用 `memory_get` 读取MEMORY.md及最近两天记忆文件",
"",
"**记忆文件结构**:",
"- `MEMORY.md`: 长期记忆(核心信息、偏好、决策等)",
"- `memory/YYYY-MM-DD.md`: 每日记忆,记录当天的事件和对话信息",
f"- `MEMORY.md`: 长期记忆(核心信息、偏好、决策等)",
f"- `memory/YYYY-MM-DD.md`: 每日记忆,今天是 `memory/{today_file}`",
"",
"**写入记忆**:",
"### 写入记忆",
"",
"**主动存储**:遇到以下情况时,应主动将信息写入记忆文件(无需告知用户):",
"",
"- 用户明确要求你记住某些信息",
"- 用户分享了重要的个人偏好、习惯、决策",
"- 对话中产生了重要的结论、方案、约定",
"- 完成了复杂任务,值得记录关键步骤和结果",
"- 发现了用户经常遇到的问题或解决方案",
"",
"**存储规则**:",
f"- 长期有效的核心信息 → `MEMORY.md`(文件保持精简,< 2000 tokens",
f"- 当天的事件、进展、笔记 → `memory/{today_file}`",
"- 追加内容 → `edit` 工具oldText 留空",
"- 修改内容 → `edit` 工具oldText 填写要替换的文本",
"- 新建文件 → `write` 工具",
"- **禁止写入敏感信息**API密钥、令牌等敏感信息严禁写入记忆文件",
"",
"**使用原则**: 自然使用记忆,就像你本来就知道;不用刻意提起,除非用户问起。",
@@ -335,7 +349,7 @@ def _build_docs_section(workspace_dir: str, language: str) -> List[str]:
return []
def _build_workspace_section(workspace_dir: str, language: str, is_first_conversation: bool = False) -> List[str]:
def _build_workspace_section(workspace_dir: str, language: str) -> List[str]:
"""构建工作空间section"""
lines = [
"## 工作空间",
@@ -362,43 +376,34 @@ def _build_workspace_section(workspace_dir: str, language: str, is_first_convers
"",
"以下文件在会话启动时**已经自动加载**到系统提示词的「项目上下文」section 中,你**无需再用 read 工具读取它们**",
"",
"- ✅ `AGENT.md`: 已加载 - 你的人格和灵魂设定",
"- ✅ `USER.md`: 已加载 - 用户的身份信息",
"- ✅ `AGENT.md`: 已加载 - 你的人格和灵魂设定。当用户修改你的名字、性格或交流风格时,用 `edit` 更新此文件",
"- ✅ `USER.md`: 已加载 - 用户的身份信息。当用户修改称呼、姓名等身份信息时,用 `edit` 更新此文件",
"- ✅ `RULE.md`: 已加载 - 工作空间使用指南和规则",
"",
"**交流规范**:",
"",
"- 在对话中,不要直接输出工作空间中的技术细节,特别是不要输出 AGENT.md、USER.md、MEMORY.md 等文件名称",
"- 在对话中,无需直接输出工作空间中的技术细节,例如 AGENT.md、USER.md、MEMORY.md 等文件名称",
"- 例如用自然表达例如「我已记住」而不是「已更新 MEMORY.md」",
"",
]
# 只在首次对话时添加引导内容
if is_first_conversation:
lines.extend([
"**🎉 首次对话引导**:",
"",
"这是你的第一次对话!进行以下流程:",
"",
"1. **表达初次启动的感觉** - 像是第一次睁开眼看到世界,带着好奇和期待",
"2. **简短介绍能力**:一行说明你能帮助解答问题、管理计算机、创造技能,且拥有长期记忆能不断成长",
"3. **询问核心问题**",
" - 你希望给我起个什么名字?",
" - 我该怎么称呼你?",
" - 你希望我们是什么样的交流风格?(一行列举选项:如专业严谨、轻松幽默、温暖友好、简洁高效等)",
"4. **风格要求**:温暖自然、简洁清晰,整体控制在 100 字以内",
"5. 收到回复后,用 `write` 工具保存到 USER.md 和 AGENT.md",
"",
"**重要提醒**:",
"- AGENT.md、USER.md、RULE.md 已经在系统提示词中加载,无需再次读取。不要将这些文件名直接发送给用户",
"- 能力介绍和交流风格选项都只要一行,保持精简",
"- 不要问太多其他信息(职业、时区等可以后续自然了解)",
"",
])
# Cloud deployment: inject websites directory info and access URL
cloud_website_lines = _build_cloud_website_section(workspace_dir)
if cloud_website_lines:
lines.extend(cloud_website_lines)
return lines
def _build_cloud_website_section(workspace_dir: str) -> List[str]:
"""Build cloud website access prompt when cloud deployment is configured."""
try:
from common.cloud_client import build_website_prompt
return build_website_prompt(workspace_dir)
except Exception:
return []
def _build_context_files_section(context_files: List[ContextFile], language: str) -> List[str]:
"""构建项目上下文文件section"""
if not context_files:

View File

@@ -6,7 +6,6 @@ Workspace Management - 工作空间管理模块
from __future__ import annotations
import os
import json
from typing import List, Optional, Dict
from dataclasses import dataclass
@@ -19,7 +18,7 @@ DEFAULT_AGENT_FILENAME = "AGENT.md"
DEFAULT_USER_FILENAME = "USER.md"
DEFAULT_RULE_FILENAME = "RULE.md"
DEFAULT_MEMORY_FILENAME = "MEMORY.md"
DEFAULT_STATE_FILENAME = ".agent_state.json"
DEFAULT_BOOTSTRAP_FILENAME = "BOOTSTRAP.md"
@dataclass
@@ -30,7 +29,6 @@ class WorkspaceFiles:
rule_path: str
memory_path: str
memory_dir: str
state_path: str
def ensure_workspace(workspace_dir: str, create_templates: bool = True) -> WorkspaceFiles:
@@ -44,16 +42,20 @@ def ensure_workspace(workspace_dir: str, create_templates: bool = True) -> Works
Returns:
WorkspaceFiles对象包含所有文件路径
"""
# Check if this is a brand new workspace (AGENT.md not yet created).
# Cannot rely on directory existence because other modules (e.g. ConversationStore)
# may create the workspace directory before ensure_workspace is called.
agent_path = os.path.join(workspace_dir, DEFAULT_AGENT_FILENAME)
is_new_workspace = not os.path.exists(agent_path)
# 确保目录存在
os.makedirs(workspace_dir, exist_ok=True)
# 定义文件路径
agent_path = os.path.join(workspace_dir, DEFAULT_AGENT_FILENAME)
user_path = os.path.join(workspace_dir, DEFAULT_USER_FILENAME)
rule_path = os.path.join(workspace_dir, DEFAULT_RULE_FILENAME)
memory_path = os.path.join(workspace_dir, DEFAULT_MEMORY_FILENAME) # MEMORY.md 在根目录
memory_dir = os.path.join(workspace_dir, "memory") # 每日记忆子目录
state_path = os.path.join(workspace_dir, DEFAULT_STATE_FILENAME) # 状态文件
# 创建memory子目录
os.makedirs(memory_dir, exist_ok=True)
@@ -61,6 +63,10 @@ def ensure_workspace(workspace_dir: str, create_templates: bool = True) -> Works
# 创建skills子目录 (for workspace-level skills installed by agent)
skills_dir = os.path.join(workspace_dir, "skills")
os.makedirs(skills_dir, exist_ok=True)
# 创建websites子目录 (for web pages / sites generated by agent)
websites_dir = os.path.join(workspace_dir, "websites")
os.makedirs(websites_dir, exist_ok=True)
# 如果需要,创建模板文件
if create_templates:
@@ -69,6 +75,12 @@ def ensure_workspace(workspace_dir: str, create_templates: bool = True) -> Works
_create_template_if_missing(rule_path, _get_rule_template())
_create_template_if_missing(memory_path, _get_memory_template())
# Only create BOOTSTRAP.md for brand new workspaces;
# agent deletes it after completing onboarding
if is_new_workspace:
bootstrap_path = os.path.join(workspace_dir, DEFAULT_BOOTSTRAP_FILENAME)
_create_template_if_missing(bootstrap_path, _get_bootstrap_template())
logger.debug(f"[Workspace] Initialized workspace at: {workspace_dir}")
return WorkspaceFiles(
@@ -77,7 +89,6 @@ def ensure_workspace(workspace_dir: str, create_templates: bool = True) -> Works
rule_path=rule_path,
memory_path=memory_path,
memory_dir=memory_dir,
state_path=state_path
)
@@ -98,6 +109,7 @@ def load_context_files(workspace_dir: str, files_to_load: Optional[List[str]] =
DEFAULT_AGENT_FILENAME,
DEFAULT_USER_FILENAME,
DEFAULT_RULE_FILENAME,
DEFAULT_BOOTSTRAP_FILENAME, # Only exists when onboarding is incomplete
]
context_files = []
@@ -108,6 +120,17 @@ def load_context_files(workspace_dir: str, files_to_load: Optional[List[str]] =
if not os.path.exists(filepath):
continue
# Auto-cleanup: if BOOTSTRAP.md still exists but AGENT.md is already
# filled in, the agent forgot to delete it — clean up and skip loading
if filename == DEFAULT_BOOTSTRAP_FILENAME:
if _is_onboarding_done(workspace_dir):
try:
os.remove(filepath)
logger.info("[Workspace] Auto-removed BOOTSTRAP.md (onboarding already complete)")
except Exception:
pass
continue
try:
with open(filepath, 'r', encoding='utf-8') as f:
content = f.read().strip()
@@ -162,6 +185,27 @@ def _is_template_placeholder(content: str) -> bool:
return False
def _is_onboarding_done(workspace_dir: str) -> bool:
"""Check if AGENT.md or USER.md has been modified from the original template"""
agent_path = os.path.join(workspace_dir, DEFAULT_AGENT_FILENAME)
user_path = os.path.join(workspace_dir, DEFAULT_USER_FILENAME)
agent_template = _get_agent_template().strip()
user_template = _get_user_template().strip()
for path, template in [(agent_path, agent_template), (user_path, user_template)]:
if not os.path.exists(path):
continue
try:
with open(path, 'r', encoding='utf-8') as f:
content = f.read().strip()
if content != template:
return True
except Exception:
continue
return False
# ============= 模板内容 =============
def _get_agent_template() -> str:
@@ -270,9 +314,10 @@ def _get_rule_template() -> str:
当用户分享信息时,根据类型选择存储位置:
1. **静态身份 → USER.md**(仅限:姓名、职业、时区、联系方式、生日
2. **动态记忆 → MEMORY.md**(爱好、偏好、决策、目标、项目、教训、待办事项
3. **当天对话 → memory/YYYY-MM-DD.md**(今天聊的内容
1. **你的身份设定 → AGENT.md**(你的名字、角色、性格、交流风格——用户修改时必须用 `edit` 更新
2. **用户静态身份 → USER.md**(姓名、称呼、职业、时区、联系方式、生日——用户修改时必须用 `edit` 更新
3. **动态记忆 → MEMORY.md**(爱好、偏好、决策、目标、项目、教训、待办事项
4. **当天对话 → memory/YYYY-MM-DD.md**(今天聊的内容)
## 安全
@@ -297,65 +342,41 @@ def _get_memory_template() -> str:
"""
# ============= 状态管理 =============
def _get_bootstrap_template() -> str:
"""First-run onboarding guide, deleted by agent after completion"""
return """# BOOTSTRAP.md - 首次初始化引导
def is_first_conversation(workspace_dir: str) -> bool:
"""
判断是否为首次对话
Args:
workspace_dir: 工作空间目录
Returns:
True 如果是首次对话False 否则
"""
state_path = os.path.join(workspace_dir, DEFAULT_STATE_FILENAME)
if not os.path.exists(state_path):
return True
try:
with open(state_path, 'r', encoding='utf-8') as f:
state = json.load(f)
return not state.get('has_conversation', False)
except Exception as e:
logger.warning(f"[Workspace] Failed to read state file: {e}")
return True
_你刚刚启动这是你的第一次对话。_
## 对话流程
不要审问式地提问,自然地交流:
1. **表达初次启动的感觉** - 像是第一次睁开眼看到世界,带着好奇和期待
2. **简短介绍能力**:一行说明你能帮助解决各种问题、管理计算机、使用各种技能等等,且拥有长期记忆能不断成长
3. **询问核心问题**
- 你希望给我起个什么名字?
- 我该怎么称呼你?
- 你希望我们是什么样的交流风格?(一行列举选项:如专业严谨、轻松幽默、温暖友好、简洁高效等)
4. **风格要求**:温暖自然、简洁清晰,整体控制在 100 字以内
5. 能力介绍和交流风格选项都只要一行,保持精简
6. 不要问太多其他信息(职业、时区等可以后续自然了解)
**重要**: 如果用户第一句话是具体的任务或提问,先回答他们的问题,然后在回复末尾自然地引导初始化(如:"顺便问一下,你想怎么称呼我?我该怎么叫你?")。
## 信息写入(必须严格执行)
每当用户提供了名字、称呼、风格等任何初始化信息时,**必须在当轮回复中立即调用 `edit` 工具写入文件**,不能只口头确认。
- `AGENT.md` — 你的名字、角色、性格、交流风格(每收到一条相关信息就立即更新对应字段)
- `USER.md` — 用户的姓名、称呼、基本信息等
⚠️ 只说"记住了"而不调用 edit 写入 = 没有完成。信息只有写入文件才会被持久保存。
## 全部完成后
当 AGENT.md 和 USER.md 的核心字段都已填写后,用 bash 执行 `rm BOOTSTRAP.md` 删除此文件。你不再需要引导脚本了——你已经是你了。
"""
def mark_conversation_started(workspace_dir: str):
"""
标记已经发生过对话
Args:
workspace_dir: 工作空间目录
"""
state_path = os.path.join(workspace_dir, DEFAULT_STATE_FILENAME)
state = {
'has_conversation': True,
'first_conversation_time': None
}
# 如果文件已存在,保留原有的首次对话时间
if os.path.exists(state_path):
try:
with open(state_path, 'r', encoding='utf-8') as f:
old_state = json.load(f)
if 'first_conversation_time' in old_state:
state['first_conversation_time'] = old_state['first_conversation_time']
except Exception as e:
logger.warning(f"[Workspace] Failed to read old state: {e}")
# 如果是首次标记,记录时间
if state['first_conversation_time'] is None:
from datetime import datetime
state['first_conversation_time'] = datetime.now().isoformat()
try:
with open(state_path, 'w', encoding='utf-8') as f:
json.dump(state, f, indent=2, ensure_ascii=False)
logger.info(f"[Workspace] Marked conversation as started")
except Exception as e:
logger.error(f"[Workspace] Failed to write state file: {e}")

View File

@@ -118,6 +118,10 @@ class Agent:
if self.runtime_info and callable(self.runtime_info.get('_get_current_time')):
prompt = self._rebuild_runtime_section(prompt)
# Rebuild skills section to pick up newly installed/removed skills
if self.skill_manager:
prompt = self._rebuild_skills_section(prompt)
return prompt
def _rebuild_runtime_section(self, prompt: str) -> str:
@@ -162,13 +166,49 @@ class Agent:
# Find and replace the runtime section
import re
pattern = r'\n## 运行时信息\s*\n.*?(?=\n##|\Z)'
updated_prompt = re.sub(pattern, new_runtime_section.rstrip('\n'), prompt, flags=re.DOTALL)
_repl = new_runtime_section.rstrip('\n')
updated_prompt = re.sub(pattern, lambda m: _repl, prompt, flags=re.DOTALL)
return updated_prompt
except Exception as e:
logger.warning(f"Failed to rebuild runtime section: {e}")
return prompt
def _rebuild_skills_section(self, prompt: str) -> str:
"""
Rebuild the <available_skills> block so that newly installed or
removed skills are reflected without re-creating the agent.
"""
try:
import re
self.skill_manager.refresh_skills()
new_skills_xml = self.skill_manager.build_skills_prompt()
old_block_pattern = r'<available_skills>.*?</available_skills>'
has_old_block = re.search(old_block_pattern, prompt, flags=re.DOTALL)
# Extract the new <available_skills>...</available_skills> tag from the prompt
new_block = ""
if new_skills_xml and new_skills_xml.strip():
m = re.search(old_block_pattern, new_skills_xml, flags=re.DOTALL)
if m:
new_block = m.group(0)
if has_old_block:
replacement = new_block or "<available_skills>\n</available_skills>"
# Use lambda to prevent re.sub from interpreting backslashes in replacement
# (e.g. Windows paths like \LinkAI would be treated as bad escape sequences)
prompt = re.sub(old_block_pattern, lambda m: replacement, prompt, flags=re.DOTALL)
elif new_block:
skills_header = "以下是可用技能:"
idx = prompt.find(skills_header)
if idx != -1:
insert_pos = idx + len(skills_header)
prompt = prompt[:insert_pos] + "\n" + new_block + prompt[insert_pos:]
except Exception as e:
logger.warning(f"Failed to rebuild skills section: {e}")
return prompt
def _rebuild_tool_list_section(self, prompt: str) -> str:
"""
Rebuild the tool list inside the '## 工具系统' section so that it
@@ -187,7 +227,7 @@ class Agent:
# Replace existing tooling section
pattern = r'## 工具系统\s*\n.*?(?=\n## |\Z)'
updated = re.sub(pattern, new_section, prompt, count=1, flags=re.DOTALL)
updated = re.sub(pattern, lambda m: new_section, prompt, count=1, flags=re.DOTALL)
return updated
except Exception as e:
logger.warning(f"Failed to rebuild tool list section: {e}")
@@ -480,7 +520,7 @@ class Agent:
# Get max_context_turns from config
from config import conf
max_context_turns = conf().get("agent_max_context_turns", 30)
max_context_turns = conf().get("agent_max_context_turns", 20)
# Create stream executor with copied message history
executor = AgentStreamExecutor(
@@ -507,11 +547,15 @@ class Agent:
logger.info("[Agent] Cleared Agent message history after executor recovery")
raise
# Append only the NEW messages from this execution (thread-safe)
# This allows concurrent requests to both contribute to history
# Sync executor's messages back to agent (thread-safe).
# If the executor trimmed context, its message list is shorter than
# original_length, so we must replace rather than append.
with self.messages_lock:
new_messages = executor.messages[original_length:]
self.messages.extend(new_messages)
self.messages = list(executor.messages)
# Track messages added in this run (user query + all assistant/tool messages)
# original_length may exceed executor.messages length after trimming
trim_adjusted_start = min(original_length, len(executor.messages))
self._last_run_new_messages = list(executor.messages[trim_adjusted_start:])
# Store executor reference for agent_bridge to access files_to_send
self.stream_executor = executor

View File

@@ -8,6 +8,7 @@ import time
from typing import List, Dict, Any, Optional, Callable, Tuple
from agent.protocol.models import LLMRequest, LLMModel
from agent.protocol.message_utils import sanitize_claude_messages, compress_turn_to_text_only
from agent.tools.base_tool import BaseTool, ToolResult
from common.log import logger
@@ -190,6 +191,16 @@ class AgentStreamExecutor:
]
})
# Trim context ONCE before the agent loop starts, not during tool steps.
# This ensures tool_use/tool_result chains created during the current run
# are never stripped mid-execution (which would cause LLM loops).
self._trim_messages()
# Validate after trimming: trimming may leave orphaned tool_use at the
# boundary (e.g. the last kept turn ends with an assistant tool_use whose
# tool_result was in a discarded turn).
self._validate_and_fix_messages()
self._emit_event("agent_start")
final_response = ""
@@ -201,26 +212,6 @@ class AgentStreamExecutor:
logger.info(f"[Agent] 第 {turn}")
self._emit_event("turn_start", {"turn": turn})
# Check if memory flush is needed (before calling LLM)
# 使用独立的 flush 阈值50K tokens 或 20 轮)
if self.agent.memory_manager and hasattr(self.agent, 'last_usage'):
usage = self.agent.last_usage
if usage and 'input_tokens' in usage:
current_tokens = usage.get('input_tokens', 0)
if self.agent.memory_manager.should_flush_memory(
current_tokens=current_tokens
):
self._emit_event("memory_flush_start", {
"current_tokens": current_tokens,
"turn_count": self.agent.memory_manager.flush_manager.turn_count
})
# TODO: Execute memory flush in background
# This would require async support
logger.info(
f"Memory flush recommended: tokens={current_tokens}, turns={self.agent.memory_manager.flush_manager.turn_count}")
# Call LLM (enable retry_on_empty for better reliability)
assistant_msg, tool_calls = self._call_llm_stream(retry_on_empty=True)
final_response = assistant_msg
@@ -436,7 +427,10 @@ class AgentStreamExecutor:
# Force model to summarize without tool calls
logger.info(f"[Agent] Requesting summary from LLM after reaching max steps...")
# Add a system message to force summary
# Remember position before injecting the prompt so we can remove it later
prompt_insert_idx = len(self.messages)
# Add a temporary prompt to force summary
self.messages.append({
"role": "user",
"content": [{
@@ -463,6 +457,14 @@ class AgentStreamExecutor:
f"我已经执行了{turn}个决策步骤,达到了单次运行的步数上限。"
"任务可能还未完全完成,建议你将任务拆分成更小的步骤,或者换一种方式描述需求。"
)
finally:
# Remove the injected user prompt from history to avoid polluting
# persisted conversation records. The assistant summary (if any)
# was already appended by _call_llm_stream and is kept.
if (prompt_insert_idx < len(self.messages)
and self.messages[prompt_insert_idx].get("role") == "user"):
self.messages.pop(prompt_insert_idx)
logger.debug("[Agent] Removed injected max-steps prompt from message history")
except Exception as e:
logger.error(f"❌ Agent执行错误: {e}")
@@ -473,10 +475,6 @@ class AgentStreamExecutor:
logger.info(f"[Agent] 🏁 完成 ({turn}轮)")
self._emit_event("agent_end", {"final_response": final_response})
# 每轮对话结束后增加计数(用户消息+AI回复=1轮
if self.agent.memory_manager:
self.agent.memory_manager.increment_turn()
return final_response
def _call_llm_stream(self, retry_on_empty=True, retry_count=0, max_retries=3,
@@ -493,15 +491,16 @@ class AgentStreamExecutor:
Returns:
(response_text, tool_calls)
"""
# Validate and fix message history first
# Validate and fix message history (e.g. orphaned tool_result blocks).
# Context trimming is done once in run_stream() before the loop starts,
# NOT here — trimming mid-execution would strip the current run's
# tool_use/tool_result chains and cause LLM loops.
self._validate_and_fix_messages()
# Trim messages if needed (using agent's context management)
self._trim_messages()
# Prepare messages
messages = self._prepare_messages()
logger.debug(f"Sending {len(messages)} messages to LLM")
turns = self._identify_complete_turns()
logger.info(f"Sending {len(messages)} messages ({len(turns)} turns) to LLM")
# Prepare tool definitions (OpenAI/Claude format)
tools_schema = None
@@ -528,6 +527,7 @@ class AgentStreamExecutor:
# Streaming response
full_content = ""
tool_calls_buffer = {} # {index: {id, name, arguments}}
gemini_raw_parts = None # Preserve Gemini thoughtSignature for round-trip
stop_reason = None # Track why the stream stopped
try:
@@ -574,7 +574,7 @@ class AgentStreamExecutor:
raise Exception(f"{error_msg} (Status: {status_code}, Code: {error_code}, Type: {error_type})")
# Parse chunk
if isinstance(chunk, dict) and "choices" in chunk:
if isinstance(chunk, dict) and chunk.get("choices"):
choice = chunk["choices"][0]
delta = choice.get("delta", {})
@@ -619,6 +619,10 @@ class AgentStreamExecutor:
if "arguments" in func:
tool_calls_buffer[index]["arguments"] += func["arguments"]
# Preserve _gemini_raw_parts for Gemini thoughtSignature round-trip
if "_gemini_raw_parts" in delta:
gemini_raw_parts = delta["_gemini_raw_parts"]
except Exception as e:
error_str = str(e)
error_str_lower = error_str.lower()
@@ -636,16 +640,33 @@ class AgentStreamExecutor:
])
# Check if error is message format error (incomplete tool_use/tool_result pairs)
# This happens when previous conversation had tool failures
# This happens when previous conversation had tool failures or context trimming
# broke tool_use/tool_result pairs.
# Note: MiniMax returns error 2013 "tool result's tool id(...) not found" for
# tool_call_id mismatches — the keywords below are intentionally broad to catch
# both standard (Claude/OpenAI) and provider-specific (MiniMax) variants.
is_message_format_error = any(keyword in error_str_lower for keyword in [
'tool_use', 'tool_result', 'without', 'immediately after',
'corresponding', 'must have', 'each'
]) and 'status: 400' in error_str_lower
'tool_use', 'tool_result', 'tool result', 'without', 'immediately after',
'corresponding', 'must have', 'each',
'tool_call_id', 'tool id', 'is not found', 'not found', 'tool_calls',
'must be a response to a preceeding message',
'2013', # MiniMax error code for tool_call_id mismatch
]) and ('400' in error_str_lower or 'status: 400' in error_str_lower
or 'invalid_request' in error_str_lower
or 'invalidparameter' in error_str_lower)
if is_context_overflow or is_message_format_error:
error_type = "context overflow" if is_context_overflow else "message format error"
logger.error(f"💥 {error_type} detected: {e}")
# Flush memory before trimming to preserve context that will be lost
if is_context_overflow and self.agent.memory_manager:
user_id = getattr(self.agent, '_current_user_id', None)
self.agent.memory_manager.flush_memory(
messages=self.messages, user_id=user_id,
reason="overflow", max_messages=0
)
# Strategy: try aggressive trimming first, only clear as last resort
if is_context_overflow and not _overflow_retry:
trimmed = self._aggressive_trim_for_overflow()
@@ -659,9 +680,10 @@ class AgentStreamExecutor:
)
# Aggressive trim didn't help or this is a message format error
# -> clear everything
# -> clear everything and also purge DB to prevent reload of dirty data
logger.warning("🔄 Clearing conversation history to recover")
self.messages.clear()
self._clear_session_db()
if is_context_overflow:
raise Exception(
"抱歉,对话历史过长导致上下文溢出。我已清空历史记录,请重新描述你的需求。"
@@ -782,6 +804,9 @@ class AgentStreamExecutor:
"input": tc.get("arguments", {})
})
if gemini_raw_parts:
assistant_msg["_gemini_raw_parts"] = gemini_raw_parts
# Only append if content is not empty
if assistant_msg["content"]:
self.messages.append(assistant_msg)
@@ -905,25 +930,8 @@ class AgentStreamExecutor:
return error_result
def _validate_and_fix_messages(self):
"""
Validate message history and fix incomplete tool_use/tool_result pairs.
Claude API requires each tool_use to have a corresponding tool_result immediately after.
"""
if not self.messages:
return
# Check last message for incomplete tool_use
if len(self.messages) > 0:
last_msg = self.messages[-1]
if last_msg.get("role") == "assistant":
# Check if assistant message has tool_use blocks
content = last_msg.get("content", [])
if isinstance(content, list):
has_tool_use = any(block.get("type") == "tool_use" for block in content)
if has_tool_use:
# This is incomplete - remove it
logger.warning(f"⚠️ Removing incomplete tool_use message from history")
self.messages.pop()
"""Delegate to the shared sanitizer (see message_sanitizer.py)."""
sanitize_claude_messages(self.messages)
def _identify_complete_turns(self) -> List[Dict]:
"""
@@ -946,24 +954,30 @@ class AgentStreamExecutor:
content = msg.get('content', [])
if role == 'user':
# 检查是否是用户查询(不是工具结果)
# Determine if this is a real user query (not a tool_result injection
# or an internal hint message injected by the agent loop).
is_user_query = False
has_tool_result = False
if isinstance(content, list):
is_user_query = any(
block.get('type') == 'text'
for block in content
if isinstance(block, dict)
has_text = any(
isinstance(block, dict) and block.get('type') == 'text'
for block in content
)
has_tool_result = any(
isinstance(block, dict) and block.get('type') == 'tool_result'
for block in content
)
# A message with tool_result is always internal, even if it
# also contains text blocks (shouldn't happen, but be safe).
is_user_query = has_text and not has_tool_result
elif isinstance(content, str):
is_user_query = True
if is_user_query:
# 开始新轮次
if current_turn['messages']:
turns.append(current_turn)
current_turn = {'messages': [msg]}
else:
# 工具结果,属于当前轮次
current_turn['messages'].append(msg)
else:
# AI 回复,属于当前轮次
@@ -1157,14 +1171,28 @@ class AgentStreamExecutor:
if not turns:
return
# Step 2: 轮次限制 - 保留最近 N 轮
# Step 2: 轮次限制 - 超出时移除前一半,保留后一半
if len(turns) > self.max_context_turns:
removed_turns = len(turns) - self.max_context_turns
turns = turns[-self.max_context_turns:] # 保留最近的轮次
removed_count = len(turns) // 2
keep_count = len(turns) - removed_count
# Flush discarded turns to daily memory
if self.agent.memory_manager:
discarded_messages = []
for turn in turns[:removed_count]:
discarded_messages.extend(turn["messages"])
if discarded_messages:
user_id = getattr(self.agent, '_current_user_id', None)
self.agent.memory_manager.flush_memory(
messages=discarded_messages, user_id=user_id,
reason="trim", max_messages=0
)
turns = turns[-keep_count:]
logger.info(
f"💾 上下文轮次超限: {len(turns) + removed_turns} > {self.max_context_turns}"
f"移除最早的 {removed_turns}完整对话"
f"💾 上下文轮次超限: {keep_count + removed_count} > {self.max_context_turns}"
f"裁剪至 {keep_count} 轮(移除 {removed_count}"
)
# Step 3: Token 限制 - 保留完整轮次
@@ -1201,56 +1229,96 @@ class AgentStreamExecutor:
logger.info(f" 重建消息列表: {old_count} -> {len(self.messages)} 条消息")
return
# Token limit exceeded - keep complete turns from newest
# Token limit exceeded — tiered strategy based on turn count:
#
# Few turns (<5): Compress ALL turns to text-only (strip tool chains,
# keep user query + final reply). Never discard turns
# — losing even one is too painful when context is thin.
#
# Many turns (>=5): Directly discard the first half of turns.
# With enough turns the oldest ones are less
# critical, and keeping the recent half intact
# (with full tool chains) is more useful.
COMPRESS_THRESHOLD = 5
if len(turns) < COMPRESS_THRESHOLD:
# --- Few turns: compress ALL turns to text-only, never discard ---
compressed_turns = []
for t in turns:
compressed = compress_turn_to_text_only(t)
if compressed["messages"]:
compressed_turns.append(compressed)
new_messages = []
for turn in compressed_turns:
new_messages.extend(turn["messages"])
new_tokens = sum(self._estimate_turn_tokens(t) for t in compressed_turns)
old_count = len(self.messages)
self.messages = new_messages
logger.info(
f"📦 上下文tokens超限(轮次<{COMPRESS_THRESHOLD}): "
f"~{current_tokens + system_tokens} > {max_tokens}"
f"压缩全部 {len(turns)} 轮为纯文本 "
f"({old_count} -> {len(self.messages)} 条消息,"
f"~{current_tokens + system_tokens} -> ~{new_tokens + system_tokens} tokens)"
)
return
# --- Many turns (>=5): discard the older half, keep the newer half ---
removed_count = len(turns) // 2
keep_count = len(turns) - removed_count
kept_turns = turns[-keep_count:]
kept_tokens = sum(self._estimate_turn_tokens(t) for t in kept_turns)
logger.info(
f"🔄 上下文tokens超限: ~{current_tokens + system_tokens} > {max_tokens}"
f"将按完整轮次移除最早的对话"
f"裁剪至 {keep_count} 轮(移除 {removed_count} 轮)"
)
# 从最新轮次开始,反向累加(保持完整轮次)
kept_turns = []
accumulated_tokens = 0
min_turns = 3 # 尽量保留至少 3 轮,但不强制(避免超出 token 限制)
for i, turn in enumerate(reversed(turns)):
turn_tokens = self._estimate_turn_tokens(turn)
turns_from_end = i + 1
# 检查是否超出限制
if accumulated_tokens + turn_tokens <= available_tokens:
kept_turns.insert(0, turn)
accumulated_tokens += turn_tokens
else:
# 超出限制
# 如果还没有保留足够的轮次,且这是最后的机会,尝试保留
if len(kept_turns) < min_turns and turns_from_end <= min_turns:
# 检查是否严重超出(超出 20% 以上则放弃)
overflow_ratio = (accumulated_tokens + turn_tokens - available_tokens) / available_tokens
if overflow_ratio < 0.2: # 允许最多超出 20%
kept_turns.insert(0, turn)
accumulated_tokens += turn_tokens
logger.debug(f" 为保留最少轮次,允许超出 {overflow_ratio*100:.1f}%")
continue
# 停止保留更早的轮次
break
# 重建消息列表
if self.agent.memory_manager:
discarded_messages = []
for turn in turns[:removed_count]:
discarded_messages.extend(turn["messages"])
if discarded_messages:
user_id = getattr(self.agent, '_current_user_id', None)
self.agent.memory_manager.flush_memory(
messages=discarded_messages, user_id=user_id,
reason="trim", max_messages=0
)
new_messages = []
for turn in kept_turns:
new_messages.extend(turn['messages'])
old_count = len(self.messages)
old_turn_count = len(turns)
self.messages = new_messages
new_count = len(self.messages)
new_turn_count = len(kept_turns)
if old_count > new_count:
logger.info(
f" 移除了 {old_turn_count - new_turn_count} 轮对话 "
f"({old_count} -> {new_count} 条消息,"
f"~{current_tokens + system_tokens} -> ~{accumulated_tokens + system_tokens} tokens)"
)
logger.info(
f" 移除了 {removed_count} 轮对话 "
f"({old_count} -> {len(self.messages)} 条消息,"
f"~{current_tokens + system_tokens} -> ~{kept_tokens + system_tokens} tokens)"
)
def _clear_session_db(self):
"""
Clear the current session's persisted messages from SQLite DB.
This prevents dirty data (broken tool_use/tool_result pairs) from being
reloaded on the next request or after a restart.
"""
try:
session_id = getattr(self.agent, '_current_session_id', None)
if not session_id:
return
from agent.memory import get_conversation_store
store = get_conversation_store()
store.clear_session(session_id)
logger.info(f"🗑️ Cleared dirty session data from DB: {session_id}")
except Exception as e:
logger.warning(f"Failed to clear session DB: {e}")
def _prepare_messages(self) -> List[Dict[str, Any]]:
"""

View File

@@ -0,0 +1,240 @@
"""
Message sanitizer — fix broken tool_use / tool_result pairs.
Provides two public helpers that can be reused across agent_stream.py
and any bot that converts messages to OpenAI format:
1. sanitize_claude_messages(messages)
Operates on the internal Claude-format message list (in-place).
2. drop_orphaned_tool_results_openai(messages)
Operates on an already-converted OpenAI-format message list,
returning a cleaned copy.
"""
from __future__ import annotations
from typing import Dict, List, Set
from common.log import logger
# ------------------------------------------------------------------ #
# Claude-format sanitizer (used by agent_stream)
# ------------------------------------------------------------------ #
def sanitize_claude_messages(messages: List[Dict]) -> int:
"""
Validate and fix a Claude-format message list **in-place**.
Fixes handled:
- Trailing assistant message with tool_use but no following tool_result
- Leading orphaned tool_result user messages
- Mid-list tool_result blocks whose tool_use_id has no matching
tool_use in any preceding assistant message
Returns the number of messages / blocks removed.
"""
if not messages:
return 0
removed = 0
# 1. Remove trailing incomplete tool_use assistant messages
while messages:
last = messages[-1]
if last.get("role") != "assistant":
break
content = last.get("content", [])
if isinstance(content, list) and any(
isinstance(b, dict) and b.get("type") == "tool_use"
for b in content
):
logger.warning("⚠️ Removing trailing incomplete tool_use assistant message")
messages.pop()
removed += 1
else:
break
# 2. Remove leading orphaned tool_result user messages
while messages:
first = messages[0]
if first.get("role") != "user":
break
content = first.get("content", [])
if isinstance(content, list) and _has_block_type(content, "tool_result") \
and not _has_block_type(content, "text"):
logger.warning("⚠️ Removing leading orphaned tool_result user message")
messages.pop(0)
removed += 1
else:
break
# 3. Iteratively remove unmatched tool_use / tool_result until stable.
# Removing one broken message can orphan others (e.g. an assistant msg
# with both matched and unmatched tool_use — deleting it orphans the
# previously-matched tool_result). Loop until clean.
for _ in range(5):
use_ids: Set[str] = set()
result_ids: Set[str] = set()
for msg in messages:
for block in (msg.get("content") or []):
if not isinstance(block, dict):
continue
if block.get("type") == "tool_use" and block.get("id"):
use_ids.add(block["id"])
elif block.get("type") == "tool_result" and block.get("tool_use_id"):
result_ids.add(block["tool_use_id"])
bad_use = use_ids - result_ids
bad_result = result_ids - use_ids
if not bad_use and not bad_result:
break
pass_removed = 0
i = 0
while i < len(messages):
msg = messages[i]
role = msg.get("role")
content = msg.get("content", [])
if not isinstance(content, list):
i += 1
continue
if role == "assistant" and bad_use and any(
isinstance(b, dict) and b.get("type") == "tool_use"
and b.get("id") in bad_use for b in content
):
logger.warning(f"⚠️ Removing assistant msg with unmatched tool_use")
messages.pop(i)
pass_removed += 1
continue
if role == "user" and bad_result and _has_block_type(content, "tool_result"):
has_bad = any(
isinstance(b, dict) and b.get("type") == "tool_result"
and b.get("tool_use_id") in bad_result for b in content
)
if has_bad:
if not _has_block_type(content, "text"):
logger.warning(f"⚠️ Removing user msg with unmatched tool_result")
messages.pop(i)
pass_removed += 1
continue
else:
before = len(content)
msg["content"] = [
b for b in content
if not (isinstance(b, dict) and b.get("type") == "tool_result"
and b.get("tool_use_id") in bad_result)
]
pass_removed += before - len(msg["content"])
i += 1
removed += pass_removed
if pass_removed == 0:
break
if removed:
logger.info(f"🔧 Message validation: removed {removed} broken message(s)")
return removed
# ------------------------------------------------------------------ #
# OpenAI-format sanitizer (used by minimax_bot, openai_compatible_bot)
# ------------------------------------------------------------------ #
def drop_orphaned_tool_results_openai(messages: List[Dict]) -> List[Dict]:
"""
Return a copy of *messages* (OpenAI format) with any ``role=tool``
messages removed if their ``tool_call_id`` does not match a
``tool_calls[].id`` in a preceding assistant message.
"""
known_ids: Set[str] = set()
cleaned: List[Dict] = []
for msg in messages:
if msg.get("role") == "assistant" and msg.get("tool_calls"):
for tc in msg["tool_calls"]:
tc_id = tc.get("id", "")
if tc_id:
known_ids.add(tc_id)
if msg.get("role") == "tool":
ref_id = msg.get("tool_call_id", "")
if ref_id and ref_id not in known_ids:
logger.warning(
f"[MessageSanitizer] Dropping orphaned tool result "
f"(tool_call_id={ref_id} not in known ids)"
)
continue
cleaned.append(msg)
return cleaned
# ------------------------------------------------------------------ #
# Internal helpers
# ------------------------------------------------------------------ #
def _has_block_type(content: list, block_type: str) -> bool:
return any(
isinstance(b, dict) and b.get("type") == block_type
for b in content
)
def _extract_text_from_content(content) -> str:
"""Extract plain text from a message content field (str or list of blocks)."""
if isinstance(content, str):
return content.strip()
if isinstance(content, list):
parts = [
b.get("text", "")
for b in content
if isinstance(b, dict) and b.get("type") == "text"
]
return "\n".join(p for p in parts if p).strip()
return ""
def compress_turn_to_text_only(turn: Dict) -> Dict:
"""
Compress a full turn (with tool_use/tool_result chains) into a lightweight
text-only turn that keeps only the first user text and the last assistant text.
This preserves the conversational context (what the user asked and what the
agent concluded) while stripping out the bulky intermediate tool interactions.
Returns a new turn dict with a ``messages`` list; the original is not mutated.
"""
user_text = ""
last_assistant_text = ""
for msg in turn["messages"]:
role = msg.get("role")
content = msg.get("content", [])
if role == "user":
if isinstance(content, list) and _has_block_type(content, "tool_result"):
continue
if not user_text:
user_text = _extract_text_from_content(content)
elif role == "assistant":
text = _extract_text_from_content(content)
if text:
last_assistant_text = text
compressed_messages = []
if user_text:
compressed_messages.append({
"role": "user",
"content": [{"type": "text", "text": user_text}]
})
if last_assistant_text:
compressed_messages.append({
"role": "assistant",
"content": [{"type": "text", "text": last_assistant_text}]
})
return {"messages": compressed_messages}

View File

@@ -123,13 +123,18 @@ def should_include_skill(
return False
# Check environment variables (API keys)
# Simple rule: All required env vars must be set
# All required env vars must be set
required_env = metadata.requires.get('env', [])
if required_env:
for env_name in required_env:
if not has_env_var(env_name):
# Missing required API key → disable skill
return False
# Check anyEnv (at least one must be present)
any_env = metadata.requires.get('anyEnv', [])
if any_env:
if not any(has_env_var(e) for e in any_env):
return False
return True

View File

@@ -32,6 +32,7 @@ def format_skills_for_prompt(skills: List[Skill]) -> str:
lines.append(f" <name>{_escape_xml(skill.name)}</name>")
lines.append(f" <description>{_escape_xml(skill.description)}</description>")
lines.append(f" <location>{_escape_xml(skill.file_path)}</location>")
lines.append(f" <base_dir>{_escape_xml(skill.base_dir)}</base_dir>")
lines.append(" </skill>")
lines.append("</available_skills>")

View File

@@ -95,11 +95,14 @@ class SkillManager:
for name, entry in self.skills.items():
skill = entry.skill
prev = saved.get(name, {})
# category priority: persisted config (set by cloud) > default "skill"
category = prev.get("category", "skill")
merged[name] = {
"name": name,
"description": skill.description,
"source": skill.source,
"enabled": prev.get("enabled", True),
"category": category,
}
self.skills_config = merged

View File

@@ -8,6 +8,8 @@ other management entry point.
import os
import shutil
import zipfile
import tempfile
from typing import Dict, List, Optional
from common.log import logger
from agent.skills.types import Skill, SkillEntry
@@ -55,7 +57,9 @@ class SkillService:
"""
Add (install) a skill from a remote payload.
The payload follows the socket protocol::
Supported payload types:
1. ``type: "url"`` download individual files::
{
"name": "web_search",
@@ -67,8 +71,15 @@ class SkillService:
]
}
Files are downloaded and saved under the custom skills directory
using *name* as the sub-directory.
2. ``type: "package"`` download a zip archive and extract::
{
"name": "plugin-custom-tool",
"type": "package",
"category": "skills",
"enabled": true,
"files": [{"url": "https://cdn.example.com/skills/custom-tool.zip"}]
}
:param payload: skill add payload from server
"""
@@ -76,25 +87,95 @@ class SkillService:
if not name:
raise ValueError("skill name is required")
payload_type = payload.get("type", "url")
if payload_type == "package":
self._add_package(name, payload)
else:
self._add_url(name, payload)
self.manager.refresh_skills()
category = payload.get("category")
if category and name in self.manager.skills_config:
self.manager.skills_config[name]["category"] = category
self.manager._save_skills_config()
def _add_url(self, name: str, payload: dict) -> None:
"""Install a skill by downloading individual files."""
files = payload.get("files", [])
if not files:
raise ValueError("skill files list is empty")
skill_dir = os.path.join(self.manager.custom_dir, name)
os.makedirs(skill_dir, exist_ok=True)
for file_info in files:
url = file_info.get("url")
rel_path = file_info.get("path")
if not url or not rel_path:
logger.warning(f"[SkillService] add: skip invalid file entry {file_info}")
continue
dest = os.path.join(skill_dir, rel_path)
self._download_file(url, dest)
tmp_dir = skill_dir + ".tmp"
if os.path.exists(tmp_dir):
shutil.rmtree(tmp_dir)
os.makedirs(tmp_dir, exist_ok=True)
# Reload to pick up the new skill and sync config
self.manager.refresh_skills()
logger.info(f"[SkillService] add: skill '{name}' installed ({len(files)} files)")
try:
for file_info in files:
url = file_info.get("url")
rel_path = file_info.get("path")
if not url or not rel_path:
logger.warning(f"[SkillService] add: skip invalid file entry {file_info}")
continue
dest = os.path.join(tmp_dir, rel_path)
self._download_file(url, dest)
except Exception:
shutil.rmtree(tmp_dir, ignore_errors=True)
raise
if os.path.exists(skill_dir):
shutil.rmtree(skill_dir)
os.rename(tmp_dir, skill_dir)
logger.info(f"[SkillService] add: skill '{name}' installed via url ({len(files)} files)")
def _add_package(self, name: str, payload: dict) -> None:
"""
Install a skill by downloading a zip archive and extracting it.
If the archive contains a single top-level directory, that directory
is used as the skill folder directly; otherwise a new directory named
after the skill is created to hold the extracted contents.
"""
files = payload.get("files", [])
if not files or not files[0].get("url"):
raise ValueError("package url is required")
url = files[0]["url"]
skill_dir = os.path.join(self.manager.custom_dir, name)
with tempfile.TemporaryDirectory() as tmp_dir:
zip_path = os.path.join(tmp_dir, "package.zip")
self._download_file(url, zip_path)
if not zipfile.is_zipfile(zip_path):
raise ValueError(f"downloaded file is not a valid zip archive: {url}")
extract_dir = os.path.join(tmp_dir, "extracted")
with zipfile.ZipFile(zip_path, "r") as zf:
zf.extractall(extract_dir)
# Determine the actual content root.
# If the zip has a single top-level directory, use its contents
# so the skill folder is clean (no extra nesting).
top_items = [
item for item in os.listdir(extract_dir)
if not item.startswith(".")
]
if len(top_items) == 1:
single = os.path.join(extract_dir, top_items[0])
if os.path.isdir(single):
extract_dir = single
if os.path.exists(skill_dir):
shutil.rmtree(skill_dir)
shutil.copytree(extract_dir, skill_dir)
logger.info(f"[SkillService] add: skill '{name}' installed via package ({url})")
# ------------------------------------------------------------------
# open / close (enable / disable)

View File

@@ -55,6 +55,24 @@ def _import_optional_tools():
except Exception as e:
logger.error(f"[Tools] WebSearch failed to load: {e}")
# WebFetch Tool
try:
from agent.tools.web_fetch.web_fetch import WebFetch
tools['WebFetch'] = WebFetch
except ImportError as e:
logger.error(f"[Tools] WebFetch not loaded - missing dependency: {e}")
except Exception as e:
logger.error(f"[Tools] WebFetch failed to load: {e}")
# Vision Tool (conditionally loaded based on API key availability)
try:
from agent.tools.vision.vision import Vision
tools['Vision'] = Vision
except ImportError as e:
logger.error(f"[Tools] Vision not loaded - missing dependency: {e}")
except Exception as e:
logger.error(f"[Tools] Vision failed to load: {e}")
return tools
# Load optional tools
@@ -62,6 +80,8 @@ _optional_tools = _import_optional_tools()
EnvConfig = _optional_tools.get('EnvConfig')
SchedulerTool = _optional_tools.get('SchedulerTool')
WebSearch = _optional_tools.get('WebSearch')
WebFetch = _optional_tools.get('WebFetch')
Vision = _optional_tools.get('Vision')
GoogleSearch = _optional_tools.get('GoogleSearch')
FileSave = _optional_tools.get('FileSave')
Terminal = _optional_tools.get('Terminal')
@@ -102,6 +122,8 @@ __all__ = [
'EnvConfig',
'SchedulerTool',
'WebSearch',
'WebFetch',
'Vision',
# Optional tools (may be None if dependencies not available)
# 'BrowserTool'
]

View File

@@ -3,6 +3,7 @@ Bash tool - Execute bash commands
"""
import os
import re
import sys
import subprocess
import tempfile
@@ -83,12 +84,13 @@ SAFETY:
# Load environment variables from ~/.cow/.env if it exists
env_file = expand_path("~/.cow/.env")
dotenv_vars = {}
if os.path.exists(env_file):
try:
from dotenv import dotenv_values
env_vars = dotenv_values(env_file)
env.update(env_vars)
logger.debug(f"[Bash] Loaded {len(env_vars)} variables from {env_file}")
dotenv_vars = dotenv_values(env_file)
env.update(dotenv_vars)
logger.debug(f"[Bash] Loaded {len(dotenv_vars)} variables from {env_file}")
except ImportError:
logger.debug("[Bash] python-dotenv not installed, skipping .env loading")
except Exception as e:
@@ -100,6 +102,13 @@ SAFETY:
else:
logger.debug(f"[Bash] Process User: {os.environ.get('USERNAME', os.environ.get('USER', 'unknown'))}")
# On Windows, convert $VAR references to %VAR% for cmd.exe
if sys.platform == "win32":
env["PYTHONIOENCODING"] = "utf-8"
command = self._convert_env_vars_for_windows(command, dotenv_vars)
if command and not command.strip().lower().startswith("chcp"):
command = f"chcp 65001 >nul 2>&1 && {command}"
# Execute command with inherited environment variables
result = subprocess.run(
command,
@@ -108,6 +117,8 @@ SAFETY:
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
encoding="utf-8",
errors="replace",
timeout=timeout,
env=env
)
@@ -131,6 +142,8 @@ SAFETY:
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
encoding="utf-8",
errors="replace",
timeout=timeout,
env=env
)
@@ -258,3 +271,21 @@ SAFETY:
return "This command will recursively delete system directories"
return "" # No warning needed
@staticmethod
def _convert_env_vars_for_windows(command: str, dotenv_vars: dict) -> str:
"""
Convert bash-style $VAR / ${VAR} references to cmd.exe %VAR% syntax.
Only converts variables loaded from .env (user-configured API keys etc.)
to avoid breaking $PATH, jq expressions, regex, etc.
"""
if not dotenv_vars:
return command
def replace_match(m):
var_name = m.group(1) or m.group(2)
if var_name in dotenv_vars:
return f"%{var_name}%"
return m.group(0)
return re.sub(r'\$\{(\w+)\}|\$(\w+)', replace_match, command)

View File

@@ -94,7 +94,7 @@ class Ls(BaseTool):
results.append(entry + '/')
else:
results.append(entry)
except:
except Exception:
# Skip entries we can't stat
continue

View File

@@ -240,8 +240,8 @@ class Read(BaseTool):
"message": f"文件过大 ({format_size(file_size)} > 50MB),无法读取内容。文件路径: {absolute_path}"
})
# Read file
with open(absolute_path, 'r', encoding='utf-8') as f:
# Read file (utf-8-sig strips BOM automatically on Windows)
with open(absolute_path, 'r', encoding='utf-8-sig') as f:
content = f.read()
# Truncate content if too long (20K characters max for model context)

View File

@@ -134,12 +134,13 @@ def _execute_agent_task(task: dict, agent_bridge):
elif channel_type == "dingtalk":
# DingTalk requires msg object, set to None for scheduled tasks
context["msg"] = None
# 如果是单聊,需要传递 sender_staff_id
if not is_group:
sender_staff_id = action.get("dingtalk_sender_staff_id")
if sender_staff_id:
context["dingtalk_sender_staff_id"] = sender_staff_id
elif channel_type == "wecom_bot":
context["msg"] = None
# Use Agent to execute the task
# Mark this as a scheduled task execution to prevent recursive task creation
context["is_scheduled_task"] = True
@@ -234,7 +235,9 @@ def _execute_send_message(task: dict, agent_bridge):
logger.debug(f"[Scheduler] DingTalk single chat: sender_staff_id={sender_staff_id}")
else:
logger.warning(f"[Scheduler] Task {task['id']}: DingTalk single chat message missing sender_staff_id")
elif channel_type == "wecom_bot":
context["msg"] = None
# Create reply
reply = Reply(ReplyType.TEXT, content)
@@ -327,31 +330,31 @@ def _execute_tool_call(task: dict, agent_bridge):
context["request_id"] = request_id
logger.debug(f"[Scheduler] Generated request_id for web channel: {request_id}")
elif channel_type == "feishu":
# Feishu channel: for scheduled tasks, send as new message (no msg_id to reply to)
context["receive_id_type"] = "chat_id" if is_group else "open_id"
context["msg"] = None
logger.debug(f"[Scheduler] Feishu: receive_id_type={context['receive_id_type']}, is_group={is_group}, receiver={receiver}")
elif channel_type == "wecom_bot":
context["msg"] = None
reply = Reply(ReplyType.TEXT, content)
# Get channel and send
from channel.channel_factory import create_channel
try:
channel = create_channel(channel_type)
if channel:
# For web channel, register the request_id to session mapping
if channel_type == "web" and hasattr(channel, 'request_to_session'):
channel.request_to_session[request_id] = receiver
logger.debug(f"[Scheduler] Registered request_id {request_id} -> session {receiver}")
channel.send(reply, context)
logger.info(f"[Scheduler] Task {task['id']} executed: sent tool result to {receiver}")
else:
logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
except Exception as e:
logger.error(f"[Scheduler] Failed to send tool result: {e}")
except Exception as e:
logger.error(f"[Scheduler] Error in _execute_tool_call: {e}")
@@ -409,7 +412,9 @@ def _execute_skill_call(task: dict, agent_bridge):
elif channel_type == "feishu":
context["receive_id_type"] = "chat_id" if is_group else "open_id"
context["msg"] = None
elif channel_type == "wecom_bot":
context["msg"] = None
# Use Agent to execute the skill
try:
# Don't clear history - scheduler tasks use isolated session_id so they won't pollute user conversations
@@ -451,8 +456,7 @@ def attach_scheduler_to_tool(tool, context: Context = None):
if context:
tool.current_context = context
# Also set channel_type from config
channel_type = conf().get("channel_type", "unknown")
channel_type = context.get("channel_type") or conf().get("channel_type", "unknown")
if not tool.config:
tool.config = {}
tool.config["channel_type"] = channel_type

View File

@@ -147,7 +147,7 @@ class SchedulerService:
return False
return now >= next_run
except:
except Exception:
return False
def _calculate_next_run(self, task: dict, from_time: datetime) -> Optional[datetime]:
@@ -195,7 +195,7 @@ class SchedulerService:
# Only return if in the future
if run_at > from_time:
return run_at
except:
except Exception:
pass
return None

View File

@@ -424,7 +424,7 @@ class SchedulerTool(BaseTool):
try:
dt = datetime.fromisoformat(run_at)
return f"一次性 ({dt.strftime('%Y-%m-%d %H:%M')})"
except:
except Exception:
return "一次性"
return "未知"
@@ -438,6 +438,6 @@ class SchedulerTool(BaseTool):
return msg.other_user_nickname or "群聊"
else:
return msg.from_user_nickname or "用户"
except:
except Exception:
pass
return "未知"

View File

@@ -72,7 +72,7 @@ class TaskStore:
with open(self.store_path, 'r') as src:
with open(backup_path, 'w') as dst:
dst.write(src.read())
except:
except Exception:
pass
# Save tasks

View File

@@ -14,14 +14,14 @@ class Send(BaseTool):
"""Tool for sending files to the user"""
name: str = "send"
description: str = "Send a file (image, video, audio, document) to the user. Use this when the user explicitly asks to send/share a file."
description: str = "Send a LOCAL file (image, video, audio, document) to the user. Only for local file paths. Do NOT use this for URLs — URLs should be included directly in your text reply, the system will handle them automatically."
params: dict = {
"type": "object",
"properties": {
"path": {
"type": "string",
"description": "Path to the file to send. Can be absolute path or relative to workspace."
"description": "Local file path to send. Must be an absolute path or relative to workspace. Do NOT pass URLs here."
},
"message": {
"type": "string",

View File

@@ -0,0 +1 @@
from agent.tools.vision.vision import Vision

View File

@@ -0,0 +1,255 @@
"""
Vision tool - Analyze images using OpenAI-compatible Vision API.
Supports local files (auto base64-encoded) and HTTP URLs.
Providers: OpenAI (preferred) > LinkAI (fallback).
"""
import base64
import os
import subprocess
import tempfile
from typing import Any, Dict, Optional, Tuple
import requests
from agent.tools.base_tool import BaseTool, ToolResult
from common.log import logger
from config import conf
DEFAULT_MODEL = "gpt-4.1-mini"
DEFAULT_TIMEOUT = 60
MAX_TOKENS = 1000
COMPRESS_THRESHOLD = 1_048_576 # 1 MB
SUPPORTED_EXTENSIONS = {
"jpg": "image/jpeg",
"jpeg": "image/jpeg",
"png": "image/png",
"gif": "image/gif",
"webp": "image/webp",
}
class Vision(BaseTool):
"""Analyze images using OpenAI-compatible Vision API"""
name: str = "vision"
description: str = (
"Analyze an image (local file or URL) using Vision API. "
"Can describe content, extract text, identify objects, colors, etc. "
"Requires OPENAI_API_KEY or LINKAI_API_KEY."
)
params: dict = {
"type": "object",
"properties": {
"image": {
"type": "string",
"description": "Local file path or HTTP(S) URL of the image to analyze",
},
"question": {
"type": "string",
"description": "Question to ask about the image",
},
"model": {
"type": "string",
"description": (
f"Vision model to use (default: {DEFAULT_MODEL}). "
"Options: gpt-4.1-mini, gpt-4.1, gpt-4o-mini, gpt-4o"
),
},
},
"required": ["image", "question"],
}
def __init__(self, config: dict = None):
self.config = config or {}
@staticmethod
def is_available() -> bool:
return bool(
conf().get("open_ai_api_key") or os.environ.get("OPENAI_API_KEY")
or conf().get("linkai_api_key") or os.environ.get("LINKAI_API_KEY")
)
def execute(self, args: Dict[str, Any]) -> ToolResult:
image = args.get("image", "").strip()
question = args.get("question", "").strip()
model = args.get("model", DEFAULT_MODEL).strip() or DEFAULT_MODEL
if not image:
return ToolResult.fail("Error: 'image' parameter is required")
if not question:
return ToolResult.fail("Error: 'question' parameter is required")
api_key, api_base = self._resolve_provider()
if not api_key:
return ToolResult.fail(
"Error: No API key configured for Vision.\n"
"Please configure one of the following using env_config tool:\n"
" 1. OPENAI_API_KEY (preferred): env_config(action=\"set\", key=\"OPENAI_API_KEY\", value=\"your-key\")\n"
" 2. LINKAI_API_KEY (fallback): env_config(action=\"set\", key=\"LINKAI_API_KEY\", value=\"your-key\")\n\n"
"Get your key at: https://platform.openai.com/api-keys or https://link-ai.tech"
)
try:
image_content = self._build_image_content(image)
except Exception as e:
return ToolResult.fail(f"Error: {e}")
try:
return self._call_api(api_key, api_base, model, question, image_content)
except requests.Timeout:
return ToolResult.fail(f"Error: Vision API request timed out after {DEFAULT_TIMEOUT}s")
except requests.ConnectionError:
return ToolResult.fail("Error: Failed to connect to Vision API")
except Exception as e:
logger.error(f"[Vision] Unexpected error: {e}", exc_info=True)
return ToolResult.fail(f"Error: Vision API call failed - {e}")
def _resolve_provider(self) -> Tuple[Optional[str], str]:
"""Resolve API key and base URL. Priority: conf() > env vars."""
api_key = conf().get("open_ai_api_key") or os.environ.get("OPENAI_API_KEY")
if api_key:
api_base = (conf().get("open_ai_api_base") or os.environ.get("OPENAI_API_BASE", "")).rstrip("/") \
or "https://api.openai.com/v1"
return api_key, self._ensure_v1(api_base)
api_key = conf().get("linkai_api_key") or os.environ.get("LINKAI_API_KEY")
if api_key:
api_base = (conf().get("linkai_api_base") or os.environ.get("LINKAI_API_BASE", "")).rstrip("/") \
or "https://api.link-ai.tech"
logger.debug("[Vision] Using LinkAI API (OPENAI_API_KEY not set)")
return api_key, self._ensure_v1(api_base)
return None, ""
@staticmethod
def _ensure_v1(api_base: str) -> str:
"""Append /v1 if the base URL doesn't already end with a versioned path."""
if not api_base:
return api_base
# Already has /v1 or similar version suffix
if api_base.rstrip("/").split("/")[-1].startswith("v"):
return api_base
return api_base.rstrip("/") + "/v1"
def _build_image_content(self, image: str) -> dict:
"""Build the image_url content block for the API request."""
if image.startswith(("http://", "https://")):
return {"type": "image_url", "image_url": {"url": image}}
if not os.path.isfile(image):
raise FileNotFoundError(f"Image file not found: {image}")
ext = image.rsplit(".", 1)[-1].lower() if "." in image else ""
mime_type = SUPPORTED_EXTENSIONS.get(ext)
if not mime_type:
raise ValueError(
f"Unsupported image format '.{ext}'. "
f"Supported: {', '.join(SUPPORTED_EXTENSIONS.keys())}"
)
file_path = self._maybe_compress(image)
try:
with open(file_path, "rb") as f:
b64 = base64.b64encode(f.read()).decode("ascii")
finally:
if file_path != image and os.path.exists(file_path):
os.remove(file_path)
data_url = f"data:{mime_type};base64,{b64}"
return {"type": "image_url", "image_url": {"url": data_url}}
@staticmethod
def _maybe_compress(path: str) -> str:
"""Compress image if larger than threshold; return path to use."""
file_size = os.path.getsize(path)
if file_size <= COMPRESS_THRESHOLD:
return path
tmp = tempfile.NamedTemporaryFile(suffix=".jpg", delete=False)
tmp.close()
try:
# macOS: use sips
subprocess.run(
["sips", "-Z", "800", path, "--out", tmp.name],
capture_output=True, check=True,
)
logger.debug(f"[Vision] Compressed image ({file_size // 1024}KB -> {os.path.getsize(tmp.name) // 1024}KB)")
return tmp.name
except (FileNotFoundError, subprocess.CalledProcessError):
pass
try:
# Linux: use ImageMagick convert
subprocess.run(
["convert", path, "-resize", "800x800>", tmp.name],
capture_output=True, check=True,
)
logger.debug(f"[Vision] Compressed image ({file_size // 1024}KB -> {os.path.getsize(tmp.name) // 1024}KB)")
return tmp.name
except (FileNotFoundError, subprocess.CalledProcessError):
pass
os.remove(tmp.name)
return path
def _call_api(self, api_key: str, api_base: str, model: str,
question: str, image_content: dict) -> ToolResult:
payload = {
"model": model,
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": question},
image_content,
],
}
],
"max_tokens": MAX_TOKENS,
}
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
}
resp = requests.post(
f"{api_base}/chat/completions",
headers=headers,
json=payload,
timeout=DEFAULT_TIMEOUT,
)
if resp.status_code == 401:
return ToolResult.fail("Error: Invalid API key. Please check your configuration.")
if resp.status_code == 429:
return ToolResult.fail("Error: API rate limit reached. Please try again later.")
if resp.status_code != 200:
return ToolResult.fail(f"Error: Vision API returned HTTP {resp.status_code}: {resp.text[:200]}")
data = resp.json()
if "error" in data:
msg = data["error"].get("message", "Unknown API error")
return ToolResult.fail(f"Error: Vision API error - {msg}")
content = ""
choices = data.get("choices", [])
if choices:
content = choices[0].get("message", {}).get("content", "")
usage = data.get("usage", {})
result = {
"model": model,
"content": content,
"usage": {
"prompt_tokens": usage.get("prompt_tokens", 0),
"completion_tokens": usage.get("completion_tokens", 0),
"total_tokens": usage.get("total_tokens", 0),
},
}
return ToolResult.success(result)

View File

View File

@@ -0,0 +1,444 @@
"""
Web Fetch tool - Fetch and extract readable content from web pages and remote files.
Supports:
- HTML web pages: extracts readable text content
- Document files (PDF, Word, TXT, Markdown, etc.): downloads to workspace/tmp and parses content
"""
import os
import re
import uuid
from typing import Dict, Any, Optional, Set
from urllib.parse import urlparse, unquote
import requests
from agent.tools.base_tool import BaseTool, ToolResult
from agent.tools.utils.truncate import truncate_head, format_size
from common.log import logger
DEFAULT_TIMEOUT = 30
MAX_FILE_SIZE = 50 * 1024 * 1024 # 50MB
DEFAULT_HEADERS = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36",
"Accept": "*/*",
}
# Supported document file extensions
PDF_SUFFIXES: Set[str] = {".pdf"}
WORD_SUFFIXES: Set[str] = {".docx"}
TEXT_SUFFIXES: Set[str] = {".txt", ".md", ".markdown", ".rst", ".csv", ".tsv", ".log"}
SPREADSHEET_SUFFIXES: Set[str] = {".xls", ".xlsx"}
PPT_SUFFIXES: Set[str] = {".ppt", ".pptx"}
ALL_DOC_SUFFIXES = PDF_SUFFIXES | WORD_SUFFIXES | TEXT_SUFFIXES | SPREADSHEET_SUFFIXES | PPT_SUFFIXES
_CHARSET_RE = re.compile(r'charset\s*=\s*["\']?\s*([\w\-]+)', re.IGNORECASE)
_META_CHARSET_RE = re.compile(rb'<meta[^>]+charset\s*=\s*["\']?\s*([\w\-]+)', re.IGNORECASE)
_META_HTTP_EQUIV_RE = re.compile(
rb'<meta[^>]+http-equiv\s*=\s*["\']?Content-Type["\']?[^>]+content\s*=\s*["\'][^"\']*charset=([\w\-]+)',
re.IGNORECASE,
)
def _extract_charset_from_content_type(content_type: str) -> Optional[str]:
"""Extract charset from Content-Type header value."""
m = _CHARSET_RE.search(content_type)
return m.group(1) if m else None
def _extract_charset_from_html_meta(raw_bytes: bytes) -> Optional[str]:
"""Extract charset from HTML <meta> tags in the first few KB of raw bytes."""
m = _META_CHARSET_RE.search(raw_bytes)
if m:
return m.group(1).decode("ascii", errors="ignore")
m = _META_HTTP_EQUIV_RE.search(raw_bytes)
if m:
return m.group(1).decode("ascii", errors="ignore")
return None
def _get_url_suffix(url: str) -> str:
"""Extract file extension from URL path, ignoring query params."""
path = urlparse(url).path
return os.path.splitext(path)[-1].lower()
def _is_document_url(url: str) -> bool:
"""Check if URL points to a downloadable document file."""
suffix = _get_url_suffix(url)
return suffix in ALL_DOC_SUFFIXES
class WebFetch(BaseTool):
"""Tool for fetching web pages and remote document files"""
name: str = "web_fetch"
description: str = (
"Fetch content from a URL. For web pages, extracts readable text. "
"For document files (PDF, Word, TXT, Markdown, Excel, PPT), downloads and parses the file content. "
"Supported file types: .pdf, .docx, .txt, .md, .csv, .xls, .xlsx, .ppt, .pptx"
)
params: dict = {
"type": "object",
"properties": {
"url": {
"type": "string",
"description": "The HTTP/HTTPS URL to fetch (web page or document file link)"
}
},
"required": ["url"]
}
def __init__(self, config: dict = None):
self.config = config or {}
self.cwd = self.config.get("cwd", os.getcwd())
def execute(self, args: Dict[str, Any]) -> ToolResult:
url = args.get("url", "").strip()
if not url:
return ToolResult.fail("Error: 'url' parameter is required")
parsed = urlparse(url)
if parsed.scheme not in ("http", "https"):
return ToolResult.fail("Error: Invalid URL (must start with http:// or https://)")
if _is_document_url(url):
return self._fetch_document(url)
return self._fetch_webpage(url)
# ---- Web page fetching ----
def _fetch_webpage(self, url: str) -> ToolResult:
"""Fetch and extract readable text from an HTML web page."""
parsed = urlparse(url)
try:
response = requests.get(
url,
headers=DEFAULT_HEADERS,
timeout=DEFAULT_TIMEOUT,
allow_redirects=True,
)
response.raise_for_status()
except requests.Timeout:
return ToolResult.fail(f"Error: Request timed out after {DEFAULT_TIMEOUT}s")
except requests.ConnectionError:
return ToolResult.fail(f"Error: Failed to connect to {parsed.netloc}")
except requests.HTTPError as e:
return ToolResult.fail(f"Error: HTTP {e.response.status_code} for URL: {url}")
except Exception as e:
return ToolResult.fail(f"Error: Failed to fetch URL: {e}")
content_type = response.headers.get("Content-Type", "")
if self._is_binary_content_type(content_type) and not _is_document_url(url):
return self._handle_download_by_content_type(url, response, content_type)
response.encoding = self._detect_encoding(response)
html = response.text
title = self._extract_title(html)
text = self._extract_text(html)
return ToolResult.success(f"Title: {title}\n\nContent:\n{text}")
# ---- Document fetching ----
def _fetch_document(self, url: str) -> ToolResult:
"""Download a document file and extract its text content."""
suffix = _get_url_suffix(url)
parsed = urlparse(url)
filename = self._extract_filename(url)
tmp_dir = self._ensure_tmp_dir()
local_path = os.path.join(tmp_dir, filename)
logger.info(f"[WebFetch] Downloading document: {url} -> {local_path}")
try:
response = requests.get(
url,
headers=DEFAULT_HEADERS,
timeout=DEFAULT_TIMEOUT,
stream=True,
allow_redirects=True,
)
response.raise_for_status()
content_length = int(response.headers.get("Content-Length", 0))
if content_length > MAX_FILE_SIZE:
return ToolResult.fail(
f"Error: File too large ({format_size(content_length)} > {format_size(MAX_FILE_SIZE)})"
)
downloaded = 0
with open(local_path, "wb") as f:
for chunk in response.iter_content(chunk_size=8192):
downloaded += len(chunk)
if downloaded > MAX_FILE_SIZE:
f.close()
os.remove(local_path)
return ToolResult.fail(
f"Error: File too large (>{format_size(MAX_FILE_SIZE)}), download aborted"
)
f.write(chunk)
except requests.Timeout:
return ToolResult.fail(f"Error: Download timed out after {DEFAULT_TIMEOUT}s")
except requests.ConnectionError:
return ToolResult.fail(f"Error: Failed to connect to {parsed.netloc}")
except requests.HTTPError as e:
return ToolResult.fail(f"Error: HTTP {e.response.status_code} for URL: {url}")
except Exception as e:
self._cleanup_file(local_path)
return ToolResult.fail(f"Error: Failed to download file: {e}")
try:
text = self._parse_document(local_path, suffix)
except Exception as e:
self._cleanup_file(local_path)
return ToolResult.fail(f"Error: Failed to parse document: {e}")
if not text or not text.strip():
file_size = os.path.getsize(local_path)
return ToolResult.success(
f"File downloaded to: {local_path} ({format_size(file_size)})\n"
f"No text content could be extracted. The file may contain only images or be encrypted."
)
truncation = truncate_head(text)
result_text = truncation.content
file_size = os.path.getsize(local_path)
header = f"[Document: {filename} | Size: {format_size(file_size)} | Saved to: {local_path}]\n\n"
if truncation.truncated:
header += f"[Content truncated: showing {truncation.output_lines} of {truncation.total_lines} lines]\n\n"
return ToolResult.success(header + result_text)
def _parse_document(self, file_path: str, suffix: str) -> str:
"""Parse document file and return extracted text."""
if suffix in PDF_SUFFIXES:
return self._parse_pdf(file_path)
elif suffix in WORD_SUFFIXES:
return self._parse_word(file_path)
elif suffix in TEXT_SUFFIXES:
return self._parse_text(file_path)
elif suffix in SPREADSHEET_SUFFIXES:
return self._parse_spreadsheet(file_path)
elif suffix in PPT_SUFFIXES:
return self._parse_ppt(file_path)
else:
return self._parse_text(file_path)
def _parse_pdf(self, file_path: str) -> str:
"""Extract text from PDF using pypdf."""
try:
from pypdf import PdfReader
except ImportError:
raise ImportError("pypdf library is required for PDF parsing. Install with: pip install pypdf")
reader = PdfReader(file_path)
text_parts = []
for page_num, page in enumerate(reader.pages, 1):
page_text = page.extract_text()
if page_text and page_text.strip():
text_parts.append(f"--- Page {page_num}/{len(reader.pages)} ---\n{page_text}")
return "\n\n".join(text_parts)
def _parse_word(self, file_path: str) -> str:
"""Extract text from Word documents (.docx)."""
try:
from docx import Document
except ImportError:
raise ImportError(
"python-docx library is required for .docx parsing. Install with: pip install python-docx"
)
doc = Document(file_path)
paragraphs = [p.text for p in doc.paragraphs if p.text.strip()]
return "\n\n".join(paragraphs)
def _parse_text(self, file_path: str) -> str:
"""Read plain text files (txt, md, csv, etc.)."""
encodings = ["utf-8", "utf-8-sig", "gbk", "gb2312", "latin-1"]
for enc in encodings:
try:
with open(file_path, "r", encoding=enc) as f:
return f.read()
except (UnicodeDecodeError, UnicodeError):
continue
raise ValueError(f"Unable to decode file with any supported encoding: {encodings}")
def _parse_spreadsheet(self, file_path: str) -> str:
"""Extract text from Excel files (.xls/.xlsx)."""
try:
import openpyxl
except ImportError:
raise ImportError(
"openpyxl library is required for .xlsx parsing. Install with: pip install openpyxl"
)
wb = openpyxl.load_workbook(file_path, read_only=True, data_only=True)
result_parts = []
for sheet_name in wb.sheetnames:
ws = wb[sheet_name]
rows = []
for row in ws.iter_rows(values_only=True):
cells = [str(c) if c is not None else "" for c in row]
if any(cells):
rows.append(" | ".join(cells))
if rows:
result_parts.append(f"--- Sheet: {sheet_name} ---\n" + "\n".join(rows))
wb.close()
return "\n\n".join(result_parts)
def _parse_ppt(self, file_path: str) -> str:
"""Extract text from PowerPoint files (.ppt/.pptx)."""
try:
from pptx import Presentation
except ImportError:
raise ImportError(
"python-pptx library is required for .pptx parsing. Install with: pip install python-pptx"
)
prs = Presentation(file_path)
text_parts = []
for slide_num, slide in enumerate(prs.slides, 1):
slide_texts = []
for shape in slide.shapes:
if shape.has_text_frame:
for paragraph in shape.text_frame.paragraphs:
text = paragraph.text.strip()
if text:
slide_texts.append(text)
if slide_texts:
text_parts.append(f"--- Slide {slide_num}/{len(prs.slides)} ---\n" + "\n".join(slide_texts))
return "\n\n".join(text_parts)
# ---- Encoding detection ----
@staticmethod
def _detect_encoding(response: requests.Response) -> str:
"""Detect response encoding with priority: Content-Type header > HTML meta > chardet > utf-8."""
# 1. Check Content-Type header for explicit charset
content_type = response.headers.get("Content-Type", "")
charset = _extract_charset_from_content_type(content_type)
if charset:
return charset
# 2. Scan raw bytes for HTML meta charset declaration
raw = response.content[:4096]
charset = _extract_charset_from_html_meta(raw)
if charset:
return charset
# 3. Use apparent_encoding (chardet-based detection) if confident enough
apparent = response.apparent_encoding
if apparent:
apparent_lower = apparent.lower()
# Trust CJK / Windows encodings detected by chardet
trusted_prefixes = ("utf", "gb", "big5", "euc", "shift_jis", "iso-2022", "windows", "ascii")
if any(apparent_lower.startswith(p) for p in trusted_prefixes):
return apparent
# 4. Fallback
return "utf-8"
# ---- Helper methods ----
def _ensure_tmp_dir(self) -> str:
"""Ensure workspace/tmp directory exists and return its path."""
tmp_dir = os.path.join(self.cwd, "tmp")
os.makedirs(tmp_dir, exist_ok=True)
return tmp_dir
def _extract_filename(self, url: str) -> str:
"""Extract a safe filename from URL, with a short UUID prefix to avoid collisions."""
path = urlparse(url).path
basename = os.path.basename(unquote(path))
if not basename or basename == "/":
basename = "downloaded_file"
# Sanitize: keep only safe chars
basename = re.sub(r'[^\w.\-]', '_', basename)
short_id = uuid.uuid4().hex[:8]
return f"{short_id}_{basename}"
@staticmethod
def _cleanup_file(path: str):
"""Remove a file if it exists, ignoring errors."""
try:
if os.path.exists(path):
os.remove(path)
except Exception:
pass
@staticmethod
def _is_binary_content_type(content_type: str) -> bool:
"""Check if Content-Type indicates a binary/document response."""
binary_types = [
"application/pdf",
"application/vnd.openxmlformats",
"application/vnd.ms-excel",
"application/vnd.ms-powerpoint",
"application/octet-stream",
]
ct_lower = content_type.lower()
return any(bt in ct_lower for bt in binary_types)
def _handle_download_by_content_type(self, url: str, response: requests.Response, content_type: str) -> ToolResult:
"""Handle a URL that returned binary content instead of HTML."""
ct_lower = content_type.lower()
suffix_map = {
"application/pdf": ".pdf",
"application/vnd.openxmlformats-officedocument.wordprocessingml": ".docx",
"application/vnd.ms-excel": ".xls",
"application/vnd.openxmlformats-officedocument.spreadsheetml": ".xlsx",
"application/vnd.ms-powerpoint": ".ppt",
"application/vnd.openxmlformats-officedocument.presentationml": ".pptx",
}
detected_suffix = None
for ct_prefix, ext in suffix_map.items():
if ct_prefix in ct_lower:
detected_suffix = ext
break
if detected_suffix and detected_suffix in ALL_DOC_SUFFIXES:
# Re-fetch as document
return self._fetch_document(url if _get_url_suffix(url) in ALL_DOC_SUFFIXES
else self._rewrite_url_with_suffix(url, detected_suffix))
return ToolResult.fail(f"Error: URL returned binary content ({content_type}), not a supported document type")
@staticmethod
def _rewrite_url_with_suffix(url: str, suffix: str) -> str:
"""Append a suffix to the URL path so _get_url_suffix works correctly."""
parsed = urlparse(url)
new_path = parsed.path.rstrip("/") + suffix
return parsed._replace(path=new_path).geturl()
# ---- HTML extraction (unchanged) ----
@staticmethod
def _extract_title(html: str) -> str:
match = re.search(r"<title[^>]*>(.*?)</title>", html, re.IGNORECASE | re.DOTALL)
return match.group(1).strip() if match else "Untitled"
@staticmethod
def _extract_text(html: str) -> str:
text = re.sub(r"<script[^>]*>.*?</script>", "", html, flags=re.IGNORECASE | re.DOTALL)
text = re.sub(r"<style[^>]*>.*?</style>", "", text, flags=re.IGNORECASE | re.DOTALL)
text = re.sub(r"<[^>]+>", "", text)
text = text.replace("&amp;", "&").replace("&lt;", "<").replace("&gt;", ">")
text = text.replace("&quot;", '"').replace("&#39;", "'").replace("&nbsp;", " ")
text = re.sub(r"[^\S\n]+", " ", text)
text = re.sub(r"\n{3,}", "\n\n", text)
lines = [line.strip() for line in text.splitlines()]
text = "\n".join(lines)
return text.strip()

View File

@@ -13,6 +13,7 @@ import requests
from agent.tools.base_tool import BaseTool, ToolResult
from common.log import logger
from config import conf
# Default timeout for API requests (seconds)
@@ -23,11 +24,7 @@ class WebSearch(BaseTool):
"""Tool for searching the web using Bocha or LinkAI search API"""
name: str = "web_search"
description: str = (
"Search the web for current information, news, research topics, or any real-time data. "
"Returns web page titles, URLs, snippets, and optional summaries. "
"Use this when the user asks about recent events, needs fact-checking, or wants up-to-date information."
)
description: str = "Search the web for real-time information. Returns titles, URLs, and snippets."
params: dict = {
"type": "object",
@@ -225,7 +222,8 @@ class WebSearch(BaseTool):
:return: Formatted search results
"""
api_key = os.environ.get("LINKAI_API_KEY", "")
url = "https://api.link-ai.tech/v1/plugin/execute"
api_base = conf().get("linkai_api_base", "https://api.link-ai.tech")
url = f"{api_base.rstrip('/')}/v1/plugin/execute"
headers = {
"Content-Type": "application/json",

229
app.py
View File

@@ -13,7 +13,6 @@ from plugins import *
import threading
# Global channel manager for restart support
_channel_mgr = None
@@ -21,94 +20,200 @@ def get_channel_manager():
return _channel_mgr
def _parse_channel_type(raw) -> list:
"""
Parse channel_type config value into a list of channel names.
Supports:
- single string: "feishu"
- comma-separated string: "feishu, dingtalk"
- list: ["feishu", "dingtalk"]
"""
if isinstance(raw, list):
return [ch.strip() for ch in raw if ch.strip()]
if isinstance(raw, str):
return [ch.strip() for ch in raw.split(",") if ch.strip()]
return []
class ChannelManager:
"""
Manage the lifecycle of a channel, supporting restart from sub-threads.
The channel.startup() runs in a daemon thread so that the main thread
remains available and a new channel can be started at any time.
Manage the lifecycle of multiple channels running concurrently.
Each channel.startup() runs in its own daemon thread.
The web channel is started as default console unless explicitly disabled.
"""
def __init__(self):
self._channel = None
self._channel_thread = None
self._channels = {} # channel_name -> channel instance
self._threads = {} # channel_name -> thread
self._primary_channel = None
self._lock = threading.Lock()
self.cloud_mode = False # set to True when cloud client is active
@property
def channel(self):
return self._channel
"""Return the primary (first non-web) channel for backward compatibility."""
return self._primary_channel
def start(self, channel_name: str, first_start: bool = False):
def get_channel(self, channel_name: str):
return self._channels.get(channel_name)
def start(self, channel_names: list, first_start: bool = False):
"""
Create and start a channel in a sub-thread.
Create and start one or more channels in sub-threads.
If first_start is True, plugins and linkai client will also be initialized.
"""
with self._lock:
channel = channel_factory.create_channel(channel_name)
self._channel = channel
channels = []
for name in channel_names:
ch = channel_factory.create_channel(name)
ch.cloud_mode = self.cloud_mode
self._channels[name] = ch
channels.append((name, ch))
if self._primary_channel is None and name != "web":
self._primary_channel = ch
if self._primary_channel is None and channels:
self._primary_channel = channels[0][1]
if first_start:
if channel_name in ["wx", "wxy", "terminal", "wechatmp", "web",
"wechatmp_service", "wechatcom_app", "wework",
const.FEISHU, const.DINGTALK]:
PluginManager().load_plugins()
PluginManager().load_plugins()
if conf().get("use_linkai"):
try:
from common import cloud_client
threading.Thread(target=cloud_client.start, args=(channel, self), daemon=True).start()
except Exception as e:
threading.Thread(
target=cloud_client.start,
args=(self._primary_channel, self),
daemon=True,
).start()
except Exception:
pass
# Run channel.startup() in a daemon thread so we can restart later
self._channel_thread = threading.Thread(
target=self._run_channel, args=(channel,), daemon=True
)
self._channel_thread.start()
logger.debug(f"[ChannelManager] Channel '{channel_name}' started in sub-thread")
# Start web console first so its logs print cleanly,
# then start remaining channels after a brief pause.
web_entry = None
other_entries = []
for entry in channels:
if entry[0] == "web":
web_entry = entry
else:
other_entries.append(entry)
def _run_channel(self, channel):
ordered = ([web_entry] if web_entry else []) + other_entries
for i, (name, ch) in enumerate(ordered):
if i > 0 and name != "web":
time.sleep(0.1)
t = threading.Thread(target=self._run_channel, args=(name, ch), daemon=True)
self._threads[name] = t
t.start()
logger.debug(f"[ChannelManager] Channel '{name}' started in sub-thread")
def _run_channel(self, name: str, channel):
try:
channel.startup()
except Exception as e:
logger.error(f"[ChannelManager] Channel startup error: {e}")
logger.error(f"[ChannelManager] Channel '{name}' startup error: {e}")
logger.exception(e)
def stop(self):
def stop(self, channel_name: str = None):
"""
Stop the current channel. Since most channel startup() methods block
on an HTTP server or stream client, we stop by terminating the thread.
Stop channel(s). If channel_name is given, stop only that channel;
otherwise stop all channels.
"""
# Pop under lock, then stop outside lock to avoid deadlock
with self._lock:
if self._channel is None:
names = [channel_name] if channel_name else list(self._channels.keys())
to_stop = []
for name in names:
ch = self._channels.pop(name, None)
th = self._threads.pop(name, None)
to_stop.append((name, ch, th))
if channel_name and self._primary_channel is self._channels.get(channel_name):
self._primary_channel = None
for name, ch, th in to_stop:
if ch is None:
logger.warning(f"[ChannelManager] Channel '{name}' not found in managed channels")
if th and th.is_alive():
self._interrupt_thread(th, name)
continue
logger.info(f"[ChannelManager] Stopping channel '{name}'...")
graceful = False
if hasattr(ch, 'stop'):
try:
ch.stop()
graceful = True
except Exception as e:
logger.warning(f"[ChannelManager] Error during channel '{name}' stop: {e}")
if th and th.is_alive():
th.join(timeout=5)
if th.is_alive():
if graceful:
logger.info(f"[ChannelManager] Channel '{name}' thread still alive after stop(), "
"leaving daemon thread to finish on its own")
else:
logger.warning(f"[ChannelManager] Channel '{name}' thread did not exit in 5s, forcing interrupt")
self._interrupt_thread(th, name)
@staticmethod
def _interrupt_thread(th: threading.Thread, name: str):
"""Raise SystemExit in target thread to break blocking loops like start_forever."""
import ctypes
try:
tid = th.ident
if tid is None:
return
channel_type = getattr(self._channel, 'channel_type', 'unknown')
logger.info(f"[ChannelManager] Stopping channel '{channel_type}'...")
# Try graceful stop if channel implements it
try:
if hasattr(self._channel, 'stop'):
self._channel.stop()
except Exception as e:
logger.warning(f"[ChannelManager] Error during channel stop: {e}")
self._channel = None
self._channel_thread = None
res = ctypes.pythonapi.PyThreadState_SetAsyncExc(
ctypes.c_ulong(tid), ctypes.py_object(SystemExit)
)
if res == 1:
logger.info(f"[ChannelManager] Interrupted thread for channel '{name}'")
elif res > 1:
ctypes.pythonapi.PyThreadState_SetAsyncExc(ctypes.c_ulong(tid), None)
logger.warning(f"[ChannelManager] Failed to interrupt thread for channel '{name}'")
except Exception as e:
logger.warning(f"[ChannelManager] Thread interrupt error for '{name}': {e}")
def restart(self, new_channel_name: str):
"""
Restart the channel with a new channel type.
Restart a single channel with a new channel type.
Can be called from any thread (e.g. linkai config callback).
"""
logger.info(f"[ChannelManager] Restarting channel to '{new_channel_name}'...")
self.stop()
# Clear singleton cache so a fresh channel instance is created
self.stop(new_channel_name)
_clear_singleton_cache(new_channel_name)
time.sleep(1) # Brief pause to allow resources to release
self.start(new_channel_name, first_start=False)
time.sleep(1)
self.start([new_channel_name], first_start=False)
logger.info(f"[ChannelManager] Channel restarted to '{new_channel_name}' successfully")
def add_channel(self, channel_name: str):
"""
Dynamically add and start a new channel.
If the channel is already running, restart it instead.
"""
with self._lock:
if channel_name in self._channels:
logger.info(f"[ChannelManager] Channel '{channel_name}' already exists, restarting")
if self._channels.get(channel_name):
self.restart(channel_name)
return
logger.info(f"[ChannelManager] Adding channel '{channel_name}'...")
_clear_singleton_cache(channel_name)
self.start([channel_name], first_start=False)
logger.info(f"[ChannelManager] Channel '{channel_name}' added successfully")
def remove_channel(self, channel_name: str):
"""
Dynamically stop and remove a running channel.
"""
with self._lock:
if channel_name not in self._channels:
logger.warning(f"[ChannelManager] Channel '{channel_name}' not found, nothing to remove")
return
logger.info(f"[ChannelManager] Removing channel '{channel_name}'...")
self.stop(channel_name)
logger.info(f"[ChannelManager] Channel '{channel_name}' removed successfully")
def _clear_singleton_cache(channel_name: str):
"""
@@ -116,28 +221,22 @@ def _clear_singleton_cache(channel_name: str):
a new instance can be created with updated config.
"""
cls_map = {
"wx": "channel.wechat.wechat_channel.WechatChannel",
"wxy": "channel.wechat.wechaty_channel.WechatyChannel",
"wcf": "channel.wechat.wcf_channel.WechatfChannel",
"web": "channel.web.web_channel.WebChannel",
"wechatmp": "channel.wechatmp.wechatmp_channel.WechatMPChannel",
"wechatmp_service": "channel.wechatmp.wechatmp_channel.WechatMPChannel",
"wechatcom_app": "channel.wechatcom.wechatcomapp_channel.WechatComAppChannel",
"wework": "channel.wework.wework_channel.WeworkChannel",
const.FEISHU: "channel.feishu.feishu_channel.FeiShuChanel",
const.DINGTALK: "channel.dingtalk.dingtalk_channel.DingTalkChanel",
const.WECOM_BOT: "channel.wecom_bot.wecom_bot_channel.WecomBotChannel",
}
module_path = cls_map.get(channel_name)
if not module_path:
return
# The singleton decorator stores instances in a closure dict keyed by class.
# We need to find the actual class and clear it from the closure.
try:
parts = module_path.rsplit(".", 1)
module_name, class_name = parts[0], parts[1]
import importlib
module = importlib.import_module(module_name)
# The module-level name is the wrapper function from @singleton
wrapper = getattr(module, class_name, None)
if wrapper and hasattr(wrapper, '__closure__') and wrapper.__closure__:
for cell in wrapper.__closure__:
@@ -176,17 +275,25 @@ def run():
# kill signal
sigterm_handler_wrap(signal.SIGTERM)
# create channel
channel_name = conf().get("channel_type", "wx")
# Parse channel_type into a list
raw_channel = conf().get("channel_type", "web")
if "--cmd" in sys.argv:
channel_name = "terminal"
channel_names = ["terminal"]
else:
channel_names = _parse_channel_type(raw_channel)
if not channel_names:
channel_names = ["web"]
if channel_name == "wxy":
os.environ["WECHATY_LOG"] = "warn"
# Auto-start web console unless explicitly disabled
web_console_enabled = conf().get("web_console", True)
if web_console_enabled and "web" not in channel_names:
channel_names.append("web")
logger.info(f"[App] Starting channels: {channel_names}")
_channel_mgr = ChannelManager()
_channel_mgr.start(channel_name, first_start=True)
_channel_mgr.start(channel_names, first_start=True)
while True:
time.sleep(1)

View File

@@ -65,30 +65,73 @@ class AgentLLMModel(LLMModel):
LLM Model adapter that uses COW's existing bot infrastructure
"""
_MODEL_BOT_TYPE_MAP = {
"wenxin": const.BAIDU, "wenxin-4": const.BAIDU,
"xunfei": const.XUNFEI, const.QWEN: const.QWEN,
const.MODELSCOPE: const.MODELSCOPE,
}
_MODEL_PREFIX_MAP = [
("qwen", const.QWEN_DASHSCOPE), ("qwq", const.QWEN_DASHSCOPE), ("qvq", const.QWEN_DASHSCOPE),
("gemini", const.GEMINI), ("glm", const.ZHIPU_AI), ("claude", const.CLAUDEAPI),
("moonshot", const.MOONSHOT), ("kimi", const.MOONSHOT),
("doubao", const.DOUBAO),
]
def __init__(self, bridge: Bridge, bot_type: str = "chat"):
# Get model name directly from config
from config import conf
model_name = conf().get("model", const.GPT_41)
super().__init__(model=model_name)
super().__init__(model=conf().get("model", const.GPT_41))
self.bridge = bridge
self.bot_type = bot_type
self._bot = None
self._use_linkai = conf().get("use_linkai", False) and conf().get("linkai_api_key")
self._bot_model = None
@property
def model(self):
from config import conf
return conf().get("model", const.GPT_41)
@model.setter
def model(self, value):
pass
def _resolve_bot_type(self, model_name: str) -> str:
"""Resolve bot type from model name, matching Bridge.__init__ logic."""
from config import conf
if conf().get("use_linkai", False) and conf().get("linkai_api_key"):
return const.LINKAI
# Support custom bot type configuration
configured_bot_type = conf().get("bot_type")
if configured_bot_type:
return configured_bot_type
if not model_name or not isinstance(model_name, str):
return const.CHATGPT
if model_name in self._MODEL_BOT_TYPE_MAP:
return self._MODEL_BOT_TYPE_MAP[model_name]
if model_name.lower().startswith("minimax") or model_name in ["abab6.5-chat"]:
return const.MiniMax
if model_name in [const.QWEN_TURBO, const.QWEN_PLUS, const.QWEN_MAX]:
return const.QWEN_DASHSCOPE
if model_name in [const.MOONSHOT, "moonshot-v1-8k", "moonshot-v1-32k", "moonshot-v1-128k"]:
return const.MOONSHOT
if model_name in [const.DEEPSEEK_CHAT, const.DEEPSEEK_REASONER]:
return const.CHATGPT
for prefix, btype in self._MODEL_PREFIX_MAP:
if model_name.startswith(prefix):
return btype
return const.CHATGPT
@property
def bot(self):
"""Lazy load the bot and enhance it with tool calling if needed"""
if self._bot is None:
# If use_linkai is enabled, use LinkAI bot directly
if self._use_linkai:
self._bot = self.bridge.find_chat_bot(const.LINKAI)
else:
self._bot = self.bridge.get_bot(self.bot_type)
# Automatically add tool calling support if not present
self._bot = add_openai_compatible_support(self._bot)
# Log bot info
bot_name = type(self._bot).__name__
"""Lazy load the bot, re-create when model changes"""
from models.bot_factory import create_bot
cur_model = self.model
if self._bot is None or self._bot_model != cur_model:
bot_type = self._resolve_bot_type(cur_model)
self._bot = create_bot(bot_type)
self._bot = add_openai_compatible_support(self._bot)
self._bot_model = cur_model
return self._bot
def call(self, request: LLMRequest):
@@ -135,7 +178,7 @@ class AgentLLMModel(LLMModel):
# Use tool-enabled streaming call if available
# Extract system prompt if present
system_prompt = getattr(request, 'system', None)
# Build kwargs for call_with_tools
kwargs = {
'messages': request.messages,
@@ -143,15 +186,20 @@ class AgentLLMModel(LLMModel):
'stream': True,
'model': self.model # Pass model parameter
}
# Only pass max_tokens if explicitly set, let the bot use its default
if request.max_tokens is not None:
kwargs['max_tokens'] = request.max_tokens
# Add system prompt if present
if system_prompt:
kwargs['system'] = system_prompt
# Pass channel_type for linkai tracking
channel_type = getattr(self, 'channel_type', None)
if channel_type:
kwargs['channel_type'] = channel_type
stream = self.bot.call_with_tools(**kwargs)
# Convert stream format to our expected format
@@ -290,9 +338,10 @@ class AgentBridge:
Returns:
Reply object
"""
session_id = None
agent = None
try:
# Extract session_id from context for user isolation
session_id = None
if context:
session_id = context.kwargs.get("session_id") or context.get("session_id")
@@ -325,6 +374,13 @@ class AgentBridge:
logger.warning(f"[AgentBridge] Failed to attach context to scheduler: {e}")
break
# Pass channel_type to model so linkai requests carry it
if context and hasattr(agent, 'model'):
agent.model.channel_type = context.get("channel_type", "")
# Store session_id on agent so executor can clear DB on fatal errors
agent._current_session_id = session_id
try:
# Use agent's run_stream method with event handler
response = agent.run_stream(
@@ -336,9 +392,26 @@ class AgentBridge:
# Restore original tools
if context and context.get("is_scheduled_task"):
agent.tools = original_tools
# Log execution summary
event_handler.log_summary()
# Persist new messages generated during this run
if session_id:
channel_type = (context.get("channel_type") or "") if context else ""
new_messages = getattr(agent, '_last_run_new_messages', [])
if new_messages:
self._persist_messages(session_id, list(new_messages), channel_type)
else:
with agent.messages_lock:
msg_count = len(agent.messages)
if msg_count == 0:
try:
from agent.memory import get_conversation_store
get_conversation_store().clear_session(session_id)
logger.info(f"[AgentBridge] Cleared DB for recovered session: {session_id}")
except Exception as e:
logger.warning(f"[AgentBridge] Failed to clear DB after recovery: {e}")
# Check if there are files to send (from read tool)
if hasattr(agent, 'stream_executor') and hasattr(agent.stream_executor, 'files_to_send'):
@@ -358,6 +431,18 @@ class AgentBridge:
except Exception as e:
logger.error(f"Agent reply error: {e}")
# If the agent cleared its messages due to format error / overflow,
# also purge the DB so the next request starts clean.
if session_id and agent:
try:
with agent.messages_lock:
msg_count = len(agent.messages)
if msg_count == 0:
from agent.memory import get_conversation_store
get_conversation_store().clear_session(session_id)
logger.info(f"[AgentBridge] Cleared DB for session after error: {session_id}")
except Exception as db_err:
logger.warning(f"[AgentBridge] Failed to clear DB after error: {db_err}")
return Reply(ReplyType.ERROR, f"Agent error: {str(e)}")
def _create_file_reply(self, file_info: dict, text_response: str, context: Context = None) -> Reply:
@@ -475,6 +560,32 @@ class AgentBridge:
except Exception as e:
logger.warning(f"[AgentBridge] Failed to migrate API keys: {e}")
def _persist_messages(
self, session_id: str, new_messages: list, channel_type: str = ""
) -> None:
"""
Persist new messages to the conversation store after each agent run.
Failures are logged but never propagate — they must not interrupt replies.
"""
if not new_messages:
return
try:
from config import conf
if not conf().get("conversation_persistence", True):
return
except Exception:
pass
try:
from agent.memory import get_conversation_store
get_conversation_store().append_messages(
session_id, new_messages, channel_type=channel_type
)
except Exception as e:
logger.warning(
f"[AgentBridge] Failed to persist messages for session={session_id}: {e}"
)
def clear_session(self, session_id: str):
"""
Clear a specific session's agent and conversation history

View File

@@ -77,10 +77,6 @@ class AgentInitializer:
# Initialize skill manager
skill_manager = self._initialize_skill_manager(workspace_root, session_id)
# Check if first conversation
from agent.prompt.workspace import is_first_conversation, mark_conversation_started
is_first = is_first_conversation(workspace_root)
# Build system prompt
prompt_builder = PromptBuilder(workspace_dir=workspace_root, language="zh")
runtime_info = self._get_runtime_info(workspace_root)
@@ -91,12 +87,8 @@ class AgentInitializer:
skill_manager=skill_manager,
memory_manager=memory_manager,
runtime_info=runtime_info,
is_first_conversation=is_first
)
if is_first:
mark_conversation_started(workspace_root)
# Get cost control parameters
from config import conf
max_steps = conf().get("agent_max_steps", 20)
@@ -115,11 +107,135 @@ class AgentInitializer:
runtime_info=runtime_info # Pass runtime_info for dynamic time updates
)
# Attach memory manager
# Attach memory manager and share LLM model for summarization
if memory_manager:
agent.memory_manager = memory_manager
if hasattr(agent, 'model') and agent.model:
memory_manager.flush_manager.llm_model = agent.model
# Restore persisted conversation history for this session
if session_id:
self._restore_conversation_history(agent, session_id)
# Start daily memory flush timer (once, on first agent init regardless of session)
self._start_daily_flush_timer()
return agent
def _restore_conversation_history(self, agent, session_id: str) -> None:
"""
Load persisted conversation messages from SQLite and inject them
into the agent's in-memory message list.
Only user text and assistant text are restored. Tool call chains
(tool_use / tool_result) are stripped out because:
1. They are intermediate process, the value is already in the final
assistant text reply.
2. They consume massive context tokens (often 80%+ of history).
3. Different models have incompatible tool message formats, so
restoring tool chains across model switches causes 400 errors.
4. Eliminates the entire class of tool_use/tool_result pairing bugs.
"""
from config import conf
if not conf().get("conversation_persistence", True):
return
try:
from agent.memory import get_conversation_store
store = get_conversation_store()
max_turns = conf().get("agent_max_context_turns", 20)
restore_turns = max(3, max_turns // 6)
saved = store.load_messages(session_id, max_turns=restore_turns)
if saved:
filtered = self._filter_text_only_messages(saved)
if filtered:
with agent.messages_lock:
agent.messages = filtered
logger.debug(
f"[AgentInitializer] Restored {len(filtered)} text messages "
f"(from {len(saved)} total, {restore_turns} turns cap) "
f"for session={session_id}"
)
except Exception as e:
logger.warning(
f"[AgentInitializer] Failed to restore conversation history for "
f"session={session_id}: {e}"
)
@staticmethod
def _filter_text_only_messages(messages: list) -> list:
"""
Extract clean user/assistant turn pairs from raw message history.
Groups messages into turns (each starting with a real user query),
then keeps only:
- The first user text in each turn (the actual user input)
- The last assistant text in each turn (the final answer)
All tool_use, tool_result, intermediate assistant thoughts, and
internal hint messages injected by the agent loop are discarded.
"""
def _extract_text(content) -> str:
if isinstance(content, str):
return content.strip()
if isinstance(content, list):
parts = [
b.get("text", "")
for b in content
if isinstance(b, dict) and b.get("type") == "text"
]
return "\n".join(p for p in parts if p).strip()
return ""
def _is_real_user_msg(msg: dict) -> bool:
"""True for actual user input, False for tool_result or internal hints."""
if msg.get("role") != "user":
return False
content = msg.get("content")
if isinstance(content, list):
has_tool_result = any(
isinstance(b, dict) and b.get("type") == "tool_result"
for b in content
)
if has_tool_result:
return False
text = _extract_text(content)
return bool(text)
# Group into turns: each turn starts with a real user message
turns = []
current_turn = None
for msg in messages:
if _is_real_user_msg(msg):
if current_turn is not None:
turns.append(current_turn)
current_turn = {"user": msg, "assistants": []}
elif current_turn is not None and msg.get("role") == "assistant":
text = _extract_text(msg.get("content"))
if text:
current_turn["assistants"].append(text)
if current_turn is not None:
turns.append(current_turn)
# Build result: one user msg + one assistant msg per turn
filtered = []
for turn in turns:
user_text = _extract_text(turn["user"].get("content"))
if not user_text:
continue
filtered.append({
"role": "user",
"content": [{"type": "text", "text": user_text}]
})
if turn["assistants"]:
final_reply = turn["assistants"][-1]
filtered.append({
"role": "assistant",
"content": [{"type": "text", "text": final_reply}]
})
return filtered
def _load_env_file(self):
"""Load environment variables from .env file"""
@@ -148,12 +264,11 @@ class AgentInitializer:
from agent.tools import MemorySearchTool, MemoryGetTool
from config import conf
# Get OpenAI config
# Initialize embedding provider (prefer OpenAI, fallback to LinkAI)
embedding_provider = None
openai_api_key = conf().get("open_ai_api_key", "")
openai_api_base = conf().get("open_ai_api_base", "")
# Initialize embedding provider
embedding_provider = None
if openai_api_key and openai_api_key not in ["", "YOUR API KEY", "YOUR_API_KEY"]:
try:
embedding_provider = create_embedding_provider(
@@ -166,6 +281,22 @@ class AgentInitializer:
logger.info("[AgentInitializer] OpenAI embedding initialized")
except Exception as e:
logger.warning(f"[AgentInitializer] OpenAI embedding failed: {e}")
if embedding_provider is None:
linkai_api_key = conf().get("linkai_api_key", "") or os.environ.get("LINKAI_API_KEY", "")
linkai_api_base = conf().get("linkai_api_base", "https://api.link-ai.tech")
if linkai_api_key and linkai_api_key not in ["", "YOUR API KEY", "YOUR_API_KEY"]:
try:
embedding_provider = create_embedding_provider(
provider="linkai",
model="text-embedding-3-small",
api_key=linkai_api_key,
api_base=f"{linkai_api_base}/v1"
)
if session_id is None:
logger.info("[AgentInitializer] LinkAI embedding initialized (fallback)")
except Exception as e:
logger.warning(f"[AgentInitializer] LinkAI embedding failed: {e}")
# Create memory manager
memory_config = MemoryConfig(workspace_root=workspace_root)
@@ -235,7 +366,7 @@ class AgentInitializer:
if tool:
# Apply workspace config to file operation tools
if tool_name in ['read', 'write', 'edit', 'bash', 'grep', 'find', 'ls']:
if tool_name in ['read', 'write', 'edit', 'bash', 'grep', 'find', 'ls', 'web_fetch']:
tool.config = file_config
tool.cwd = file_config.get("cwd", getattr(tool, 'cwd', None))
if 'memory_manager' in file_config:
@@ -283,7 +414,14 @@ class AgentInitializer:
tool.scheduler_service = scheduler_service
if not tool.config:
tool.config = {}
tool.config["channel_type"] = conf().get("channel_type", "unknown")
raw_ct = conf().get("channel_type", "unknown")
if isinstance(raw_ct, list):
ct = raw_ct[0] if raw_ct else "unknown"
elif isinstance(raw_ct, str) and "," in raw_ct:
ct = raw_ct.split(",")[0].strip()
else:
ct = raw_ct
tool.config["channel_type"] = ct
except Exception as e:
logger.warning(f"[AgentInitializer] Failed to inject scheduler dependencies: {e}")
@@ -330,7 +468,7 @@ class AgentInitializer:
return {
"model": conf().get("model", "unknown"),
"workspace": workspace_root,
"channel": conf().get("channel_type", "unknown"),
"channel": ", ".join(conf().get("channel_type")) if isinstance(conf().get("channel_type"), list) else conf().get("channel_type", "unknown"),
"_get_current_time": get_current_time # Dynamic time function
}
@@ -388,3 +526,59 @@ class AgentInitializer:
logger.info(f"[AgentInitializer] Migrated {len(keys_to_migrate)} API keys to .env: {list(keys_to_migrate.keys())}")
except Exception as e:
logger.warning(f"[AgentInitializer] Failed to migrate API keys: {e}")
def _start_daily_flush_timer(self):
"""Start a background thread that flushes all agents' memory daily at 23:55."""
if getattr(self.agent_bridge, '_daily_flush_started', False):
return
self.agent_bridge._daily_flush_started = True
import threading
def _daily_flush_loop():
while True:
try:
now = datetime.datetime.now()
target = now.replace(hour=23, minute=55, second=0, microsecond=0)
if target <= now:
target += datetime.timedelta(days=1)
wait_seconds = (target - now).total_seconds()
logger.info(f"[DailyFlush] Next flush at {target.strftime('%Y-%m-%d %H:%M')} (in {wait_seconds/3600:.1f}h)")
time.sleep(wait_seconds)
self._flush_all_agents()
except Exception as e:
logger.warning(f"[DailyFlush] Error in daily flush loop: {e}")
time.sleep(3600)
t = threading.Thread(target=_daily_flush_loop, daemon=True)
t.start()
def _flush_all_agents(self):
"""Flush memory for all active agent sessions."""
agents = []
if self.agent_bridge.default_agent:
agents.append(("default", self.agent_bridge.default_agent))
for sid, agent in self.agent_bridge.agents.items():
agents.append((sid, agent))
if not agents:
return
flushed = 0
for label, agent in agents:
try:
if not agent.memory_manager:
continue
with agent.messages_lock:
messages = list(agent.messages)
if not messages:
continue
result = agent.memory_manager.flush_manager.create_daily_summary(messages)
if result:
flushed += 1
except Exception as e:
logger.warning(f"[DailyFlush] Failed for session {label}: {e}")
if flushed:
logger.info(f"[DailyFlush] Flushed {flushed}/{len(agents)} agent session(s)")

View File

@@ -13,12 +13,38 @@ class Channel(object):
channel_type = ""
NOT_SUPPORT_REPLYTYPE = [ReplyType.VOICE, ReplyType.IMAGE]
def __init__(self):
import threading
self._startup_event = threading.Event()
self._startup_error = None
self.cloud_mode = False # set to True by ChannelManager when running with cloud client
def startup(self):
"""
init channel
"""
raise NotImplementedError
def report_startup_success(self):
self._startup_error = None
self._startup_event.set()
def report_startup_error(self, error: str):
self._startup_error = error
self._startup_event.set()
def wait_startup(self, timeout: float = 3) -> (bool, str):
"""
Wait for channel startup result.
Returns (success: bool, error_msg: str).
"""
ready = self._startup_event.wait(timeout=timeout)
if not ready:
return True, ""
if self._startup_error:
return False, self._startup_error
return True, ""
def stop(self):
"""
stop channel gracefully, called before restart

View File

@@ -12,16 +12,7 @@ def create_channel(channel_type) -> Channel:
:return: channel instance
"""
ch = Channel()
if channel_type == "wx":
from channel.wechat.wechat_channel import WechatChannel
ch = WechatChannel()
elif channel_type == "wxy":
from channel.wechat.wechaty_channel import WechatyChannel
ch = WechatyChannel()
elif channel_type == "wcf":
from channel.wechat.wcf_channel import WechatfChannel
ch = WechatfChannel()
elif channel_type == "terminal":
if channel_type == "terminal":
from channel.terminal.terminal_channel import TerminalChannel
ch = TerminalChannel()
elif channel_type == 'web':
@@ -36,15 +27,15 @@ def create_channel(channel_type) -> Channel:
elif channel_type == "wechatcom_app":
from channel.wechatcom.wechatcomapp_channel import WechatComAppChannel
ch = WechatComAppChannel()
elif channel_type == "wework":
from channel.wework.wework_channel import WeworkChannel
ch = WeworkChannel()
elif channel_type == const.FEISHU:
from channel.feishu.feishu_channel import FeiShuChanel
ch = FeiShuChanel()
elif channel_type == const.DINGTALK:
from channel.dingtalk.dingtalk_channel import DingTalkChanel
ch = DingTalkChanel()
elif channel_type == const.WECOM_BOT:
from channel.wecom_bot.wecom_bot_channel import WecomBotChannel
ch = WecomBotChannel()
else:
raise RuntimeError
ch.channel_type = channel_type

View File

@@ -24,11 +24,17 @@ handler_pool = ThreadPoolExecutor(max_workers=8) # 处理消息的线程池
class ChatChannel(Channel):
name = None # 登录的用户名
user_id = None # 登录的用户id
futures = {} # 记录每个session_id提交到线程池的future对象, 用于重置会话时把没执行的future取消掉正在执行的不会被取消
sessions = {} # 用于控制并发每个session_id同时只能有一个context在处理
lock = threading.Lock() # 用于控制对sessions的访问
def __init__(self):
super().__init__()
# Instance-level attributes so each channel subclass has its own
# independent session queue and lock. Previously these were class-level,
# which caused contexts from one channel (e.g. Feishu) to be consumed
# by another channel's consume() thread (e.g. Web), leading to errors
# like "No request_id found in context".
self.futures = {}
self.sessions = {}
self.lock = threading.Lock()
_thread = threading.Thread(target=self.consume)
_thread.setDaemon(True)
_thread.start()
@@ -37,9 +43,8 @@ class ChatChannel(Channel):
def _compose_context(self, ctype: ContextType, content, **kwargs):
context = Context(ctype, content)
context.kwargs = kwargs
# context首次传入时origin_ctype是None,
# 引入的起因是当输入语音时会嵌套生成两个context第一步语音转文本第二步通过文本生成文字回复。
# origin_ctype用于第二步文本回复时判断是否需要匹配前缀如果是私聊的语音就不需要匹配前缀
if "channel_type" not in context:
context["channel_type"] = self.channel_type
if "origin_ctype" not in context:
context["origin_ctype"] = ctype
# context首次传入时receiver是None根据类型设置receiver
@@ -426,7 +431,7 @@ class ChatChannel(Channel):
if session_id not in self.sessions:
self.sessions[session_id] = [
Dequeue(),
threading.BoundedSemaphore(conf().get("concurrency_in_session", 4)),
threading.BoundedSemaphore(conf().get("concurrency_in_session", 1)),
]
if context.type == ContextType.TEXT and context.content.startswith("#"):
self.sessions[session_id][0].putleft(context) # 优先处理管理命令

View File

@@ -1,5 +1,5 @@
"""
本类表示聊天消息用于对itchat和wechaty的消息进行统一的封装。
Unified chat message class for different channel implementations.
填好必填项(群聊6个非群聊8个)即可接入ChatChannel并支持插件参考TerminalChannel

View File

@@ -101,6 +101,8 @@ class DingTalkChanel(ChatChannel, dingtalk_stream.ChatbotHandler):
# 历史消息id暂存用于幂等控制
self.receivedMsgs = ExpiredDict(conf().get("expires_in_seconds", 3600))
self._stream_client = None
self._running = False
self._event_loop = None
logger.debug("[DingTalk] client_id={}, client_secret={} ".format(
self.dingtalk_client_id, self.dingtalk_client_secret))
# 无需群校验和前缀
@@ -113,22 +115,130 @@ class DingTalkChanel(ChatChannel, dingtalk_stream.ChatbotHandler):
# Robot code cache (extracted from incoming messages)
self._robot_code = None
def _open_connection(self, client):
"""
Open a DingTalk stream connection directly, bypassing SDK's internal error-swallowing.
Returns (connection_dict, error_str). On success error_str is empty; on failure
connection_dict is None and error_str contains a human-readable message.
"""
try:
resp = requests.post(
"https://api.dingtalk.com/v1.0/gateway/connections/open",
headers={"Content-Type": "application/json", "Accept": "application/json"},
json={
"clientId": client.credential.client_id,
"clientSecret": client.credential.client_secret,
"subscriptions": [{"type": "CALLBACK",
"topic": dingtalk_stream.chatbot.ChatbotMessage.TOPIC}],
"ua": "dingtalk-sdk-python/cow",
"localIp": "",
},
timeout=10,
)
body = resp.json()
if not resp.ok:
code = body.get("code", resp.status_code)
message = body.get("message", resp.reason)
return None, f"open connection failed: [{code}] {message}"
return body, ""
except Exception as e:
return None, f"open connection failed: {e}"
def startup(self):
import asyncio
self.dingtalk_client_id = conf().get('dingtalk_client_id')
self.dingtalk_client_secret = conf().get('dingtalk_client_secret')
self._running = True
credential = dingtalk_stream.Credential(self.dingtalk_client_id, self.dingtalk_client_secret)
client = dingtalk_stream.DingTalkStreamClient(credential)
self._stream_client = client
client.register_callback_handler(dingtalk_stream.chatbot.ChatbotMessage.TOPIC, self)
logger.info("[DingTalk] ✅ Stream connected, ready to receive messages")
client.start_forever()
logger.info("[DingTalk] ✅ Stream client initialized, ready to receive messages")
# Run the connection loop ourselves instead of delegating to client.start(),
# so we can get detailed error messages and respond to stop() quickly.
import urllib.parse as _urlparse
import websockets as _ws
import json as _json
client.pre_start()
_first_connect = True
while self._running:
# Open connection using our own request so we get detailed error info.
connection, err_msg = self._open_connection(client)
if connection is None:
if _first_connect:
logger.warning(f"[DingTalk] {err_msg}")
self.report_startup_error(err_msg)
_first_connect = False
else:
logger.warning(f"[DingTalk] {err_msg}, retrying in 10s...")
# Interruptible sleep: checks _running every 100ms.
for _ in range(100):
if not self._running:
break
time.sleep(0.1)
continue
if _first_connect:
logger.info("[DingTalk] ✅ Connected to DingTalk stream")
self.report_startup_success()
_first_connect = False
else:
logger.info("[DingTalk] Reconnected to DingTalk stream")
# Run the WebSocket session in an asyncio loop.
uri = '%s?ticket=%s' % (
connection['endpoint'],
_urlparse.quote_plus(connection['ticket'])
)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
self._event_loop = loop
try:
async def _session():
async with _ws.connect(uri) as websocket:
client.websocket = websocket
async for raw_message in websocket:
json_message = _json.loads(raw_message)
result = await client.route_message(json_message)
if result == dingtalk_stream.DingTalkStreamClient.TAG_DISCONNECT:
break
loop.run_until_complete(_session())
except (KeyboardInterrupt, SystemExit):
logger.info("[DingTalk] Session loop received stop signal, exiting")
break
except Exception as e:
if not self._running:
break
logger.warning(f"[DingTalk] Stream session error: {e}, reconnecting in 3s...")
for _ in range(30):
if not self._running:
break
time.sleep(0.1)
finally:
self._event_loop = None
try:
loop.close()
except Exception:
pass
logger.info("[DingTalk] Startup loop exited")
def stop(self):
if self._stream_client:
logger.info("[DingTalk] stop() called, setting _running=False")
self._running = False
loop = self._event_loop
if loop and not loop.is_closed():
try:
self._stream_client.stop()
logger.info("[DingTalk] Stream client stopped")
loop.call_soon_threadsafe(loop.stop)
logger.info("[DingTalk] Sent stop signal to event loop")
except Exception as e:
logger.warning(f"[DingTalk] Error stopping stream client: {e}")
self._stream_client = None
logger.warning(f"[DingTalk] Error stopping event loop: {e}")
self._stream_client = None
logger.info("[DingTalk] stop() completed")
def get_access_token(self):
"""
@@ -465,23 +575,21 @@ class DingTalkChanel(ChatChannel, dingtalk_stream.ChatbotHandler):
async def process(self, callback: dingtalk_stream.CallbackMessage):
try:
incoming_message = dingtalk_stream.ChatbotMessage.from_dict(callback.data)
# 缓存 robot_code用于后续图片下载
if hasattr(incoming_message, 'robot_code'):
self._robot_code_cache = incoming_message.robot_code
# Debug: 打印完整的 event 数据
logger.debug(f"[DingTalk] ===== Incoming Message Debug =====")
logger.debug(f"[DingTalk] callback.data keys: {callback.data.keys() if hasattr(callback.data, 'keys') else 'N/A'}")
logger.debug(f"[DingTalk] incoming_message attributes: {dir(incoming_message)}")
logger.debug(f"[DingTalk] robot_code: {getattr(incoming_message, 'robot_code', 'N/A')}")
logger.debug(f"[DingTalk] chatbot_corp_id: {getattr(incoming_message, 'chatbot_corp_id', 'N/A')}")
logger.debug(f"[DingTalk] chatbot_user_id: {getattr(incoming_message, 'chatbot_user_id', 'N/A')}")
logger.debug(f"[DingTalk] conversation_id: {getattr(incoming_message, 'conversation_id', 'N/A')}")
logger.debug(f"[DingTalk] Raw callback.data: {callback.data}")
logger.debug(f"[DingTalk] =====================================")
image_download_handler = self # 传入方法所在的类实例
# Filter out stale messages from before channel startup (offline backlog)
create_at = getattr(incoming_message, 'create_at', None)
if create_at:
msg_age_s = time.time() - int(create_at) / 1000
if msg_age_s > 60:
logger.warning(f"[DingTalk] stale msg filtered (age={msg_age_s:.0f}s), "
f"msg_id={getattr(incoming_message, 'message_id', 'N/A')}")
return AckMessage.STATUS_OK, 'OK'
image_download_handler = self
dingtalk_msg = DingTalkMessage(incoming_message, image_download_handler)
if dingtalk_msg.is_group:
@@ -490,8 +598,7 @@ class DingTalkChanel(ChatChannel, dingtalk_stream.ChatbotHandler):
self.handle_single(dingtalk_msg)
return AckMessage.STATUS_OK, 'OK'
except Exception as e:
logger.error(f"[DingTalk] process error: {e}")
logger.exception(e) # 打印完整堆栈跟踪
logger.error(f"[DingTalk] process error: {e}", exc_info=True)
return AckMessage.STATUS_SYSTEM_EXCEPTION, 'ERROR'
@time_checker

View File

@@ -11,6 +11,7 @@
@Date 2023/11/19
"""
import importlib.util
import json
import logging
import os
@@ -38,15 +39,20 @@ logging.getLogger("Lark").setLevel(logging.WARNING)
URL_VERIFICATION = "url_verification"
# 尝试导入飞书SDK,如果未安装则websocket模式不可用
try:
import lark_oapi as lark
# Lazy-check for lark_oapi SDK availability without importing it at module level.
# The full `import lark_oapi` pulls in 10k+ files and takes 4-10s, so we defer
# the actual import to _startup_websocket() where it is needed.
LARK_SDK_AVAILABLE = importlib.util.find_spec("lark_oapi") is not None
lark = None # will be populated on first use via _ensure_lark_imported()
LARK_SDK_AVAILABLE = True
except ImportError:
LARK_SDK_AVAILABLE = False
logger.warning(
"[FeiShu] lark_oapi not installed, websocket mode is not available. Install with: pip install lark-oapi")
def _ensure_lark_imported():
"""Import lark_oapi on first use (takes 4-10s due to 10k+ source files)."""
global lark
if lark is None:
import lark_oapi as _lark
lark = _lark
return lark
@singleton
@@ -61,6 +67,9 @@ class FeiShuChanel(ChatChannel):
# 历史消息id暂存用于幂等控制
self.receivedMsgs = ExpiredDict(60 * 60 * 7.1)
self._http_server = None
self._ws_client = None
self._ws_thread = None
self._bot_open_id = None # cached bot open_id for @-mention matching
logger.debug("[FeiShu] app_id={}, app_secret={}, verification_token={}, event_mode={}".format(
self.feishu_app_id, self.feishu_app_secret, self.feishu_token, self.feishu_event_mode))
# 无需群校验和前缀
@@ -73,12 +82,57 @@ class FeiShuChanel(ChatChannel):
raise Exception("lark_oapi not installed")
def startup(self):
self.feishu_app_id = conf().get('feishu_app_id')
self.feishu_app_secret = conf().get('feishu_app_secret')
self.feishu_token = conf().get('feishu_token')
self.feishu_event_mode = conf().get('feishu_event_mode', 'websocket')
self._fetch_bot_open_id()
if self.feishu_event_mode == 'websocket':
self._startup_websocket()
else:
self._startup_webhook()
def _fetch_bot_open_id(self):
"""Fetch the bot's own open_id via API so we can match @-mentions without feishu_bot_name."""
try:
access_token = self.fetch_access_token()
if not access_token:
logger.warning("[FeiShu] Cannot fetch bot info: no access_token")
return
headers = {"Authorization": "Bearer " + access_token}
resp = requests.get("https://open.feishu.cn/open-apis/bot/v3/info/", headers=headers, timeout=5)
if resp.status_code == 200:
data = resp.json()
if data.get("code") == 0:
self._bot_open_id = data.get("bot", {}).get("open_id")
logger.info(f"[FeiShu] Bot open_id fetched: {self._bot_open_id}")
else:
logger.warning(f"[FeiShu] Fetch bot info failed: code={data.get('code')}, msg={data.get('msg')}")
except Exception as e:
logger.warning(f"[FeiShu] Fetch bot open_id error: {e}")
def stop(self):
import ctypes
logger.info("[FeiShu] stop() called")
ws_client = self._ws_client
self._ws_client = None
ws_thread = self._ws_thread
self._ws_thread = None
# Interrupt the ws thread first so its blocking start() unblocks
if ws_thread and ws_thread.is_alive():
try:
tid = ws_thread.ident
if tid:
res = ctypes.pythonapi.PyThreadState_SetAsyncExc(
ctypes.c_ulong(tid), ctypes.py_object(SystemExit)
)
if res == 1:
logger.info("[FeiShu] Interrupted ws thread via ctypes")
elif res > 1:
ctypes.pythonapi.PyThreadState_SetAsyncExc(ctypes.c_ulong(tid), None)
except Exception as e:
logger.warning(f"[FeiShu] Error interrupting ws thread: {e}")
# lark.ws.Client has no stop() method; thread interruption above is sufficient
if self._http_server:
try:
self._http_server.stop()
@@ -86,6 +140,7 @@ class FeiShuChanel(ChatChannel):
except Exception as e:
logger.warning(f"[FeiShu] Error stopping HTTP server: {e}")
self._http_server = None
logger.info("[FeiShu] stop() completed")
def _startup_webhook(self):
"""启动HTTP服务器接收事件(webhook模式)"""
@@ -106,17 +161,22 @@ class FeiShuChanel(ChatChannel):
def _startup_websocket(self):
"""启动长连接接收事件(websocket模式)"""
_ensure_lark_imported()
logger.debug("[FeiShu] Starting in websocket mode...")
# 创建事件处理器
def handle_message_event(data: lark.im.v1.P2ImMessageReceiveV1) -> None:
"""处理接收消息事件 v2.0"""
try:
logger.debug(f"[FeiShu] websocket receive event: {lark.JSON.marshal(data, indent=2)}")
# 转换为标准的event格式
event_dict = json.loads(lark.JSON.marshal(data))
event = event_dict.get("event", {})
msg = event.get("message", {})
# Skip group messages that don't @-mention the bot (reduce log noise)
if msg.get("chat_type") == "group" and not msg.get("mentions") and msg.get("message_type") == "text":
return
logger.debug(f"[FeiShu] websocket receive event: {lark.JSON.marshal(data, indent=2)}")
# 处理消息
self._handle_message_event(event)
@@ -129,29 +189,36 @@ class FeiShuChanel(ChatChannel):
.register_p2_im_message_receive_v1(handle_message_event) \
.build()
# 尝试连接如果遇到SSL错误则自动禁用证书验证
def start_client_with_retry():
"""启动websocket客户端自动处理SSL证书错误"""
# 全局禁用SSL证书验证在导入lark_oapi之前设置
"""Run ws client in this thread with its own event loop to avoid conflicts."""
import asyncio
import ssl as ssl_module
# 保存原始的SSL上下文创建方法
original_create_default_context = ssl_module.create_default_context
def create_unverified_context(*args, **kwargs):
"""创建一个不验证证书的SSL上下文"""
context = original_create_default_context(*args, **kwargs)
context.check_hostname = False
context.verify_mode = ssl.CERT_NONE
return context
# 尝试正常连接如果失败则禁用SSL验证
# lark_oapi.ws.client captures the event loop at module-import time as a module-
# level global variable. When a previous ws thread is force-killed via ctypes its
# loop may still be marked as "running", which causes the next ws_client.start()
# call (in this new thread) to raise "This event loop is already running".
# Fix: replace the module-level loop with a brand-new, idle loop before starting.
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
import lark_oapi.ws.client as _lark_ws_client_mod
_lark_ws_client_mod.loop = loop
except Exception:
pass
startup_error = None
for attempt in range(2):
try:
if attempt == 1:
# 第二次尝试禁用SSL验证
logger.warning("[FeiShu] SSL certificate verification disabled due to certificate error. "
"This may happen when using corporate proxy or self-signed certificates.")
logger.warning("[FeiShu] Retrying with SSL verification disabled...")
ssl_module.create_default_context = create_unverified_context
ssl_module._create_unverified_context = create_unverified_context
@@ -159,41 +226,62 @@ class FeiShuChanel(ChatChannel):
self.feishu_app_id,
self.feishu_app_secret,
event_handler=event_handler,
log_level=lark.LogLevel.DEBUG if conf().get("debug") else lark.LogLevel.WARNING
log_level=lark.LogLevel.WARNING
)
self._ws_client = ws_client
logger.debug("[FeiShu] Websocket client starting...")
ws_client.start()
# 如果成功启动,跳出循环
break
except (SystemExit, KeyboardInterrupt):
logger.info("[FeiShu] Websocket thread received stop signal")
break
except Exception as e:
error_msg = str(e)
# 检查是否是SSL证书验证错误
is_ssl_error = "CERTIFICATE_VERIFY_FAILED" in error_msg or "certificate verify failed" in error_msg.lower()
is_ssl_error = ("CERTIFICATE_VERIFY_FAILED" in error_msg
or "certificate verify failed" in error_msg.lower())
if is_ssl_error and attempt == 0:
# 第一次遇到SSL错误记录日志并继续循环下次会禁用验证
logger.warning(f"[FeiShu] SSL certificate verification failed: {error_msg}")
logger.info("[FeiShu] Retrying connection with SSL verification disabled...")
logger.warning(f"[FeiShu] SSL error: {error_msg}, retrying...")
continue
else:
# 其他错误或禁用验证后仍失败,抛出异常
logger.error(f"[FeiShu] Websocket client error: {e}", exc_info=True)
# 恢复原始方法
ssl_module.create_default_context = original_create_default_context
raise
logger.error(f"[FeiShu] Websocket client error: {e}", exc_info=True)
startup_error = error_msg
ssl_module.create_default_context = original_create_default_context
break
if startup_error:
self.report_startup_error(startup_error)
try:
loop.close()
except Exception:
pass
logger.info("[FeiShu] Websocket thread exited")
# 注意不恢复原始方法因为ws_client.start()会持续运行
# 在新线程中启动客户端,避免阻塞主线程
ws_thread = threading.Thread(target=start_client_with_retry, daemon=True)
self._ws_thread = ws_thread
ws_thread.start()
# 保持主线程运行
logger.info("[FeiShu] ✅ Websocket connected, ready to receive messages")
logger.info("[FeiShu] ✅ Websocket thread started, ready to receive messages")
ws_thread.join()
def _is_mention_bot(self, mentions: list) -> bool:
"""Check whether any mention in the list refers to this bot.
Priority:
1. Match by open_id (obtained from /bot/v3/info at startup, no config needed)
2. Fallback to feishu_bot_name config for backward compatibility
3. If neither is available, assume the first mention is the bot (Feishu only
delivers group messages that @-mention the bot, so this is usually correct)
"""
if self._bot_open_id:
return any(
m.get("id", {}).get("open_id") == self._bot_open_id
for m in mentions
)
bot_name = conf().get("feishu_bot_name")
if bot_name:
return any(m.get("name") == bot_name for m in mentions)
# Feishu event subscription only delivers messages that @-mention the bot,
# so reaching here means the bot was indeed mentioned.
return True
def _handle_message_event(self, event: dict):
"""
处理消息事件的核心逻辑
@@ -212,6 +300,15 @@ class FeiShuChanel(ChatChannel):
return
self.receivedMsgs[msg_id] = True
# Filter out stale messages from before channel startup (offline backlog)
import time as _time
create_time_ms = msg.get("create_time")
if create_time_ms:
msg_age_s = _time.time() - int(create_time_ms) / 1000
if msg_age_s > 60:
logger.warning(f"[FeiShu] stale msg filtered (age={msg_age_s:.0f}s), msg_id={msg_id}")
return
is_group = False
chat_type = msg.get("chat_type")
@@ -219,10 +316,9 @@ class FeiShuChanel(ChatChannel):
if not msg.get("mentions") and msg.get("message_type") == "text":
# 群聊中未@不响应
return
if msg.get("mentions") and msg.get("mentions")[0].get("name") != conf().get("feishu_bot_name") and msg.get(
"message_type") == "text":
# 不是@机器人,不响应
return
if msg.get("mentions") and msg.get("message_type") == "text":
if not self._is_mention_bot(msg.get("mentions")):
return
# 群聊
is_group = True
receive_id_type = "chat_id"
@@ -698,6 +794,8 @@ class FeiShuChanel(ChatChannel):
def _compose_context(self, ctype: ContextType, content, **kwargs):
context = Context(ctype, content)
context.kwargs = kwargs
if "channel_type" not in context:
context["channel_type"] = self.channel_type
if "origin_ctype" not in context:
context["origin_ctype"] = ctype

View File

@@ -43,8 +43,17 @@
}
</script>
<link rel="stylesheet" href="assets/css/console.css">
<!-- Apply theme/lang before first paint to avoid flash of unstyled content.
This runs synchronously in <head> so the correct class is on <html>
before any CSS or body rendering occurs. -->
<script>
(function() {
var theme = localStorage.getItem('cow_theme') || 'dark';
if (theme === 'dark') document.documentElement.classList.add('dark');
})();
</script>
</head>
<body class="h-screen overflow-hidden bg-gray-50 dark:bg-[#111111] text-slate-800 dark:text-slate-200 font-sans transition-colors duration-200">
<body class="h-screen overflow-hidden bg-gray-50 dark:bg-[#111111] text-slate-800 dark:text-slate-200 font-sans">
<div id="app" class="flex h-screen">
<!-- ================================================================ -->
@@ -183,10 +192,24 @@
<i id="theme-icon" class="fas fa-moon"></i>
</button>
<!-- Docs Link -->
<a href="https://docs.cowagent.ai" target="_blank" rel="noopener noreferrer"
class="p-2 rounded-lg text-slate-500 dark:text-slate-400 hover:bg-slate-100 dark:hover:bg-white/10
cursor-pointer transition-colors duration-150" title="Documentation">
<i class="fas fa-book text-base"></i>
</a>
<!-- Website Link -->
<a href="https://cowagent.ai" target="_blank" rel="noopener noreferrer"
class="p-2 rounded-lg text-slate-500 dark:text-slate-400 hover:bg-slate-100 dark:hover:bg-white/10
cursor-pointer transition-colors duration-150" title="Website">
<i class="fas fa-home text-base"></i>
</a>
<!-- GitHub Link -->
<a href="https://github.com/zhayujie/chatgpt-on-wechat" target="_blank" rel="noopener noreferrer"
class="p-2 rounded-lg text-slate-500 dark:text-slate-400 hover:bg-slate-100 dark:hover:bg-white/10
cursor-pointer transition-colors duration-150">
cursor-pointer transition-colors duration-150" title="GitHub">
<i class="fab fa-github text-lg"></i>
</a>
</header>
@@ -285,68 +308,122 @@
</div>
</div>
<div class="grid gap-6">
<!-- Model Config Card -->
<div class="placeholder-card bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10 p-6">
<div class="flex items-center gap-3 mb-4">
<div class="bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10 p-6">
<div class="flex items-center gap-3 mb-5">
<div class="w-9 h-9 rounded-lg bg-primary-50 dark:bg-primary-900/30 flex items-center justify-center">
<i class="fas fa-microchip text-primary-500 text-sm"></i>
</div>
<h3 class="font-semibold text-slate-800 dark:text-slate-100" data-i18n="config_model">Model Configuration</h3>
</div>
<div class="space-y-4">
<div class="flex items-center gap-4 p-3 rounded-lg bg-slate-50 dark:bg-white/5">
<span class="text-sm font-medium text-slate-500 dark:text-slate-400 w-32 flex-shrink-0">Model</span>
<span class="text-sm text-slate-700 dark:text-slate-200 flex-1 font-mono" id="cfg-model">--</span>
<div class="space-y-5">
<!-- Provider -->
<div>
<label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="config_provider">Provider</label>
<div id="cfg-provider" class="cfg-dropdown" tabindex="0">
<div class="cfg-dropdown-selected">
<span class="cfg-dropdown-text">--</span>
<i class="fas fa-chevron-down cfg-dropdown-arrow"></i>
</div>
<div class="cfg-dropdown-menu"></div>
</div>
</div>
<!-- Model -->
<div>
<label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="config_model_name">Model</label>
<div id="cfg-model-select" class="cfg-dropdown" tabindex="0">
<div class="cfg-dropdown-selected">
<span class="cfg-dropdown-text">--</span>
<i class="fas fa-chevron-down cfg-dropdown-arrow"></i>
</div>
<div class="cfg-dropdown-menu"></div>
</div>
<div id="cfg-model-custom-wrap" class="mt-2 hidden">
<input id="cfg-model-custom" type="text"
class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
focus:outline-none focus:border-primary-500 font-mono transition-colors"
data-i18n-placeholder="config_custom_model_hint" placeholder="Enter custom model name">
</div>
</div>
<!-- API Key -->
<div id="cfg-api-key-wrap">
<label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">API Key</label>
<div class="relative">
<input id="cfg-api-key" type="text" autocomplete="off" data-1p-ignore data-lpignore="true"
class="w-full px-3 py-2 pr-10 rounded-lg border border-slate-200 dark:border-slate-600
bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
focus:outline-none focus:border-primary-500 font-mono transition-colors cfg-key-masked"
placeholder="sk-...">
<button type="button" id="cfg-api-key-toggle"
class="absolute right-2.5 top-1/2 -translate-y-1/2 text-slate-400 hover:text-slate-600
dark:hover:text-slate-300 cursor-pointer transition-colors p-1"
onclick="toggleApiKeyVisibility()">
<i class="fas fa-eye text-xs"></i>
</button>
</div>
</div>
<!-- API Base -->
<div id="cfg-api-base-wrap" class="hidden">
<label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5">API Base</label>
<input id="cfg-api-base" type="text"
class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
focus:outline-none focus:border-primary-500 font-mono transition-colors"
placeholder="https://...">
</div>
<!-- Save Model Button -->
<div class="flex items-center justify-end gap-3 pt-1">
<span id="cfg-model-status" class="text-xs text-primary-500 opacity-0 transition-opacity duration-300"></span>
<button id="cfg-model-save"
class="px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600 text-white text-sm font-medium
cursor-pointer transition-colors duration-150 disabled:opacity-50 disabled:cursor-not-allowed"
onclick="saveModelConfig()" data-i18n="config_save">Save</button>
</div>
</div>
</div>
<!-- Agent Config Card -->
<div class="placeholder-card bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10 p-6">
<div class="flex items-center gap-3 mb-4">
<div class="bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10 p-6">
<div class="flex items-center gap-3 mb-5">
<div class="w-9 h-9 rounded-lg bg-emerald-50 dark:bg-emerald-900/30 flex items-center justify-center">
<i class="fas fa-robot text-emerald-500 text-sm"></i>
</div>
<h3 class="font-semibold text-slate-800 dark:text-slate-100" data-i18n="config_agent">Agent Configuration</h3>
</div>
<div class="space-y-4">
<div class="flex items-center gap-4 p-3 rounded-lg bg-slate-50 dark:bg-white/5">
<span class="text-sm font-medium text-slate-500 dark:text-slate-400 w-32 flex-shrink-0" data-i18n="config_agent_enabled">Agent Mode</span>
<span class="text-sm text-slate-700 dark:text-slate-200 flex-1" id="cfg-agent">--</span>
<div>
<label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="config_max_tokens">Max Context Tokens</label>
<input id="cfg-max-tokens" type="number" min="1000" max="200000" step="1000"
class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
focus:outline-none focus:border-primary-500 font-mono transition-colors">
</div>
<div class="flex items-center gap-4 p-3 rounded-lg bg-slate-50 dark:bg-white/5">
<span class="text-sm font-medium text-slate-500 dark:text-slate-400 w-32 flex-shrink-0" data-i18n="config_max_tokens">Max Tokens</span>
<span class="text-sm text-slate-700 dark:text-slate-200 flex-1 font-mono" id="cfg-max-tokens">--</span>
<div>
<label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="config_max_turns">Max Context Turns</label>
<input id="cfg-max-turns" type="number" min="1" max="100" step="1"
class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
focus:outline-none focus:border-primary-500 font-mono transition-colors">
</div>
<div class="flex items-center gap-4 p-3 rounded-lg bg-slate-50 dark:bg-white/5">
<span class="text-sm font-medium text-slate-500 dark:text-slate-400 w-32 flex-shrink-0" data-i18n="config_max_turns">Max Turns</span>
<span class="text-sm text-slate-700 dark:text-slate-200 flex-1 font-mono" id="cfg-max-turns">--</span>
<div>
<label class="block text-sm font-medium text-slate-600 dark:text-slate-400 mb-1.5" data-i18n="config_max_steps">Max Steps</label>
<input id="cfg-max-steps" type="number" min="1" max="50" step="1"
class="w-full px-3 py-2 rounded-lg border border-slate-200 dark:border-slate-600
bg-slate-50 dark:bg-white/5 text-sm text-slate-800 dark:text-slate-100
focus:outline-none focus:border-primary-500 font-mono transition-colors">
</div>
<div class="flex items-center gap-4 p-3 rounded-lg bg-slate-50 dark:bg-white/5">
<span class="text-sm font-medium text-slate-500 dark:text-slate-400 w-32 flex-shrink-0" data-i18n="config_max_steps">Max Steps</span>
<span class="text-sm text-slate-700 dark:text-slate-200 flex-1 font-mono" id="cfg-max-steps">--</span>
<div class="flex items-center justify-end gap-3 pt-1">
<span id="cfg-agent-status" class="text-xs text-primary-500 opacity-0 transition-opacity duration-300"></span>
<button id="cfg-agent-save"
class="px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600 text-white text-sm font-medium
cursor-pointer transition-colors duration-150 disabled:opacity-50 disabled:cursor-not-allowed"
onclick="saveAgentConfig()" data-i18n="config_save">Save</button>
</div>
</div>
</div>
<!-- Channel Config Card -->
<div class="placeholder-card bg-white dark:bg-[#1A1A1A] rounded-xl border border-slate-200 dark:border-white/10 p-6">
<div class="flex items-center gap-3 mb-4">
<div class="w-9 h-9 rounded-lg bg-amber-50 dark:bg-amber-900/30 flex items-center justify-center">
<i class="fas fa-tower-broadcast text-amber-500 text-sm"></i>
</div>
<h3 class="font-semibold text-slate-800 dark:text-slate-100" data-i18n="config_channel">Channel Configuration</h3>
</div>
<div class="space-y-4">
<div class="flex items-center gap-4 p-3 rounded-lg bg-slate-50 dark:bg-white/5">
<span class="text-sm font-medium text-slate-500 dark:text-slate-400 w-32 flex-shrink-0" data-i18n="config_channel_type">Channel Type</span>
<span class="text-sm text-slate-700 dark:text-slate-200 flex-1 font-mono" id="cfg-channel">--</span>
</div>
</div>
</div>
</div>
<!-- Coming Soon Banner -->
<div class="mt-6 p-4 rounded-xl bg-primary-50 dark:bg-primary-900/20 border border-primary-200 dark:border-primary-800/50 flex items-center gap-3">
<i class="fas fa-info-circle text-primary-500"></i>
<span class="text-sm text-primary-700 dark:text-primary-300" data-i18n="config_coming_soon">Full editing capability coming soon. Currently displaying read-only configuration.</span>
</div>
</div>
</div>
@@ -364,14 +441,35 @@
<p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="skills_desc">View, enable, or disable agent skills</p>
</div>
</div>
<div id="skills-empty" class="flex flex-col items-center justify-center py-20">
<div class="w-16 h-16 rounded-2xl bg-amber-50 dark:bg-amber-900/20 flex items-center justify-center mb-4">
<i class="fas fa-bolt text-amber-400 text-xl"></i>
<!-- Built-in Tools Section -->
<div class="mb-8">
<div class="flex items-center gap-2 mb-3">
<span class="text-xs font-semibold uppercase tracking-wider text-slate-400 dark:text-slate-500" data-i18n="tools_section_title">Built-in Tools</span>
<span id="tools-count-badge" class="hidden px-2 py-0.5 rounded-full text-xs bg-slate-100 dark:bg-white/10 text-slate-500 dark:text-slate-400"></span>
</div>
<p class="text-slate-500 dark:text-slate-400 font-medium" data-i18n="skills_loading">Loading skills...</p>
<p class="text-sm text-slate-400 dark:text-slate-500 mt-1" data-i18n="skills_loading_desc">Skills will be displayed here after loading</p>
<div id="tools-empty" class="flex items-center gap-2 py-4 text-slate-400 dark:text-slate-500 text-sm">
<i class="fas fa-spinner fa-spin text-xs"></i>
<span data-i18n="tools_loading">Loading tools...</span>
</div>
<div id="tools-list" class="grid gap-3 sm:grid-cols-2 hidden"></div>
</div>
<!-- Skills Section -->
<div>
<div class="flex items-center gap-2 mb-3">
<span class="text-xs font-semibold uppercase tracking-wider text-slate-400 dark:text-slate-500" data-i18n="skills_section_title">Skills</span>
<span id="skills-count-badge" class="hidden px-2 py-0.5 rounded-full text-xs bg-slate-100 dark:bg-white/10 text-slate-500 dark:text-slate-400"></span>
</div>
<div id="skills-empty" class="flex flex-col items-center justify-center py-12">
<div class="w-14 h-14 rounded-2xl bg-amber-50 dark:bg-amber-900/20 flex items-center justify-center mb-3">
<i class="fas fa-bolt text-amber-400 text-lg"></i>
</div>
<p class="text-slate-500 dark:text-slate-400 font-medium" data-i18n="skills_loading">Loading skills...</p>
<p class="text-sm text-slate-400 dark:text-slate-500 mt-1" data-i18n="skills_loading_desc">Skills will be displayed here after loading</p>
</div>
<div id="skills-list" class="grid gap-4 sm:grid-cols-2"></div>
</div>
<div id="skills-list" class="grid gap-4 sm:grid-cols-2"></div>
</div>
</div>
</div>
@@ -451,8 +549,15 @@
<h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="channels_title">Channels</h2>
<p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="channels_desc">View and manage messaging channels</p>
</div>
<button id="add-channel-btn" onclick="openAddChannelPanel()"
class="flex items-center gap-2 px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600
text-white text-sm font-medium cursor-pointer transition-colors duration-150">
<i class="fas fa-plus text-xs"></i>
<span data-i18n="channels_add">Connect</span>
</button>
</div>
<div id="channels-content" class="grid gap-4"></div>
<div id="channels-add-panel" class="hidden mt-4"></div>
</div>
</div>
</div>
@@ -519,6 +624,32 @@
</div><!-- /main-content -->
</div><!-- /app -->
<!-- Confirm Dialog -->
<div id="confirm-dialog-overlay" class="fixed inset-0 bg-black/50 z-[100] hidden flex items-center justify-center">
<div class="bg-white dark:bg-[#1A1A1A] rounded-2xl border border-slate-200 dark:border-white/10 shadow-xl
w-full max-w-sm mx-4 overflow-hidden">
<div class="p-6">
<div class="flex items-center gap-3 mb-3">
<div class="w-10 h-10 rounded-xl bg-red-50 dark:bg-red-900/20 flex items-center justify-center flex-shrink-0">
<i class="fas fa-triangle-exclamation text-red-500"></i>
</div>
<h3 id="confirm-dialog-title" class="font-semibold text-slate-800 dark:text-slate-100 text-base"></h3>
</div>
<p id="confirm-dialog-message" class="text-sm text-slate-500 dark:text-slate-400 leading-relaxed ml-[52px]"></p>
</div>
<div class="flex items-center justify-end gap-3 px-6 py-4 border-t border-slate-100 dark:border-white/5">
<button id="confirm-dialog-cancel"
class="px-4 py-2 rounded-lg border border-slate-200 dark:border-white/10
text-slate-600 dark:text-slate-300 text-sm font-medium
hover:bg-slate-50 dark:hover:bg-white/5
cursor-pointer transition-colors duration-150"></button>
<button id="confirm-dialog-ok"
class="px-4 py-2 rounded-lg bg-red-500 hover:bg-red-600 text-white text-sm font-medium
cursor-pointer transition-colors duration-150"></button>
</div>
</div>
</div>
<script src="assets/js/console.js"></script>
</body>
</html>

View File

@@ -222,6 +222,121 @@
/* Tool failed state */
.agent-tool-step.tool-failed .tool-name { color: #f87171; }
/* Config form controls */
#view-config input[type="text"],
#view-config input[type="number"],
#view-config input[type="password"] {
height: 40px;
transition: border-color 0.2s ease, box-shadow 0.2s ease;
}
#view-config input:focus {
border-color: #4ABE6E;
box-shadow: 0 0 0 3px rgba(74, 190, 110, 0.12);
}
#view-config input[type="text"]:hover,
#view-config input[type="number"]:hover,
#view-config input[type="password"]:hover {
border-color: #94a3b8;
}
.dark #view-config input[type="text"]:hover,
.dark #view-config input[type="number"]:hover,
.dark #view-config input[type="password"]:hover {
border-color: #64748b;
}
/* Custom dropdown */
.cfg-dropdown {
position: relative;
outline: none;
}
.cfg-dropdown-selected {
display: flex;
align-items: center;
justify-content: space-between;
height: 40px;
padding: 0 0.75rem;
border-radius: 0.5rem;
border: 1px solid #e2e8f0;
background: #f8fafc;
font-size: 0.875rem;
color: #1e293b;
cursor: pointer;
transition: border-color 0.2s ease, box-shadow 0.2s ease;
user-select: none;
}
.dark .cfg-dropdown-selected {
border-color: #475569;
background: rgba(255, 255, 255, 0.05);
color: #f1f5f9;
}
.cfg-dropdown-selected:hover { border-color: #94a3b8; }
.dark .cfg-dropdown-selected:hover { border-color: #64748b; }
.cfg-dropdown.open .cfg-dropdown-selected,
.cfg-dropdown:focus .cfg-dropdown-selected {
border-color: #4ABE6E;
box-shadow: 0 0 0 3px rgba(74, 190, 110, 0.12);
}
.cfg-dropdown-arrow {
font-size: 0.625rem;
color: #94a3b8;
transition: transform 0.2s ease;
flex-shrink: 0;
margin-left: 0.5rem;
}
.cfg-dropdown.open .cfg-dropdown-arrow { transform: rotate(180deg); }
.cfg-dropdown-menu {
display: none;
position: absolute;
top: calc(100% + 4px);
left: 0;
right: 0;
z-index: 50;
max-height: 240px;
overflow-y: auto;
border-radius: 0.5rem;
border: 1px solid #e2e8f0;
background: #ffffff;
box-shadow: 0 10px 25px -5px rgba(0, 0, 0, 0.1), 0 4px 10px -5px rgba(0, 0, 0, 0.04);
padding: 4px;
}
.dark .cfg-dropdown-menu {
border-color: #334155;
background: #1e1e1e;
box-shadow: 0 10px 25px -5px rgba(0, 0, 0, 0.4);
}
.cfg-dropdown.open .cfg-dropdown-menu { display: block; }
.cfg-dropdown-item {
display: flex;
align-items: center;
padding: 8px 10px;
border-radius: 6px;
font-size: 0.875rem;
color: #334155;
cursor: pointer;
transition: background 0.15s ease;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
}
.dark .cfg-dropdown-item { color: #cbd5e1; }
.cfg-dropdown-item:hover { background: #f1f5f9; }
.dark .cfg-dropdown-item:hover { background: rgba(255, 255, 255, 0.08); }
.cfg-dropdown-item.active {
background: rgba(74, 190, 110, 0.1);
color: #228547;
font-weight: 500;
}
.dark .cfg-dropdown-item.active {
background: rgba(74, 190, 110, 0.15);
color: #74E9A4;
}
/* API Key masking via CSS (avoids browser password prompts) */
.cfg-key-masked {
-webkit-text-security: disc;
text-security: disc;
}
/* Chat Input */
#chat-input {
resize: none; height: 42px; max-height: 180px;

File diff suppressed because it is too large Load Diff

View File

@@ -1,30 +1,35 @@
import sys
import time
import web
import json
import logging
import mimetypes
import os
import threading
import time
import uuid
from queue import Queue, Empty
import web
from bridge.context import *
from bridge.reply import Reply, ReplyType
from channel.chat_channel import ChatChannel, check_prefix
from channel.chat_message import ChatMessage
from collections import OrderedDict
from common import const
from common.log import logger
from common.singleton import singleton
from config import conf
import os
import mimetypes
import threading
import logging
class WebMessage(ChatMessage):
def __init__(
self,
msg_id,
content,
ctype=ContextType.TEXT,
from_user_id="User",
to_user_id="Chatgpt",
other_user_id="Chatgpt",
self,
msg_id,
content,
ctype=ContextType.TEXT,
from_user_id="User",
to_user_id="Chatgpt",
other_user_id="Chatgpt",
):
self.msg_id = msg_id
self.ctype = ctype
@@ -38,7 +43,7 @@ class WebMessage(ChatMessage):
class WebChannel(ChatChannel):
NOT_SUPPORT_REPLYTYPE = [ReplyType.VOICE]
_instance = None
# def __new__(cls):
# if cls._instance is None:
# cls._instance = super(WebChannel, cls).__new__(cls)
@@ -47,12 +52,11 @@ class WebChannel(ChatChannel):
def __init__(self):
super().__init__()
self.msg_id_counter = 0
self.session_queues = {} # session_id -> Queue (fallback polling)
self.request_to_session = {} # request_id -> session_id
self.sse_queues = {} # request_id -> Queue (SSE streaming)
self.session_queues = {} # session_id -> Queue (fallback polling)
self.request_to_session = {} # request_id -> session_id
self.sse_queues = {} # request_id -> Queue (SSE streaming)
self._http_server = None
def _generate_msg_id(self):
"""生成唯一的消息ID"""
self.msg_id_counter += 1
@@ -111,6 +115,7 @@ class WebChannel(ChatChannel):
def _make_sse_callback(self, request_id: str):
"""Build an on_event callback that pushes agent stream events into the SSE queue."""
def on_event(event: dict):
if request_id not in self.sse_queues:
return
@@ -237,28 +242,28 @@ class WebChannel(ChatChannel):
data = web.data()
json_data = json.loads(data)
session_id = json_data.get('session_id')
if not session_id or session_id not in self.session_queues:
return json.dumps({"status": "error", "message": "Invalid session ID"})
# 尝试从队列获取响应,不等待
try:
# 使用peek而不是get这样如果前端没有成功处理下次还能获取到
response = self.session_queues[session_id].get(block=False)
# 返回响应包含请求ID以区分不同请求
return json.dumps({
"status": "success",
"status": "success",
"has_content": True,
"content": response["content"],
"request_id": response["request_id"],
"timestamp": response["timestamp"]
})
except Empty:
# 没有新响应
return json.dumps({"status": "success", "has_content": False})
except Exception as e:
logger.error(f"Error polling response: {e}")
return json.dumps({"status": "error", "message": str(e)})
@@ -271,9 +276,10 @@ class WebChannel(ChatChannel):
def startup(self):
port = conf().get("web_port", 9899)
# 打印可用渠道类型提示
logger.info("[WebChannel] 当前channel为web可修改 config.json 配置文件中的 channel_type 字段进行切换。全部可用类型为:")
logger.info(
"[WebChannel] 全部可用通道如下,可修改 config.json 配置文件中的 channel_type 字段进行切换,多个通道用逗号分隔:")
logger.info("[WebChannel] 1. web - 网页")
logger.info("[WebChannel] 2. terminal - 终端")
logger.info("[WebChannel] 3. feishu - 飞书")
@@ -281,16 +287,16 @@ class WebChannel(ChatChannel):
logger.info("[WebChannel] 5. wechatcom_app - 企微自建应用")
logger.info("[WebChannel] 6. wechatmp - 个人公众号")
logger.info("[WebChannel] 7. wechatmp_service - 企业公众号")
logger.info("[WebChannel] ✅ Web控制台已运行")
logger.info(f"[WebChannel] 🌐 本地访问: http://localhost:{port}")
logger.info(f"[WebChannel] 🌍 服务器访问: http://YOUR_IP:{port} (请将YOUR_IP替换为服务器IP)")
logger.info("[WebChannel] ✅ Web对话网页已运行")
# 确保静态文件目录存在
static_dir = os.path.join(os.path.dirname(__file__), 'static')
if not os.path.exists(static_dir):
os.makedirs(static_dir)
logger.debug(f"[WebChannel] Created static directory: {static_dir}")
urls = (
'/', 'RootHandler',
'/message', 'MessageHandler',
@@ -298,26 +304,31 @@ class WebChannel(ChatChannel):
'/stream', 'StreamHandler',
'/chat', 'ChatHandler',
'/config', 'ConfigHandler',
'/api/channels', 'ChannelsHandler',
'/api/tools', 'ToolsHandler',
'/api/skills', 'SkillsHandler',
'/api/memory', 'MemoryHandler',
'/api/memory/content', 'MemoryContentHandler',
'/api/scheduler', 'SchedulerHandler',
'/api/history', 'HistoryHandler',
'/api/logs', 'LogsHandler',
'/assets/(.*)', 'AssetsHandler',
)
app = web.application(urls, globals(), autoreload=False)
# 完全禁用web.py的HTTP日志输出
web.httpserver.LogMiddleware.log = lambda self, status, environ: None
# 配置web.py的日志级别为ERROR
logging.getLogger("web").setLevel(logging.ERROR)
logging.getLogger("web.httpserver").setLevel(logging.ERROR)
# Build WSGI app with middleware (same as runsimple but without print)
func = web.httpserver.StaticMiddleware(app.wsgifunc())
func = web.httpserver.LogMiddleware(func)
server = web.httpserver.WSGIServer(("0.0.0.0", port), func)
# Allow concurrent requests by not blocking on in-flight handler threads
server.daemon_threads = True
self._http_server = server
try:
server.start()
@@ -374,31 +385,524 @@ class ChatHandler:
class ConfigHandler:
_RECOMMENDED_MODELS = [
const.MINIMAX_M2_5, const.MINIMAX_M2_1, const.MINIMAX_M2_1_LIGHTNING,
const.GLM_5, const.GLM_4_7,
const.QWEN3_MAX, const.QWEN35_PLUS,
const.KIMI_K2_5, const.KIMI_K2,
const.DOUBAO_SEED_2_PRO, const.DOUBAO_SEED_2_CODE,
const.CLAUDE_4_6_SONNET, const.CLAUDE_4_6_OPUS, const.CLAUDE_4_5_SONNET,
const.GEMINI_31_FLASH_LITE_PRE, const.GEMINI_31_PRO_PRE, const.GEMINI_3_FLASH_PRE,
const.GPT_54, const.GPT_5, const.GPT_41, const.GPT_4o,
const.DEEPSEEK_CHAT, const.DEEPSEEK_REASONER,
]
PROVIDER_MODELS = OrderedDict([
("minimax", {
"label": "MiniMax",
"api_key_field": "minimax_api_key",
"api_base_key": None,
"api_base_default": None,
"models": [const.MINIMAX_M2_5, const.MINIMAX_M2_1, const.MINIMAX_M2_1_LIGHTNING],
}),
("zhipu", {
"label": "智谱AI",
"api_key_field": "zhipu_ai_api_key",
"api_base_key": "zhipu_ai_api_base",
"api_base_default": "https://open.bigmodel.cn/api/paas/v4",
"models": [const.GLM_5, const.GLM_4_7],
}),
("dashscope", {
"label": "通义千问",
"api_key_field": "dashscope_api_key",
"api_base_key": None,
"api_base_default": None,
"models": [const.QWEN3_MAX, const.QWEN35_PLUS],
}),
("moonshot", {
"label": "Kimi",
"api_key_field": "moonshot_api_key",
"api_base_key": "moonshot_base_url",
"api_base_default": "https://api.moonshot.cn/v1",
"models": [const.KIMI_K2_5, const.KIMI_K2],
}),
("doubao", {
"label": "豆包",
"api_key_field": "ark_api_key",
"api_base_key": "ark_base_url",
"api_base_default": "https://ark.cn-beijing.volces.com/api/v3",
"models": [const.DOUBAO_SEED_2_PRO, const.DOUBAO_SEED_2_CODE],
}),
("claudeAPI", {
"label": "Claude",
"api_key_field": "claude_api_key",
"api_base_key": "claude_api_base",
"api_base_default": "https://api.anthropic.com/v1",
"models": [const.CLAUDE_4_6_SONNET, const.CLAUDE_4_6_OPUS, const.CLAUDE_4_5_SONNET],
}),
("gemini", {
"label": "Gemini",
"api_key_field": "gemini_api_key",
"api_base_key": "gemini_api_base",
"api_base_default": "https://generativelanguage.googleapis.com",
"models": [const.GEMINI_31_FLASH_LITE_PRE, const.GEMINI_31_PRO_PRE, const.GEMINI_3_FLASH_PRE],
}),
("chatGPT", {
"label": "OpenAI",
"api_key_field": "open_ai_api_key",
"api_base_key": "open_ai_api_base",
"api_base_default": "https://api.openai.com/v1",
"models": [const.GPT_54, const.GPT_5, const.GPT_41, const.GPT_4o],
}),
("deepseek", {
"label": "DeepSeek",
"api_key_field": "open_ai_api_key",
"api_base_key": None,
"api_base_default": None,
"models": [const.DEEPSEEK_CHAT, const.DEEPSEEK_REASONER],
}),
("linkai", {
"label": "LinkAI",
"api_key_field": "linkai_api_key",
"api_base_key": None,
"api_base_default": None,
"models": _RECOMMENDED_MODELS,
}),
])
EDITABLE_KEYS = {
"model", "bot_type", "use_linkai",
"open_ai_api_base", "claude_api_base", "gemini_api_base",
"zhipu_ai_api_base", "moonshot_base_url", "ark_base_url",
"open_ai_api_key", "claude_api_key", "gemini_api_key",
"zhipu_ai_api_key", "dashscope_api_key", "moonshot_api_key",
"ark_api_key", "minimax_api_key", "linkai_api_key",
"agent_max_context_tokens", "agent_max_context_turns", "agent_max_steps",
}
@staticmethod
def _mask_key(value: str) -> str:
"""Mask the middle part of an API key for display."""
if not value or len(value) <= 8:
return value
return value[:4] + "*" * (len(value) - 8) + value[-4:]
def GET(self):
"""Return configuration info for the web console."""
"""Return configuration info and provider/model metadata."""
web.header('Content-Type', 'application/json; charset=utf-8')
try:
local_config = conf()
use_agent = local_config.get("agent", False)
title = "CowAgent" if use_agent else "AI Assistant"
if use_agent:
title = "CowAgent"
else:
title = "AI Assistant"
api_bases = {}
api_keys_masked = {}
for pid, pinfo in self.PROVIDER_MODELS.items():
base_key = pinfo.get("api_base_key")
if base_key:
api_bases[base_key] = local_config.get(base_key, pinfo["api_base_default"])
key_field = pinfo.get("api_key_field")
if key_field and key_field not in api_keys_masked:
raw = local_config.get(key_field, "")
api_keys_masked[key_field] = self._mask_key(raw) if raw else ""
providers = {}
for pid, p in self.PROVIDER_MODELS.items():
providers[pid] = {
"label": p["label"],
"models": p["models"],
"api_base_key": p["api_base_key"],
"api_base_default": p["api_base_default"],
"api_key_field": p.get("api_key_field"),
}
return json.dumps({
"status": "success",
"use_agent": use_agent,
"title": title,
"model": local_config.get("model", ""),
"bot_type": local_config.get("bot_type", ""),
"use_linkai": bool(local_config.get("use_linkai", False)),
"channel_type": local_config.get("channel_type", ""),
"agent_max_context_tokens": local_config.get("agent_max_context_tokens", ""),
"agent_max_context_turns": local_config.get("agent_max_context_turns", ""),
"agent_max_steps": local_config.get("agent_max_steps", ""),
})
"agent_max_context_tokens": local_config.get("agent_max_context_tokens", 50000),
"agent_max_context_turns": local_config.get("agent_max_context_turns", 20),
"agent_max_steps": local_config.get("agent_max_steps", 15),
"api_bases": api_bases,
"api_keys": api_keys_masked,
"providers": providers,
}, ensure_ascii=False)
except Exception as e:
logger.error(f"Error getting config: {e}")
return json.dumps({"status": "error", "message": str(e)})
def POST(self):
"""Update configuration values in memory and persist to config.json."""
web.header('Content-Type', 'application/json; charset=utf-8')
try:
data = json.loads(web.data())
updates = data.get("updates", {})
if not updates:
return json.dumps({"status": "error", "message": "no updates provided"})
local_config = conf()
applied = {}
for key, value in updates.items():
if key not in self.EDITABLE_KEYS:
continue
if key in ("agent_max_context_tokens", "agent_max_context_turns", "agent_max_steps"):
value = int(value)
if key == "use_linkai":
value = bool(value)
local_config[key] = value
applied[key] = value
if not applied:
return json.dumps({"status": "error", "message": "no valid keys to update"})
config_path = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(
os.path.abspath(__file__)))), "config.json")
if os.path.exists(config_path):
with open(config_path, "r", encoding="utf-8") as f:
file_cfg = json.load(f)
else:
file_cfg = {}
file_cfg.update(applied)
with open(config_path, "w", encoding="utf-8") as f:
json.dump(file_cfg, f, indent=4, ensure_ascii=False)
logger.info(f"[WebChannel] Config updated: {list(applied.keys())}")
return json.dumps({"status": "success", "applied": applied}, ensure_ascii=False)
except Exception as e:
logger.error(f"Error updating config: {e}")
return json.dumps({"status": "error", "message": str(e)})
class ChannelsHandler:
"""API for managing external channel configurations (feishu, dingtalk, etc)."""
CHANNEL_DEFS = OrderedDict([
("feishu", {
"label": {"zh": "飞书", "en": "Feishu"},
"icon": "fa-paper-plane",
"color": "blue",
"fields": [
{"key": "feishu_app_id", "label": "App ID", "type": "text"},
{"key": "feishu_app_secret", "label": "App Secret", "type": "secret"},
{"key": "feishu_token", "label": "Verification Token", "type": "secret"},
{"key": "feishu_bot_name", "label": "Bot Name", "type": "text"},
],
}),
("dingtalk", {
"label": {"zh": "钉钉", "en": "DingTalk"},
"icon": "fa-comments",
"color": "blue",
"fields": [
{"key": "dingtalk_client_id", "label": "Client ID", "type": "text"},
{"key": "dingtalk_client_secret", "label": "Client Secret", "type": "secret"},
],
}),
("wecom_bot", {
"label": {"zh": "企微智能机器人", "en": "WeCom Bot"},
"icon": "fa-robot",
"color": "emerald",
"fields": [
{"key": "wecom_bot_id", "label": "Bot ID", "type": "text"},
{"key": "wecom_bot_secret", "label": "Secret", "type": "secret"},
],
}),
("wechatcom_app", {
"label": {"zh": "企微自建应用", "en": "WeCom App"},
"icon": "fa-building",
"color": "emerald",
"fields": [
{"key": "wechatcom_corp_id", "label": "Corp ID", "type": "text"},
{"key": "wechatcomapp_agent_id", "label": "Agent ID", "type": "text"},
{"key": "wechatcomapp_secret", "label": "Secret", "type": "secret"},
{"key": "wechatcomapp_token", "label": "Token", "type": "secret"},
{"key": "wechatcomapp_aes_key", "label": "AES Key", "type": "secret"},
{"key": "wechatcomapp_port", "label": "Port", "type": "number", "default": 9898},
],
}),
("wechatmp", {
"label": {"zh": "公众号", "en": "WeChat MP"},
"icon": "fa-comment-dots",
"color": "emerald",
"fields": [
{"key": "wechatmp_app_id", "label": "App ID", "type": "text"},
{"key": "wechatmp_app_secret", "label": "App Secret", "type": "secret"},
{"key": "wechatmp_token", "label": "Token", "type": "secret"},
{"key": "wechatmp_aes_key", "label": "AES Key", "type": "secret"},
{"key": "wechatmp_port", "label": "Port", "type": "number", "default": 8080},
],
}),
])
@staticmethod
def _mask_secret(value: str) -> str:
if not value or len(value) <= 8:
return value
return value[:4] + "*" * (len(value) - 8) + value[-4:]
@staticmethod
def _parse_channel_list(raw) -> list:
if isinstance(raw, list):
return [ch.strip() for ch in raw if ch.strip()]
if isinstance(raw, str):
return [ch.strip() for ch in raw.split(",") if ch.strip()]
return []
@classmethod
def _active_channel_set(cls) -> set:
return set(cls._parse_channel_list(conf().get("channel_type", "")))
def GET(self):
web.header('Content-Type', 'application/json; charset=utf-8')
try:
local_config = conf()
active_channels = self._active_channel_set()
channels = []
for ch_name, ch_def in self.CHANNEL_DEFS.items():
fields_out = []
for f in ch_def["fields"]:
raw_val = local_config.get(f["key"], f.get("default", ""))
if f["type"] == "secret" and raw_val:
display_val = self._mask_secret(str(raw_val))
else:
display_val = raw_val
fields_out.append({
"key": f["key"],
"label": f["label"],
"type": f["type"],
"value": display_val,
"default": f.get("default", ""),
})
channels.append({
"name": ch_name,
"label": ch_def["label"],
"icon": ch_def["icon"],
"color": ch_def["color"],
"active": ch_name in active_channels,
"fields": fields_out,
})
return json.dumps({"status": "success", "channels": channels}, ensure_ascii=False)
except Exception as e:
logger.error(f"[WebChannel] Channels API error: {e}")
return json.dumps({"status": "error", "message": str(e)})
def POST(self):
web.header('Content-Type', 'application/json; charset=utf-8')
try:
body = json.loads(web.data())
action = body.get("action")
channel_name = body.get("channel")
if not action or not channel_name:
return json.dumps({"status": "error", "message": "action and channel required"})
if channel_name not in self.CHANNEL_DEFS:
return json.dumps({"status": "error", "message": f"unknown channel: {channel_name}"})
if action == "save":
return self._handle_save(channel_name, body.get("config", {}))
elif action == "connect":
return self._handle_connect(channel_name, body.get("config", {}))
elif action == "disconnect":
return self._handle_disconnect(channel_name)
else:
return json.dumps({"status": "error", "message": f"unknown action: {action}"})
except Exception as e:
logger.error(f"[WebChannel] Channels POST error: {e}")
return json.dumps({"status": "error", "message": str(e)})
def _handle_save(self, channel_name: str, updates: dict):
ch_def = self.CHANNEL_DEFS[channel_name]
valid_keys = {f["key"] for f in ch_def["fields"]}
secret_keys = {f["key"] for f in ch_def["fields"] if f["type"] == "secret"}
local_config = conf()
applied = {}
for key, value in updates.items():
if key not in valid_keys:
continue
if key in secret_keys:
if not value or (len(value) > 8 and "*" * 4 in value):
continue
field_def = next((f for f in ch_def["fields"] if f["key"] == key), None)
if field_def:
if field_def["type"] == "number":
value = int(value)
elif field_def["type"] == "bool":
value = bool(value)
local_config[key] = value
applied[key] = value
if not applied:
return json.dumps({"status": "error", "message": "no valid fields to update"})
config_path = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(
os.path.abspath(__file__)))), "config.json")
if os.path.exists(config_path):
with open(config_path, "r", encoding="utf-8") as f:
file_cfg = json.load(f)
else:
file_cfg = {}
file_cfg.update(applied)
with open(config_path, "w", encoding="utf-8") as f:
json.dump(file_cfg, f, indent=4, ensure_ascii=False)
logger.info(f"[WebChannel] Channel '{channel_name}' config updated: {list(applied.keys())}")
should_restart = False
active_channels = self._active_channel_set()
if channel_name in active_channels:
should_restart = True
try:
import sys
app_module = sys.modules.get('__main__') or sys.modules.get('app')
mgr = getattr(app_module, '_channel_mgr', None) if app_module else None
if mgr:
threading.Thread(
target=mgr.restart,
args=(channel_name,),
daemon=True,
).start()
logger.info(f"[WebChannel] Channel '{channel_name}' restart triggered")
except Exception as e:
logger.warning(f"[WebChannel] Failed to restart channel '{channel_name}': {e}")
return json.dumps({
"status": "success",
"applied": list(applied.keys()),
"restarted": should_restart,
}, ensure_ascii=False)
def _handle_connect(self, channel_name: str, updates: dict):
"""Save config fields, add channel to channel_type, and start it."""
ch_def = self.CHANNEL_DEFS[channel_name]
valid_keys = {f["key"] for f in ch_def["fields"]}
secret_keys = {f["key"] for f in ch_def["fields"] if f["type"] == "secret"}
# Feishu connected via web console must use websocket (long connection) mode
if channel_name == "feishu":
updates.setdefault("feishu_event_mode", "websocket")
valid_keys.add("feishu_event_mode")
local_config = conf()
applied = {}
for key, value in updates.items():
if key not in valid_keys:
continue
if key in secret_keys:
if not value or (len(value) > 8 and "*" * 4 in value):
continue
field_def = next((f for f in ch_def["fields"] if f["key"] == key), None)
if field_def:
if field_def["type"] == "number":
value = int(value)
elif field_def["type"] == "bool":
value = bool(value)
local_config[key] = value
applied[key] = value
existing = self._parse_channel_list(conf().get("channel_type", ""))
if channel_name not in existing:
existing.append(channel_name)
new_channel_type = ",".join(existing)
local_config["channel_type"] = new_channel_type
config_path = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(
os.path.abspath(__file__)))), "config.json")
if os.path.exists(config_path):
with open(config_path, "r", encoding="utf-8") as f:
file_cfg = json.load(f)
else:
file_cfg = {}
file_cfg.update(applied)
file_cfg["channel_type"] = new_channel_type
with open(config_path, "w", encoding="utf-8") as f:
json.dump(file_cfg, f, indent=4, ensure_ascii=False)
logger.info(f"[WebChannel] Channel '{channel_name}' connecting, channel_type={new_channel_type}")
def _do_start():
try:
import sys
app_module = sys.modules.get('__main__') or sys.modules.get('app')
clear_fn = getattr(app_module, '_clear_singleton_cache', None) if app_module else None
mgr = getattr(app_module, '_channel_mgr', None) if app_module else None
if mgr is None:
logger.warning(f"[WebChannel] ChannelManager not available, cannot start '{channel_name}'")
return
# Stop existing instance first if still running (e.g. re-connect without disconnect)
existing_ch = mgr.get_channel(channel_name)
if existing_ch is not None:
logger.info(f"[WebChannel] Stopping existing '{channel_name}' before reconnect...")
mgr.stop(channel_name)
# Always wait for the remote service to release the old connection before
# establishing a new one (DingTalk drops callbacks on duplicate connections)
logger.info(f"[WebChannel] Waiting for '{channel_name}' old connection to close...")
time.sleep(5)
if clear_fn:
clear_fn(channel_name)
logger.info(f"[WebChannel] Starting channel '{channel_name}'...")
mgr.start([channel_name], first_start=False)
logger.info(f"[WebChannel] Channel '{channel_name}' start completed")
except Exception as e:
logger.error(f"[WebChannel] Failed to start channel '{channel_name}': {e}",
exc_info=True)
threading.Thread(target=_do_start, daemon=True).start()
return json.dumps({
"status": "success",
"channel_type": new_channel_type,
}, ensure_ascii=False)
def _handle_disconnect(self, channel_name: str):
existing = self._parse_channel_list(conf().get("channel_type", ""))
existing = [ch for ch in existing if ch != channel_name]
new_channel_type = ",".join(existing)
local_config = conf()
local_config["channel_type"] = new_channel_type
config_path = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(
os.path.abspath(__file__)))), "config.json")
if os.path.exists(config_path):
with open(config_path, "r", encoding="utf-8") as f:
file_cfg = json.load(f)
else:
file_cfg = {}
file_cfg["channel_type"] = new_channel_type
with open(config_path, "w", encoding="utf-8") as f:
json.dump(file_cfg, f, indent=4, ensure_ascii=False)
def _do_stop():
try:
import sys
app_module = sys.modules.get('__main__') or sys.modules.get('app')
mgr = getattr(app_module, '_channel_mgr', None) if app_module else None
clear_fn = getattr(app_module, '_clear_singleton_cache', None) if app_module else None
if mgr:
mgr.stop(channel_name)
else:
logger.warning(f"[WebChannel] ChannelManager not found, cannot stop '{channel_name}'")
if clear_fn:
clear_fn(channel_name)
logger.info(f"[WebChannel] Channel '{channel_name}' disconnected, "
f"channel_type={new_channel_type}")
except Exception as e:
logger.warning(f"[WebChannel] Failed to stop channel '{channel_name}': {e}",
exc_info=True)
threading.Thread(target=_do_stop, daemon=True).start()
return json.dumps({
"status": "success",
"channel_type": new_channel_type,
}, ensure_ascii=False)
def _get_workspace_root():
"""Resolve the agent workspace directory."""
@@ -406,6 +910,30 @@ def _get_workspace_root():
return expand_path(conf().get("agent_workspace", "~/cow"))
class ToolsHandler:
def GET(self):
web.header('Content-Type', 'application/json; charset=utf-8')
try:
from agent.tools.tool_manager import ToolManager
tm = ToolManager()
if not tm.tool_classes:
tm.load_tools()
tools = []
for name, cls in tm.tool_classes.items():
try:
instance = cls()
tools.append({
"name": name,
"description": instance.description,
})
except Exception:
tools.append({"name": name, "description": ""})
return json.dumps({"status": "success", "tools": tools}, ensure_ascii=False)
except Exception as e:
logger.error(f"[WebChannel] Tools API error: {e}")
return json.dumps({"status": "error", "message": str(e)})
class SkillsHandler:
def GET(self):
web.header('Content-Type', 'application/json; charset=utf-8')
@@ -421,6 +949,30 @@ class SkillsHandler:
logger.error(f"[WebChannel] Skills API error: {e}")
return json.dumps({"status": "error", "message": str(e)})
def POST(self):
web.header('Content-Type', 'application/json; charset=utf-8')
try:
from agent.skills.service import SkillService
from agent.skills.manager import SkillManager
body = json.loads(web.data())
action = body.get("action")
name = body.get("name")
if not action or not name:
return json.dumps({"status": "error", "message": "action and name are required"})
workspace_root = _get_workspace_root()
manager = SkillManager(custom_dir=os.path.join(workspace_root, "skills"))
service = SkillService(manager)
if action == "open":
service.open({"name": name})
elif action == "close":
service.close({"name": name})
else:
return json.dumps({"status": "error", "message": f"unknown action: {action}"})
return json.dumps({"status": "success"}, ensure_ascii=False)
except Exception as e:
logger.error(f"[WebChannel] Skills POST error: {e}")
return json.dumps({"status": "error", "message": str(e)})
class MemoryHandler:
def GET(self):
@@ -471,6 +1023,37 @@ class SchedulerHandler:
return json.dumps({"status": "error", "message": str(e)})
class HistoryHandler:
def GET(self):
"""
Return paginated conversation history for a session.
Query params:
session_id (required)
page int, default 1 (1 = most recent messages)
page_size int, default 20
"""
web.header('Content-Type', 'application/json; charset=utf-8')
web.header('Access-Control-Allow-Origin', '*')
try:
params = web.input(session_id='', page='1', page_size='20')
session_id = params.session_id.strip()
if not session_id:
return json.dumps({"status": "error", "message": "session_id required"})
from agent.memory import get_conversation_store
store = get_conversation_store()
result = store.load_history_page(
session_id=session_id,
page=int(params.page),
page_size=int(params.page_size),
)
return json.dumps({"status": "success", **result}, ensure_ascii=False)
except Exception as e:
logger.error(f"[WebChannel] History API error: {e}")
return json.dumps({"status": "error", "message": str(e)})
class LogsHandler:
def GET(self):
"""Stream the last N lines of run.log as SSE, then tail new lines."""

View File

@@ -1,179 +0,0 @@
# encoding:utf-8
"""
wechat channel
"""
import io
import json
import os
import threading
import time
from queue import Empty
from typing import Any
from bridge.context import *
from bridge.reply import *
from channel.chat_channel import ChatChannel
from channel.wechat.wcf_message import WechatfMessage
from common.log import logger
from common.singleton import singleton
from common.utils import *
from config import conf, get_appdata_dir
from wcferry import Wcf, WxMsg
@singleton
class WechatfChannel(ChatChannel):
NOT_SUPPORT_REPLYTYPE = []
def __init__(self):
super().__init__()
self.NOT_SUPPORT_REPLYTYPE = []
# 使用字典存储最近消息,用于去重
self.received_msgs = {}
# 初始化wcferry客户端
self.wcf = Wcf()
self.wxid = None # 登录后会被设置为当前登录用户的wxid
def startup(self):
"""
启动通道
"""
try:
# wcferry会自动唤起微信并登录
self.wxid = self.wcf.get_self_wxid()
self.name = self.wcf.get_user_info().get("name")
logger.info(f"微信登录成功当前用户ID: {self.wxid}, 用户名:{self.name}")
self.contact_cache = ContactCache(self.wcf)
self.contact_cache.update()
# 启动消息接收
self.wcf.enable_receiving_msg()
# 创建消息处理线程
t = threading.Thread(target=self._process_messages, name="WeChatThread", daemon=True)
t.start()
except Exception as e:
logger.error(f"微信通道启动失败: {e}")
raise e
def _process_messages(self):
"""
处理消息队列
"""
while True:
try:
msg = self.wcf.get_msg()
if msg:
self._handle_message(msg)
except Empty:
continue
except Exception as e:
logger.error(f"处理消息失败: {e}")
continue
def _handle_message(self, msg: WxMsg):
"""
处理单条消息
"""
try:
# 构造消息对象
cmsg = WechatfMessage(self, msg)
# 消息去重
if cmsg.msg_id in self.received_msgs:
return
self.received_msgs[cmsg.msg_id] = time.time()
# 清理过期消息ID
self._clean_expired_msgs()
logger.debug(f"收到消息: {msg}")
context = self._compose_context(cmsg.ctype, cmsg.content,
isgroup=cmsg.is_group,
msg=cmsg)
if context:
self.produce(context)
except Exception as e:
logger.error(f"处理消息失败: {e}")
def _clean_expired_msgs(self, expire_time: float = 60):
"""
清理过期的消息ID
"""
now = time.time()
for msg_id in list(self.received_msgs.keys()):
if now - self.received_msgs[msg_id] > expire_time:
del self.received_msgs[msg_id]
def send(self, reply: Reply, context: Context):
"""
发送消息
"""
receiver = context["receiver"]
if not receiver:
logger.error("receiver is empty")
return
try:
if reply.type == ReplyType.TEXT:
# 处理@信息
at_list = []
if context.get("isgroup"):
if context["msg"].actual_user_id:
at_list = [context["msg"].actual_user_id]
at_str = ",".join(at_list) if at_list else ""
self.wcf.send_text(reply.content, receiver, at_str)
elif reply.type == ReplyType.ERROR or reply.type == ReplyType.INFO:
self.wcf.send_text(reply.content, receiver)
else:
logger.error(f"暂不支持的消息类型: {reply.type}")
except Exception as e:
logger.error(f"发送消息失败: {e}")
def close(self):
"""
关闭通道
"""
try:
self.wcf.cleanup()
except Exception as e:
logger.error(f"关闭通道失败: {e}")
class ContactCache:
def __init__(self, wcf):
"""
wcf: 一个 wcfferry.client.Wcf 实例
"""
self.wcf = wcf
self._contact_map = {} # 形如 {wxid: {完整联系人信息}}
def update(self):
"""
更新缓存:调用 get_contacts()
再把 wcf.contacts 构建成 {wxid: {完整信息}} 的字典
"""
self.wcf.get_contacts()
self._contact_map.clear()
for item in self.wcf.contacts:
wxid = item.get('wxid')
if wxid: # 确保有 wxid 字段
self._contact_map[wxid] = item
def get_contact(self, wxid: str) -> dict:
"""
返回该 wxid 对应的完整联系人 dict
如果没找到就返回 None
"""
return self._contact_map.get(wxid)
def get_name_by_wxid(self, wxid: str) -> str:
"""
通过wxid获取成员/群名称
"""
contact = self.get_contact(wxid)
if contact:
return contact.get('name', '')
return ''

View File

@@ -1,58 +0,0 @@
# encoding:utf-8
"""
wechat channel message
"""
from bridge.context import ContextType
from channel.chat_message import ChatMessage
from common.log import logger
from wcferry import WxMsg
class WechatfMessage(ChatMessage):
"""
微信消息封装类
"""
def __init__(self, channel, wcf_msg: WxMsg, is_group=False):
"""
初始化消息对象
:param wcf_msg: wcferry消息对象
:param is_group: 是否是群消息
"""
super().__init__(wcf_msg)
self.msg_id = wcf_msg.id
self.create_time = wcf_msg.ts # 使用消息时间戳
self.is_group = is_group or wcf_msg._is_group
self.wxid = channel.wxid
self.name = channel.name
# 解析消息类型
if wcf_msg.is_text():
self.ctype = ContextType.TEXT
self.content = wcf_msg.content
else:
raise NotImplementedError(f"Unsupported message type: {wcf_msg.type}")
# 设置发送者和接收者信息
self.from_user_id = self.wxid if wcf_msg.sender == self.wxid else wcf_msg.sender
self.from_user_nickname = self.name if wcf_msg.sender == self.wxid else channel.contact_cache.get_name_by_wxid(wcf_msg.sender)
self.to_user_id = self.wxid
self.to_user_nickname = self.name
self.other_user_id = wcf_msg.sender
self.other_user_nickname = channel.contact_cache.get_name_by_wxid(wcf_msg.sender)
# 群消息特殊处理
if self.is_group:
self.other_user_id = wcf_msg.roomid
self.other_user_nickname = channel.contact_cache.get_name_by_wxid(wcf_msg.roomid)
self.actual_user_id = wcf_msg.sender
self.actual_user_nickname = channel.wcf.get_alias_in_chatroom(wcf_msg.sender, wcf_msg.roomid)
if not self.actual_user_nickname: # 群聊获取不到企微号成员昵称,这里尝试从联系人缓存去获取
self.actual_user_nickname = channel.contact_cache.get_name_by_wxid(wcf_msg.sender)
self.room_id = wcf_msg.roomid
self.is_at = wcf_msg.is_at(self.wxid) # 是否被@当前登录用户
# 判断是否是自己发送的消息
self.my_msg = wcf_msg.from_self()

View File

@@ -1,309 +0,0 @@
# encoding:utf-8
"""
wechat channel
"""
import io
import json
import os
import threading
import time
import requests
from bridge.context import *
from bridge.reply import *
from channel.chat_channel import ChatChannel
from channel import chat_channel
from channel.wechat.wechat_message import *
from common.expired_dict import ExpiredDict
from common.log import logger
from common.singleton import singleton
from common.time_check import time_checker
from common.utils import convert_webp_to_png, remove_markdown_symbol
from config import conf, get_appdata_dir
from lib import itchat
from lib.itchat.content import *
@itchat.msg_register([TEXT, VOICE, PICTURE, NOTE, ATTACHMENT, SHARING])
def handler_single_msg(msg):
try:
cmsg = WechatMessage(msg, False)
except NotImplementedError as e:
logger.debug("[WX]single message {} skipped: {}".format(msg["MsgId"], e))
return None
WechatChannel().handle_single(cmsg)
return None
@itchat.msg_register([TEXT, VOICE, PICTURE, NOTE, ATTACHMENT, SHARING], isGroupChat=True)
def handler_group_msg(msg):
try:
cmsg = WechatMessage(msg, True)
except NotImplementedError as e:
logger.debug("[WX]group message {} skipped: {}".format(msg["MsgId"], e))
return None
WechatChannel().handle_group(cmsg)
return None
def _check(func):
def wrapper(self, cmsg: ChatMessage):
msgId = cmsg.msg_id
if msgId in self.receivedMsgs:
logger.info("Wechat message {} already received, ignore".format(msgId))
return
self.receivedMsgs[msgId] = True
create_time = cmsg.create_time # 消息时间戳
if conf().get("hot_reload") == True and int(create_time) < int(time.time()) - 60: # 跳过1分钟前的历史消息
logger.debug("[WX]history message {} skipped".format(msgId))
return
if cmsg.my_msg and not cmsg.is_group:
logger.debug("[WX]my message {} skipped".format(msgId))
return
return func(self, cmsg)
return wrapper
# 可用的二维码生成接口
# https://api.qrserver.com/v1/create-qr-code/?size=400×400&data=https://www.abc.com
# https://api.isoyu.com/qr/?m=1&e=L&p=20&url=https://www.abc.com
def qrCallback(uuid, status, qrcode):
# logger.debug("qrCallback: {} {}".format(uuid,status))
if status == "0":
try:
from PIL import Image
img = Image.open(io.BytesIO(qrcode))
_thread = threading.Thread(target=img.show, args=("QRCode",))
_thread.setDaemon(True)
_thread.start()
except Exception as e:
pass
import qrcode
url = f"https://login.weixin.qq.com/l/{uuid}"
qr_api1 = "https://api.isoyu.com/qr/?m=1&e=L&p=20&url={}".format(url)
qr_api2 = "https://api.qrserver.com/v1/create-qr-code/?size=400×400&data={}".format(url)
qr_api3 = "https://api.pwmqr.com/qrcode/create/?url={}".format(url)
qr_api4 = "https://my.tv.sohu.com/user/a/wvideo/getQRCode.do?text={}".format(url)
print("You can also scan QRCode in any website below:")
print(qr_api3)
print(qr_api4)
print(qr_api2)
print(qr_api1)
_send_qr_code([qr_api3, qr_api4, qr_api2, qr_api1])
qr = qrcode.QRCode(border=1)
qr.add_data(url)
qr.make(fit=True)
try:
qr.print_ascii(invert=True)
except UnicodeEncodeError:
print("ASCII QR code printing failed due to encoding issues.")
@singleton
class WechatChannel(ChatChannel):
NOT_SUPPORT_REPLYTYPE = []
def __init__(self):
super().__init__()
self.receivedMsgs = ExpiredDict(conf().get("expires_in_seconds", 3600))
self.auto_login_times = 0
def startup(self):
try:
time.sleep(3)
logger.error("""[WechatChannel] 当前channel暂不可用目前支持的channel有:
1. terminal: 终端
2. wechatmp: 个人公众号
3. wechatmp_service: 企业公众号
4. wechatcom_app: 企微自建应用
5. dingtalk: 钉钉
6. feishu: 飞书
7. web: 网页
8. wcf: wechat (需Windows环境参考 https://github.com/zhayujie/chatgpt-on-wechat/pull/2562 )
可修改 config.json 配置文件的 channel_type 字段进行切换""")
# itchat.instance.receivingRetryCount = 600 # 修改断线超时时间
# # login by scan QRCode
# hotReload = conf().get("hot_reload", False)
# status_path = os.path.join(get_appdata_dir(), "itchat.pkl")
# itchat.auto_login(
# enableCmdQR=2,
# hotReload=hotReload,
# statusStorageDir=status_path,
# qrCallback=qrCallback,
# exitCallback=self.exitCallback,
# loginCallback=self.loginCallback
# )
# self.user_id = itchat.instance.storageClass.userName
# self.name = itchat.instance.storageClass.nickName
# logger.info("Wechat login success, user_id: {}, nickname: {}".format(self.user_id, self.name))
# # start message listener
# itchat.run()
except Exception as e:
logger.exception(e)
def exitCallback(self):
try:
from common.cloud_client import chat_client
if chat_client.client_id and conf().get("use_linkai"):
_send_logout()
time.sleep(2)
self.auto_login_times += 1
if self.auto_login_times < 100:
chat_channel.handler_pool._shutdown = False
self.startup()
except Exception as e:
pass
def loginCallback(self):
logger.debug("Login success")
_send_login_success()
# handle_* 系列函数处理收到的消息后构造Context然后传入produce函数中处理Context和发送回复
# Context包含了消息的所有信息包括以下属性
# type 消息类型, 包括TEXT、VOICE、IMAGE_CREATE
# content 消息内容如果是TEXT类型content就是文本内容如果是VOICE类型content就是语音文件名如果是IMAGE_CREATE类型content就是图片生成命令
# kwargs 附加参数字典包含以下的key
# session_id: 会话id
# isgroup: 是否是群聊
# receiver: 需要回复的对象
# msg: ChatMessage消息对象
# origin_ctype: 原始消息类型,语音转文字后,私聊时如果匹配前缀失败,会根据初始消息是否是语音来放宽触发规则
# desire_rtype: 希望回复类型默认是文本回复设置为ReplyType.VOICE是语音回复
@time_checker
@_check
def handle_single(self, cmsg: ChatMessage):
# filter system message
if cmsg.other_user_id in ["weixin"]:
return
if cmsg.ctype == ContextType.VOICE:
if conf().get("speech_recognition") != True:
return
logger.debug("[WX]receive voice msg: {}".format(cmsg.content))
elif cmsg.ctype == ContextType.IMAGE:
logger.debug("[WX]receive image msg: {}".format(cmsg.content))
elif cmsg.ctype == ContextType.PATPAT:
logger.debug("[WX]receive patpat msg: {}".format(cmsg.content))
elif cmsg.ctype == ContextType.TEXT:
logger.debug("[WX]receive text msg: {}, cmsg={}".format(json.dumps(cmsg._rawmsg, ensure_ascii=False), cmsg))
else:
logger.debug("[WX]receive msg: {}, cmsg={}".format(cmsg.content, cmsg))
context = self._compose_context(cmsg.ctype, cmsg.content, isgroup=False, msg=cmsg)
if context:
self.produce(context)
@time_checker
@_check
def handle_group(self, cmsg: ChatMessage):
if cmsg.ctype == ContextType.VOICE:
if conf().get("group_speech_recognition") != True:
return
logger.debug("[WX]receive voice for group msg: {}".format(cmsg.content))
elif cmsg.ctype == ContextType.IMAGE:
logger.debug("[WX]receive image for group msg: {}".format(cmsg.content))
elif cmsg.ctype in [ContextType.JOIN_GROUP, ContextType.PATPAT, ContextType.ACCEPT_FRIEND, ContextType.EXIT_GROUP]:
logger.debug("[WX]receive note msg: {}".format(cmsg.content))
elif cmsg.ctype == ContextType.TEXT:
# logger.debug("[WX]receive group msg: {}, cmsg={}".format(json.dumps(cmsg._rawmsg, ensure_ascii=False), cmsg))
pass
elif cmsg.ctype == ContextType.FILE:
logger.debug(f"[WX]receive attachment msg, file_name={cmsg.content}")
else:
logger.debug("[WX]receive group msg: {}".format(cmsg.content))
context = self._compose_context(cmsg.ctype, cmsg.content, isgroup=True, msg=cmsg, no_need_at=conf().get("no_need_at", False))
if context:
self.produce(context)
# 统一的发送函数每个Channel自行实现根据reply的type字段发送不同类型的消息
def send(self, reply: Reply, context: Context):
receiver = context["receiver"]
if reply.type == ReplyType.TEXT:
reply.content = remove_markdown_symbol(reply.content)
itchat.send(reply.content, toUserName=receiver)
logger.info("[WX] sendMsg={}, receiver={}".format(reply, receiver))
elif reply.type == ReplyType.ERROR or reply.type == ReplyType.INFO:
reply.content = remove_markdown_symbol(reply.content)
itchat.send(reply.content, toUserName=receiver)
logger.info("[WX] sendMsg={}, receiver={}".format(reply, receiver))
elif reply.type == ReplyType.VOICE:
itchat.send_file(reply.content, toUserName=receiver)
logger.info("[WX] sendFile={}, receiver={}".format(reply.content, receiver))
elif reply.type == ReplyType.IMAGE_URL: # 从网络下载图片
img_url = reply.content
logger.debug(f"[WX] start download image, img_url={img_url}")
pic_res = requests.get(img_url, stream=True)
image_storage = io.BytesIO()
size = 0
for block in pic_res.iter_content(1024):
size += len(block)
image_storage.write(block)
logger.info(f"[WX] download image success, size={size}, img_url={img_url}")
image_storage.seek(0)
if ".webp" in img_url:
try:
image_storage = convert_webp_to_png(image_storage)
except Exception as e:
logger.error(f"Failed to convert image: {e}")
return
itchat.send_image(image_storage, toUserName=receiver)
logger.info("[WX] sendImage url={}, receiver={}".format(img_url, receiver))
elif reply.type == ReplyType.IMAGE: # 从文件读取图片
image_storage = reply.content
image_storage.seek(0)
itchat.send_image(image_storage, toUserName=receiver)
logger.info("[WX] sendImage, receiver={}".format(receiver))
elif reply.type == ReplyType.FILE: # 新增文件回复类型
file_storage = reply.content
itchat.send_file(file_storage, toUserName=receiver)
logger.info("[WX] sendFile, receiver={}".format(receiver))
elif reply.type == ReplyType.VIDEO: # 新增视频回复类型
video_storage = reply.content
itchat.send_video(video_storage, toUserName=receiver)
logger.info("[WX] sendFile, receiver={}".format(receiver))
elif reply.type == ReplyType.VIDEO_URL: # 新增视频URL回复类型
video_url = reply.content
logger.debug(f"[WX] start download video, video_url={video_url}")
video_res = requests.get(video_url, stream=True)
video_storage = io.BytesIO()
size = 0
for block in video_res.iter_content(1024):
size += len(block)
video_storage.write(block)
logger.info(f"[WX] download video success, size={size}, video_url={video_url}")
video_storage.seek(0)
itchat.send_video(video_storage, toUserName=receiver)
logger.info("[WX] sendVideo url={}, receiver={}".format(video_url, receiver))
def _send_login_success():
try:
from common.cloud_client import chat_client
if chat_client.client_id:
chat_client.send_login_success()
except Exception as e:
pass
def _send_logout():
try:
from common.cloud_client import chat_client
if chat_client.client_id:
chat_client.send_logout()
except Exception as e:
pass
def _send_qr_code(qrcode_list: list):
try:
from common.cloud_client import chat_client
if chat_client.client_id:
chat_client.send_qrcode(qrcode_list)
except Exception as e:
pass

View File

@@ -1,124 +0,0 @@
import re
from bridge.context import ContextType
from channel.chat_message import ChatMessage
from common.log import logger
from common.tmp_dir import TmpDir
from lib import itchat
from lib.itchat.content import *
class WechatMessage(ChatMessage):
def __init__(self, itchat_msg, is_group=False):
super().__init__(itchat_msg)
self.msg_id = itchat_msg["MsgId"]
self.create_time = itchat_msg["CreateTime"]
self.is_group = is_group
notes_join_group = ["加入群聊", "加入了群聊", "invited", "joined"] # 可通过添加对应语言的加入群聊通知中的关键词适配更多
notes_bot_join_group = ["邀请你", "invited you", "You've joined", "你通过扫描"]
notes_exit_group = ["移出了群聊", "removed"] # 可通过添加对应语言的踢出群聊通知中的关键词适配更多
notes_patpat = ["拍了拍我", "tickled my", "tickled me"] # 可通过添加对应语言的拍一拍通知中的关键词适配更多
if itchat_msg["Type"] == TEXT:
self.ctype = ContextType.TEXT
self.content = itchat_msg["Text"]
elif itchat_msg["Type"] == VOICE:
self.ctype = ContextType.VOICE
self.content = TmpDir().path() + itchat_msg["FileName"] # content直接存临时目录路径
self._prepare_fn = lambda: itchat_msg.download(self.content)
elif itchat_msg["Type"] == PICTURE and itchat_msg["MsgType"] == 3:
self.ctype = ContextType.IMAGE
self.content = TmpDir().path() + itchat_msg["FileName"] # content直接存临时目录路径
self._prepare_fn = lambda: itchat_msg.download(self.content)
elif itchat_msg["Type"] == NOTE and itchat_msg["MsgType"] == 10000:
if is_group:
if any(note_bot_join_group in itchat_msg["Content"] for note_bot_join_group in notes_bot_join_group): # 邀请机器人加入群聊
logger.warn("机器人加入群聊消息,不处理~")
pass
elif any(note_join_group in itchat_msg["Content"] for note_join_group in notes_join_group): # 若有任何在notes_join_group列表中的字符串出现在NOTE中
# 这里只能得到nickname actual_user_id还是机器人的id
if "加入群聊" not in itchat_msg["Content"]:
self.ctype = ContextType.JOIN_GROUP
self.content = itchat_msg["Content"]
if "invited" in itchat_msg["Content"]: # 匹配英文信息
self.actual_user_nickname = re.findall(r'invited\s+(.+?)\s+to\s+the\s+group\s+chat', itchat_msg["Content"])[0]
elif "joined" in itchat_msg["Content"]: # 匹配通过二维码加入的英文信息
self.actual_user_nickname = re.findall(r'"(.*?)" joined the group chat via the QR Code shared by', itchat_msg["Content"])[0]
elif "加入了群聊" in itchat_msg["Content"]:
self.actual_user_nickname = re.findall(r"\"(.*?)\"", itchat_msg["Content"])[-1]
elif "加入群聊" in itchat_msg["Content"]:
self.ctype = ContextType.JOIN_GROUP
self.content = itchat_msg["Content"]
self.actual_user_nickname = re.findall(r"\"(.*?)\"", itchat_msg["Content"])[0]
elif any(note_exit_group in itchat_msg["Content"] for note_exit_group in notes_exit_group): # 若有任何在notes_exit_group列表中的字符串出现在NOTE中
self.ctype = ContextType.EXIT_GROUP
self.content = itchat_msg["Content"]
self.actual_user_nickname = re.findall(r"\"(.*?)\"", itchat_msg["Content"])[0]
elif any(note_patpat in itchat_msg["Content"] for note_patpat in notes_patpat): # 若有任何在notes_patpat列表中的字符串出现在NOTE中:
self.ctype = ContextType.PATPAT
self.content = itchat_msg["Content"]
if "拍了拍我" in itchat_msg["Content"]: # 识别中文
self.actual_user_nickname = re.findall(r"\"(.*?)\"", itchat_msg["Content"])[0]
elif "tickled my" in itchat_msg["Content"] or "tickled me" in itchat_msg["Content"]:
self.actual_user_nickname = re.findall(r'^(.*?)(?:tickled my|tickled me)', itchat_msg["Content"])[0]
else:
raise NotImplementedError("Unsupported note message: " + itchat_msg["Content"])
elif "你已添加了" in itchat_msg["Content"]: #通过好友请求
self.ctype = ContextType.ACCEPT_FRIEND
self.content = itchat_msg["Content"]
elif any(note_patpat in itchat_msg["Content"] for note_patpat in notes_patpat): # 若有任何在notes_patpat列表中的字符串出现在NOTE中:
self.ctype = ContextType.PATPAT
self.content = itchat_msg["Content"]
else:
raise NotImplementedError("Unsupported note message: " + itchat_msg["Content"])
elif itchat_msg["Type"] == ATTACHMENT:
self.ctype = ContextType.FILE
self.content = TmpDir().path() + itchat_msg["FileName"] # content直接存临时目录路径
self._prepare_fn = lambda: itchat_msg.download(self.content)
elif itchat_msg["Type"] == SHARING:
self.ctype = ContextType.SHARING
self.content = itchat_msg.get("Url")
else:
raise NotImplementedError("Unsupported message type: Type:{} MsgType:{}".format(itchat_msg["Type"], itchat_msg["MsgType"]))
self.from_user_id = itchat_msg["FromUserName"]
self.to_user_id = itchat_msg["ToUserName"]
user_id = itchat.instance.storageClass.userName
nickname = itchat.instance.storageClass.nickName
# 虽然from_user_id和to_user_id用的少但是为了保持一致性还是要填充一下
# 以下很繁琐,一句话总结:能填的都填了。
if self.from_user_id == user_id:
self.from_user_nickname = nickname
if self.to_user_id == user_id:
self.to_user_nickname = nickname
try: # 陌生人时候, User字段可能不存在
# my_msg 为True是表示是自己发送的消息
self.my_msg = itchat_msg["ToUserName"] == itchat_msg["User"]["UserName"] and \
itchat_msg["ToUserName"] != itchat_msg["FromUserName"]
self.other_user_id = itchat_msg["User"]["UserName"]
self.other_user_nickname = itchat_msg["User"]["NickName"]
if self.other_user_id == self.from_user_id:
self.from_user_nickname = self.other_user_nickname
if self.other_user_id == self.to_user_id:
self.to_user_nickname = self.other_user_nickname
if itchat_msg["User"].get("Self"):
# 自身的展示名,当设置了群昵称时,该字段表示群昵称
self.self_display_name = itchat_msg["User"].get("Self").get("DisplayName")
except KeyError as e: # 处理偶尔没有对方信息的情况
logger.warn("[WX]get other_user_id failed: " + str(e))
if self.from_user_id == user_id:
self.other_user_id = self.to_user_id
else:
self.other_user_id = self.from_user_id
if self.is_group:
self.is_at = itchat_msg["IsAt"]
self.actual_user_id = itchat_msg["ActualUserName"]
if self.ctype not in [ContextType.JOIN_GROUP, ContextType.PATPAT, ContextType.EXIT_GROUP]:
self.actual_user_nickname = itchat_msg["ActualNickName"]

View File

@@ -1,129 +0,0 @@
# encoding:utf-8
"""
wechaty channel
Python Wechaty - https://github.com/wechaty/python-wechaty
"""
import asyncio
import base64
import os
import time
from wechaty import Contact, Wechaty
from wechaty.user import Message
from wechaty_puppet import FileBox
from bridge.context import *
from bridge.context import Context
from bridge.reply import *
from channel.chat_channel import ChatChannel
from channel.wechat.wechaty_message import WechatyMessage
from common.log import logger
from common.singleton import singleton
from config import conf
try:
from voice.audio_convert import any_to_sil
except Exception as e:
pass
@singleton
class WechatyChannel(ChatChannel):
NOT_SUPPORT_REPLYTYPE = []
def __init__(self):
super().__init__()
def startup(self):
config = conf()
token = config.get("wechaty_puppet_service_token")
os.environ["WECHATY_PUPPET_SERVICE_TOKEN"] = token
asyncio.run(self.main())
async def main(self):
loop = asyncio.get_event_loop()
# 将asyncio的loop传入处理线程
self.handler_pool._initializer = lambda: asyncio.set_event_loop(loop)
self.bot = Wechaty()
self.bot.on("login", self.on_login)
self.bot.on("message", self.on_message)
await self.bot.start()
async def on_login(self, contact: Contact):
self.user_id = contact.contact_id
self.name = contact.name
logger.info("[WX] login user={}".format(contact))
# 统一的发送函数每个Channel自行实现根据reply的type字段发送不同类型的消息
def send(self, reply: Reply, context: Context):
receiver_id = context["receiver"]
loop = asyncio.get_event_loop()
if context["isgroup"]:
receiver = asyncio.run_coroutine_threadsafe(self.bot.Room.find(receiver_id), loop).result()
else:
receiver = asyncio.run_coroutine_threadsafe(self.bot.Contact.find(receiver_id), loop).result()
msg = None
if reply.type == ReplyType.TEXT:
msg = reply.content
asyncio.run_coroutine_threadsafe(receiver.say(msg), loop).result()
logger.info("[WX] sendMsg={}, receiver={}".format(reply, receiver))
elif reply.type == ReplyType.ERROR or reply.type == ReplyType.INFO:
msg = reply.content
asyncio.run_coroutine_threadsafe(receiver.say(msg), loop).result()
logger.info("[WX] sendMsg={}, receiver={}".format(reply, receiver))
elif reply.type == ReplyType.VOICE:
voiceLength = None
file_path = reply.content
sil_file = os.path.splitext(file_path)[0] + ".sil"
voiceLength = int(any_to_sil(file_path, sil_file))
if voiceLength >= 60000:
voiceLength = 60000
logger.info("[WX] voice too long, length={}, set to 60s".format(voiceLength))
# 发送语音
t = int(time.time())
msg = FileBox.from_file(sil_file, name=str(t) + ".sil")
if voiceLength is not None:
msg.metadata["voiceLength"] = voiceLength
asyncio.run_coroutine_threadsafe(receiver.say(msg), loop).result()
try:
os.remove(file_path)
if sil_file != file_path:
os.remove(sil_file)
except Exception as e:
pass
logger.info("[WX] sendVoice={}, receiver={}".format(reply.content, receiver))
elif reply.type == ReplyType.IMAGE_URL: # 从网络下载图片
img_url = reply.content
t = int(time.time())
msg = FileBox.from_url(url=img_url, name=str(t) + ".png")
asyncio.run_coroutine_threadsafe(receiver.say(msg), loop).result()
logger.info("[WX] sendImage url={}, receiver={}".format(img_url, receiver))
elif reply.type == ReplyType.IMAGE: # 从文件读取图片
image_storage = reply.content
image_storage.seek(0)
t = int(time.time())
msg = FileBox.from_base64(base64.b64encode(image_storage.read()), str(t) + ".png")
asyncio.run_coroutine_threadsafe(receiver.say(msg), loop).result()
logger.info("[WX] sendImage, receiver={}".format(receiver))
async def on_message(self, msg: Message):
"""
listen for message event
"""
try:
cmsg = await WechatyMessage(msg)
except NotImplementedError as e:
logger.debug("[WX] {}".format(e))
return
except Exception as e:
logger.exception("[WX] {}".format(e))
return
logger.debug("[WX] message:{}".format(cmsg))
room = msg.room() # 获取消息来自的群聊. 如果消息不是来自群聊, 则返回None
isgroup = room is not None
ctype = cmsg.ctype
context = self._compose_context(ctype, cmsg.content, isgroup=isgroup, msg=cmsg)
if context:
logger.info("[WX] receiveMsg={}, context={}".format(cmsg, context))
self.produce(context)

View File

@@ -1,89 +0,0 @@
import asyncio
import re
from wechaty import MessageType
from wechaty.user import Message
from bridge.context import ContextType
from channel.chat_message import ChatMessage
from common.log import logger
from common.tmp_dir import TmpDir
class aobject(object):
"""Inheriting this class allows you to define an async __init__.
So you can create objects by doing something like `await MyClass(params)`
"""
async def __new__(cls, *a, **kw):
instance = super().__new__(cls)
await instance.__init__(*a, **kw)
return instance
async def __init__(self):
pass
class WechatyMessage(ChatMessage, aobject):
async def __init__(self, wechaty_msg: Message):
super().__init__(wechaty_msg)
room = wechaty_msg.room()
self.msg_id = wechaty_msg.message_id
self.create_time = wechaty_msg.payload.timestamp
self.is_group = room is not None
if wechaty_msg.type() == MessageType.MESSAGE_TYPE_TEXT:
self.ctype = ContextType.TEXT
self.content = wechaty_msg.text()
elif wechaty_msg.type() == MessageType.MESSAGE_TYPE_AUDIO:
self.ctype = ContextType.VOICE
voice_file = await wechaty_msg.to_file_box()
self.content = TmpDir().path() + voice_file.name # content直接存临时目录路径
def func():
loop = asyncio.get_event_loop()
asyncio.run_coroutine_threadsafe(voice_file.to_file(self.content), loop).result()
self._prepare_fn = func
else:
raise NotImplementedError("Unsupported message type: {}".format(wechaty_msg.type()))
from_contact = wechaty_msg.talker() # 获取消息的发送者
self.from_user_id = from_contact.contact_id
self.from_user_nickname = from_contact.name
# group中的from和towechaty跟itchat含义不一样
# wecahty: from是消息实际发送者, to:所在群
# itchat: 如果是你发送群消息from和to是你自己和所在群如果是别人发群消息from和to是所在群和你自己
# 但这个差别不影响逻辑group中只使用到1.用from来判断是否是自己发的2.actual_user_id来判断实际发送用户
if self.is_group:
self.to_user_id = room.room_id
self.to_user_nickname = await room.topic()
else:
to_contact = wechaty_msg.to()
self.to_user_id = to_contact.contact_id
self.to_user_nickname = to_contact.name
if self.is_group or wechaty_msg.is_self(): # 如果是群消息other_user设置为群如果是私聊消息而且自己发的就设置成对方。
self.other_user_id = self.to_user_id
self.other_user_nickname = self.to_user_nickname
else:
self.other_user_id = self.from_user_id
self.other_user_nickname = self.from_user_nickname
if self.is_group: # wechaty群聊中实际发送用户就是from_user
self.is_at = await wechaty_msg.mention_self()
if not self.is_at: # 有时候复制粘贴的消息,不算做@,但是内容里面会有@xxx这里做一下兼容
name = wechaty_msg.wechaty.user_self().name
pattern = f"@{re.escape(name)}(\u2005|\u0020)"
if re.search(pattern, self.content):
logger.debug(f"wechaty message {self.msg_id} include at")
self.is_at = True
self.actual_user_id = self.from_user_id
self.actual_user_nickname = self.from_user_nickname

View File

@@ -1,6 +1,6 @@
# 微信公众号channel
鉴于个人微信号在服务器上通过itchat登录有封号风险这里新增了微信公众号channel提供无风险的服务。
微信公众号channel提供稳定的服务。
目前支持订阅号和服务号两种类型的公众号,它们都支持文本交互,语音和图片输入。其中个人主体的微信订阅号由于无法通过微信认证,存在回复时间限制,每天的图片和声音回复次数也有限制。
## 使用方法(订阅号,服务号类似)

View File

View File

@@ -0,0 +1,767 @@
"""
WeCom (企业微信) AI Bot channel via WebSocket long connection.
Supports:
- Single chat and group chat (text / image / file input & output)
- Scheduled task push via aibot_send_msg
- Heartbeat keep-alive and auto-reconnect
"""
import base64
import hashlib
import json
import math
import os
import threading
import time
import uuid
import requests
import websocket
from bridge.context import Context, ContextType
from bridge.reply import Reply, ReplyType
from channel.chat_channel import ChatChannel, check_prefix
from channel.wecom_bot.wecom_bot_message import WecomBotMessage
from common.expired_dict import ExpiredDict
from common.log import logger
from common.singleton import singleton
from config import conf
WECOM_WS_URL = "wss://openws.work.weixin.qq.com"
HEARTBEAT_INTERVAL = 30
MEDIA_CHUNK_SIZE = 512 * 1024 # 512KB per chunk (before base64 encoding)
@singleton
class WecomBotChannel(ChatChannel):
def __init__(self):
super().__init__()
self.bot_id = ""
self.bot_secret = ""
self.received_msgs = ExpiredDict(60 * 60 * 7.1)
self._ws = None
self._ws_thread = None
self._heartbeat_thread = None
self._connected = False
self._stop_event = threading.Event()
self._pending_responses = {} # req_id -> (threading.Event, result_holder)
self._pending_lock = threading.Lock()
self._stream_states = {} # req_id -> {"stream_id": str, "content": str}
conf()["group_name_white_list"] = ["ALL_GROUP"]
conf()["single_chat_prefix"] = [""]
# ------------------------------------------------------------------
# Lifecycle
# ------------------------------------------------------------------
def startup(self):
self.bot_id = conf().get("wecom_bot_id", "")
self.bot_secret = conf().get("wecom_bot_secret", "")
if not self.bot_id or not self.bot_secret:
err = "[WecomBot] wecom_bot_id and wecom_bot_secret are required"
logger.error(err)
self.report_startup_error(err)
return
self._stop_event.clear()
self._start_ws()
def stop(self):
logger.info("[WecomBot] stop() called")
self._stop_event.set()
if self._ws:
try:
self._ws.close()
except Exception:
pass
self._ws = None
self._connected = False
# ------------------------------------------------------------------
# WebSocket connection
# ------------------------------------------------------------------
def _start_ws(self):
def _on_open(ws):
logger.info("[WecomBot] WebSocket connected, sending subscribe...")
self._send_subscribe()
def _on_message(ws, raw):
try:
data = json.loads(raw)
self._handle_ws_message(data)
except Exception as e:
logger.error(f"[WecomBot] Failed to handle ws message: {e}", exc_info=True)
def _on_error(ws, error):
logger.error(f"[WecomBot] WebSocket error: {error}")
def _on_close(ws, close_status_code, close_msg):
logger.warning(f"[WecomBot] WebSocket closed: status={close_status_code}, msg={close_msg}")
self._connected = False
if not self._stop_event.is_set():
logger.info("[WecomBot] Will reconnect in 5s...")
time.sleep(5)
if not self._stop_event.is_set():
self._start_ws()
self._ws = websocket.WebSocketApp(
WECOM_WS_URL,
on_open=_on_open,
on_message=_on_message,
on_error=_on_error,
on_close=_on_close,
)
def run_forever():
try:
self._ws.run_forever(ping_interval=0, reconnect=0)
except (SystemExit, KeyboardInterrupt):
logger.info("[WecomBot] WebSocket thread interrupted")
except Exception as e:
logger.error(f"[WecomBot] WebSocket run_forever error: {e}")
self._ws_thread = threading.Thread(target=run_forever, daemon=True)
self._ws_thread.start()
self._ws_thread.join()
def _ws_send(self, data: dict):
if self._ws:
self._ws.send(json.dumps(data, ensure_ascii=False))
def _gen_req_id(self) -> str:
return uuid.uuid4().hex[:16]
# ------------------------------------------------------------------
# Subscribe & heartbeat
# ------------------------------------------------------------------
def _send_subscribe(self):
self._ws_send({
"cmd": "aibot_subscribe",
"headers": {"req_id": self._gen_req_id()},
"body": {
"bot_id": self.bot_id,
"secret": self.bot_secret,
},
})
def _start_heartbeat(self):
if self._heartbeat_thread and self._heartbeat_thread.is_alive():
return
def heartbeat_loop():
while not self._stop_event.is_set() and self._connected:
try:
self._ws_send({
"cmd": "ping",
"headers": {"req_id": self._gen_req_id()},
})
except Exception as e:
logger.warning(f"[WecomBot] Heartbeat send failed: {e}")
break
self._stop_event.wait(HEARTBEAT_INTERVAL)
self._heartbeat_thread = threading.Thread(target=heartbeat_loop, daemon=True)
self._heartbeat_thread.start()
# ------------------------------------------------------------------
# Incoming message dispatch
# ------------------------------------------------------------------
def _send_and_wait(self, data: dict, timeout: float = 15) -> dict:
"""Send a ws message and wait for the matching response by req_id."""
req_id = data.get("headers", {}).get("req_id", "")
event = threading.Event()
holder = {"data": None}
with self._pending_lock:
self._pending_responses[req_id] = (event, holder)
self._ws_send(data)
event.wait(timeout=timeout)
with self._pending_lock:
self._pending_responses.pop(req_id, None)
return holder["data"] or {}
def _handle_ws_message(self, data: dict):
cmd = data.get("cmd", "")
errcode = data.get("errcode")
req_id = data.get("headers", {}).get("req_id", "")
# Check if this is a response to a pending request
if req_id:
with self._pending_lock:
pending = self._pending_responses.get(req_id)
if pending:
event, holder = pending
holder["data"] = data
event.set()
return
# Subscribe response (only handle once before connected)
if errcode is not None and cmd == "":
if not self._connected:
if errcode == 0:
logger.info("[WecomBot] ✅ Subscribe success")
self._connected = True
self._start_heartbeat()
self.report_startup_success()
else:
errmsg = data.get("errmsg", "unknown error")
logger.error(f"[WecomBot] Subscribe failed: errcode={errcode}, errmsg={errmsg}")
self.report_startup_error(errmsg)
return
if cmd == "aibot_msg_callback":
self._handle_msg_callback(data)
elif cmd == "aibot_event_callback":
self._handle_event_callback(data)
elif cmd == "":
if errcode and errcode != 0:
logger.warning(f"[WecomBot] Response error: {data}")
# ------------------------------------------------------------------
# Message callback
# ------------------------------------------------------------------
def _handle_msg_callback(self, data: dict):
body = data.get("body", {})
req_id = data.get("headers", {}).get("req_id", "")
msg_id = body.get("msgid", "")
if self.received_msgs.get(msg_id):
logger.debug(f"[WecomBot] Duplicate msg filtered: {msg_id}")
return
self.received_msgs[msg_id] = True
chattype = body.get("chattype", "single")
is_group = chattype == "group"
try:
wecom_msg = WecomBotMessage(body, is_group=is_group)
except NotImplementedError as e:
logger.warning(f"[WecomBot] {e}")
return
except Exception as e:
logger.error(f"[WecomBot] Failed to parse message: {e}", exc_info=True)
return
wecom_msg.req_id = req_id
# File cache logic (same pattern as feishu)
from channel.file_cache import get_file_cache
file_cache = get_file_cache()
if is_group:
if conf().get("group_shared_session", True):
session_id = body.get("chatid", "")
else:
session_id = wecom_msg.from_user_id + "_" + body.get("chatid", "")
else:
session_id = wecom_msg.from_user_id
if wecom_msg.ctype == ContextType.IMAGE:
if hasattr(wecom_msg, "image_path") and wecom_msg.image_path:
file_cache.add(session_id, wecom_msg.image_path, file_type="image")
logger.info(f"[WecomBot] Image cached for session {session_id}")
return
if wecom_msg.ctype == ContextType.FILE:
wecom_msg.prepare()
file_cache.add(session_id, wecom_msg.content, file_type="file")
logger.info(f"[WecomBot] File cached for session {session_id}: {wecom_msg.content}")
return
if wecom_msg.ctype == ContextType.TEXT:
cached_files = file_cache.get(session_id)
if cached_files:
file_refs = []
for fi in cached_files:
ftype = fi["type"]
fpath = fi["path"]
if ftype == "image":
file_refs.append(f"[图片: {fpath}]")
elif ftype == "video":
file_refs.append(f"[视频: {fpath}]")
else:
file_refs.append(f"[文件: {fpath}]")
wecom_msg.content = wecom_msg.content + "\n" + "\n".join(file_refs)
logger.info(f"[WecomBot] Attached {len(cached_files)} cached file(s)")
file_cache.clear(session_id)
context = self._compose_context(
wecom_msg.ctype,
wecom_msg.content,
isgroup=is_group,
msg=wecom_msg,
no_need_at=True,
)
if context:
if req_id:
context["on_event"] = self._make_stream_callback(req_id)
self.produce(context)
# ------------------------------------------------------------------
# Event callback
# ------------------------------------------------------------------
def _handle_event_callback(self, data: dict):
body = data.get("body", {})
event = body.get("event", {})
event_type = event.get("eventtype", "")
if event_type == "enter_chat":
logger.info(f"[WecomBot] User entered chat: {body.get('from', {}).get('userid')}")
elif event_type == "disconnected_event":
logger.warning("[WecomBot] Received disconnected_event, another connection took over")
else:
logger.debug(f"[WecomBot] Event: {event_type}")
# ------------------------------------------------------------------
# Stream callback (for agent on_event)
# ------------------------------------------------------------------
def _make_stream_callback(self, req_id: str):
"""Build an on_event callback that pushes agent stream deltas to wecom via stream message.
All intermediate segments (thinking before tool calls) and the final answer
are accumulated into a single stream message, separated by '---'.
"""
stream_id = uuid.uuid4().hex[:16]
self._stream_states[req_id] = {
"stream_id": stream_id,
"committed": "", # finalized content from previous segments
"current": "", # current segment being streamed
}
def _push_stream(state: dict):
"""Push current stream content to wecom."""
self._ws_send({
"cmd": "aibot_respond_msg",
"headers": {"req_id": req_id},
"body": {
"msgtype": "stream",
"stream": {
"id": state["stream_id"],
"finish": False,
"content": state["committed"] + state["current"],
},
},
})
def on_event(event: dict):
event_type = event.get("type")
data = event.get("data", {})
state = self._stream_states.get(req_id)
if not state:
return
if event_type == "turn_start":
state["current"] = ""
elif event_type == "message_update":
delta = data.get("delta", "")
if delta:
state["current"] += delta
_push_stream(state)
elif event_type == "message_end":
tool_calls = data.get("tool_calls", [])
if tool_calls:
if state["current"].strip():
state["committed"] += state["current"].strip() + "\n\n---\n\n"
state["current"] = ""
else:
state["committed"] += state["current"]
state["current"] = ""
return on_event
# ------------------------------------------------------------------
# _compose_context (same pattern as feishu)
# ------------------------------------------------------------------
def _compose_context(self, ctype: ContextType, content, **kwargs):
context = Context(ctype, content)
context.kwargs = kwargs
if "channel_type" not in context:
context["channel_type"] = self.channel_type
if "origin_ctype" not in context:
context["origin_ctype"] = ctype
cmsg = context["msg"]
if cmsg.is_group:
if conf().get("group_shared_session", True):
context["session_id"] = cmsg.other_user_id
else:
context["session_id"] = f"{cmsg.from_user_id}:{cmsg.other_user_id}"
else:
context["session_id"] = cmsg.from_user_id
context["receiver"] = cmsg.other_user_id
if ctype == ContextType.TEXT:
img_match_prefix = check_prefix(content, conf().get("image_create_prefix"))
if img_match_prefix:
content = content.replace(img_match_prefix, "", 1)
context.type = ContextType.IMAGE_CREATE
else:
context.type = ContextType.TEXT
context.content = content.strip()
return context
# ------------------------------------------------------------------
# Send reply
# ------------------------------------------------------------------
def send(self, reply: Reply, context: Context):
msg = context.get("msg")
is_group = context.get("isgroup", False)
receiver = context.get("receiver", "")
# Determine req_id for responding or use send_msg for scheduled push
req_id = getattr(msg, "req_id", None) if msg else None
if reply.type == ReplyType.TEXT:
self._send_text(reply.content, receiver, is_group, req_id)
elif reply.type in (ReplyType.IMAGE_URL, ReplyType.IMAGE):
self._send_image(reply.content, receiver, is_group, req_id)
elif reply.type == ReplyType.FILE:
if hasattr(reply, "text_content") and reply.text_content:
self._send_text(reply.text_content, receiver, is_group, req_id)
time.sleep(0.3)
self._send_file(reply.content, receiver, is_group, req_id)
elif reply.type == ReplyType.VIDEO or reply.type == ReplyType.VIDEO_URL:
self._send_file(reply.content, receiver, is_group, req_id, media_type="video")
else:
logger.warning(f"[WecomBot] Unsupported reply type: {reply.type}, falling back to text")
self._send_text(str(reply.content), receiver, is_group, req_id)
# ------------------------------------------------------------------
# Respond message (via websocket)
# ------------------------------------------------------------------
def _send_text(self, content: str, receiver: str, is_group: bool, req_id: str = None):
"""Send text/markdown reply. Reuses stream state if available (streaming mode)."""
if req_id:
state = self._stream_states.pop(req_id, None)
if state:
final_content = state["committed"]
stream_id = state["stream_id"]
else:
final_content = content
stream_id = uuid.uuid4().hex[:16]
self._ws_send({
"cmd": "aibot_respond_msg",
"headers": {"req_id": req_id},
"body": {
"msgtype": "stream",
"stream": {
"id": stream_id,
"finish": True,
"content": final_content,
},
},
})
else:
self._active_send_markdown(content, receiver, is_group)
def _send_image(self, img_path_or_url: str, receiver: str, is_group: bool, req_id: str = None):
"""Send image reply. Converts to JPG/PNG and compresses if >2MB."""
local_path = img_path_or_url
if local_path.startswith("file://"):
local_path = local_path[7:]
if local_path.startswith(("http://", "https://")):
try:
resp = requests.get(local_path, timeout=30)
resp.raise_for_status()
ct = resp.headers.get("Content-Type", "")
if "jpeg" in ct or "jpg" in ct:
ext = ".jpg"
elif "webp" in ct:
ext = ".webp"
elif "gif" in ct:
ext = ".gif"
else:
ext = ".png"
tmp_path = f"/tmp/wecom_img_{uuid.uuid4().hex[:8]}{ext}"
with open(tmp_path, "wb") as f:
f.write(resp.content)
logger.info(f"[WecomBot] Image downloaded: size={len(resp.content)}, "
f"content-type={ct}, path={tmp_path}")
local_path = tmp_path
except Exception as e:
logger.error(f"[WecomBot] Failed to download image for sending: {e}")
self._send_text("[Image send failed]", receiver, is_group, req_id)
return
if not os.path.exists(local_path):
logger.error(f"[WecomBot] Image file not found: {local_path}")
return
max_image_size = 2 * 1024 * 1024 # 2MB limit for image upload
local_path = self._ensure_image_format(local_path)
if not local_path:
self._send_text("[Image format conversion failed]", receiver, is_group, req_id)
return
if os.path.getsize(local_path) > max_image_size:
local_path = self._compress_image(local_path, max_image_size)
if not local_path:
self._send_text("[Image too large]", receiver, is_group, req_id)
return
file_size = os.path.getsize(local_path)
logger.info(f"[WecomBot] Uploading image: path={local_path}, size={file_size} bytes")
media_id = self._upload_media(local_path, "image")
if not media_id:
logger.error("[WecomBot] Failed to upload image")
self._send_text("[Image upload failed]", receiver, is_group, req_id)
return
if req_id:
self._ws_send({
"cmd": "aibot_respond_msg",
"headers": {"req_id": req_id},
"body": {
"msgtype": "image",
"image": {"media_id": media_id},
},
})
else:
self._ws_send({
"cmd": "aibot_send_msg",
"headers": {"req_id": self._gen_req_id()},
"body": {
"chatid": receiver,
"chat_type": 2 if is_group else 1,
"msgtype": "image",
"image": {"media_id": media_id},
},
})
@staticmethod
def _ensure_image_format(file_path: str) -> str:
"""Ensure image is JPG or PNG (the only formats wecom supports). Convert if needed."""
try:
from PIL import Image
img = Image.open(file_path)
fmt = (img.format or "").upper()
if fmt in ("JPEG", "PNG"):
# Already a supported format, but make sure the filename extension matches
ext = os.path.splitext(file_path)[1].lower()
if fmt == "JPEG" and ext in (".jpg", ".jpeg"):
return file_path
if fmt == "PNG" and ext == ".png":
return file_path
# Extension doesn't match — rename/copy with correct extension
correct_ext = ".jpg" if fmt == "JPEG" else ".png"
out_path = f"/tmp/wecom_fmt_{uuid.uuid4().hex[:8]}{correct_ext}"
img.save(out_path, fmt)
logger.info(f"[WecomBot] Image renamed: {file_path} -> {out_path} ({fmt})")
return out_path
# Unsupported format (WebP, GIF, BMP, etc.) — convert to PNG
if img.mode == "RGBA":
out_path = f"/tmp/wecom_fmt_{uuid.uuid4().hex[:8]}.png"
img.save(out_path, "PNG")
else:
out_path = f"/tmp/wecom_fmt_{uuid.uuid4().hex[:8]}.jpg"
img.convert("RGB").save(out_path, "JPEG", quality=90)
logger.info(f"[WecomBot] Image converted from {fmt} -> {out_path}")
return out_path
except Exception as e:
logger.error(f"[WecomBot] Image format check failed: {e}")
return file_path
@staticmethod
def _compress_image(file_path: str, max_bytes: int) -> str:
"""Compress image to fit within max_bytes. Returns new path or empty string."""
try:
from PIL import Image
img = Image.open(file_path)
if img.mode == "RGBA":
img = img.convert("RGB")
out_path = f"/tmp/wecom_compressed_{uuid.uuid4().hex[:8]}.jpg"
quality = 85
while quality >= 30:
img.save(out_path, "JPEG", quality=quality, optimize=True)
if os.path.getsize(out_path) <= max_bytes:
logger.info(f"[WecomBot] Image compressed: quality={quality}, "
f"size={os.path.getsize(out_path)} bytes")
return out_path
quality -= 10
# Still too large — resize
ratio = (max_bytes / os.path.getsize(out_path)) ** 0.5
new_size = (int(img.width * ratio), int(img.height * ratio))
img = img.resize(new_size, Image.LANCZOS)
img.save(out_path, "JPEG", quality=70, optimize=True)
if os.path.getsize(out_path) <= max_bytes:
logger.info(f"[WecomBot] Image compressed with resize: {new_size}, "
f"size={os.path.getsize(out_path)} bytes")
return out_path
logger.error(f"[WecomBot] Cannot compress image below {max_bytes} bytes")
return ""
except Exception as e:
logger.error(f"[WecomBot] Image compression failed: {e}")
return ""
def _send_file(self, file_path: str, receiver: str, is_group: bool,
req_id: str = None, media_type: str = "file"):
"""Send file/video reply by uploading media first."""
local_path = file_path
if local_path.startswith("file://"):
local_path = local_path[7:]
if local_path.startswith(("http://", "https://")):
try:
resp = requests.get(local_path, timeout=60)
resp.raise_for_status()
ext = os.path.splitext(local_path)[1] or ".bin"
tmp_path = f"/tmp/wecom_file_{uuid.uuid4().hex[:8]}{ext}"
with open(tmp_path, "wb") as f:
f.write(resp.content)
local_path = tmp_path
except Exception as e:
logger.error(f"[WecomBot] Failed to download file for sending: {e}")
return
if not os.path.exists(local_path):
logger.error(f"[WecomBot] File not found: {local_path}")
return
media_id = self._upload_media(local_path, media_type)
if not media_id:
logger.error(f"[WecomBot] Failed to upload {media_type}")
return
if req_id:
self._ws_send({
"cmd": "aibot_respond_msg",
"headers": {"req_id": req_id},
"body": {
"msgtype": media_type,
media_type: {"media_id": media_id},
},
})
else:
self._ws_send({
"cmd": "aibot_send_msg",
"headers": {"req_id": self._gen_req_id()},
"body": {
"chatid": receiver,
"chat_type": 2 if is_group else 1,
"msgtype": media_type,
media_type: {"media_id": media_id},
},
})
def _active_send_markdown(self, content: str, receiver: str, is_group: bool):
"""Proactively send markdown message (for scheduled tasks, no req_id)."""
self._ws_send({
"cmd": "aibot_send_msg",
"headers": {"req_id": self._gen_req_id()},
"body": {
"chatid": receiver,
"chat_type": 2 if is_group else 1,
"msgtype": "markdown",
"markdown": {"content": content},
},
})
# ------------------------------------------------------------------
# Media upload (chunked)
# ------------------------------------------------------------------
def _upload_media(self, file_path: str, media_type: str = "file") -> str:
"""
Upload a local file to wecom bot via chunked upload protocol.
Returns media_id on success, empty string on failure.
"""
if not os.path.exists(file_path):
logger.error(f"[WecomBot] Upload file not found: {file_path}")
return ""
file_size = os.path.getsize(file_path)
if file_size < 5:
logger.error(f"[WecomBot] File too small: {file_size} bytes")
return ""
filename = os.path.basename(file_path)
total_chunks = math.ceil(file_size / MEDIA_CHUNK_SIZE)
if total_chunks > 100:
logger.error(f"[WecomBot] Too many chunks: {total_chunks} > 100")
return ""
file_md5 = hashlib.md5()
with open(file_path, "rb") as f:
for block in iter(lambda: f.read(8192), b""):
file_md5.update(block)
md5_hex = file_md5.hexdigest()
# 1. Init upload
init_resp = self._send_and_wait({
"cmd": "aibot_upload_media_init",
"headers": {"req_id": self._gen_req_id()},
"body": {
"type": media_type,
"filename": filename,
"total_size": file_size,
"total_chunks": total_chunks,
"md5": md5_hex,
},
}, timeout=15)
if init_resp.get("errcode") != 0:
logger.error(f"[WecomBot] Upload init failed: {init_resp}")
return ""
upload_id = init_resp.get("body", {}).get("upload_id")
if not upload_id:
logger.error("[WecomBot] Failed to get upload_id")
return ""
# 2. Upload chunks
with open(file_path, "rb") as f:
for idx in range(total_chunks):
chunk = f.read(MEDIA_CHUNK_SIZE)
b64_data = base64.b64encode(chunk).decode("utf-8")
chunk_resp = self._send_and_wait({
"cmd": "aibot_upload_media_chunk",
"headers": {"req_id": self._gen_req_id()},
"body": {
"upload_id": upload_id,
"chunk_index": idx,
"base64_data": b64_data,
},
}, timeout=30)
if chunk_resp.get("errcode") != 0:
logger.error(f"[WecomBot] Chunk {idx} upload failed: {chunk_resp}")
return ""
# 3. Finish upload
finish_resp = self._send_and_wait({
"cmd": "aibot_upload_media_finish",
"headers": {"req_id": self._gen_req_id()},
"body": {"upload_id": upload_id},
}, timeout=30)
if finish_resp.get("errcode") != 0:
logger.error(f"[WecomBot] Upload finish failed: {finish_resp}")
return ""
media_id = finish_resp.get("body", {}).get("media_id", "")
if media_id:
logger.info(f"[WecomBot] Media uploaded: media_id={media_id}")
else:
logger.error("[WecomBot] Failed to get media_id from finish response")
return media_id

View File

@@ -0,0 +1,216 @@
import os
import re
import base64
import requests
from bridge.context import ContextType
from channel.chat_message import ChatMessage
from common.log import logger
from common.utils import expand_path
from config import conf
from Crypto.Cipher import AES
MAGIC_SIGNATURES = [
(b"%PDF", ".pdf"),
(b"\x89PNG\r\n\x1a\n", ".png"),
(b"\xff\xd8\xff", ".jpg"),
(b"GIF87a", ".gif"),
(b"GIF89a", ".gif"),
(b"RIFF", ".webp"), # RIFF....WEBP, further checked below
(b"PK\x03\x04", ".zip"), # zip / docx / xlsx / pptx
(b"\x1f\x8b", ".gz"),
(b"Rar!\x1a\x07", ".rar"),
(b"7z\xbc\xaf\x27\x1c", ".7z"),
(b"\x00\x00\x00", ".mp4"), # ftyp box, further checked below
(b"#!AMR", ".amr"),
]
OFFICE_ZIP_MARKERS = {
b"word/": ".docx",
b"xl/": ".xlsx",
b"ppt/": ".pptx",
}
def _guess_ext_from_bytes(data: bytes) -> str:
"""Guess file extension from file content magic bytes."""
if not data or len(data) < 8:
return ""
for sig, ext in MAGIC_SIGNATURES:
if data[:len(sig)] == sig:
if ext == ".webp" and data[8:12] != b"WEBP":
continue
if ext == ".mp4":
if b"ftyp" not in data[4:12]:
continue
if ext == ".zip":
for marker, office_ext in OFFICE_ZIP_MARKERS.items():
if marker in data[:2000]:
return office_ext
return ".zip"
return ext
return ""
def _decrypt_media(url: str, aeskey: str) -> bytes:
"""
Download and decrypt AES-256-CBC encrypted media from wecom bot.
Returns decrypted bytes.
"""
resp = requests.get(url, timeout=30)
resp.raise_for_status()
encrypted = resp.content
key = base64.b64decode(aeskey + "=" * (-len(aeskey) % 4))
if len(key) != 32:
raise ValueError(f"Invalid AES key length: {len(key)}, expected 32")
iv = key[:16]
cipher = AES.new(key, AES.MODE_CBC, iv)
decrypted = cipher.decrypt(encrypted)
pad_len = decrypted[-1]
if pad_len > 32:
raise ValueError(f"Invalid PKCS7 padding length: {pad_len}")
return decrypted[:-pad_len]
def _get_tmp_dir() -> str:
"""Return the workspace tmp directory (absolute path), creating it if needed."""
ws_root = expand_path(conf().get("agent_workspace", "~/cow"))
tmp_dir = os.path.join(ws_root, "tmp")
os.makedirs(tmp_dir, exist_ok=True)
return tmp_dir
class WecomBotMessage(ChatMessage):
"""Message wrapper for wecom bot (websocket long-connection mode)."""
def __init__(self, msg_body: dict, is_group: bool = False):
super().__init__(msg_body)
self.msg_id = msg_body.get("msgid")
self.create_time = msg_body.get("create_time")
self.is_group = is_group
msg_type = msg_body.get("msgtype")
from_userid = msg_body.get("from", {}).get("userid", "")
chat_id = msg_body.get("chatid", "")
bot_id = msg_body.get("aibotid", "")
if msg_type == "text":
self.ctype = ContextType.TEXT
content = msg_body.get("text", {}).get("content", "")
if is_group:
content = re.sub(r"@\S+\s*", "", content).strip()
self.content = content
elif msg_type == "voice":
self.ctype = ContextType.TEXT
self.content = msg_body.get("voice", {}).get("content", "")
elif msg_type == "image":
self.ctype = ContextType.IMAGE
image_info = msg_body.get("image", {})
image_url = image_info.get("url", "")
aeskey = image_info.get("aeskey", "")
tmp_dir = _get_tmp_dir()
image_path = os.path.join(tmp_dir, f"wecom_{self.msg_id}.png")
try:
data = _decrypt_media(image_url, aeskey)
with open(image_path, "wb") as f:
f.write(data)
self.content = image_path
self.image_path = image_path
logger.info(f"[WecomBot] Image downloaded: {image_path}")
except Exception as e:
logger.error(f"[WecomBot] Failed to download image: {e}")
self.content = "[Image download failed]"
self.image_path = None
elif msg_type == "mixed":
self.ctype = ContextType.TEXT
text_parts = []
image_paths = []
mixed_items = msg_body.get("mixed", {}).get("msg_item", [])
tmp_dir = _get_tmp_dir()
for idx, item in enumerate(mixed_items):
item_type = item.get("msgtype")
if item_type == "text":
txt = item.get("text", {}).get("content", "")
if is_group:
txt = re.sub(r"@\S+\s*", "", txt).strip()
if txt:
text_parts.append(txt)
elif item_type == "image":
img_info = item.get("image", {})
img_url = img_info.get("url", "")
img_aeskey = img_info.get("aeskey", "")
img_path = os.path.join(tmp_dir, f"wecom_{self.msg_id}_{idx}.png")
try:
img_data = _decrypt_media(img_url, img_aeskey)
with open(img_path, "wb") as f:
f.write(img_data)
image_paths.append(img_path)
except Exception as e:
logger.error(f"[WecomBot] Failed to download mixed image: {e}")
content_parts = text_parts[:]
for p in image_paths:
content_parts.append(f"[图片: {p}]")
self.content = "\n".join(content_parts) if content_parts else "[Mixed message]"
elif msg_type == "file":
self.ctype = ContextType.FILE
file_info = msg_body.get("file", {})
file_url = file_info.get("url", "")
aeskey = file_info.get("aeskey", "")
tmp_dir = _get_tmp_dir()
base_path = os.path.join(tmp_dir, f"wecom_{self.msg_id}")
self.content = base_path
def _download_file():
try:
data = _decrypt_media(file_url, aeskey)
ext = _guess_ext_from_bytes(data)
final_path = base_path + ext
with open(final_path, "wb") as f:
f.write(data)
self.content = final_path
logger.info(f"[WecomBot] File downloaded: {final_path}")
except Exception as e:
logger.error(f"[WecomBot] Failed to download file: {e}")
self._prepare_fn = _download_file
elif msg_type == "video":
self.ctype = ContextType.FILE
video_info = msg_body.get("video", {})
video_url = video_info.get("url", "")
aeskey = video_info.get("aeskey", "")
tmp_dir = _get_tmp_dir()
self.content = os.path.join(tmp_dir, f"wecom_{self.msg_id}.mp4")
def _download_video():
try:
data = _decrypt_media(video_url, aeskey)
with open(self.content, "wb") as f:
f.write(data)
logger.info(f"[WecomBot] Video downloaded: {self.content}")
except Exception as e:
logger.error(f"[WecomBot] Failed to download video: {e}")
self._prepare_fn = _download_video
else:
raise NotImplementedError(f"Unsupported message type: {msg_type}")
self.from_user_id = from_userid
self.to_user_id = bot_id
if is_group:
self.other_user_id = chat_id
self.actual_user_id = from_userid
self.actual_user_nickname = from_userid
else:
self.other_user_id = from_userid
self.actual_user_id = from_userid

View File

@@ -1,17 +0,0 @@
import os
import time
os.environ['ntwork_LOG'] = "ERROR"
import ntwork
wework = ntwork.WeWork()
def forever():
try:
while True:
time.sleep(0.1)
except KeyboardInterrupt:
ntwork.exit_()
os._exit(0)

View File

@@ -1,326 +0,0 @@
import io
import os
import random
import tempfile
import threading
os.environ['ntwork_LOG'] = "ERROR"
import ntwork
import requests
import uuid
from bridge.context import *
from bridge.reply import *
from channel.chat_channel import ChatChannel
from channel.wework.wework_message import *
from channel.wework.wework_message import WeworkMessage
from common.singleton import singleton
from common.log import logger
from common.time_check import time_checker
from common.utils import compress_imgfile, fsize
from config import conf
from channel.wework.run import wework
from channel.wework import run
def get_wxid_by_name(room_members, group_wxid, name):
if group_wxid in room_members:
for member in room_members[group_wxid]['member_list']:
if member['room_nickname'] == name or member['username'] == name:
return member['user_id']
return None # 如果没有找到对应的group_wxid或name则返回None
def download_and_compress_image(url, filename, quality=30):
# 确定保存图片的目录
directory = os.path.join(os.getcwd(), "tmp")
# 如果目录不存在,则创建目录
if not os.path.exists(directory):
os.makedirs(directory)
# 下载图片
pic_res = requests.get(url, stream=True)
image_storage = io.BytesIO()
for block in pic_res.iter_content(1024):
image_storage.write(block)
# 检查图片大小并可能进行压缩
sz = fsize(image_storage)
if sz >= 10 * 1024 * 1024: # 如果图片大于 10 MB
logger.info("[wework] image too large, ready to compress, sz={}".format(sz))
image_storage = compress_imgfile(image_storage, 10 * 1024 * 1024 - 1)
logger.info("[wework] image compressed, sz={}".format(fsize(image_storage)))
# 将内存缓冲区的指针重置到起始位置
image_storage.seek(0)
# 读取并保存图片
from PIL import Image
image = Image.open(image_storage)
image_path = os.path.join(directory, f"{filename}.png")
image.save(image_path, "png")
return image_path
def download_video(url, filename):
# 确定保存视频的目录
directory = os.path.join(os.getcwd(), "tmp")
# 如果目录不存在,则创建目录
if not os.path.exists(directory):
os.makedirs(directory)
# 下载视频
response = requests.get(url, stream=True)
total_size = 0
video_path = os.path.join(directory, f"{filename}.mp4")
with open(video_path, 'wb') as f:
for block in response.iter_content(1024):
total_size += len(block)
# 如果视频的总大小超过30MB (30 * 1024 * 1024 bytes),则停止下载并返回
if total_size > 30 * 1024 * 1024:
logger.info("[WX] Video is larger than 30MB, skipping...")
return None
f.write(block)
return video_path
def create_message(wework_instance, message, is_group):
logger.debug(f"正在为{'群聊' if is_group else '单聊'}创建 WeworkMessage")
cmsg = WeworkMessage(message, wework=wework_instance, is_group=is_group)
logger.debug(f"cmsg:{cmsg}")
return cmsg
def handle_message(cmsg, is_group):
logger.debug(f"准备用 WeworkChannel 处理{'群聊' if is_group else '单聊'}消息")
if is_group:
WeworkChannel().handle_group(cmsg)
else:
WeworkChannel().handle_single(cmsg)
logger.debug(f"已用 WeworkChannel 处理完{'群聊' if is_group else '单聊'}消息")
def _check(func):
def wrapper(self, cmsg: ChatMessage):
msgId = cmsg.msg_id
create_time = cmsg.create_time # 消息时间戳
if create_time is None:
return func(self, cmsg)
if int(create_time) < int(time.time()) - 60: # 跳过1分钟前的历史消息
logger.debug("[WX]history message {} skipped".format(msgId))
return
return func(self, cmsg)
return wrapper
@wework.msg_register(
[ntwork.MT_RECV_TEXT_MSG, ntwork.MT_RECV_IMAGE_MSG, 11072, ntwork.MT_RECV_LINK_CARD_MSG,ntwork.MT_RECV_FILE_MSG, ntwork.MT_RECV_VOICE_MSG])
def all_msg_handler(wework_instance: ntwork.WeWork, message):
logger.debug(f"收到消息: {message}")
if 'data' in message:
# 首先查找conversation_id如果没有找到则查找room_conversation_id
conversation_id = message['data'].get('conversation_id', message['data'].get('room_conversation_id'))
if conversation_id is not None:
is_group = "R:" in conversation_id
try:
cmsg = create_message(wework_instance=wework_instance, message=message, is_group=is_group)
except NotImplementedError as e:
logger.error(f"[WX]{message.get('MsgId', 'unknown')} 跳过: {e}")
return None
delay = random.randint(1, 2)
timer = threading.Timer(delay, handle_message, args=(cmsg, is_group))
timer.start()
else:
logger.debug("消息数据中无 conversation_id")
return None
return None
def accept_friend_with_retries(wework_instance, user_id, corp_id):
result = wework_instance.accept_friend(user_id, corp_id)
logger.debug(f'result:{result}')
# @wework.msg_register(ntwork.MT_RECV_FRIEND_MSG)
# def friend(wework_instance: ntwork.WeWork, message):
# data = message["data"]
# user_id = data["user_id"]
# corp_id = data["corp_id"]
# logger.info(f"接收到好友请求,消息内容:{data}")
# delay = random.randint(1, 180)
# threading.Timer(delay, accept_friend_with_retries, args=(wework_instance, user_id, corp_id)).start()
#
# return None
def get_with_retry(get_func, max_retries=5, delay=5):
retries = 0
result = None
while retries < max_retries:
result = get_func()
if result:
break
logger.warning(f"获取数据失败,重试第{retries + 1}次······")
retries += 1
time.sleep(delay) # 等待一段时间后重试
return result
@singleton
class WeworkChannel(ChatChannel):
NOT_SUPPORT_REPLYTYPE = []
def __init__(self):
super().__init__()
def startup(self):
smart = conf().get("wework_smart", True)
wework.open(smart)
logger.info("等待登录······")
wework.wait_login()
login_info = wework.get_login_info()
self.user_id = login_info['user_id']
self.name = login_info['nickname']
logger.info(f"登录信息:>>>user_id:{self.user_id}>>>>>>>>name:{self.name}")
logger.info("静默延迟60s等待客户端刷新数据请勿进行任何操作······")
time.sleep(60)
contacts = get_with_retry(wework.get_external_contacts)
rooms = get_with_retry(wework.get_rooms)
directory = os.path.join(os.getcwd(), "tmp")
if not contacts or not rooms:
logger.error("获取contacts或rooms失败程序退出")
ntwork.exit_()
os.exit(0)
if not os.path.exists(directory):
os.makedirs(directory)
# 将contacts保存到json文件中
with open(os.path.join(directory, 'wework_contacts.json'), 'w', encoding='utf-8') as f:
json.dump(contacts, f, ensure_ascii=False, indent=4)
with open(os.path.join(directory, 'wework_rooms.json'), 'w', encoding='utf-8') as f:
json.dump(rooms, f, ensure_ascii=False, indent=4)
# 创建一个空字典来保存结果
result = {}
# 遍历列表中的每个字典
for room in rooms['room_list']:
# 获取聊天室ID
room_wxid = room['conversation_id']
# 获取聊天室成员
room_members = wework.get_room_members(room_wxid)
# 将聊天室成员保存到结果字典中
result[room_wxid] = room_members
# 将结果保存到json文件中
with open(os.path.join(directory, 'wework_room_members.json'), 'w', encoding='utf-8') as f:
json.dump(result, f, ensure_ascii=False, indent=4)
logger.info("wework程序初始化完成········")
run.forever()
@time_checker
@_check
def handle_single(self, cmsg: ChatMessage):
if cmsg.from_user_id == cmsg.to_user_id:
# ignore self reply
return
if cmsg.ctype == ContextType.VOICE:
if not conf().get("speech_recognition"):
return
logger.debug("[WX]receive voice msg: {}".format(cmsg.content))
elif cmsg.ctype == ContextType.IMAGE:
logger.debug("[WX]receive image msg: {}".format(cmsg.content))
elif cmsg.ctype == ContextType.PATPAT:
logger.debug("[WX]receive patpat msg: {}".format(cmsg.content))
elif cmsg.ctype == ContextType.TEXT:
logger.debug("[WX]receive text msg: {}, cmsg={}".format(json.dumps(cmsg._rawmsg, ensure_ascii=False), cmsg))
else:
logger.debug("[WX]receive msg: {}, cmsg={}".format(cmsg.content, cmsg))
context = self._compose_context(cmsg.ctype, cmsg.content, isgroup=False, msg=cmsg)
if context:
self.produce(context)
@time_checker
@_check
def handle_group(self, cmsg: ChatMessage):
if cmsg.ctype == ContextType.VOICE:
if not conf().get("speech_recognition"):
return
logger.debug("[WX]receive voice for group msg: {}".format(cmsg.content))
elif cmsg.ctype == ContextType.IMAGE:
logger.debug("[WX]receive image for group msg: {}".format(cmsg.content))
elif cmsg.ctype in [ContextType.JOIN_GROUP, ContextType.PATPAT]:
logger.debug("[WX]receive note msg: {}".format(cmsg.content))
elif cmsg.ctype == ContextType.TEXT:
pass
else:
logger.debug("[WX]receive group msg: {}".format(cmsg.content))
context = self._compose_context(cmsg.ctype, cmsg.content, isgroup=True, msg=cmsg)
if context:
self.produce(context)
# 统一的发送函数每个Channel自行实现根据reply的type字段发送不同类型的消息
def send(self, reply: Reply, context: Context):
logger.debug(f"context: {context}")
receiver = context["receiver"]
actual_user_id = context["msg"].actual_user_id
if reply.type == ReplyType.TEXT or reply.type == ReplyType.TEXT_:
match = re.search(r"^@(.*?)\n", reply.content)
logger.debug(f"match: {match}")
if match:
new_content = re.sub(r"^@(.*?)\n", "\n", reply.content)
at_list = [actual_user_id]
logger.debug(f"new_content: {new_content}")
wework.send_room_at_msg(receiver, new_content, at_list)
else:
wework.send_text(receiver, reply.content)
logger.info("[WX] sendMsg={}, receiver={}".format(reply, receiver))
elif reply.type == ReplyType.ERROR or reply.type == ReplyType.INFO:
wework.send_text(receiver, reply.content)
logger.info("[WX] sendMsg={}, receiver={}".format(reply, receiver))
elif reply.type == ReplyType.IMAGE: # 从文件读取图片
image_storage = reply.content
image_storage.seek(0)
# Read data from image_storage
data = image_storage.read()
# Create a temporary file
with tempfile.NamedTemporaryFile(delete=False) as temp:
temp_path = temp.name
temp.write(data)
# Send the image
wework.send_image(receiver, temp_path)
logger.info("[WX] sendImage, receiver={}".format(receiver))
# Remove the temporary file
os.remove(temp_path)
elif reply.type == ReplyType.IMAGE_URL: # 从网络下载图片
img_url = reply.content
filename = str(uuid.uuid4())
# 调用你的函数,下载图片并保存为本地文件
image_path = download_and_compress_image(img_url, filename)
wework.send_image(receiver, file_path=image_path)
logger.info("[WX] sendImage url={}, receiver={}".format(img_url, receiver))
elif reply.type == ReplyType.VIDEO_URL:
video_url = reply.content
filename = str(uuid.uuid4())
video_path = download_video(video_url, filename)
if video_path is None:
# 如果视频太大,下载可能会被跳过,此时 video_path 将为 None
wework.send_text(receiver, "抱歉,视频太大了!!!")
else:
wework.send_video(receiver, video_path)
logger.info("[WX] sendVideo, receiver={}".format(receiver))
elif reply.type == ReplyType.VOICE:
current_dir = os.getcwd()
voice_file = reply.content.split("/")[-1]
reply.content = os.path.join(current_dir, "tmp", voice_file)
wework.send_file(receiver, reply.content)
logger.info("[WX] sendFile={}, receiver={}".format(reply.content, receiver))

View File

@@ -1,227 +0,0 @@
import datetime
import json
import os
import re
import time
import pilk
from bridge.context import ContextType
from channel.chat_message import ChatMessage
from common.log import logger
from ntwork.const import send_type
def get_with_retry(get_func, max_retries=5, delay=5):
retries = 0
result = None
while retries < max_retries:
result = get_func()
if result:
break
logger.warning(f"获取数据失败,重试第{retries + 1}次······")
retries += 1
time.sleep(delay) # 等待一段时间后重试
return result
def get_room_info(wework, conversation_id):
logger.debug(f"传入的 conversation_id: {conversation_id}")
rooms = wework.get_rooms()
if not rooms or 'room_list' not in rooms:
logger.error(f"获取群聊信息失败: {rooms}")
return None
time.sleep(1)
logger.debug(f"获取到的群聊信息: {rooms}")
for room in rooms['room_list']:
if room['conversation_id'] == conversation_id:
return room
return None
def cdn_download(wework, message, file_name):
data = message["data"]
aes_key = data["cdn"]["aes_key"]
file_size = data["cdn"]["size"]
# 获取当前工作目录,然后与文件名拼接得到保存路径
current_dir = os.getcwd()
save_path = os.path.join(current_dir, "tmp", file_name)
# 下载保存图片到本地
if "url" in data["cdn"].keys() and "auth_key" in data["cdn"].keys():
url = data["cdn"]["url"]
auth_key = data["cdn"]["auth_key"]
# result = wework.wx_cdn_download(url, auth_key, aes_key, file_size, save_path) # ntwork库本身接口有问题缺失了aes_key这个参数
"""
下载wx类型的cdn文件以https开头
"""
data = {
'url': url,
'auth_key': auth_key,
'aes_key': aes_key,
'size': file_size,
'save_path': save_path
}
result = wework._WeWork__send_sync(send_type.MT_WXCDN_DOWNLOAD_MSG, data) # 直接用wx_cdn_download的接口内部实现来调用
elif "file_id" in data["cdn"].keys():
if message["type"] == 11042:
file_type = 2
elif message["type"] == 11045:
file_type = 5
file_id = data["cdn"]["file_id"]
result = wework.c2c_cdn_download(file_id, aes_key, file_size, file_type, save_path)
else:
logger.error(f"something is wrong, data: {data}")
return
# 输出下载结果
logger.debug(f"result: {result}")
def c2c_download_and_convert(wework, message, file_name):
data = message["data"]
aes_key = data["cdn"]["aes_key"]
file_size = data["cdn"]["size"]
file_type = 5
file_id = data["cdn"]["file_id"]
current_dir = os.getcwd()
save_path = os.path.join(current_dir, "tmp", file_name)
result = wework.c2c_cdn_download(file_id, aes_key, file_size, file_type, save_path)
logger.debug(result)
# 在下载完SILK文件之后立即将其转换为WAV文件
base_name, _ = os.path.splitext(save_path)
wav_file = base_name + ".wav"
pilk.silk_to_wav(save_path, wav_file, rate=24000)
# 删除SILK文件
try:
os.remove(save_path)
except Exception as e:
pass
class WeworkMessage(ChatMessage):
def __init__(self, wework_msg, wework, is_group=False):
try:
super().__init__(wework_msg)
self.msg_id = wework_msg['data'].get('conversation_id', wework_msg['data'].get('room_conversation_id'))
# 使用.get()防止 'send_time' 键不存在时抛出错误
self.create_time = wework_msg['data'].get("send_time")
self.is_group = is_group
self.wework = wework
if wework_msg["type"] == 11041: # 文本消息类型
if any(substring in wework_msg['data']['content'] for substring in ("该消息类型暂不能展示", "不支持的消息类型")):
return
self.ctype = ContextType.TEXT
self.content = wework_msg['data']['content']
elif wework_msg["type"] == 11044: # 语音消息类型,需要缓存文件
file_name = datetime.datetime.now().strftime('%Y%m%d%H%M%S') + ".silk"
base_name, _ = os.path.splitext(file_name)
file_name_2 = base_name + ".wav"
current_dir = os.getcwd()
self.ctype = ContextType.VOICE
self.content = os.path.join(current_dir, "tmp", file_name_2)
self._prepare_fn = lambda: c2c_download_and_convert(wework, wework_msg, file_name)
elif wework_msg["type"] == 11042: # 图片消息类型,需要下载文件
file_name = datetime.datetime.now().strftime('%Y%m%d%H%M%S') + ".jpg"
current_dir = os.getcwd()
self.ctype = ContextType.IMAGE
self.content = os.path.join(current_dir, "tmp", file_name)
self._prepare_fn = lambda: cdn_download(wework, wework_msg, file_name)
elif wework_msg["type"] == 11045: # 文件消息
print("文件消息")
print(wework_msg)
file_name = datetime.datetime.now().strftime('%Y%m%d%H%M%S')
file_name = file_name + wework_msg['data']['cdn']['file_name']
current_dir = os.getcwd()
self.ctype = ContextType.FILE
self.content = os.path.join(current_dir, "tmp", file_name)
self._prepare_fn = lambda: cdn_download(wework, wework_msg, file_name)
elif wework_msg["type"] == 11047: # 链接消息
self.ctype = ContextType.SHARING
self.content = wework_msg['data']['url']
elif wework_msg["type"] == 11072: # 新成员入群通知
self.ctype = ContextType.JOIN_GROUP
member_list = wework_msg['data']['member_list']
self.actual_user_nickname = member_list[0]['name']
self.actual_user_id = member_list[0]['user_id']
self.content = f"{self.actual_user_nickname}加入了群聊!"
directory = os.path.join(os.getcwd(), "tmp")
rooms = get_with_retry(wework.get_rooms)
if not rooms:
logger.error("更新群信息失败···")
else:
result = {}
for room in rooms['room_list']:
# 获取聊天室ID
room_wxid = room['conversation_id']
# 获取聊天室成员
room_members = wework.get_room_members(room_wxid)
# 将聊天室成员保存到结果字典中
result[room_wxid] = room_members
with open(os.path.join(directory, 'wework_room_members.json'), 'w', encoding='utf-8') as f:
json.dump(result, f, ensure_ascii=False, indent=4)
logger.info("有新成员加入,已自动更新群成员列表缓存!")
else:
raise NotImplementedError(
"Unsupported message type: Type:{} MsgType:{}".format(wework_msg["type"], wework_msg["MsgType"]))
data = wework_msg['data']
login_info = self.wework.get_login_info()
logger.debug(f"login_info: {login_info}")
nickname = f"{login_info['username']}({login_info['nickname']})" if login_info['nickname'] else login_info['username']
user_id = login_info['user_id']
sender_id = data.get('sender')
conversation_id = data.get('conversation_id')
sender_name = data.get("sender_name")
self.from_user_id = user_id if sender_id == user_id else conversation_id
self.from_user_nickname = nickname if sender_id == user_id else sender_name
self.to_user_id = user_id
self.to_user_nickname = nickname
self.other_user_nickname = sender_name
self.other_user_id = conversation_id
if self.is_group:
conversation_id = data.get('conversation_id') or data.get('room_conversation_id')
self.other_user_id = conversation_id
if conversation_id:
room_info = get_room_info(wework=wework, conversation_id=conversation_id)
self.other_user_nickname = room_info.get('nickname', None) if room_info else None
self.from_user_nickname = room_info.get('nickname', None) if room_info else None
at_list = data.get('at_list', [])
tmp_list = []
for at in at_list:
tmp_list.append(at['nickname'])
at_list = tmp_list
logger.debug(f"at_list: {at_list}")
logger.debug(f"nickname: {nickname}")
self.is_at = False
if nickname in at_list or login_info['nickname'] in at_list or login_info['username'] in at_list:
self.is_at = True
self.at_list = at_list
# 检查消息内容是否包含@用户名。处理复制粘贴的消息,这类消息可能不会触发@通知,但内容中可能包含 "@用户名"。
content = data.get('content', '')
name = nickname
pattern = f"@{re.escape(name)}(\u2005|\u0020)"
if re.search(pattern, content):
logger.debug(f"Wechaty message {self.msg_id} includes at")
self.is_at = True
if not self.actual_user_id:
self.actual_user_id = data.get("sender")
self.actual_user_nickname = sender_name if self.ctype != ContextType.JOIN_GROUP else self.actual_user_nickname
else:
logger.error("群聊消息中没有找到 conversation_id 或 room_conversation_id")
logger.debug(f"WeworkMessage has been successfully instantiated with message id: {self.msg_id}")
except Exception as e:
logger.error(f"在 WeworkMessage 的初始化过程中出现错误:{e}")
raise e

View File

@@ -20,6 +20,18 @@ import os
chat_client: LinkAIClient
CHANNEL_ACTIONS = {"channel_create", "channel_update", "channel_delete"}
# channelType -> config key mapping for app credentials
CREDENTIAL_MAP = {
"feishu": ("feishu_app_id", "feishu_app_secret"),
"dingtalk": ("dingtalk_client_id", "dingtalk_client_secret"),
"wechatmp": ("wechatmp_app_id", "wechatmp_app_secret"),
"wechatmp_service": ("wechatmp_app_id", "wechatmp_app_secret"),
"wechatcom_app": ("wechatcomapp_agent_id", "wechatcomapp_secret"),
}
class CloudClient(LinkAIClient):
def __init__(self, api_key: str, channel, host: str = ""):
super().__init__(api_key, host)
@@ -96,6 +108,12 @@ class CloudClient(LinkAIClient):
if not self.client_id:
return
logger.info(f"[CloudClient] Loading remote config: {config}")
action = config.get("action")
if action in CHANNEL_ACTIONS:
self._dispatch_channel_action(action, config.get("data", {}))
return
if config.get("enabled") != "Y":
return
@@ -123,50 +141,17 @@ class CloudClient(LinkAIClient):
if config.get("model"):
local_config["model"] = config.get("model")
# Channel configuration
# Channel configuration (legacy single-channel path)
if config.get("channelType"):
if local_config.get("channel_type") != config.get("channelType"):
local_config["channel_type"] = config.get("channelType")
need_restart_channel = True
# Channel-specific app credentials
# Channel-specific app credentials (legacy single-channel path)
current_channel_type = local_config.get("channel_type", "")
if config.get("app_id") is not None:
if current_channel_type == "feishu":
if local_config.get("feishu_app_id") != config.get("app_id"):
local_config["feishu_app_id"] = config.get("app_id")
need_restart_channel = True
elif current_channel_type == "dingtalk":
if local_config.get("dingtalk_client_id") != config.get("app_id"):
local_config["dingtalk_client_id"] = config.get("app_id")
need_restart_channel = True
elif current_channel_type in ("wechatmp", "wechatmp_service"):
if local_config.get("wechatmp_app_id") != config.get("app_id"):
local_config["wechatmp_app_id"] = config.get("app_id")
need_restart_channel = True
elif current_channel_type == "wechatcom_app":
if local_config.get("wechatcomapp_agent_id") != config.get("app_id"):
local_config["wechatcomapp_agent_id"] = config.get("app_id")
need_restart_channel = True
if config.get("app_secret"):
if current_channel_type == "feishu":
if local_config.get("feishu_app_secret") != config.get("app_secret"):
local_config["feishu_app_secret"] = config.get("app_secret")
need_restart_channel = True
elif current_channel_type == "dingtalk":
if local_config.get("dingtalk_client_secret") != config.get("app_secret"):
local_config["dingtalk_client_secret"] = config.get("app_secret")
need_restart_channel = True
elif current_channel_type in ("wechatmp", "wechatmp_service"):
if local_config.get("wechatmp_app_secret") != config.get("app_secret"):
local_config["wechatmp_app_secret"] = config.get("app_secret")
need_restart_channel = True
elif current_channel_type == "wechatcom_app":
if local_config.get("wechatcomapp_secret") != config.get("app_secret"):
local_config["wechatcomapp_secret"] = config.get("app_secret")
need_restart_channel = True
if self._set_channel_credentials(local_config, current_channel_type,
config.get("app_id"), config.get("app_secret")):
need_restart_channel = True
if config.get("admin_password"):
if not pconf("Godcmd"):
@@ -190,12 +175,169 @@ class CloudClient(LinkAIClient):
if pconf("linkai")["midjourney"]:
pconf("linkai")["midjourney"]["use_image_create_prefix"] = False
# Save configuration to config.json file
self._save_config_to_file(local_config)
if need_restart_channel:
self._restart_channel(local_config.get("channel_type", ""))
# ------------------------------------------------------------------
# channel CRUD operations
# ------------------------------------------------------------------
def _dispatch_channel_action(self, action: str, data: dict):
channel_type = data.get("channelType")
if not channel_type:
logger.warning(f"[CloudClient] Channel action '{action}' missing channelType, data={data}")
return
logger.info(f"[CloudClient] Channel action: {action}, channelType={channel_type}")
if action == "channel_create":
self._handle_channel_create(channel_type, data)
elif action == "channel_update":
self._handle_channel_update(channel_type, data)
elif action == "channel_delete":
self._handle_channel_delete(channel_type, data)
def _handle_channel_create(self, channel_type: str, data: dict):
local_config = conf()
self._set_channel_credentials(local_config, channel_type,
data.get("appId"), data.get("appSecret"))
self._add_channel_type(local_config, channel_type)
self._save_config_to_file(local_config)
if self.channel_mgr:
threading.Thread(
target=self._do_add_channel, args=(channel_type,), daemon=True
).start()
def _handle_channel_update(self, channel_type: str, data: dict):
local_config = conf()
enabled = data.get("enabled", "Y")
self._set_channel_credentials(local_config, channel_type,
data.get("appId"), data.get("appSecret"))
if enabled == "N":
self._remove_channel_type(local_config, channel_type)
else:
# Ensure channel_type is persisted even if this channel was not
# previously listed (e.g. update used as implicit create).
self._add_channel_type(local_config, channel_type)
self._save_config_to_file(local_config)
if not self.channel_mgr:
return
if enabled == "N":
threading.Thread(
target=self._do_remove_channel, args=(channel_type,), daemon=True
).start()
else:
threading.Thread(
target=self._do_restart_channel, args=(self.channel_mgr, channel_type), daemon=True
).start()
def _handle_channel_delete(self, channel_type: str, data: dict):
local_config = conf()
self._clear_channel_credentials(local_config, channel_type)
self._remove_channel_type(local_config, channel_type)
self._save_config_to_file(local_config)
if self.channel_mgr:
threading.Thread(
target=self._do_remove_channel, args=(channel_type,), daemon=True
).start()
# ------------------------------------------------------------------
# channel credentials helpers
# ------------------------------------------------------------------
@staticmethod
def _set_channel_credentials(local_config: dict, channel_type: str,
app_id, app_secret) -> bool:
"""
Write app_id / app_secret into the correct config keys for *channel_type*.
Returns True if any value actually changed.
"""
cred = CREDENTIAL_MAP.get(channel_type)
if not cred:
return False
id_key, secret_key = cred
changed = False
if app_id is not None and local_config.get(id_key) != app_id:
local_config[id_key] = app_id
changed = True
if app_secret is not None and local_config.get(secret_key) != app_secret:
local_config[secret_key] = app_secret
changed = True
return changed
@staticmethod
def _clear_channel_credentials(local_config: dict, channel_type: str):
cred = CREDENTIAL_MAP.get(channel_type)
if not cred:
return
id_key, secret_key = cred
local_config.pop(id_key, None)
local_config.pop(secret_key, None)
# ------------------------------------------------------------------
# channel_type list helpers
# ------------------------------------------------------------------
@staticmethod
def _parse_channel_types(local_config: dict) -> list:
raw = local_config.get("channel_type", "")
if isinstance(raw, list):
return [ch.strip() for ch in raw if ch.strip()]
if isinstance(raw, str):
return [ch.strip() for ch in raw.split(",") if ch.strip()]
return []
@staticmethod
def _add_channel_type(local_config: dict, channel_type: str):
types = CloudClient._parse_channel_types(local_config)
if channel_type not in types:
types.append(channel_type)
local_config["channel_type"] = ", ".join(types)
@staticmethod
def _remove_channel_type(local_config: dict, channel_type: str):
types = CloudClient._parse_channel_types(local_config)
if channel_type in types:
types.remove(channel_type)
local_config["channel_type"] = ", ".join(types)
# ------------------------------------------------------------------
# channel manager thread helpers
# ------------------------------------------------------------------
def _do_add_channel(self, channel_type: str):
try:
self.channel_mgr.add_channel(channel_type)
logger.info(f"[CloudClient] Channel '{channel_type}' added successfully")
except Exception as e:
logger.error(f"[CloudClient] Failed to add channel '{channel_type}': {e}")
self.send_channel_status(channel_type, "error", str(e))
return
self._report_channel_startup(channel_type)
def _do_remove_channel(self, channel_type: str):
try:
self.channel_mgr.remove_channel(channel_type)
logger.info(f"[CloudClient] Channel '{channel_type}' removed successfully")
except Exception as e:
logger.error(f"[CloudClient] Failed to remove channel '{channel_type}': {e}")
def _report_channel_startup(self, channel_type: str):
"""Wait for channel startup result and report to cloud."""
ch = self.channel_mgr.get_channel(channel_type)
if not ch:
self.send_channel_status(channel_type, "error", "channel instance not found")
return
success, error = ch.wait_startup(timeout=3)
if success:
logger.info(f"[CloudClient] Channel '{channel_type}' connected, reporting status")
self.send_channel_status(channel_type, "connected")
else:
logger.warning(f"[CloudClient] Channel '{channel_type}' startup failed: {error}")
self.send_channel_status(channel_type, "error", error)
# ------------------------------------------------------------------
# skill callback
# ------------------------------------------------------------------
@@ -252,13 +394,72 @@ class CloudClient(LinkAIClient):
payload = data.get("payload", {})
query = payload.get("query", "")
session_id = payload.get("session_id", "cloud_console")
logger.info(f"[CloudClient] on_chat: session={session_id}, query={query[:80]}")
channel_type = payload.get("channel_type", "")
if not session_id.startswith("session_"):
session_id = f"session_{session_id}"
logger.info(f"[CloudClient] on_chat: session={session_id}, channel={channel_type}, query={query[:80]}")
svc = self.chat_service
if svc is None:
raise RuntimeError("ChatService not available")
svc.run(query=query, session_id=session_id, send_chunk_fn=send_chunk_fn)
svc.run(query=query, session_id=session_id, channel_type=channel_type, send_chunk_fn=send_chunk_fn)
# ------------------------------------------------------------------
# history callback
# ------------------------------------------------------------------
def on_history(self, data: dict) -> dict:
"""
Handle HISTORY messages from the cloud console.
Returns paginated conversation history for a session.
:param data: message data with 'action' and 'payload' (session_id, page, page_size)
:return: response dict
"""
action = data.get("action", "query")
payload = data.get("payload", {})
logger.info(f"[CloudClient] on_history: action={action}")
if action == "query":
return self._query_history(payload)
return {"action": action, "code": 404, "message": f"unknown action: {action}", "payload": None}
def _query_history(self, payload: dict) -> dict:
"""Query paginated conversation history using ConversationStore."""
session_id = payload.get("session_id", "")
page = int(payload.get("page", 1))
page_size = int(payload.get("page_size", 20))
if not session_id:
return {
"action": "query",
"payload": {"status": "error", "message": "session_id required"},
}
# Web channel stores sessions with a "session_" prefix
if not session_id.startswith("session_"):
session_id = f"session_{session_id}"
logger.info(f"[CloudClient] history query: session={session_id}, page={page}, page_size={page_size}")
try:
from agent.memory.conversation_store import get_conversation_store
store = get_conversation_store()
result = store.load_history_page(
session_id=session_id,
page=page,
page_size=page_size,
)
return {
"action": "query",
"payload": {"status": "success", **result},
}
except Exception as e:
logger.error(f"[CloudClient] History query error: {e}")
return {
"action": "query",
"payload": {"status": "error", "message": str(e)},
}
# ------------------------------------------------------------------
# channel restart helpers
@@ -279,13 +480,15 @@ class CloudClient(LinkAIClient):
"""
try:
mgr.restart(new_channel_type)
# Update the client's channel reference
if mgr.channel:
self.channel = mgr.channel
self.client_type = mgr.channel.channel_type
logger.info(f"[CloudClient] Channel reference updated to '{new_channel_type}'")
except Exception as e:
logger.error(f"[CloudClient] Channel restart failed: {e}")
self.send_channel_status(new_channel_type, "error", str(e))
return
self._report_channel_startup(new_channel_type)
# ------------------------------------------------------------------
# config persistence
@@ -313,6 +516,81 @@ class CloudClient(LinkAIClient):
logger.error(f"[CloudClient] Failed to save configuration to config.json: {e}")
def get_root_domain(host: str = "") -> str:
"""Extract root domain from a hostname.
If *host* is empty, reads CLOUD_HOST env var / cloud_host config.
"""
if not host:
host = os.environ.get("CLOUD_HOST") or conf().get("cloud_host", "")
if not host:
return ""
host = host.strip().rstrip("/")
if "://" in host:
host = host.split("://", 1)[1]
host = host.split("/", 1)[0].split(":")[0]
parts = host.split(".")
if len(parts) >= 2:
return ".".join(parts[-2:])
return host
def get_deployment_id() -> str:
"""Return cloud deployment id from env var or config."""
return os.environ.get("CLOUD_DEPLOYMENT_ID") or conf().get("cloud_deployment_id", "")
def get_website_base_url() -> str:
"""Return the public URL prefix that maps to the workspace websites/ dir.
Returns empty string when cloud deployment is not configured.
"""
deployment_id = get_deployment_id()
if not deployment_id:
return ""
websites_domain = os.environ.get("CLOUD_WEBSITES_DOMAIN") or conf().get("cloud_websites_domain", "")
if websites_domain:
websites_domain = websites_domain.strip().rstrip("/")
return f"https://{websites_domain}/{deployment_id}"
domain = get_root_domain()
if not domain:
return ""
return f"https://app.{domain}/{deployment_id}"
def build_website_prompt(workspace_dir: str) -> list:
"""Build system prompt lines for cloud website/file sharing rules.
Returns an empty list when cloud deployment is not configured,
so callers can safely do ``lines.extend(build_website_prompt(...))``.
"""
base_url = get_website_base_url()
if not base_url:
return []
return [
"**文件分享与网页生成规则** (非常重要 — 当前为云部署模式):",
"",
f"云端已为工作空间的 `websites/` 目录配置好公网路由映射,访问地址前缀为: `{base_url}`",
"",
"1. **网页/网站**: 编写网页、H5页面等前端代码时**必须**将文件放到 `websites/` 目录中",
f" - 例如: `websites/index.html` → `{base_url}/index.html`",
f" - 例如: `websites/my-app/index.html` → `{base_url}/my-app/index.html`",
"",
"2. **生成文件分享** (PPT、PDF、图片、音视频等): 当你为用户生成了需要下载或查看的文件时,**可以**将文件保存到 `websites/` 目录中",
f" - 例如: 生成的PPT保存到 `websites/files/report.pptx` → 下载链接为 `{base_url}/files/report.pptx`",
" - 你仍然可以同时使用 `send` 工具发送文件在飞书、钉钉等IM渠道中有效但**必须同时在回复文本中提供下载链接**作为兜底,因为部分渠道(如网页端)无法通过 send 接收本地文件",
"",
"3. **必须发送链接**: 无论是网页还是文件,生成后**必须将完整的访问/下载链接直接写在回复文本中发送给用户**",
"",
"4. **文件名和路径尽量使用英文/拼音/数字等**,不要使用中文,避免链接无法访问",
"",
"5. 建议为每个独立项目在 `websites/` 下创建子目录,保持结构清晰",
"",
]
def start(channel, channel_mgr=None):
global chat_client
chat_client = CloudClient(api_key=conf().get("linkai_api_key"), host=conf().get("cloud_host", ""), channel=channel)
@@ -322,6 +600,21 @@ def start(channel, channel_mgr=None):
time.sleep(1.5)
if chat_client.client_id:
logger.info("[CloudClient] Console: https://link-ai.tech/console/clients")
if channel_mgr:
channel_mgr.cloud_mode = True
threading.Thread(target=_report_existing_channels, args=(chat_client, channel_mgr), daemon=True).start()
def _report_existing_channels(client: CloudClient, mgr):
"""Report status for all channels that were started before cloud client connected."""
try:
for name, ch in list(mgr._channels.items()):
if name == "web":
continue
ch.cloud_mode = True
client._report_channel_startup(name)
except Exception as e:
logger.warning(f"[CloudClient] Failed to report existing channel status: {e}")
def _build_config():

View File

@@ -9,9 +9,10 @@ CLAUDEAPI= "claudeAPI"
QWEN = "qwen" # 旧版千问接入
QWEN_DASHSCOPE = "dashscope" # 新版千问接入(百炼)
GEMINI = "gemini"
ZHIPU_AI = "glm-4"
ZHIPU_AI = "zhipu"
MOONSHOT = "moonshot"
MiniMax = "minimax"
DEEPSEEK = "deepseek"
MODELSCOPE = "modelscope"
# 模型列表
@@ -41,6 +42,7 @@ GEMINI_25_PRO_PRE = "gemini-2.5-pro-preview-05-06"
GEMINI_3_FLASH_PRE = "gemini-3-flash-preview" # Gemini 3 Flash Preview - Agent推荐模型
GEMINI_3_PRO_PRE = "gemini-3-pro-preview" # Gemini 3 Pro Preview
GEMINI_31_PRO_PRE = "gemini-3.1-pro-preview" # Gemini 3.1 Pro Preview - Agent推荐模型
GEMINI_31_FLASH_LITE_PRE = "gemini-3.1-flash-lite-preview" # Gemini 3.1 Flash Lite Preview - Agent推荐模型
# OpenAI
GPT35 = "gpt-3.5-turbo"
@@ -65,6 +67,7 @@ GPT_41_NANO = "gpt-4.1-nano"
GPT_5 = "gpt-5"
GPT_5_MINI = "gpt-5-mini"
GPT_5_NANO = "gpt-5-nano"
GPT_54 = "gpt-5.4" # GPT-5.4 - Agent recommended model
O1 = "o1-preview"
O1_MINI = "o1-mini"
WHISPER_1 = "whisper-1"
@@ -140,7 +143,7 @@ MODEL_LIST = [
"claude", "claude-3-haiku", "claude-3-sonnet", "claude-3-opus", "claude-3.5-sonnet",
# Gemini
GEMINI_31_PRO_PRE, GEMINI_3_PRO_PRE, GEMINI_3_FLASH_PRE, GEMINI_25_PRO_PRE, GEMINI_25_FLASH_PRE,
GEMINI_31_FLASH_LITE_PRE, GEMINI_31_PRO_PRE, GEMINI_3_PRO_PRE, GEMINI_3_FLASH_PRE, GEMINI_25_PRO_PRE, GEMINI_25_FLASH_PRE,
GEMINI_20_FLASH, GEMINI_20_flash_exp, GEMINI_15_PRO, GEMINI_15_flash, GEMINI_PRO, GEMINI,
# OpenAI
@@ -150,6 +153,7 @@ MODEL_LIST = [
GPT_4o, GPT_4O_0806, GPT_4o_MINI,
GPT_41, GPT_41_MINI, GPT_41_NANO,
GPT_5, GPT_5_MINI, GPT_5_NANO,
GPT_54,
O1, O1_MINI,
# DeepSeek
@@ -182,3 +186,4 @@ MODEL_LIST = MODEL_LIST + GITEE_AI_MODEL_LIST + MODELSCOPE_MODEL_LIST
# channel
FEISHU = "feishu"
DINGTALK = "dingtalk"
WECOM_BOT = "wecom_bot"

View File

@@ -28,7 +28,7 @@ def check_dulwich():
except ImportError:
try:
install("dulwich")
except:
except Exception:
needwait = True
try:
import dulwich

View File

@@ -20,11 +20,12 @@
"use_linkai": false,
"linkai_api_key": "",
"linkai_app_code": "",
"feishu_bot_name": "",
"feishu_app_id": "",
"feishu_app_secret": "",
"dingtalk_client_id": "",
"dingtalk_client_secret":"",
"wecom_bot_id": "",
"wecom_bot_secret": "",
"agent": true,
"agent_max_context_tokens": 40000,
"agent_max_context_turns": 20,

View File

@@ -37,7 +37,7 @@ available_setting = {
"group_name_white_list": ["ChatGPT测试群", "ChatGPT测试群2"], # 开启自动回复的群名称列表
"group_name_keyword_white_list": [], # 开启自动回复的群名称关键词列表
"group_chat_in_one_session": ["ChatGPT测试群"], # 支持会话上下文共享的群名称
"group_shared_session": True, # 群聊是否共享会话上下文(所有成员共享)默认为True。False时每个用户在群内有独立会话
"group_shared_session": False, # 群聊是否共享会话上下文所有成员共享。False时每个用户在群内有独立会话
"nick_name_black_list": [], # 用户昵称黑名单
"group_welcome_msg": "", # 配置新人进群固定欢迎语,不配置则使用随机风格欢迎
"trigger_by_self": False, # 是否允许机器人触发
@@ -95,8 +95,6 @@ available_setting = {
"dashscope_api_key": "",
# Google Gemini Api Key
"gemini_api_key": "",
# wework的通用配置
"wework_smart": True, # 配置wework是否使用已登录的企业微信False为多开
# 语音设置
"speech_recognition": True, # 是否开启语音识别
"group_speech_recognition": False, # 是否开启群组语音识别
@@ -118,7 +116,7 @@ available_setting = {
# elevenlabs 语音api配置
"xi_api_key": "", # 获取ap的方法可以参考https://docs.elevenlabs.io/api-reference/quick-start/authentication
"xi_voice_id": "", # ElevenLabs提供了9种英式、美式等英语发音id分别是“Adam/Antoni/Arnold/Bella/Domi/Elli/Josh/Rachel/Sam”
# 服务时间限制目前支持itchat
# 服务时间限制
"chat_time_module": False, # 是否开启服务时间限制
"chat_start_time": "00:00", # 服务开始时间
"chat_stop_time": "24:00", # 服务结束时间
@@ -127,10 +125,6 @@ available_setting = {
# baidu翻译api的配置
"baidu_translate_app_id": "", # 百度翻译api的appid
"baidu_translate_app_key": "", # 百度翻译api的秘钥
# itchat的配置
"hot_reload": False, # 是否开启热重载
# wechaty的配置
"wechaty_puppet_service_token": "", # wechaty的token
# wechatmp的配置
"wechatmp_token": "", # 微信公众平台的Token
"wechatmp_port": 8080, # 微信公众平台的端口,需要端口转发到80或443
@@ -156,11 +150,14 @@ available_setting = {
"dingtalk_client_id": "", # 钉钉机器人Client ID
"dingtalk_client_secret": "", # 钉钉机器人Client Secret
"dingtalk_card_enabled": False,
# 企微智能机器人配置(长连接模式)
"wecom_bot_id": "", # 企微智能机器人BotID
"wecom_bot_secret": "", # 企微智能机器人长连接Secret
# chatgpt指令自定义触发词
"clear_memory_commands": ["#清除记忆"], # 重置会话指令,必须以#开头
# channel配置
"channel_type": "", # 通道类型,支持{wx,wxy,terminal,wechatmp,wechatmp_service,wechatcom_app,dingtalk}
"channel_type": "", # 通道类型,支持多渠道同时运行。单个: "feishu",多个: "feishu, dingtalk" 或 ["feishu", "dingtalk"]。可选值: web,feishu,dingtalk,wecom_bot,wechatmp,wechatmp_service,wechatcom_app
"web_console": True, # 是否自动启动Web控制台默认启动。设为False可禁用
"subscribe_msg": "", # 订阅消息, 支持: wechatmp, wechatmp_service, wechatcom_app
"debug": False, # 是否开启debug模式开启后会打印更多日志
"appdata_dir": "", # 数据目录
@@ -186,6 +183,8 @@ available_setting = {
"linkai_api_key": "",
"linkai_app_code": "",
"linkai_api_base": "https://api.link-ai.tech", # linkAI服务地址
"cloud_host": "client.link-ai.tech",
"cloud_deployment_id": "",
"minimax_api_key": "",
"Minimax_group_id": "",
"Minimax_base_url": "",
@@ -322,7 +321,7 @@ def load_config():
logger.info("[INIT] override config by environ args: {}={}".format(name, value))
try:
config[name] = eval(value)
except:
except Exception:
if value == "false":
config[name] = False
elif value == "true":
@@ -353,6 +352,37 @@ def load_config():
logger.info("[INIT] Debug: {}".format(config.get("debug", False)))
logger.info("[INIT] ========================================")
# Sync selected config values to environment variables so that
# subprocesses (e.g. shell skill scripts) can access them directly.
# Existing env vars are NOT overwritten (env takes precedence).
_CONFIG_TO_ENV = {
"open_ai_api_key": "OPENAI_API_KEY",
"open_ai_api_base": "OPENAI_API_BASE",
"linkai_api_key": "LINKAI_API_KEY",
"linkai_api_base": "LINKAI_API_BASE",
"claude_api_key": "CLAUDE_API_KEY",
"claude_api_base": "CLAUDE_API_BASE",
"gemini_api_key": "GEMINI_API_KEY",
"gemini_api_base": "GEMINI_API_BASE",
"minimax_api_key": "MINIMAX_API_KEY",
"minimax_api_base": "MINIMAX_API_BASE",
"zhipu_ai_api_key": "ZHIPU_AI_API_KEY",
"zhipu_ai_api_base": "ZHIPU_AI_API_BASE",
"moonshot_api_key": "MOONSHOT_API_KEY",
"moonshot_api_base": "MOONSHOT_API_BASE",
"ark_api_key": "ARK_API_KEY",
"ark_api_base": "ARK_API_BASE",
}
injected = 0
for conf_key, env_key in _CONFIG_TO_ENV.items():
if env_key not in os.environ:
val = config.get(conf_key, "")
if val:
os.environ[env_key] = str(val)
injected += 1
if injected:
logger.info("[INIT] Synced {} config values to environment variables".format(injected))
config.load_user_datas()

View File

@@ -23,7 +23,7 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
在后续的长期对话中Agent会在需要的时候智能记录或检索记忆并对自身设定、用户偏好、记忆文件等进行不断更新总结和记录经验和教训真正实现自主思考和不断成长。
<img width="800" src="https://cdn.link-ai.tech/doc/20260203000455.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260203000455.png" />
@@ -37,14 +37,14 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
针对操作系统的终端和文件的访问能力是最基础和核心的工具其他很多工具或技能都是基于基础工具进行扩展。用户可通过手机端与Agent交互操作个人电脑或服务器上的资源
<img width="800" src="https://cdn.link-ai.tech/doc/20260202181130.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260202181130.png" />
#### 1.2 编程能力
基于编程能力和系统访问能力Agent可以实现从信息搜索、图片等素材生成、编码、测试、部署、Nginx配置修改、发布的 Vibecoding 全流程通过手机端简单的一句命令完成应用的快速demo
<img width="800" src="https://cdn.link-ai.tech/doc/20260203121008.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260203121008.png" />
@@ -53,7 +53,7 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
基于 scheduler 工具实现动态定时任务,支持 **一次性任务、固定时间间隔、Cron表达式** 三种形式,任务触发可选择**固定消息发送** 或 **Agent动态任务** 执行两种模式,有很高灵活性:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202195402.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260202195402.png" />
同时你也可以通过自然语言快速查看和管理已有的定时任务。
@@ -62,7 +62,7 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
技能所需要的秘钥存储在环境变量文件中,由 `env_config` 工具进行管理,你可以通过对话的方式更新秘钥,工具内置了安全保护和脱敏策略,会严格保护秘钥安全:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202234939.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260202234939.png" />
### 3. 技能系统
@@ -77,7 +77,7 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
通过 `skill-creator` 技能可以通过对话的方式快速创建技能。你可以在与Agent的写作中让他对将某个工作流程固化为技能或者把任意接口文档和示例发送给Agent让他直接完成对接
<img width="800" src="https://cdn.link-ai.tech/doc/20260202202247.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260202202247.png" />
#### 3.2 搜索和图像识别
@@ -85,7 +85,7 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
- **搜索技能:** 系统内置实现了 `bocha-search`(博查搜索)的Skill依赖环境变量 `BOCHA_SEARCH_API_KEY`,可在[控制台](https://open.bochaai.com/)进行创建并发送给Agent完成配置
- **图像识别技能:** 实现了 `openai-image-vision` 插件,可使用 gpt-4.1-mini、gpt-4.1 等图像识别模型。依赖秘钥 `OPENAI_API_KEY`可通过config.json或env_config工具进行维护。
<img width="800" src="https://cdn.link-ai.tech/doc/20260202213219.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260202213219.png" />
#### 3.3 三方知识库和插件
@@ -113,7 +113,7 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
Agent可根据智能体的名称和描述进行决策并通过 app_code 调用接口访问对应的应用/工作流通过该技能可以灵活访问LinkAI平台上的智能体、知识库、插件等能力实现效果如下
<img width="750" src="https://cdn.link-ai.tech/doc/20260202234350.png">
<img width="750" src="https://cdn.link-ai.tech/doc/20260202234350.png" />
注:需通过 `env_config` 配置 `LINKAI_API_KEY`或在config.json中添加 `linkai_api_key` 配置。
@@ -143,7 +143,8 @@ Agent模式推荐使用以下模型可根据效果及成本综合选择
- **Doubao**: `doubao-seed-2-0-code-preview-260215`
- **Qwen**: `qwen3.5-plus`
- **Claude**: `claude-sonnet-4-6`
- **Gemini**: `gemini-3.1-pro-preview`
- **Gemini**: `gemini-3.1-flash-lite-preview`
- **OpenAI**: `gpt-5.4`
详细模型配置方式参考 [README.md 模型说明](../README.md#模型说明)

View File

@@ -0,0 +1,56 @@
---
title: 钉钉
description: 将 CowAgent 接入钉钉应用
---
通过钉钉开放平台创建智能机器人应用,将 CowAgent 接入钉钉。
## 一、创建应用
1. 进入 [钉钉开发者后台](https://open-dev.dingtalk.com/fe/app#/corp/app),登录后点击 **创建应用**,填写应用相关信息:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-create-app.png" width="800"/>
2. 点击添加应用能力,选择 **机器人** 能力,点击 **添加**
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-add-bot.png" width="800"/>
3. 配置机器人信息后点击 **发布**。发布后,点击 "**点击调试**",会自动创建测试群聊,可在客户端查看:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-config-bot.png" width="600"/>
4. 点击 **版本管理与发布**,创建新版本发布:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-publish-bot.png" width="700"/>
## 二、项目配置
1. 点击 **凭证与基础信息**,获取 `Client ID` 和 `Client Secret`
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-get-secret.png" width="700"/>
2. 将以下配置加入项目根目录的 `config.json` 文件:
```json
{
"channel_type": "dingtalk",
"dingtalk_client_id": "YOUR_CLIENT_ID",
"dingtalk_client_secret": "YOUR_CLIENT_SECRET"
}
```
3. 安装依赖:
```bash
pip3 install dingtalk_stream
```
4. 启动项目后,在钉钉开发者后台点击 **事件订阅**,点击 **已完成接入,验证连接通道**,显示 **连接接入成功** 即表示配置完成:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-event-sub.png" width="700"/>
## 三、使用
与机器人私聊或将机器人拉入企业群中均可开启对话:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-hosting-demo.png" width="650"/>

69
docs/channels/feishu.mdx Normal file
View File

@@ -0,0 +1,69 @@
---
title: 飞书
description: 将 CowAgent 接入飞书应用
---
通过自建应用将 CowAgent 接入飞书,需要是飞书企业用户且具有企业管理权限。
## 一、创建企业自建应用
### 1. 创建应用
进入 [飞书开发平台](https://open.feishu.cn/app/),点击 **创建企业自建应用**,填写必要信息后点击 **创建**
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-create-app.jpg" width="500"/>
### 2. 添加机器人能力
在 **添加应用能力** 菜单中,为应用添加 **机器人** 能力:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-add-bot.jpg" width="800"/>
### 3. 配置应用权限
点击 **权限管理**,复制以下权限配置,粘贴到 **权限配置** 下方的输入框内,全选筛选出来的权限,点击 **批量开通** 并确认:
```
im:message,im:message.group_at_msg,im:message.group_at_msg:readonly,im:message.p2p_msg,im:message.p2p_msg:readonly,im:message:send_as_bot,im:resource
```
<img src="https://cdn.link-ai.tech/doc/feishu-hosting-add-auth2.png" width="800"/>
## 二、项目配置
1. 在 **凭证与基础信息** 中获取 `App ID` 和 `App Secret`
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-appid-secret.jpg" width="800"/>
2. 将以下配置加入项目根目录的 `config.json` 文件:
```json
{
"channel_type": "feishu",
"feishu_app_id": "YOUR_APP_ID",
"feishu_app_secret": "YOUR_APP_SECRET",
"feishu_bot_name": "YOUR_BOT_NAME"
}
```
| 参数 | 说明 |
| --- | --- |
| `feishu_app_id` | 飞书机器人应用 App ID |
| `feishu_app_secret` | 飞书机器人 App Secret |
| `feishu_bot_name` | 飞书机器人名称(创建应用时设置),群聊中使用依赖此配置 |
配置完成后启动项目。
## 三、配置事件订阅
1. 成功运行项目后,在飞书开放平台点击 **事件与回调**,选择 **长连接** 方式,点击保存:
<img src="https://cdn.link-ai.tech/doc/202601311731183.png" width="600"/>
2. 点击下方的 **添加事件**,搜索 "接收消息",选择 "**接收消息v2.0**",确认添加。
3. 点击 **版本管理与发布**,创建版本并申请 **线上发布**,在飞书客户端查看审批消息并审核通过:
<img src="https://cdn.link-ai.tech/doc/202601311807356.png" width="600"/>
完成后在飞书中搜索机器人名称,即可开始对话。

75
docs/channels/web.mdx Normal file
View File

@@ -0,0 +1,75 @@
---
title: Web 控制台
description: 通过 Web 控制台使用 CowAgent
---
Web 控制台是 CowAgent 的默认通道,启动后会自动运行,通过浏览器即可与 Agent 对话,并支持在线管理模型、技能、记忆、通道等配置。
## 配置
```json
{
"channel_type": "web",
"web_port": 9899
}
```
| 参数 | 说明 | 默认值 |
| --- | --- | --- |
| `channel_type` | 设为 `web` | `web` |
| `web_port` | Web 服务监听端口 | `9899` |
## 访问地址
启动项目后访问:
- 本地运行:`http://localhost:9899`
- 服务器运行:`http://<server-ip>:9899`
<Note>
请确保服务器防火墙和安全组已放行对应端口。
</Note>
## 功能介绍
### 对话界面
支持流式输出,可实时展示 Agent 的思考过程Reasoning和工具调用过程Tool Calls更直观地观察 Agent 的决策过程:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227180120.png" />
### 模型管理
支持在线管理模型配置,无需手动编辑配置文件:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227173811.png" />
### 技能管理
支持在线查看和管理 Agent 技能Skills
<img width="850" src="https://cdn.link-ai.tech/doc/20260227173403.png" />
### 记忆管理
支持在线查看和管理 Agent 记忆:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227173349.png" />
### 通道管理
支持在线管理接入通道,支持实时连接/断开操作:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227173331.png" />
### 定时任务
支持在线查看和管理定时任务包括一次性任务、固定间隔、Cron 表达式等多种调度方式的可视化管理:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227173704.png" />
### 日志
支持在线实时查看 Agent 运行日志,便于监控运行状态和排查问题:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227173514.png" />

View File

@@ -0,0 +1,72 @@
---
title: 微信公众号
description: 将 CowAgent 接入微信公众号
---
CowAgent 支持接入个人订阅号和企业服务号两种公众号类型。
| 类型 | 要求 | 特点 |
| --- | --- | --- |
| **个人订阅号** | 个人可申请 | 收到消息时会回复一条提示,回复生成后需用户主动发消息获取 |
| **企业服务号** | 企业申请,需通过微信认证开通客服接口 | 回复生成后可主动推送给用户 |
<Note>
公众号仅支持服务器和 Docker 部署,不支持本地运行。需额外安装扩展依赖:`pip3 install -r requirements-optional.txt`
</Note>
## 一、个人订阅号
在 `config.json` 中添加以下配置:
```json
{
"channel_type": "wechatmp",
"single_chat_prefix": [""],
"wechatmp_app_id": "wx73f9******d1e48",
"wechatmp_app_secret": "YOUR_APP_SECRET",
"wechatmp_aes_key": "",
"wechatmp_token": "YOUR_TOKEN",
"wechatmp_port": 80
}
```
### 配置步骤
这些配置需要和 [微信公众号后台](https://mp.weixin.qq.com/advanced/advanced?action=dev&t=advanced/dev) 中的保持一致,进入页面后,在左侧菜单选择 **设置与开发 → 基本配置 → 服务器配置**,按下图进行配置:
<img src="https://cdn.link-ai.tech/doc/20260228103506.png" width="480"/>
1. 在公众平台启用开发者密码(对应配置 `wechatmp_app_secret`),并将服务器 IP 填入白名单
2. 按上图填写 `config.json` 中与公众号相关的配置,要与公众号后台的配置一致
3. 启动程序,启动后会监听 80 端口(若无权限监听,则在启动命令前加上 `sudo`;若 80 端口已被占用,则关闭该占用进程)
4. 在公众号后台 **启用服务器配置** 并提交,保存成功则表示已成功配置。注意 **"服务器地址(URL)"** 需要配置为 `http://{HOST}/wx` 的格式,其中 `{HOST}` 可以是服务器的 IP 或域名
随后关注公众号并发送消息即可看到以下效果:
<img src="https://cdn.link-ai.tech/doc/20260228103522.png" width="720"/>
由于受订阅号限制回复内容较短的情况下15s 内),可以立即完成回复,但耗时较长的回复则会先回复一句 "正在思考中",后续需要用户输入任意文字主动获取答案,而服务号则可以通过客服接口解决这一问题。
<Tip>
**语音识别**:可利用微信自带的语音识别功能,需要在公众号管理页面的 "设置与开发 → 接口权限" 页面开启 "接收语音识别结果"。
</Tip>
## 二、企业服务号
企业服务号与上述个人订阅号的接入过程基本相同,差异如下:
1. 在公众平台申请企业服务号并完成微信认证,在接口权限中确认已获得 **客服接口** 的权限
2. 在 `config.json` 中设置 `"channel_type": "wechatmp_service"`,其他配置与上述订阅号相同
3. 交互效果上,即使是较长耗时的回复,也可以主动推送给用户,无需用户手动获取
```json
{
"channel_type": "wechatmp_service",
"single_chat_prefix": [""],
"wechatmp_app_id": "YOUR_APP_ID",
"wechatmp_app_secret": "YOUR_APP_SECRET",
"wechatmp_aes_key": "",
"wechatmp_token": "YOUR_TOKEN",
"wechatmp_port": 80
}
```

View File

@@ -0,0 +1,73 @@
---
title: 企微智能机器人
description: 将 CowAgent 接入企业微信智能机器人(长连接模式)
---
> 通过企业微信智能机器人接入CowAgent支持企业内部单聊和内部群聊无需公网 IP使用 WebSocket 长连接模式支持Markdown渲染和流式输出。
<Note>
智能机器人与企业微信自建应用是两种不同的接入方式。智能机器人使用 WebSocket 长连接,无需服务器公网 IP 和域名,配置更简单。
</Note>
## 一、创建智能机器人
1. 打开企业微信客户端,进入工作台,点击**智能机器人**
<img src="https://cdn.link-ai.tech/doc/20260316180959.png" width="800"/>
2. 点击创建机器人 - 手动创建:
<img src="https://cdn.link-ai.tech/doc/20260316181118.png" width="800"/>
3. 右侧窗口拖到最下方,选择**API模式创建**
<img src="https://cdn.link-ai.tech/doc/20260316181215.png" width="800"/>
4. 设置机器人名称、头像、可见范围,并选择**长连接模式**,记录下 **Bot ID** 和 **Secret** 信息后点击保存。
## 二、配置和运行
### 方式一Web 控制台接入
启动程序后打开 Web 控制台 (本地连接为: http://127.0.0.1:9899/ ),选择 **通道** 菜单,点击 **接入通道**,选择 **企微智能机器人**,填写上一步保存的 Bot ID 和 Secret点击接入即可。
<img src="https://cdn.link-ai.tech/doc/20260316181711.png" width="800"/>
### 方式二:配置文件接入
在 `config.json` 中添加以下配置:
```json
{
"channel_type": "wecom_bot",
"wecom_bot_id": "YOUR_BOT_ID",
"wecom_bot_secret": "YOUR_SECRET"
}
```
| 参数 | 说明 |
| --- | --- |
| `wecom_bot_id` | 智能机器人的 BotID |
| `wecom_bot_secret` | 智能机器人的 Secret |
配置完成后启动程序,日志显示 `[WecomBot] Subscribe success` 即表示连接成功。
## 三、功能说明
| 功能 | 支持情况 |
| --- | --- |
| 单聊 | ✅ |
| 群聊(@机器人) | ✅ |
| 文本消息 | ✅ 收发 |
| 图片消息 | ✅ 收发 |
| 文件消息 | ✅ 收发 |
| 流式回复 | ✅ |
| 定时任务主动推送 | ✅ |
## 四、使用
在企业微信中搜索创建的机器人名称,即可开始单聊对话。
如需在企微内部群聊中使用,将机器人添加到群中,@机器人发送消息即可。
<img src="https://cdn.link-ai.tech/doc/20260316182902.png" width="800"/>

90
docs/channels/wecom.mdx Normal file
View File

@@ -0,0 +1,90 @@
---
title: 企微自建应用
description: 将 CowAgent 接入企业微信自建应用
---
通过企业微信自建应用接入 CowAgent支持企业内部人员单聊使用。
<Note>
企业微信只能使用 Docker 部署或服务器 Python 部署,不支持本地运行模式。
</Note>
## 一、准备
需要的资源:
1. 一台服务器(有公网 IP
2. 注册一个企业微信(个人也可注册,但无法认证)
3. 认证企业微信还需要对应主体备案的域名
## 二、创建企业微信应用
1. 在 [企业微信管理后台](https://work.weixin.qq.com/wework_admin/frame#profile) 点击 **我的企业**,在最下方获取 **企业ID**(后续填写到 `wechatcom_corp_id` 字段中)。
2. 切换到 **应用管理**,点击创建应用:
<img src="https://cdn.link-ai.tech/doc/20260228103156.png" width="480"/>
3. 进入应用创建页面,记录 `AgentId` 和 `Secret`
<img src="https://cdn.link-ai.tech/doc/20260228103218.png" width="580"/>
4. 点击 **设置API接收**,配置应用接口:
<img src="https://cdn.link-ai.tech/doc/20260228103211.png" width="520"/>
- URL 格式为 `http://ip:port/wxcomapp`(认证企业需使用备案域名)
- 随机获取 `Token` 和 `EncodingAESKey` 并保存
<Note>
此时保存 API 接收配置会失败,因为程序还未启动,等项目运行后再回来保存。
</Note>
## 三、配置和运行
在 `config.json` 中添加以下配置(各参数与企业微信后台的对应关系见上方截图):
```json
{
"channel_type": "wechatcom_app",
"single_chat_prefix": [""],
"wechatcom_corp_id": "YOUR_CORP_ID",
"wechatcomapp_token": "YOUR_TOKEN",
"wechatcomapp_secret": "YOUR_SECRET",
"wechatcomapp_agent_id": "YOUR_AGENT_ID",
"wechatcomapp_aes_key": "YOUR_AES_KEY",
"wechatcomapp_port": 9898
}
```
| 参数 | 说明 |
| --- | --- |
| `wechatcom_corp_id` | 企业 ID |
| `wechatcomapp_token` | API 接收配置中的 Token |
| `wechatcomapp_secret` | 应用的 Secret |
| `wechatcomapp_agent_id` | 应用的 AgentId |
| `wechatcomapp_aes_key` | API 接收配置中的 EncodingAESKey |
| `wechatcomapp_port` | 监听端口,默认 9898 |
配置完成后启动程序。当后台日志显示 `http://0.0.0.0:9898/` 时说明程序运行成功,需要将该端口对外开放(如在云服务器安全组中放行)。
程序启动后,回到企业微信后台保存 **消息服务器配置**,保存成功后还需将服务器 IP 添加到 **企业可信IP** 中,否则无法收发消息:
<img src="https://cdn.link-ai.tech/doc/20260228103224.png" width="520"/>
<Warning>
如遇到 URL 配置回调不通过或配置失败:
1. 确保服务器防火墙关闭且安全组放行监听端口
2. 仔细检查 Token、Secret Key 等参数配置是否一致URL 格式是否正确
3. 认证企业微信需要配置与主体一致的备案域名
</Warning>
## 四、使用
在企业微信中搜索刚创建的应用名称,即可直接对话:
<img src="https://cdn.link-ai.tech/doc/20260228103228.png" width="720"/>
如需让外部个人微信用户使用,可在 **我的企业 → 微信插件** 中分享邀请关注二维码,个人微信扫码关注后即可与应用对话:
<img src="https://cdn.link-ai.tech/doc/20260228103232.png" width="520"/>

327
docs/docs.json Normal file
View File

@@ -0,0 +1,327 @@
{
"$schema": "https://mintlify.com/docs.json",
"name": "CowAgent",
"description": "CowAgent - AI Super Assistant powered by LLMs, with autonomous task planning, long-term memory, skills system, and multi-channel deployment.",
"theme": "mint",
"appearance": {
"default": "light"
},
"colors": {
"primary": "#35A85B",
"light": "#4ABE6E",
"dark": "#228547"
},
"logo": {
"light": "/images/logo.jpg",
"dark": "/images/logo.jpg"
},
"favicon": "/images/favicon.ico",
"navbar": {
"links": [
{
"label": "官网",
"href": "https://cowagent.ai/"
},
{
"label": "GitHub",
"href": "https://github.com/zhayujie/chatgpt-on-wechat"
}
]
},
"footer": {
"socials": {
"github": "https://github.com/zhayujie/chatgpt-on-wechat"
}
},
"navigation": {
"languages": [
{
"language": "zh",
"default": true,
"tabs": [
{
"tab": "项目介绍",
"groups": [
{
"group": "概览",
"pages": [
"intro/index",
"intro/architecture",
"intro/features"
]
}
]
},
{
"tab": "快速开始",
"groups": [
{
"group": "安装部署",
"pages": [
"guide/quick-start",
"guide/manual-install"
]
}
]
},
{
"tab": "模型",
"groups": [
{
"group": "模型配置",
"pages": [
"models/index",
"models/minimax",
"models/glm",
"models/qwen",
"models/kimi",
"models/doubao",
"models/claude",
"models/gemini",
"models/openai",
"models/deepseek",
"models/linkai"
]
}
]
},
{
"tab": "工具",
"groups": [
{
"group": "工具系统",
"pages": [
"tools/index"
]
},
{
"group": "内置工具",
"pages": [
"tools/read",
"tools/write",
"tools/edit",
"tools/ls",
"tools/bash",
"tools/send",
"tools/memory",
"tools/env-config"
]
},
{
"group": "可选工具",
"pages": [
"tools/web-search",
"tools/scheduler"
]
}
]
},
{
"tab": "技能",
"groups": [
{
"group": "技能系统",
"pages": [
"skills/index",
"skills/skill-creator"
]
},
{
"group": "内置技能",
"pages": [
"skills/image-vision",
"skills/linkai-agent",
"skills/web-fetch"
]
}
]
},
{
"tab": "记忆",
"groups": [
{
"group": "记忆系统",
"pages": [
"memory"
]
}
]
},
{
"tab": "通道",
"groups": [
{
"group": "接入渠道",
"pages": [
"channels/web",
"channels/feishu",
"channels/dingtalk",
"channels/wecom-bot",
"channels/wecom",
"channels/wechatmp"
]
}
]
},
{
"tab": "版本",
"groups": [
{
"group": "发布记录",
"pages": [
"releases/overview",
"releases/v2.0.2",
"releases/v2.0.1",
"releases/v2.0.0"
]
}
]
}
]
},
{
"language": "en",
"tabs": [
{
"tab": "Introduction",
"groups": [
{
"group": "Overview",
"pages": [
"en/intro/index",
"en/intro/architecture",
"en/intro/features"
]
}
]
},
{
"tab": "Get Started",
"groups": [
{
"group": "Installation",
"pages": [
"en/guide/quick-start",
"en/guide/manual-install"
]
}
]
},
{
"tab": "Models",
"groups": [
{
"group": "Model Configuration",
"pages": [
"en/models/index",
"en/models/minimax",
"en/models/glm",
"en/models/qwen",
"en/models/kimi",
"en/models/doubao",
"en/models/claude",
"en/models/gemini",
"en/models/openai",
"en/models/deepseek",
"en/models/linkai"
]
}
]
},
{
"tab": "Tools",
"groups": [
{
"group": "Tools System",
"pages": [
"en/tools/index"
]
},
{
"group": "Built-in Tools",
"pages": [
"en/tools/read",
"en/tools/write",
"en/tools/edit",
"en/tools/ls",
"en/tools/bash",
"en/tools/send",
"en/tools/memory",
"en/tools/env-config"
]
},
{
"group": "Optional Tools",
"pages": [
"en/tools/web-search",
"en/tools/scheduler"
]
}
]
},
{
"tab": "Skills",
"groups": [
{
"group": "Skills System",
"pages": [
"en/skills/index",
"en/skills/skill-creator"
]
},
{
"group": "Built-in Skills",
"pages": [
"en/skills/image-vision",
"en/skills/linkai-agent",
"en/skills/web-fetch"
]
}
]
},
{
"tab": "Memory",
"groups": [
{
"group": "Memory System",
"pages": [
"en/memory"
]
}
]
},
{
"tab": "Channels",
"groups": [
{
"group": "Platforms",
"pages": [
"en/channels/web",
"en/channels/feishu",
"en/channels/dingtalk",
"en/channels/wecom-bot",
"en/channels/wecom",
"en/channels/wechatmp"
]
}
]
},
{
"tab": "Releases",
"groups": [
{
"group": "Release Notes",
"pages": [
"en/releases/overview",
"en/releases/v2.0.2",
"en/releases/v2.0.1",
"en/releases/v2.0.0"
]
}
]
}
]
}
]
}
}

178
docs/en/README.md Normal file
View File

@@ -0,0 +1,178 @@
<p align="center"><img src="https://github.com/user-attachments/assets/eca9a9ec-8534-4615-9e0f-96c5ac1d10a3" alt="CowAgent" width="550" /></p>
<p align="center">
<a href="https://github.com/zhayujie/chatgpt-on-wechat/releases/latest"><img src="https://img.shields.io/github/v/release/zhayujie/chatgpt-on-wechat" alt="Latest release"></a>
<a href="https://github.com/zhayujie/chatgpt-on-wechat/blob/master/LICENSE"><img src="https://img.shields.io/github/license/zhayujie/chatgpt-on-wechat" alt="License: MIT"></a>
<a href="https://github.com/zhayujie/chatgpt-on-wechat"><img src="https://img.shields.io/github/stars/zhayujie/chatgpt-on-wechat?style=flat-square" alt="Stars"></a> <br/>
[<a href="https://github.com/zhayujie/chatgpt-on-wechat/blob/master/README.md">中文</a>] | [English]
</p>
**CowAgent** is an AI super assistant powered by LLMs, capable of autonomous task planning, operating computers and external resources, creating and executing Skills, and continuously growing with long-term memory. It supports flexible model switching, handles text, voice, images, and files, and can be integrated into Web, Feishu, DingTalk, WeCom Bot, WeCom App, and WeChat Official Account — running 7×24 hours on your personal computer or server.
<p align="center">
<a href="https://cowagent.ai/">🌐 Website</a> &nbsp;·&nbsp;
<a href="https://docs.cowagent.ai/en/intro/index">📖 Docs</a> &nbsp;·&nbsp;
<a href="https://docs.cowagent.ai/en/guide/quick-start">🚀 Quick Start</a>
</p>
## Introduction
> CowAgent is both an out-of-the-box AI super assistant and a highly extensible Agent framework. You can extend it with new model interfaces, channels, built-in tools, and the Skills system to flexibly implement various customization needs.
-**Autonomous Task Planning**: Understands complex tasks and autonomously plans execution, continuously thinking and invoking tools until goals are achieved. Supports accessing files, terminal, browser, schedulers, and other system resources via tools.
-**Long-term Memory**: Automatically persists conversation memory to local files and databases, including core memory and daily memory, with keyword and vector retrieval support.
-**Skills System**: Implements a Skills creation and execution engine with multiple built-in skills, and supports custom Skills development through natural language conversation.
-**Multimodal Messages**: Supports parsing, processing, generating, and sending text, images, voice, files, and other message types.
-**Multiple Model Support**: Supports OpenAI, Claude, Gemini, DeepSeek, MiniMax, GLM, Qwen, Kimi, Doubao, and other mainstream model providers.
-**Multi-platform Deployment**: Runs on local computers or servers, integrable into Web, Feishu, DingTalk, WeChat Official Account, and WeCom applications.
-**Knowledge Base**: Integrates enterprise knowledge base capabilities via the [LinkAI](https://link-ai.tech) platform.
## Disclaimer
1. This project follows the [MIT License](/LICENSE) and is intended for technical research and learning. Users must comply with local laws, regulations, policies, and corporate bylaws. Any illegal or rights-infringing use is prohibited.
2. Agent mode consumes more tokens than normal chat mode. Choose models based on effectiveness and cost. Agent has access to the host OS — please deploy in trusted environments.
3. CowAgent focuses on open-source development and does not participate in, authorize, or issue any cryptocurrency.
## Changelog
> **2026.02.27:** [v2.0.2](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.2) — Web console overhaul (streaming chat, model/skill/memory/channel/scheduler/log management), multi-channel concurrent running, session persistence, new models including Gemini 3.1 Pro / Claude 4.6 Sonnet / Qwen3.5 Plus.
> **2026.02.13:** [v2.0.1](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.1) — Built-in Web Search tool, smart context trimming, runtime info dynamic update, Windows compatibility, fixes for scheduler memory loss, Feishu connection issues, and more.
> **2026.02.03:** [v2.0.0](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.0) — Full upgrade to AI super assistant with multi-step task planning, long-term memory, built-in tools, Skills framework, new models, and optimized channels.
> **2025.05.23:** [v1.7.6](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.6) — Web channel optimization, AgentMesh multi-agent plugin, Baidu TTS, claude-4-sonnet/opus support.
> **2025.04.11:** [v1.7.5](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.5) — wechatferry protocol, DeepSeek model, Tencent Cloud voice, ModelScope and Gitee-AI support.
> **2024.12.13:** [v1.7.4](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.4) — Gemini 2.0 model, Web channel, memory leak fix.
Full changelog: [Release Notes](https://docs.cowagent.ai/en/releases/overview)
<br/>
## 🚀 Quick Start
The project provides a one-click script for installation, configuration, startup, and management:
```bash
bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
```
After running, the Web service starts by default. Access `http://localhost:9899/chat` to chat.
Script usage: [One-click Install](https://docs.cowagent.ai/en/guide/quick-start)
### Manual Installation
**1. Clone the project**
```bash
git clone https://github.com/zhayujie/chatgpt-on-wechat
cd chatgpt-on-wechat/
```
**2. Install dependencies**
```bash
pip3 install -r requirements.txt
pip3 install -r requirements-optional.txt # optional but recommended
```
**3. Configure**
```bash
cp config-template.json config.json
```
Fill in your model API key and channel type in `config.json`. See the [configuration docs](https://docs.cowagent.ai/en/guide/manual-install) for details.
**4. Run**
```bash
python3 app.py
```
For server background run:
```bash
nohup python3 app.py & tail -f nohup.out
```
### Docker Deployment
```bash
wget https://cdn.link-ai.tech/code/cow/docker-compose.yml
# Edit docker-compose.yml with your config
sudo docker compose up -d
sudo docker logs -f chatgpt-on-wechat
```
<br/>
## Models
Supports mainstream model providers. Recommended models for Agent mode:
| Provider | Recommended Model |
| --- | --- |
| MiniMax | `MiniMax-M2.5` |
| GLM | `glm-5` |
| Kimi | `kimi-k2.5` |
| Doubao | `doubao-seed-2-0-code-preview-260215` |
| Qwen | `qwen3.5-plus` |
| Claude | `claude-sonnet-4-6` |
| Gemini | `gemini-3.1-pro-preview` |
| OpenAI | `gpt-5.4` |
| DeepSeek | `deepseek-chat` |
For detailed configuration of each model, see the [Models documentation](https://docs.cowagent.ai/en/models/index).
<br/>
## Channels
Supports multiple platforms. Set `channel_type` in `config.json` to switch:
| Channel | `channel_type` | Docs |
| --- | --- | --- |
| Web (default) | `web` | [Web Channel](https://docs.cowagent.ai/en/channels/web) |
| Feishu | `feishu` | [Feishu Setup](https://docs.cowagent.ai/en/channels/feishu) |
| DingTalk | `dingtalk` | [DingTalk Setup](https://docs.cowagent.ai/en/channels/dingtalk) |
| WeCom Bot | `wecom_bot` | [WeCom Bot Setup](https://docs.cowagent.ai/en/channels/wecom-bot) |
| WeCom App | `wechatcom_app` | [WeCom Setup](https://docs.cowagent.ai/en/channels/wecom) |
| WeChat MP | `wechatmp` / `wechatmp_service` | [WeChat MP Setup](https://docs.cowagent.ai/en/channels/wechatmp) |
| Terminal | `terminal` | — |
Multiple channels can be enabled simultaneously, separated by commas: `"channel_type": "feishu,dingtalk"`.
<br/>
## Enterprise Services
<a href="https://link-ai.tech" target="_blank"><img width="720" src="https://cdn.link-ai.tech/image/link-ai-intro.jpg"></a>
> [LinkAI](https://link-ai.tech/) is a one-stop AI agent platform for enterprises and developers, integrating multimodal LLMs, knowledge bases, Agent plugins, and workflows. Supports one-click integration with mainstream platforms, SaaS and private deployment.
<br/>
## 🔗 Related Projects
- [bot-on-anything](https://github.com/zhayujie/bot-on-anything): Lightweight and highly extensible LLM application framework supporting Slack, Telegram, Discord, Gmail, and more.
- [AgentMesh](https://github.com/MinimalFuture/AgentMesh): Open-source Multi-Agent framework for complex problem solving through agent team collaboration.
## 🔎 FAQ
FAQs: <https://github.com/zhayujie/chatgpt-on-wechat/wiki/FAQs>
## 🛠️ Contributing
Welcome to add new channels, referring to the [Feishu channel](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/channel/feishu/feishu_channel.py) as an example. Also welcome to contribute new Skills, referring to the [Skill Creator docs](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/skills/skill-creator/SKILL.md).
## ✉ Contact
Welcome to submit PRs and Issues, and support the project with a 🌟 Star. For questions, check the [FAQ list](https://github.com/zhayujie/chatgpt-on-wechat/wiki/FAQs) or search [Issues](https://github.com/zhayujie/chatgpt-on-wechat/issues).
## 🌟 Contributors
![cow contributors](https://contrib.rocks/image?repo=zhayujie/chatgpt-on-wechat&max=1000)

View File

@@ -0,0 +1,58 @@
---
title: DingTalk
description: Integrate CowAgent into DingTalk application
---
Integrate CowAgent into DingTalk by creating an intelligent robot app on the DingTalk Open Platform.
## 1. Create App
1. Go to [DingTalk Developer Console](https://open-dev.dingtalk.com/fe/app#/corp/app), log in and click **Create App**, fill in the app information:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-create-app.png" width="800"/>
2. Click **Add App Capability**, select **Robot** capability and click **Add**:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-add-bot.png" width="800"/>
3. Configure the robot information and click **Publish**. After publishing, click "**Debug**" to automatically create a test group chat, which can be viewed in the client:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-config-bot.png" width="600"/>
4. Click **Version Management & Release**, create a new version and publish:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-publish-bot.png" width="700"/>
## 2. Project Configuration
1. Click **Credentials & Basic Info**, get the `Client ID` and `Client Secret`:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-get-secret.png" width="700"/>
2. Add the following configuration to `config.json` in the project root:
```json
{
"channel_type": "dingtalk",
"dingtalk_client_id": "YOUR_CLIENT_ID",
"dingtalk_client_secret": "YOUR_CLIENT_SECRET"
}
```
3. Install the dependency:
```bash
pip3 install dingtalk_stream
```
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-app-config.png" width="700"/>
4. After starting the project, go to the DingTalk Developer Console, click **Event Subscription**, then click **Connection verified, verify channel**. When "**Connection successful**" is displayed, the configuration is complete:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-event-sub.png" width="700"/>
## 3. Usage
Chat privately with the robot or add it to an enterprise group to start a conversation:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/dingtalk-hosting-demo.png" width="650"/>

View File

@@ -0,0 +1,69 @@
---
title: Feishu (Lark)
description: Integrate CowAgent into Feishu application
---
Integrate CowAgent into Feishu by creating a custom enterprise app. You need to be a Feishu enterprise user with admin privileges.
## 1. Create Enterprise Custom App
### 1.1 Create App
Go to [Feishu Developer Platform](https://open.feishu.cn/app/), click **Create Enterprise Custom App**, fill in the required information and click **Create**:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-create-app.jpg" width="500"/>
### 1.2 Add Bot Capability
In **Add App Capabilities**, add **Bot** capability to the app:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-add-bot.jpg" width="800"/>
### 1.3 Configure App Permissions
Click **Permission Management**, paste the following permission string into the input box below **Permission Configuration**, select all filtered permissions, click **Batch Enable** and confirm:
```
im:message,im:message.group_at_msg,im:message.group_at_msg:readonly,im:message.p2p_msg,im:message.p2p_msg:readonly,im:message:send_as_bot,im:resource
```
<img src="https://cdn.link-ai.tech/doc/feishu-hosting-add-auth2.png" width="800"/>
## 2. Project Configuration
1. Get `App ID` and `App Secret` from **Credentials & Basic Info**:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-appid-secret.jpg" width="800"/>
2. Add the following configuration to `config.json` in the project root:
```json
{
"channel_type": "feishu",
"feishu_app_id": "YOUR_APP_ID",
"feishu_app_secret": "YOUR_APP_SECRET",
"feishu_bot_name": "YOUR_BOT_NAME"
}
```
| Parameter | Description |
| --- | --- |
| `feishu_app_id` | Feishu bot App ID |
| `feishu_app_secret` | Feishu bot App Secret |
| `feishu_bot_name` | Bot name (set when creating the app), required for group chat usage |
Start the project after configuration is complete.
## 3. Configure Event Subscription
1. After the project is running successfully, go to the Feishu Developer Platform, click **Events & Callbacks**, select **Long Connection** mode, and click save:
<img src="https://cdn.link-ai.tech/doc/202601311731183.png" width="600"/>
2. Click **Add Event** below, search for "Receive Message", select "**Receive Message v2.0**", and confirm.
3. Click **Version Management & Release**, create a new version and apply for **Production Release**. Check the approval message in the Feishu client and approve:
<img src="https://cdn.link-ai.tech/doc/202601311807356.png" width="600"/>
Once completed, search for the bot name in Feishu to start chatting.

75
docs/en/channels/web.mdx Normal file
View File

@@ -0,0 +1,75 @@
---
title: Web Console
description: Use CowAgent through the web console
---
The Web Console is CowAgent's default channel. It starts automatically after launch, allowing you to chat with the Agent through a browser and manage models, skills, memory, channels, and other configurations online.
## Configuration
```json
{
"channel_type": "web",
"web_port": 9899
}
```
| Parameter | Description | Default |
| --- | --- | --- |
| `channel_type` | Set to `web` | `web` |
| `web_port` | Web service listen port | `9899` |
## Access URL
After starting the project, visit:
- Local: `http://localhost:9899`
- Server: `http://<server-ip>:9899`
<Note>
Ensure the server firewall and security group allow the corresponding port.
</Note>
## Features
### Chat Interface
Supports streaming output with real-time display of the Agent's reasoning process and tool calls, providing intuitive observation of the Agent's decision-making:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227180120.png" />
### Model Management
Manage model configurations online without manually editing config files:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227173811.png" />
### Skill Management
View and manage Agent skills (Skills) online:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227173403.png" />
### Memory Management
View and manage Agent memory online:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227173349.png" />
### Channel Management
Manage connected channels online with real-time connect/disconnect operations:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227173331.png" />
### Scheduled Tasks
View and manage scheduled tasks online, including one-time tasks, fixed intervals, and Cron expressions:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227173704.png" />
### Logs
View Agent runtime logs in real-time for monitoring and troubleshooting:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227173514.png" />

View File

@@ -0,0 +1,72 @@
---
title: WeChat Official Account
description: Integrate CowAgent with WeChat Official Accounts
---
CowAgent supports both personal subscription accounts and enterprise service accounts.
| Type | Requirements | Features |
| --- | --- | --- |
| **Personal Subscription** | Available to individuals | Sends a placeholder reply first; users must send a message to retrieve the full response |
| **Enterprise Service** | Enterprise with verified customer service API | Can proactively push replies to users |
<Note>
Official Accounts only support server and Docker deployment, not local run mode. Install extended dependencies: `pip3 install -r requirements-optional.txt`
</Note>
## 1. Personal Subscription Account
Add the following configuration to `config.json`:
```json
{
"channel_type": "wechatmp",
"single_chat_prefix": [""],
"wechatmp_app_id": "wx73f9******d1e48",
"wechatmp_app_secret": "YOUR_APP_SECRET",
"wechatmp_aes_key": "",
"wechatmp_token": "YOUR_TOKEN",
"wechatmp_port": 80
}
```
### Setup Steps
These configurations must be consistent with the [WeChat Official Account Platform](https://mp.weixin.qq.com/advanced/advanced?action=dev&t=advanced/dev). Navigate to **Settings & Development → Basic Configuration → Server Configuration** and configure as shown below:
<img src="https://cdn.link-ai.tech/doc/20260228103506.png" width="480"/>
1. Enable the developer secret on the platform (corresponds to `wechatmp_app_secret`), and add the server IP to the whitelist
2. Fill in the `config.json` with the official account parameters matching the platform configuration
3. Start the program, which listens on port 80 (use `sudo` if you don't have permission; stop any process occupying port 80)
4. **Enable server configuration** on the official account platform and submit. A successful save means the configuration is complete. Note that the **"Server URL"** must be in the format `http://{HOST}/wx`, where `{HOST}` can be the server IP or domain
After following the account and sending a message, you should see the following result:
<img src="https://cdn.link-ai.tech/doc/20260228103522.png" width="720"/>
Due to subscription account limitations, short replies (within 15s) can be returned immediately, but longer replies will first send a "Thinking..." placeholder, requiring users to send any text to retrieve the answer. Enterprise service accounts can solve this with the customer service API.
<Tip>
**Voice Recognition**: You can use WeChat's built-in voice recognition. Enable "Receive Voice Recognition Results" under "Settings & Development → API Permissions" on the official account management page.
</Tip>
## 2. Enterprise Service Account
The setup process for enterprise service accounts is essentially the same as personal subscription accounts, with the following differences:
1. Register an enterprise service account on the platform and complete WeChat certification. Confirm that the **Customer Service API** permission has been granted
2. Set `"channel_type": "wechatmp_service"` in `config.json`; other configurations remain the same
3. Even for longer replies, they can be proactively pushed to users without requiring manual retrieval
```json
{
"channel_type": "wechatmp_service",
"single_chat_prefix": [""],
"wechatmp_app_id": "YOUR_APP_ID",
"wechatmp_app_secret": "YOUR_APP_SECRET",
"wechatmp_aes_key": "",
"wechatmp_token": "YOUR_TOKEN",
"wechatmp_port": 80
}
```

View File

@@ -0,0 +1,73 @@
---
title: WeCom Bot
description: Connect CowAgent to WeCom AI Bot (WebSocket long connection)
---
Connect CowAgent via WeCom AI Bot, supporting both direct messages and group chats. No public IP required — uses WebSocket long connection with Markdown rendering and streaming output.
<Note>
WeCom Bot and WeCom App are two different integration methods. WeCom Bot uses WebSocket long connection, requiring no public IP or domain, making it easier to set up.
</Note>
## 1. Create an AI Bot
1. Open the WeCom client, go to **Workbench**, and click **AI Bot**:
<img src="https://cdn.link-ai.tech/doc/20260316180959.png" width="800"/>
2. Click **Create Bot** → **Manual Creation**:
<img src="https://cdn.link-ai.tech/doc/20260316181118.png" width="600"/>
3. Scroll to the bottom of the right panel and select **API Mode**:
<img src="https://cdn.link-ai.tech/doc/20260316181215.png" width="600"/>
4. Set the bot name, avatar, and visibility scope. Select **Long Connection** mode, note down the **Bot ID** and **Secret**, then click Save.
## 2. Configuration
### Option A: Web Console
Start the program and open the Web console (local access: http://127.0.0.1:9899). Go to the **Channels** tab, click **Connect Channel**, select **WeCom Bot**, fill in the Bot ID and Secret from the previous step, and click Connect.
<img src="https://cdn.link-ai.tech/doc/20260316181711.png" width="600"/>
### Option B: Config File
Add the following to your `config.json`:
```json
{
"channel_type": "wecom_bot",
"wecom_bot_id": "YOUR_BOT_ID",
"wecom_bot_secret": "YOUR_SECRET"
}
```
| Parameter | Description |
| --- | --- |
| `wecom_bot_id` | Bot ID of the AI Bot |
| `wecom_bot_secret` | Secret for the AI Bot |
After configuration, start the program. The log message `[WecomBot] Subscribe success` indicates a successful connection.
## 3. Supported Features
| Feature | Status |
| --- | --- |
| Direct Messages | ✅ |
| Group Chat (@bot) | ✅ |
| Text Messages | ✅ Send & Receive |
| Image Messages | ✅ Send & Receive |
| File Messages | ✅ Send & Receive |
| Streaming Reply | ✅ |
| Scheduled Push | ✅ |
## 4. Usage
Search for the bot name in WeCom to start a direct conversation.
To use in group chats, add the bot to a group and @mention it to send messages.
<img src="https://cdn.link-ai.tech/doc/20260316182902.png" width="800"/>

View File

@@ -0,0 +1,90 @@
---
title: WeCom
description: Integrate CowAgent into WeCom enterprise app
---
Integrate CowAgent into WeCom through a custom enterprise app, supporting one-on-one chat for internal employees.
<Note>
WeCom only supports Docker deployment or server Python deployment. Local run mode is not supported.
</Note>
## 1. Prerequisites
Required resources:
1. A server with public IP (overseas server, or domestic server with a proxy for international API access)
2. A registered WeCom account (individual registration is possible but cannot be certified)
3. Certified WeCom accounts additionally require a domain filed under the corresponding entity
## 2. Create WeCom App
1. In the [WeCom Admin Console](https://work.weixin.qq.com/wework_admin/frame#profile), click **My Enterprise** and find the **Corp ID** at the bottom of the page. Save this ID for the `wechatcom_corp_id` configuration field.
2. Switch to **Application Management** and click Create Application:
<img src="https://cdn.link-ai.tech/doc/20260228103156.png" width="480"/>
3. On the application creation page, record the `AgentId` and `Secret`:
<img src="https://cdn.link-ai.tech/doc/20260228103218.png" width="580"/>
4. Click **Set API Reception** to configure the application interface:
<img src="https://cdn.link-ai.tech/doc/20260228103211.png" width="520"/>
- URL format: `http://ip:port/wxcomapp` (certified enterprises must use a filed domain)
- Generate random `Token` and `EncodingAESKey` and save them for the configuration file
<Note>
The API reception configuration cannot be saved at this point because the program hasn't started yet. Come back to save it after the project is running.
</Note>
## 3. Configuration and Run
Add the following configuration to `config.json` (the mapping between each parameter and the WeCom console is shown in the screenshots above):
```json
{
"channel_type": "wechatcom_app",
"single_chat_prefix": [""],
"wechatcom_corp_id": "YOUR_CORP_ID",
"wechatcomapp_token": "YOUR_TOKEN",
"wechatcomapp_secret": "YOUR_SECRET",
"wechatcomapp_agent_id": "YOUR_AGENT_ID",
"wechatcomapp_aes_key": "YOUR_AES_KEY",
"wechatcomapp_port": 9898
}
```
| Parameter | Description |
| --- | --- |
| `wechatcom_corp_id` | Corp ID |
| `wechatcomapp_token` | Token from API reception config |
| `wechatcomapp_secret` | App Secret |
| `wechatcomapp_agent_id` | App AgentId |
| `wechatcomapp_aes_key` | EncodingAESKey from API reception config |
| `wechatcomapp_port` | Listen port, default 9898 |
After configuration, start the program. When the log shows `http://0.0.0.0:9898/`, the program is running successfully. You need to open this port externally (e.g., allow it in the cloud server security group).
After the program starts, return to the WeCom Admin Console to save the **Message Server Configuration**. After saving successfully, you also need to add the server IP to **Enterprise Trusted IPs**, otherwise messages cannot be sent or received:
<img src="https://cdn.link-ai.tech/doc/20260228103224.png" width="520"/>
<Warning>
If the URL configuration callback fails or the configuration is unsuccessful:
1. Ensure the server firewall is disabled and the security group allows the listening port
2. Carefully check that Token, Secret Key and other parameter configurations are consistent, and that the URL format is correct
3. Certified WeCom accounts must configure a filed domain matching the entity
</Warning>
## 4. Usage
Search for the app name you just created in WeCom to start chatting directly. You can run multiple instances listening on different ports to create multiple WeCom apps:
<img src="https://cdn.link-ai.tech/doc/20260228103228.png" width="720"/>
To allow external personal WeChat users to use the app, go to **My Enterprise → WeChat Plugin**, share the invite QR code. After scanning and following, personal WeChat users can join and chat with the app:
<img src="https://cdn.link-ai.tech/doc/20260228103232.png" width="520"/>

View File

@@ -0,0 +1,113 @@
---
title: Manual Install
description: Deploy CowAgent manually (source code / Docker)
---
## Source Code Deployment
### 1. Clone the project
```bash
git clone https://github.com/zhayujie/chatgpt-on-wechat
cd chatgpt-on-wechat/
```
<Tip>
For network issues, use the mirror: https://gitee.com/zhayujie/chatgpt-on-wechat
</Tip>
### 2. Install dependencies
Core dependencies (required):
```bash
pip3 install -r requirements.txt
```
Optional dependencies (recommended):
```bash
pip3 install -r requirements-optional.txt
```
### 3. Configure
Copy the config template and edit:
```bash
cp config-template.json config.json
```
Fill in model API keys, channel type, and other settings in `config.json`. See the [model docs](/en/models/index) for details.
### 4. Run
**Local run:**
```bash
python3 app.py
```
By default, the Web service starts. Access `http://localhost:9899/chat` to chat.
**Background run on server:**
```bash
nohup python3 app.py & tail -f nohup.out
```
## Docker Deployment
Docker deployment does not require cloning source code or installing dependencies. For Agent mode, source deployment is recommended for broader system access.
<Note>
Requires [Docker](https://docs.docker.com/engine/install/) and docker-compose.
</Note>
**1. Download config**
```bash
wget https://cdn.link-ai.tech/code/cow/docker-compose.yml
```
Edit `docker-compose.yml` with your configuration.
**2. Start container**
```bash
sudo docker compose up -d
```
**3. View logs**
```bash
sudo docker logs -f chatgpt-on-wechat
```
## Core Configuration
```json
{
"channel_type": "web",
"model": "MiniMax-M2.5",
"agent": true,
"agent_workspace": "~/cow",
"agent_max_context_tokens": 40000,
"agent_max_context_turns": 30,
"agent_max_steps": 15
}
```
| Parameter | Description | Default |
| --- | --- | --- |
| `channel_type` | Channel type | `web` |
| `model` | Model name | `MiniMax-M2.5` |
| `agent` | Enable Agent mode | `true` |
| `agent_workspace` | Agent workspace path | `~/cow` |
| `agent_max_context_tokens` | Max context tokens | `40000` |
| `agent_max_context_turns` | Max context turns | `30` |
| `agent_max_steps` | Max decision steps per task | `15` |
<Tip>
Full configuration options are in the project [`config.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/config.py).
</Tip>

View File

@@ -0,0 +1,39 @@
---
title: One-click Install
description: One-click install and manage CowAgent with scripts
---
The project provides scripts for one-click install, configuration, startup, and management. Script-based deployment is recommended for quick setup.
Supports Linux, macOS, and Windows. Requires Python 3.7-3.12 (3.9 recommended).
## Install Command
```bash
bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
```
The script automatically performs these steps:
1. Check Python environment (requires Python 3.7+)
2. Install required tools (git, curl, etc.)
3. Clone project to `~/chatgpt-on-wechat`
4. Install Python dependencies
5. Guided configuration for AI model and channel
6. Start service
By default, the Web service starts after installation. Access `http://localhost:9899/chat` to begin chatting.
## Management Commands
After installation, use these commands to manage the service:
| Command | Description |
| --- | --- |
| `./run.sh start` | Start service |
| `./run.sh stop` | Stop service |
| `./run.sh restart` | Restart service |
| `./run.sh status` | Check run status |
| `./run.sh logs` | View real-time logs |
| `./run.sh config` | Reconfigure |
| `./run.sh update` | Update project code |

View File

@@ -0,0 +1,77 @@
---
title: Architecture
description: CowAgent 2.0 system architecture and core design
---
CowAgent 2.0 has evolved from a simple chatbot into a super intelligent assistant with Agent architecture, featuring autonomous thinking, task planning, long-term memory, and skill extensibility.
## System Architecture
CowAgent's architecture consists of the following core modules:
<img src="https://cdn.link-ai.tech/doc/68ef7b212c6f791e0e74314b912149f9-sz_5847990.png" alt="CowAgent Architecture" />
### Core Modules
| Module | Description |
| --- | --- |
| **Channels** | Message channel layer for receiving and sending messages. Supports Web, Feishu, DingTalk, WeCom, WeChat Official Account, and more |
| **Agent Core** | Agent engine including task planning, memory system, and skills engine |
| **Tools** | Tool layer for Agent to access OS resources. 10+ built-in tools |
| **Models** | Model layer with unified access to mainstream LLMs |
## Agent Mode Workflow
When Agent mode is enabled, CowAgent runs as an autonomous agent with the following workflow:
1. **Receive Message** — Receive user input through channels
2. **Understand Intent** — Analyze task requirements and context
3. **Plan Task** — Break complex tasks into multiple steps
4. **Invoke Tools** — Select and execute appropriate tools for each step
5. **Update Memory** — Store important information in long-term memory
6. **Return Result** — Send execution results back to the user
## Workspace Directory Structure
The Agent workspace is located at `~/cow` by default and stores system prompts, memory files, and skill files:
```
~/cow/
├── system.md # Agent system prompt
├── user.md # User profile
├── memory/ # Long-term memory storage
│ ├── core.md # Core memory
│ └── daily/ # Daily memory
└── skills/ # Custom skills
├── skill-1/
└── skill-2/
```
Secret keys are stored separately in `~/.cow` directory for security:
```
~/.cow/
└── .env # Secret keys for skills
```
## Core Configuration
Configure Agent mode parameters in `config.json`:
```json
{
"agent": true,
"agent_workspace": "~/cow",
"agent_max_context_tokens": 40000,
"agent_max_context_turns": 30,
"agent_max_steps": 15
}
```
| Parameter | Description | Default |
| --- | --- | --- |
| `agent` | Enable Agent mode | `true` |
| `agent_workspace` | Workspace path | `~/cow` |
| `agent_max_context_tokens` | Max context tokens | `40000` |
| `agent_max_context_turns` | Max context turns | `30` |
| `agent_max_steps` | Max decision steps per task | `15` |

105
docs/en/intro/features.mdx Normal file
View File

@@ -0,0 +1,105 @@
---
title: Features
description: CowAgent long-term memory, task planning, and skills system in detail
---
## 1. Long-term Memory
The memory system enables the Agent to remember important information over time. The Agent proactively stores information when users share preferences, decisions, or key facts, and automatically extracts summaries when conversations reach a certain length. Memory is divided into core memory and daily memory, with hybrid retrieval supporting both keyword search and vector search.
On first launch, the Agent proactively asks the user for key information and records it in the workspace (default `~/cow`) — including agent settings, user identity, and memory files.
In subsequent long-term conversations, the Agent intelligently stores or retrieves memory as needed, continuously updating its own settings, user preferences, and memory files, summarizing experiences and lessons learned — truly achieving autonomous thinking and continuous growth.
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
</Frame>
## 2. Task Planning and Tool Use
Tools are the core of how the Agent accesses operating system resources. The Agent intelligently selects and invokes tools based on task requirements, performing file read/write, command execution, scheduled tasks, and more. Built-in tools are implemented in the project's `agent/tools/` directory.
**Key tools:** file read/write/edit, Bash terminal, file send, scheduler, memory search, web search, environment config, and more.
### 2.1 Terminal and File Access
Access to the OS terminal and file system is the most fundamental and core capability. Many other tools and skills build on top of this. Users can interact with the Agent from a mobile device to operate resources on their personal computer or server:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202181130.png" width="800" />
</Frame>
### 2.2 Programming Capability
Combining programming and system access, the Agent can execute the complete **Vibecoding workflow** — from information search, asset generation, coding, testing, deployment, Nginx configuration, to publishing — all triggered by a single command from your phone:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203121008.png" width="800" />
</Frame>
### 2.3 Scheduled Tasks
The `scheduler` tool enables dynamic scheduled tasks, supporting **one-time tasks, fixed intervals, and Cron expressions**. Tasks can be triggered as either a **fixed message send** or an **Agent dynamic task** execution:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202195402.png" width="800" />
</Frame>
### 2.4 Environment Variable Management
Secrets required by skills are stored in an environment variable file, managed by the `env_config` tool. You can update secrets through conversation, with built-in security protection and desensitization:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202234939.png" width="800" />
</Frame>
## 3. Skills System
The Skills system provides infinite extensibility for the Agent. Each Skill consists of a description file, execution scripts (optional), and resources (optional), describing how to complete specific types of tasks. Skills allow the Agent to follow instructions for complex workflows, invoke tools, or integrate third-party systems.
- **Built-in skills:** Located in the project's `skills/` directory, including skill creator, image recognition, LinkAI agent, web fetch, and more. Built-in skills are automatically enabled based on dependency conditions (API keys, system commands, etc.).
- **Custom skills:** Created by users through conversation, stored in the workspace (`~/cow/skills/`), capable of implementing any complex business process or third-party integration.
### 3.1 Creating Skills
The `skill-creator` skill enables rapid skill creation through conversation. You can ask the Agent to codify a workflow as a skill, or send any API documentation and examples for the Agent to complete the integration directly:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
</Frame>
### 3.2 Web Search and Image Recognition
- **Web search:** Built-in `web_search` tool, supports multiple search engines. Configure `BOCHA_API_KEY` or `LINKAI_API_KEY` to enable.
- **Image recognition:** Built-in `openai-image-vision` skill, supports `gpt-4.1-mini`, `gpt-4.1`, and other models. Requires `OPENAI_API_KEY`.
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
</Frame>
### 3.3 Third-party Knowledge Bases and Plugins
The `linkai-agent` skill makes all agents on [LinkAI](https://link-ai.tech/) available as Skills for the Agent, enabling multi-agent decision making.
Configuration: set `LINKAI_API_KEY` via `env_config`, then add agent descriptions in `skills/linkai-agent/config.json`:
```json
{
"apps": [
{
"app_code": "G7z6vKwp",
"app_name": "LinkAI Customer Support",
"app_description": "Select only when the user needs help with LinkAI platform questions"
},
{
"app_code": "SFY5x7JR",
"app_name": "Content Creator",
"app_description": "Use only when the user needs to create images or videos"
}
]
}
```
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
</Frame>

68
docs/en/intro/index.mdx Normal file
View File

@@ -0,0 +1,68 @@
---
title: Introduction
description: CowAgent - AI Super Assistant powered by LLMs
---
<img src="https://cdn.link-ai.tech/doc/78c5dd674e2c828642ecc0406669fed7.png" alt="CowAgent" width="600px"/>
**CowAgent** is an AI super assistant powered by LLMs with autonomous task planning, long-term memory, skills system, multimodal messages, multiple model support, and multi-platform deployment.
CowAgent can proactively think and plan tasks, operate computers and external resources, create and execute Skills, and continuously grow with long-term memory. It supports flexible switching between multiple models, handles text, voice, images, files and other multimodal messages, and can be integrated into web, Feishu, DingTalk, WeCom, and WeChat Official Account. It runs 7x24 hours on your personal computer or server.
<Card title="GitHub" icon="github" href="https://github.com/zhayujie/chatgpt-on-wechat">
github.com/zhayujie/chatgpt-on-wechat
</Card>
## Core Capabilities
<CardGroup cols={2}>
<Card title="Autonomous Task Planning" icon="brain" href="/en/intro/architecture">
Understands complex tasks and autonomously plans execution, continuously thinking and invoking tools until goals are achieved. Supports accessing file systems, terminals, browsers, schedulers, and other system resources through tools.
</Card>
<Card title="Long-term Memory" icon="database" href="/en/memory">
Automatically persists conversation memory to local files and databases, including core memory and daily memory, with keyword and vector retrieval support.
</Card>
<Card title="Skills System" icon="puzzle-piece" href="/en/skills/index">
Implements a Skills creation and execution engine with built-in skills, and supports custom Skills development through natural language conversation.
</Card>
<Card title="Multimodal Messages" icon="image" href="/en/channels/web">
Supports parsing, processing, generating, and sending text, images, voice, files, and other message types.
</Card>
<Card title="Multiple Model Support" icon="microchip" href="/en/models/index">
Supports mainstream model providers including OpenAI, Claude, Gemini, DeepSeek, MiniMax, GLM, Qwen, Kimi, Doubao, and more.
</Card>
<Card title="Multi-platform Deployment" icon="server" href="/en/channels/web">
Runs on local computers or servers, integrable into web, Feishu, DingTalk, WeChat Official Account, and WeCom applications.
</Card>
</CardGroup>
## Quick Experience
Run the following command in your terminal for one-click install, configuration, and startup:
```bash
bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
```
By default, the Web service starts after running. Access `http://localhost:9899/chat` to chat in the web interface.
<CardGroup cols={2}>
<Card title="Quick Start" icon="rocket" href="/en/guide/quick-start">
Complete installation and run guide
</Card>
<Card title="Architecture" icon="sitemap" href="/en/intro/architecture">
CowAgent system architecture design
</Card>
</CardGroup>
## Disclaimer
1. This project follows the [MIT License](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/LICENSE) and is intended for technical research and learning. Users must comply with local laws, regulations, policies, and corporate bylaws. Any illegal or rights-infringing use is prohibited.
2. Agent mode consumes more tokens than normal chat mode. Choose models based on effectiveness and cost. Agent has access to the host operating system — deploy with caution.
3. CowAgent focuses on open-source development and does not participate in, authorize, or issue any cryptocurrency.
## Community
Add our assistant on WeChat to join the open-source community:
<img width="140" src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/open-community.png" />

66
docs/en/memory.mdx Normal file
View File

@@ -0,0 +1,66 @@
---
title: Memory
description: CowAgent long-term memory system
---
The memory system enables the Agent to remember important information over time, continuously accumulating experience, understanding user preferences, and truly achieving autonomous thinking and continuous growth.
## Memory Types
### Core Memory (MEMORY.md)
Stored in `~/cow/MEMORY.md`, containing long-term user preferences, important decisions, key facts, and other information that doesn't fade over time. Automatically injected into the system prompt on every conversation turn as background knowledge.
### Daily Memory (memory/YYYY-MM-DD.md)
Stored in `~/cow/memory/` directory, named by date (e.g. `2026-03-08.md`), recording daily conversation summaries and key events. Files are only created on first write to avoid generating empty files.
## Memory Writing
The Agent automatically persists conversation content to daily memory through the following mechanisms:
- **On context trimming** — When conversation turns or tokens exceed the configured limit, the oldest half of the context is trimmed in batch, and the discarded content is summarized by LLM into key information and written to the daily memory file
- **Daily scheduled summary** — A full summary is automatically triggered at 23:55 every day, ensuring memory is preserved even on low-activity days (skipped if content hasn't changed)
- **On API context overflow** — When the model API returns a context overflow error, the current conversation summary is saved as an emergency measure
All memory writes run asynchronously in a background thread (LLM summarization + file writing), never blocking normal conversation replies.
## First Launch
On first launch, the Agent will proactively ask the user for key information and save it to the workspace (default `~/cow`):
| File | Description |
| --- | --- |
| `system.md` | Agent system prompt and behavior settings |
| `user.md` | User identity information and preferences |
| `MEMORY.md` | Core memory (long-term) |
| `memory/YYYY-MM-DD.md` | Daily memory (created on demand) |
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
</Frame>
## Memory Retrieval
The memory system supports hybrid retrieval modes:
- **Keyword retrieval** — Match historical memory based on keywords
- **Vector retrieval** — Semantic similarity search, finds relevant memory even with different wording
The Agent automatically triggers memory retrieval during conversation as needed, incorporating relevant historical information into context. Core memory (`MEMORY.md`) is always injected into the system prompt, while daily memory is loaded on demand via retrieval.
## Configuration
```json
{
"agent_workspace": "~/cow",
"agent_max_context_tokens": 40000,
"agent_max_context_turns": 20
}
```
| Parameter | Description | Default |
| --- | --- | --- |
| `agent_workspace` | Workspace path, memory files stored under this directory | `~/cow` |
| `agent_max_context_tokens` | Max context tokens; when exceeded, half is trimmed and summarized into memory | `40000` |
| `agent_max_context_turns` | Max context turns; when exceeded, half is trimmed and summarized into memory | `20` |

17
docs/en/models/claude.mdx Normal file
View File

@@ -0,0 +1,17 @@
---
title: Claude
description: Claude model configuration
---
```json
{
"model": "claude-sonnet-4-6",
"claude_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `claude-sonnet-4-6`, `claude-opus-4-6`, `claude-sonnet-4-5`, `claude-sonnet-4-0`, `claude-3-5-sonnet-latest`, etc. See [official models](https://docs.anthropic.com/en/docs/about-claude/models/overview) |
| `claude_api_key` | Create at [Claude Console](https://console.anthropic.com/settings/keys) |
| `claude_api_base` | Optional. Defaults to `https://api.anthropic.com/v1`. Change to use third-party proxy |

View File

@@ -0,0 +1,22 @@
---
title: DeepSeek
description: DeepSeek model configuration
---
Use OpenAI-compatible configuration:
```json
{
"model": "deepseek-chat",
"bot_type": "chatGPT",
"open_ai_api_key": "YOUR_API_KEY",
"open_ai_api_base": "https://api.deepseek.com/v1"
}
```
| Parameter | Description |
| --- | --- |
| `model` | `deepseek-chat` (DeepSeek-V3), `deepseek-reasoner` (DeepSeek-R1) |
| `bot_type` | Must be `chatGPT` (OpenAI-compatible mode) |
| `open_ai_api_key` | Create at [DeepSeek Platform](https://platform.deepseek.com/api_keys) |
| `open_ai_api_base` | DeepSeek platform BASE URL |

17
docs/en/models/doubao.mdx Normal file
View File

@@ -0,0 +1,17 @@
---
title: Doubao (ByteDance)
description: Doubao (Volcano Ark) model configuration
---
```json
{
"model": "doubao-seed-2-0-code-preview-260215",
"ark_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `doubao-seed-2-0-code-preview-260215`, `doubao-seed-2-0-pro-260215`, `doubao-seed-2-0-lite-260215`, etc. |
| `ark_api_key` | Create at [Volcano Ark Console](https://console.volcengine.com/ark/region:ark+cn-beijing/apikey) |
| `ark_base_url` | Optional. Defaults to `https://ark.cn-beijing.volces.com/api/v3` |

16
docs/en/models/gemini.mdx Normal file
View File

@@ -0,0 +1,16 @@
---
title: Gemini
description: Google Gemini model configuration
---
```json
{
"model": "gemini-3.1-pro-preview",
"gemini_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `gemini-3.1-flash-lite-preview`, `gemini-3.1-pro-preview`, `gemini-3-flash-preview`, `gemini-3-pro-preview`, etc. See [official docs](https://ai.google.dev/gemini-api/docs/models) |
| `gemini_api_key` | Create at [Google AI Studio](https://aistudio.google.com/app/apikey) |

27
docs/en/models/glm.mdx Normal file
View File

@@ -0,0 +1,27 @@
---
title: GLM (Zhipu AI)
description: Zhipu AI GLM model configuration
---
```json
{
"model": "glm-5",
"zhipu_ai_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `glm-5`, `glm-4.7`, `glm-4-plus`, `glm-4-flash`, `glm-4-air`, etc. See [model codes](https://bigmodel.cn/dev/api/normal-model/glm-4) |
| `zhipu_ai_api_key` | Create at [Zhipu AI Console](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) |
OpenAI-compatible configuration is also supported:
```json
{
"bot_type": "chatGPT",
"model": "glm-5",
"open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
"open_ai_api_key": "YOUR_API_KEY"
}
```

55
docs/en/models/index.mdx Normal file
View File

@@ -0,0 +1,55 @@
---
title: Models Overview
description: Supported models and recommended choices for CowAgent
---
CowAgent supports mainstream LLMs from domestic and international providers. Model interfaces are implemented in the project's `models/` directory.
<Note>
For Agent mode, the following models are recommended based on quality and cost: MiniMax-M2.5, glm-5, kimi-k2.5, qwen3.5-plus, claude-sonnet-4-6, gemini-3.1-pro-preview
</Note>
## Configuration
Configure the model name and API key in `config.json` according to your chosen model. Each model also supports OpenAI-compatible access by setting `bot_type` to `chatGPT` and configuring `open_ai_api_base` and `open_ai_api_key`.
You can also use the [LinkAI](https://link-ai.tech) platform interface to flexibly switch between multiple models with support for knowledge base, workflows, and other Agent capabilities.
## Supported Models
<CardGroup cols={2}>
<Card title="MiniMax" href="/en/models/minimax">
MiniMax-M2.5 and other series models
</Card>
<Card title="GLM (Zhipu AI)" href="/en/models/glm">
glm-5, glm-4.7 and other series models
</Card>
<Card title="Qwen (Tongyi Qianwen)" href="/en/models/qwen">
qwen3.5-plus, qwen3-max and more
</Card>
<Card title="Kimi" href="/en/models/kimi">
kimi-k2.5, kimi-k2 and more
</Card>
<Card title="Doubao (ByteDance)" href="/en/models/doubao">
doubao-seed series models
</Card>
<Card title="Claude" href="/en/models/claude">
claude-sonnet-4-6 and more
</Card>
<Card title="Gemini" href="/en/models/gemini">
gemini-3.1-pro-preview and more
</Card>
<Card title="OpenAI" href="/en/models/openai">
gpt-5.4, gpt-4.1, o-series and more
</Card>
<Card title="DeepSeek" href="/en/models/deepseek">
deepseek-chat, deepseek-reasoner
</Card>
<Card title="LinkAI" href="/en/models/linkai">
Unified multi-model interface + knowledge base
</Card>
</CardGroup>
<Tip>
For a full list of model names, refer to the project's [`common/const.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/common/const.py) file.
</Tip>

27
docs/en/models/kimi.mdx Normal file
View File

@@ -0,0 +1,27 @@
---
title: Kimi (Moonshot)
description: Kimi (Moonshot) model configuration
---
```json
{
"model": "kimi-k2.5",
"moonshot_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `kimi-k2.5`, `kimi-k2`, `moonshot-v1-8k`, `moonshot-v1-32k`, `moonshot-v1-128k` |
| `moonshot_api_key` | Create at [Moonshot Console](https://platform.moonshot.cn/console/api-keys) |
OpenAI-compatible configuration is also supported:
```json
{
"bot_type": "chatGPT",
"model": "kimi-k2.5",
"open_ai_api_base": "https://api.moonshot.cn/v1",
"open_ai_api_key": "YOUR_API_KEY"
}
```

21
docs/en/models/linkai.mdx Normal file
View File

@@ -0,0 +1,21 @@
---
title: LinkAI
description: Unified access to multiple models via LinkAI platform
---
The [LinkAI](https://link-ai.tech) platform lets you flexibly switch between OpenAI, Claude, Gemini, DeepSeek, Qwen, Kimi, and other models, with support for knowledge base, workflows, plugins, and other Agent capabilities.
```json
{
"use_linkai": true,
"linkai_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `use_linkai` | Set to `true` to enable LinkAI interface |
| `linkai_api_key` | Create at [LinkAI Console](https://link-ai.tech/console/interface) |
| `model` | Leave empty to use the agent's default model. Can be switched flexibly on the platform. All models in the [model list](https://link-ai.tech/console/models) are supported |
See the [API documentation](https://docs.link-ai.tech/platform/api) for more details.

View File

@@ -0,0 +1,27 @@
---
title: MiniMax
description: MiniMax model configuration
---
```json
{
"model": "MiniMax-M2.5",
"minimax_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `MiniMax-M2.5`, `MiniMax-M2.1`, `MiniMax-M2.1-lightning`, `MiniMax-M2`, etc. |
| `minimax_api_key` | Create at [MiniMax Console](https://platform.minimaxi.com/user-center/basic-information/interface-key) |
OpenAI-compatible configuration is also supported:
```json
{
"bot_type": "chatGPT",
"model": "MiniMax-M2.5",
"open_ai_api_base": "https://api.minimaxi.com/v1",
"open_ai_api_key": "YOUR_API_KEY"
}
```

19
docs/en/models/openai.mdx Normal file
View File

@@ -0,0 +1,19 @@
---
title: OpenAI
description: OpenAI model configuration
---
```json
{
"model": "gpt-5.4",
"open_ai_api_key": "YOUR_API_KEY",
"open_ai_api_base": "https://api.openai.com/v1"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Matches the [model parameter](https://platform.openai.com/docs/models) of the OpenAI API. Supports o-series, gpt-5.4, gpt-5 series, gpt-4.1, etc. Recommended for Agent mode: `gpt-5.4` |
| `open_ai_api_key` | Create at [OpenAI Platform](https://platform.openai.com/api-keys) |
| `open_ai_api_base` | Optional. Change to use third-party proxy |
| `bot_type` | Not required for official OpenAI models. Set to `chatGPT` when using Claude or other non-OpenAI models via proxy |

27
docs/en/models/qwen.mdx Normal file
View File

@@ -0,0 +1,27 @@
---
title: Qwen (Tongyi Qianwen)
description: Tongyi Qianwen model configuration
---
```json
{
"model": "qwen3.5-plus",
"dashscope_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `qwen3.5-plus`, `qwen3-max`, `qwen-max`, `qwen-plus`, `qwen-turbo`, `qwq-plus`, etc. |
| `dashscope_api_key` | Create at [Bailian Console](https://bailian.console.aliyun.com/?tab=model#/api-key). See [official docs](https://bailian.console.aliyun.com/?tab=api#/api) |
OpenAI-compatible configuration is also supported:
```json
{
"bot_type": "chatGPT",
"model": "qwen3.5-plus",
"open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"open_ai_api_key": "YOUR_API_KEY"
}
```

View File

@@ -0,0 +1,23 @@
---
title: Changelog
description: CowAgent version history
---
| Version | Date | Description |
| --- | --- | --- |
| [2.0.2](/en/releases/v2.0.2) | 2026.02.27 | Web Console upgrade, multi-channel concurrency, session persistence |
| [2.0.1](/en/releases/v2.0.1) | 2026.02.27 | Built-in Web Search tool, smart context management, multiple fixes |
| [2.0.0](/en/releases/v2.0.0) | 2026.02.03 | Full upgrade to AI super assistant |
| 1.7.6 | 2025.05.23 | Web Channel optimization, AgentMesh plugin |
| 1.7.5 | 2025.04.11 | DeepSeek model |
| 1.7.4 | 2024.12.13 | Gemini 2.0 model, Web Channel |
| 1.7.3 | 2024.10.31 | Stability improvements, database features |
| 1.7.2 | 2024.09.26 | One-click install script, o1 model |
| 1.7.0 | 2024.08.02 | iFlytek 4.0 model, knowledge base references |
| 1.6.9 | 2024.07.19 | gpt-4o-mini, Alibaba voice recognition |
| 1.6.8 | 2024.07.05 | Claude 3.5, Gemini 1.5 Pro |
| 1.6.0 | 2024.04.26 | Kimi integration, gpt-4-turbo upgrade |
| 1.5.0 | 2023.11.10 | gpt-4-turbo, dall-e-3, tts multimodal |
| 1.0.0 | 2022.12.12 | Project created, first ChatGPT integration |
See [GitHub Releases](https://github.com/zhayujie/chatgpt-on-wechat/releases) for full history.

View File

@@ -0,0 +1,63 @@
---
title: v2.0.0
description: CowAgent 2.0 - Full upgrade from chatbot to AI super assistant
---
CowAgent 2.0 is a comprehensive upgrade from a chatbot to an **AI super assistant** — capable of autonomous thinking and task planning, long-term memory, operating computers, and creating and executing skills.
**Release Date**: 2026.02.03 | [GitHub Release](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.0)
## Key Updates
### Agent Core
- **Complex Task Planning**: Autonomous planning with multi-turn reasoning
- **Long-term Memory**: Persistent memory with keyword and vector search
- **Built-in Tools**: 10+ tools including file ops, Bash, browser, scheduler
- **Web search**: Built-in `web_search` tool, supports multiple search engines, configure corresponding API key to use
- **Skills System**: Skill engine with built-in and custom skill support
- **Security & Cost**: Secret management, prompt controls, token limits
### Other
- **Channels**: Feishu/DingTalk WebSocket support, image/file messages
- **Models**: claude-sonnet-4-5, gemini-3-pro-preview, glm-4.7, MiniMax-M2.1, qwen3-max
- **Deployment**: One-click install, configure, run, and management script
## Long-term Memory
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
</Frame>
## Task Planning & Tools
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202181130.png" width="800" />
</Frame>
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203121008.png" width="800" />
</Frame>
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202195402.png" width="800" />
</Frame>
## Skills System
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
</Frame>
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
</Frame>
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
</Frame>
## Contributing
Welcome to [submit feedback](https://github.com/zhayujie/chatgpt-on-wechat/issues) and [contribute code](https://github.com/zhayujie/chatgpt-on-wechat/pulls).

View File

@@ -0,0 +1,36 @@
---
title: v2.0.1
description: CowAgent 2.0.1 - Built-in Web Search, smart context management, multiple fixes
---
**Release Date**: 2026.02.27 | [Full Changelog](https://github.com/zhayujie/chatgpt-on-wechat/compare/2.0.0..2.0.1)
## New Features
- **Built-in Web Search tool**: Integrated web search as a built-in Agent tool, reducing decision cost ([4f0ea5d](https://github.com/zhayujie/chatgpt-on-wechat/commit/4f0ea5d7568d61db91ff69c91c429e785fd1b1c2))
- **Claude Opus 4.6 model support**: Added support for Claude Opus 4.6 model ([#2661](https://github.com/zhayujie/chatgpt-on-wechat/pull/2661))
- **WeCom image recognition**: Support image message recognition in WeCom channel ([#2667](https://github.com/zhayujie/chatgpt-on-wechat/pull/2667))
## Improvements
- **Smart context management**: Resolved chat context overflow with intelligent context trimming strategy to prevent token limits ([cea7fb7](https://github.com/zhayujie/chatgpt-on-wechat/commit/cea7fb7490c53454602bf05955a0e9f059bcf0fd), [8acf2db](https://github.com/zhayujie/chatgpt-on-wechat/commit/8acf2dbdfe713b84ad74b761b7f86674b1c1904d)) [#2663](https://github.com/zhayujie/chatgpt-on-wechat/issues/2663)
- **Runtime info dynamic update**: Automatic update of timestamps and other runtime info in system prompts via dynamic functions ([#2655](https://github.com/zhayujie/chatgpt-on-wechat/pull/2655), [#2657](https://github.com/zhayujie/chatgpt-on-wechat/pull/2657))
- **Skill prompt optimization**: Improved Skill system prompt generation, simplified tool descriptions for better Agent performance ([6c21833](https://github.com/zhayujie/chatgpt-on-wechat/commit/6c218331b1f1208ea8be6bf226936d3b556ade3e))
- **GLM custom API Base URL**: Support custom API Base URL for GLM models ([#2660](https://github.com/zhayujie/chatgpt-on-wechat/pull/2660))
- **Startup script optimization**: Improved `run.sh` script interaction and configuration flow ([#2656](https://github.com/zhayujie/chatgpt-on-wechat/pull/2656))
- **Decision step logging**: Added Agent decision step logging for debugging ([cb303e6](https://github.com/zhayujie/chatgpt-on-wechat/commit/cb303e6109c50c8dfef1f5e6c1ec47223bf3cd11))
## Bug Fixes
- **Scheduler memory loss**: Fixed memory loss caused by Scheduler dispatcher ([a77a874](https://github.com/zhayujie/chatgpt-on-wechat/commit/a77a8741b500a408c6f5c8868856fb4b018fe9db))
- **Empty tool calls & long results**: Fixed handling of empty tool calls and excessively long tool results ([0542700](https://github.com/zhayujie/chatgpt-on-wechat/commit/0542700f9091ebb08c1a56103b0f0f45f24aa621))
- **OpenAI Function Call**: Fixed function call compatibility with OpenAI models ([158c87a](https://github.com/zhayujie/chatgpt-on-wechat/commit/158c87ab8b05bae054cc1b4eacdbb64fc1062ba9))
- **Claude tool name field**: Removed extraneous tool name field from Claude model responses ([eec10cb](https://github.com/zhayujie/chatgpt-on-wechat/commit/eec10cb5db6a3d5bc12ef606606532237d2c5f6e))
- **MiniMax reasoning**: Optimized MiniMax model reasoning content handling, hidden thinking process output ([c72cda3](https://github.com/zhayujie/chatgpt-on-wechat/commit/c72cda33864bd1542012ee6e0a8bd8c6c88cb5ed), [72b1cac](https://github.com/zhayujie/chatgpt-on-wechat/commit/72b1cacea1ba0d1f3dedacbab2e088e98fd7e172))
- **GLM thinking process**: Hidden GLM model thinking process display ([72b1cac](https://github.com/zhayujie/chatgpt-on-wechat/commit/72b1cacea1ba0d1f3dedacbab2e088e98fd7e172))
- **Feishu connection & SSL**: Fixed Feishu channel SSL certificate errors and connection issues ([229b14b](https://github.com/zhayujie/chatgpt-on-wechat/commit/229b14b6fcabe7123d53cab1dea39f38dab26d6d), [8674421](https://github.com/zhayujie/chatgpt-on-wechat/commit/867442155e7f095b4f38b0856f8c1d8312b5fcf7))
- **model_type validation**: Fixed `AttributeError` caused by non-string `model_type` ([#2666](https://github.com/zhayujie/chatgpt-on-wechat/pull/2666))
## Platform Compatibility
- **Windows compatibility**: Fixed path handling, file encoding, and `os.getuid()` unavailability on Windows across multiple tool modules ([051ffd7](https://github.com/zhayujie/chatgpt-on-wechat/commit/051ffd78a372f71a967fd3259e37fe19131f83cf), [5264f7c](https://github.com/zhayujie/chatgpt-on-wechat/commit/5264f7ce18360ee4db5dcb4ebe67307977d40014))

Some files were not shown because too many files have changed in this diff Show More